Custom Development

Keeping Code Clean with Cruft4J

Ben Northrop

If you're working on a system of enterprise scale, then you're probably "fighting the good fight" against code entropy. Tools like PMD, FindBugs, Sonar, and the like are excellent weapons in this battle. Each of these tools gives a slightly different perspective of code rot; some show instances of dormant bugs, others violations of coding best practices, or others deviations from style conventions. All that's left is to do is just go in there and clean it up. So get to it!

Unfortunately, as we know, it's just not that easy. Cleaning up code takes work, and every hour of cleaning up code is an hour of not adding functionality. This makes Project Managers and Business Owners squirm (justifiably!). Additionally, cleaning up code is also risky, since a line of code changed is also a line of code that could break. And this, of course, makes QA nervous.

The point is that although cleaning up code is inherently a developer activity, it requires team buy-in. And to get this buy-in, you need to prove to Project Managers, Business Owners, and QA that it's a worthwhile endeavor. Even if they accept the concept of technical debt in principle, before they "ok" your code cleaning expedition they probably still want to know:

  • How big is the problem (i.e. how does our code quality compare to other systems)?
  • Are things getting better or worse?

Armed with just a long list of obscure coding violations from PMD and the others, these questions are difficult to answer. Sure, your system has 1,358 violations, but is that good or bad? Are these violations even important?

This is where a tool called Cruft4J can help. Cruft4J is a static source code analysis tool, like the others, but instead of a detailed list of coding violations, Cruft4J spits out just one score:

Cruft Score = (Complexity + Copy-Paste ) / Lines of Code

To answer the first question, how big the problem is, running Cruft4J once will give you a score which you can compare against some benchmarks of open source code. Obviously, the cruft score isn't a perfect measure of code quality, but if your system scored a 200, and the average score is 52, you know something is probably wrong.

To answer the second question, are things getting better or worse, running Cruft4J over time (or better yet, hooking it into your build) makes it easy to spot trends over time, so you can understand whether the ship is headed in the right or wrong direction.

Now there are other tools, like Crap4J and Sonar's Technical Debt, that too provide a single score. These tools, however, include factors that, in my opinion, are sometimes dubious indicators of code quality. For instance, Sonar's TD considers the percentage of methods that have Javadoc - but just because a method has a Javadoc comment doesn't mean that Javadoc comment is actually useful! Cruft4J, by contrast, stands on the unassailable principles of Decomposition and Don't Repeat Yourself. It's never the case that a complex method could not be broken down, or that copy-and-pasting code is preferable to other forms of re-use.

In the end, Cruft4J is not the solution, but instead a vehicle for getting the conversation started. As Business Owners and Managers understand the extent of the code rot, a better case can be made to not only clean code up, but to invest in processes that prevent bad code in the first place (e.g. code reviews, unit testing, etc.). Anyway, I hope it helps, and I'd love to hear your thoughts.

A final note: Cruft4J was built as part of an internal initiative we call Summa Labs. In Summa Labs, we work on side projects, big or small, for the purpose of honing our technical skills, collaborating with others, building cool stuff, or just having fun. If you're interested in Summa or Summa Labs, contact us!

Ben Northrop

Distinguished Technical Consultant