Project Activity Metrics, Charts, and Stuff

Yesterday we were reminded that it is difficult to provide good activity metrics. Certainly Git commit statistics are a good means of indicating that there is activity, but different styles of development can leave very different impressions: a lot of very small commits leaves the impression that there is more activity than a smaller number of large commits. That is, when you just count commits, a one line contribution is equal to a 1,000 line contribution (skipping merge commits is a no-brainer).

Eric-Idle-Monty-Python-Holy-Grail-bring-out-dead

I’ve tinkered with using diff data to quantify individual commits, but even that can be misleading (e.g. generated code; removing vast chunks of code is often a net positive contribution). It’s made even harder when you consider that real software developers can spend a lot of time to work out a solution to a problem that results in a relatively small contribution of code.

Code talks at Eclipse, so making an initial assessment of liveliness by looking at the Git commit record makes a lot of sense. But, when I assess a project for activity I usually start my investigation with commit metrics and dig in from there. Bugzilla, mailing list, and forum activity are good places to look for evidence of daily activity (the Dashboard is a good source for this information). I also look to see if the project is making regular releases, or generating milestone builds. Finally, I actually try to communicate directly with the project team. It’s amazingly easy to write a quick “how’s it going?” note if I’m concerned that I don’t see enough evidence of activity.

Our open source projects information site, the so-called Project Management Interface, includes some handy charts intended to provide quick insight into project Git commit liveliness. They can, however, be a bit difficult to understand if you are not familiar with our arcane project structure. So I’ve made a change.

The query behind the Git commit charts that we display for a project now includes data from all subprojects recursively. So, for example, the charts for the Tools top-level project include data from all Tools projects. Likewise, the the charts for JDT include the subprojects (Core, Debug, and UI). I quite like the change as it makes it much easier to get an understanding of both the real project activity and diversity of the committer base.

But even with this change, it’s difficult to fully assess activity based on just one project’s commit metrics. Many people in the community equate JDT with “Eclipse IDE” and while JDT is what makes Eclipse a Java IDE, it relies heavily on frameworks implemented by the Platform project. So, activity in the Platform is an indication of activity in the Java IDE. This is made more challenging when you add the projects that leverage the extensibility of the platform, e.g. Web Tools, Mylyn, Code Recommenders, m2e (Maven), Buildship (Gradle), Git (EGit/JGit). A collection of activity charts including all of these projects would be interesting…

We don’t have an activity problem in our Java IDE. But we do have an opportunity to do a better job of communicating what we do and how we do it.

If you’re curious about ongoing work in Java 9 support, check out the beta.

This entry was posted in Community, Eclipse 101, Java. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s