First off, thanks to all of the ohloh team -- this is an excellent resource. We're having fun it. There have been several issues noticed, however, that I hope will be looked into at some point.
The
... [More]
biggest issue I noticed is that the processing doesn't seem to take Attic/deleted/moved files into account at least when computing the years contributions of developers. Our project, BRL-CAD -- http://www.ohloh.net/projects/3996 -- has almost 25 years of CVS history (potentially one of the longest retained project histories in revision control system) but the counts on our project page don't reflect this. For example, it list the project's original author (mike) at only 7.0 years contribution when that should be 17 years. There are commits from him all the way back to 1983 through 2000.
One possible cause for the 7.0 year contribution is that in 2004, our repository was vastly reorganized changing from a top-level heavy repository to a more hierarchical organization. This resulted in a ton of files and directories moving around in CVS (via delete/readd). If our repository is checked out unpruned or even if Attic files for existing directories were considered, the logs are massively more extensive. Since our reoganization was so extensive, the years contribution is pretty much wrong for all but the newest contributors (since we went open source in 2004) which is no noticeable given the major discrepancy for all our devs. If it is taking the Attic into account .. then I can only imagine there is some other bug/assumption in the years contribution calculation.
Example statcvs report that shows our reorganization and extent of history: http://ftp.brlcad.org/statcvs/
Speaking of statcvs.. one of the coolest aspects of the statcvs report is their authorship speckle graph.. that would make for a great ohloh feature, especially if it took the magnitude of the commit into account (which statcvs doesn't). Example, red graph near bottom: http://ftp.brlcad.org/statcvs/authors.html
Other feature requests (not that you probably don't have more than enough to do already):
Include BSD-style licenses in "Licenses" section file count
Allow specification of exclusion paths to not process
(e.g. our src/other is entirely 3rd party dependency codes, shouldn't be included in our statistics)
Account for Attic changes in project costs
Update enlistment reprocessing more frequently
(perhaps once a week or at least on request)
Add an "age of project revision history" metric
(even our non-attic files go back almost 20 years)
Thanks again for the great site. Look forward to seeing more improvements!
Cheers!
Sean [Less]