Is it possible to exclude code from statistic? There's a lot of generated code which leads to a "4 person year" project even it's only a few months old and uses a third-part-library.
I have a similar problem. I have a project with a 3rd party library in the same repository, so it lists "3 person year" instead of the correct 1-4 months.
Sorry, this is not currently possible.
For a long time, there has been a good idea floating around: Ohloh should support some kind of
robots.txt-like file that would allow you to instruct Ohloh to ignore or give special treatment to certain directories.
I think that's a great idea, but we've simply never had the development resources required to get it done.
If you are using Subversion, there may be a workaround, but it's a lot of work: rather than enlisting your entire trunk in Ohloh, you can individually add every directory except the directory containing the 3rd party library. If you have a lot of directories, I can appreciate that this may not be a realistic option.
A few of the projects I manage show as "mostly written in XSLT" because we have the DocBook stylesheets in our SVN repositories. We'd also appreciate a way to exclude a particular directory.
I have added a ticket: http://labs.ohloh.net/ohcount/ticket/317; unfortunately I am not a Ruby coder (yet); anyone else up for it?
(see also thread https://www.ohloh.net/topics/3356?page=1#post_10651)
I'm hitting this with my project as well. I've just added an app I'm writing called "WarFoundry". It has a System.Windows.Forms (Microsoft .Net Windows native) front end. All of the ".resx" (resource) files are bumping our XML line count way out of proportion. The Glade files for the GTK# front-end are probably doing the same thing. Both are all auto-generated files.
Unfortunately the "multiple directories" idea won't work because resx files are in the same place as .cs and Glade files are in a folder below the main code. What would be great for that situation would be an "ignore file name pattern" option :)
A better workaround for Subversion is to make the 3rd party code an svn external. Last time I checked, Ohloh doesn't traverse externals.
You can do this by adding the 3rd party code outside the regular code tree, for example at /3rdparty instead of /trunk/3rdparty and make /trunk/3rdparty an external pointing to /3rdparty.
Ok you all got some ideas, but i can't agree with most of them. I'd say we all don't want to instrumentalize our project structure just to go well with the ohloh statistics.
The only good solution I see is to set paths/patterns to exclude in the ohloh control panel.
Looks like there's a minor false-alarm with my project :) While digging around I found that I'd included the Log4Net documentation as well as the DLL (I hadn't paid attention to what was in the .xml file). It appears that although .resx files are XML, Ohloh doesn't pick them up as such.
Still, the general idea of "filterable paths" for when a project does include code or files that are being picked up but aren't wanted in the count is a good one :)
This is really necessary as it is currently what holds me from adding my primary git repositories to ohloh. As it is now, Im manually syncing the changes into a separate subversion repo where I only enlisted the src directory.
But I have one issue with the include/exclude thought, why dont we just classify paths into categories like sourcecode, data, docs, external etc. For example:
/lib/ > external
/docs/ > documentation
data/ > data
* > sourcecode
It shouldn't be to hard to match each path against this during counting, and assigning the score to the appropriate category..
I agree with your idea -- I'd always visualized this as more of a tagging system than an exclude/include system.
Initially, we might only honor the "ignore" tag, but as time goes on we might allow code to be tagged in all kinds of interesting ways.
@robin, good to hear that, but the main question still remains - is this feature ever going to be implemented?
The topic has existed for quite some time, and most of the solutions presented has been quite easy to implement.
Sorry, I can't make any estimate when we will get to this.
We are currently focused on performance and reliability issues. We're physically moving to a new data center, and we are redesigning our source control processing for better scalability. I can't guess how long it will be before we have free cycles to add new features.
And while agree that the solutions on this thread are good ones in principle, when you think about actually implementing them, they turn out to be surprisingly complicated.
Does a robots.txt-style file apply to all revisions of a repository, or just particular revisions? If I rename or move code, I'll need to change my robots.txt. How does the time axis of robots.txt work? How do I know which revisions are covered by which robots.txt?
How do I confirm that Ohloh processed my robots.txt correctly? How will Ohloh explain that some code is ignored intentionally? Currently, Ohloh doesn't even let me browse the code at all. How can I debug the reason for missing/extra code?
Finally, what happens after someone makes a change to the robots.txt? One small change might require Ohloh to do a full recalculation across the full history of the project, which might take a week of server time on a large project. How will we avoid that?
Those are some reasons why this feature still does not exist. I'd really like to get it done, but it's messier than it seems at first. Maybe after we've hired some more help... :-)
The easy answer would be not to use a file present IN the repository, but rather require the person enlisting the repository to supply the patterns for categorizing befor any processing takes place. These patterns could be immutable, and hence, would not lead to any extra processing - rather, by having an ignore tag (that actually caused the parser to ignore the files) you would free cpu cycles for more important work.
Just to go back and correct one of my previous statements, it appears that Ohcount does count .resx (.Net's "resources wrapped in XML") files as XML - https://www.ohloh.net/p/WarFoundry/commits/48803516?page=4 - as well as .manifest files (which are XML, but are also auto-generated) - https://www.ohloh.net/p/WarFoundry/commits/48803516?page=5 - and a few others.
So, from a C# project point of view then filtering based on extensions to remove auto-generated code from the list would be useful :) Personally, I'd prefer an Ohloh-based solution checking file paths against a pattern rather than some extra file to put in the repo (which "contaminates" it with unnecessary junk).
The recalculation problem could be an issue, unless it just never gets applied retrospectively (much like a commit - once you make it then it is always there, which is why my line count spiked like crazy because of some XML docs!)
@robin, any news on this feature? I'm betting this is quite a big showstopper for git users...
What needs to happen is for someone to add an ignore setting to ohcount which takes a folder path as input. Then robin can come along and just add it to ohloh.
Not to be contrary, but this is a bit more difficult than just implementing ignore features in Ohcount.
There's the whole question of time specificity -- if someone changes the ignore settings for a project, does that apply only to the code moving forward, or will it apply to all of old source control history as well? Ohloh doesn't have the processing power to re-calculate from scratch the line counts for the entire project history every time the ignore settings are changed.
What happens if the code that needs to be ignored changes its location in the source tree over time?
There's also the question of how to communicate to the users which project contents are or aren't being ignored by Ohloh, and whether the settings entered by the users are working correctly.
This problem is a bit trickier -- and a lot more expensive computationally-- than it seems at first glance, which is why we have been dragging our feet on an implementation.
There will always be things that Ohcount cannot catch, no matter how much effort you put in.
In my opinion it would make more sense to provide an option to project owners/managers in the Ohloh web-interface. Such an option should be on a per-project basis and allow project owners/managers to flag certain files in the project for non-standard treatment.
This would then allow files to be flagged "meta-data" or "ignore" and Ohcount could then be made not to count them.
It could also allow files to be flagged "embedded library" and Ohcount could count the lines separately and list them in the statistics under "embedded libraries". If there was also an option to specify another Ohloh tracked project as the origin for such an embedded library, then that would make it possible to automatically increment the use count for that project.
Last but not least, it would also allow non-standard use of file extensions to be fixed by the project owners without having to add disambiguation code to Ohcount and without an old project requiring renaming of filenames and trickery to hide the history from Ohcount.
i just added my latest project, which uses git as repository, now in the git itself i've bundled another opensource project, since everything else would make it pain in the ass to handle (other types of repositories as submodule? forget it!) now it would be really nice if i could exclude that directory from my stats - since it adds a ridiculous amount of work by someone else to my project, also falsifies the License stats (my project is MIT license, the bundled one BSD style)
I would love to see this feature added to, it is a common problem in many projects I work on. A simple system where we could simply ignore specific paths would work for 90% I suspect. I would want it to go back to the start in general though, not just from now on...
I appreciate the resources that this might require. May be an extra checkbox to request such as action? I don't know how you do scheduling. Thanks for all of the work you guys put into Ohloh!
Its Looks like there's a minor false-alarm with my project :)
Thanks Anya Garden GlovesManufacturer
I agree, would be a very nice option. Good luck including it!
Just to make sure everyone is aware, we recently deployed a feature to tell Ohloh which files to ignore. See more at https://www.ohloh.net/blog/LatestUpdatesToIgnoringFilesandDirectories
Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.