I use docstrings for all of python projects i work on. Could it be that this is still not counted as comments on ohloh?
Regards, Armin
Agreed, not counting docstrings in Python projects results in flawed metrics. Pretty much all Python projects have the "Few source code comments" annotation, even if they are extremely well commented via docstrings.
In order to bump that topic a bit here a regexp for finding docstrings:
^(\s*)("""(?:.|\n)*?""")
and
^(\s*)('''(?:.|\n)*?''')
Regards, Armin
Anything new?
I know that I'm very annoying but I always have a strange feeling when i have to notice that my projects once again have the "few comments" factoid although the sourcecode is well documented... Maybe there is a chance to get docstrings counted.
hi Armin,
I'm sorry to report there hasn't any progress here. I'll crack open the source analyzer and see what I can do. I'll report back tomorrow.
ok, here's where i got:
Adding the triple-single-quote and triple-double-quote was pretty easy, but some further research shows that it's only considered a docstring IF this triplet is found as the FIRST statement in a function - otherwise it's a handy way to specificy non-escaping strings (see: DiveIntoPython.
Before I turn this on, I'm curious to know how often this triple-quoting is used a non-docstring case? Is it enough to warrant adding more rules - or is it rare enough to ignore?
It might be used also for other strings in code. While looking into this, you might also consider detecting license variable in each module, which can contain license text. This is also something ohloh now fails to detect.
You can see example here: http://viewsvn.cihar.com/viewvc.cgi/imap-utils/trunk/IMAPUtils/init.py?view=markup
Ed: I deleted the duplicates
Sorry for flood, but I didn't submit anything more than once.
@Jason: Yes. It's an normal string too. just make sure it's not assigned to a variable. That's why i proposed the ^\s*
@Michal: your link resolves to a server error - can you please update?
@Armin: Aaahhh.... yes - that makes a lot of sense.
Note that docstrings don't need to be triple-quoted: any Python string literal works.
Jason: I don't know the exact numbers, but i'm fairly certain triple-quoted literals are used often enough as non-docstrings to skew the statistics.
The easiest way to reliably extract docstrings would probably be via Python 2.5's AST support.
Traversing the AST would be a way but with the whitespace rule this won't catch normal strings that often. We use the same for a syntax highlighter and so far it hasn't highlighting a single line of code wrong :-)
@Jason: Sorry, I meant this file (I forgot that underscores will be treated as bold).
Is there any news on this?
You could hack in docstrings by looking for lines with class. or def., followed by a string; either a one-liner "string" or a possibly multiline string delimited by """/'''. Parsing the code would be much more accurate.
docstrings are generally the most helpful comments in the code; actual comments tend to be things like licenses, notes about tricky bits of the implementation, fixmes, and worst of all commented-out code. If you count just those and not docstrings, you might well be getting the whole thing backwards.
Just wanted to put in another vote for making this change happen. I had assumed that the "Few source code comments" flag was just a backhanded compliment to Python's high readability :)
Actually, it might be interesting to make the source-code-comments rating relative to other projects in the same (primary) language.
Just throwing out another +1 for this feature. It would really help the accuracy of reporting for Python projects. I work on a few codebases which have thousands of functions without comments but with docstrings; implying this is the same as little documentation by reporting "few source code comments" on the project page is quite misleading.
+1 from me too.
+1
+1 too
Greetings y'all.
I'm sorry this hasn't been done yet - we're swamped for features/support and we're continually "behind" everything we want to see done on ohloh.
However, all is not lost. I wanted to remind everyone that since this original post, we've released the source code parser as open source.
A docstring aficionado could clone the git repo (git://labs.ohloh.net/git/ohcount.git or http://wiki.github.com/robinluckey/ohcount). Find us on our irc channel (irc.freenode.net#ohloh) for code questions/ideas.
When a proper patch is submitted, we'll integrate it into ohloh asap. Cheers!
I implemented this over the summer; it appears to work fine for me, what seems to be the problem?
I suspect the recent +1 voters simply aren't aware that this has already been implemented.... am I correct?
Copyright
©
2013
Black Duck Software, Inc.
and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a
Creative Commons Attribution 3.0 Unported License
. Ohloh
®
and the Ohloh logo are trademarks of
Black Duck Software, Inc.
in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.