Hidden Old History: Bug ?

Avatar

Kevin Deldycke

over 3 years ago

Hi!

First, I would like to congratulate the ohloh team for their work. I think ohloh is a wonderful tool which can deserve the community in multiple ways. Good job !

Then, my problem:

Ohloh just finished to calculate statistics of the ERP5 project ( http://www.ohloh.net/projects/4282/analyses/latest ). You can notice the history start in 2006. But when you browse the SVN repository of the project, you can clearly see that the first files where commited during 2002 (example: http://svn.erp5.org/erp5/trunk/products/ERP5/VERSION.txt?view=log ).

Actually it appear that ohloh is not able to parse log data before Mar 24 2006, which is the date when our repository was migrated from CVS to SVN. This can explain the bug.

Now, do you have an idea why our ViewCVS web interface is able to calculate the history and ohloh not ?


Avatar

Robin Luckey

over 3 years ago

Hi Kevin,

Thanks for the kind words. I wish I had better news for you in response.

With Subversion repositories, Ohloh examines only current development line. If files are moved to or from other branches, Ohloh will not see what happens to those files on the branches.

If you create and edit files on a branch, then copy them to the trunk, we do not see that early editing. It looks to Ohloh as if the file was created fully formed on the day it was moved into the trunk.

On 24 Mar 2006, it looks like all of the files in this project were moved from trunk/ERP5 to erp5/trunk. From Ohloh's perspective, this is equivalent to creating a new branch called erp5/trunk.

On this repository, Ohloh measures only the erp5/trunk line, so all of the prior activity on the trunk/ERP5 line was ignored.

This isn't exactly a bug -- we do this on purpose. We fetch the Subversion log using the --stop-on-copy option. We do this because in Subversion it's delightfully easy to create a new branch of the code, so some projects have literally thousands of branches. The problem this creates for us is that each branch is potentially as large as the trunk itself, which makes our processing and storage requirements orders of magnitude larger. On our end, it's very hard to know which branches we are supposed to be following at each point in history. We had to make a hard decision to follow only the current trunk.

We may revisit this problem in the future, but for now storage and compute limitations require us to do this.

One option that might improve the report for this project would be to import the trunk/ERP5 development line in addition to the erp5/trunk line -- effectively treating these as two separate Subversion repositories. You'd then be able to see all of the older history. Be aware, however, that on Mar 24 2006, it's going to look as if everyone on the project deleted all of their code, and then someone else on the team (yo?) re-added all of that code.


Avatar

Kevin Deldycke

over 3 years ago

Thanks Robin, for your very fast and detailled answer !

With Subversion repositories, Ohloh examines only current development line. If files are moved to or from other branches, Ohloh will not see what happens to those files on the branches.

Ok, I understand. This is quite sad, because for SVN, copy and move are quite similar. So if you rename a file (= "move to another name"), ohloh will loose it's history.

One option that might improve the report for this project would be to import the trunk/ERP5 development line in addition to the erp5/trunk line -- effectively treating these as two separate Subversion repositories.

That's a good idea !

Be aware, however, that on Mar 24 2006, it's going to look as if everyone on the project deleted all of their code, and then someone else on the team (yo?) re-added all of that code.

It's not that bad. Mar 24 2006 will just appear as a singularity, nothing else.


Avatar

Kevin Deldycke

over 3 years ago

One option that might improve the report for this project would be to import the trunk/ERP5 development line in addition to the erp5/trunk line -- effectively treating these as two separate Subversion repositories.

Unfortunately this doesn't work !

I don't understand why but it look like "svn.erp5.org/repos/public/trunk/ERP5" doesn't exist:

[user@local ~]$ svn info https://svn.erp5.org/repos/public/erp5/trunk Chemin : trunk URL : https://svn.erp5.org/repos/public/erp5/trunk Racine du dépôt : https://svn.erp5.org/repos/public UUID du dépôt : 20353a03-c40f-0410-a6d1-a30d3c3de9de Révision : 13297 Type de noeud : répertoire Auteur de la dernière modification : vincent Révision de la dernière modification : 13297 Date de la dernière modification: 2007-03-08 19:12:28 +0100 (jeu, 08 mar 2007)

is working but not the following:

[user@local ~]$ svn info https://svn.erp5.org/repos/public/trunk/ERP5 https://svn.erp5.org/repos/public/trunk/ERP5 : (URL non valide)

Is anyone had a similar issue ?


Avatar

Cetin Sert

over 3 years ago

I have had this issue fairly recently with my own project [Tenka Text]. I moved everything from root to trunk and the whole history got lost on ohloh. o_O