Yes, I'm afraid that the Gentoo Linux download has not been cooperative, and it may be too large to succeed with our current download strategy. Our system is brute force: first we fetch the log and parse the revision history, and then we download each patch set from the CVS server, one at a time.
For Gentoo, there are over 381,000 patch sets to download. That's very large -- the largest project on our system to date. The responses from the CVS server are not very fast. Even if everything worked perfectly, this could take several weeks for us to complete.
The bad news is that because the project is so big, and the history is so long, many of our downloads are timing out, either because of CVS server slowdowns or networking issues. I know that reaching that far back into the past on such a big project can be very hard on a server.
Our system can recover from an interruption due to a networking or server problem, but this slows us down to the point that if it happens too many times, it may take so long that we never catch up.
Some very large, old projects have successfully imported into our system, but these projects are evidently using extremely powerful CVS servers.
The real issue is catching up on all of the old history. Once we've done that, keeping up with daily changes would be relatively easy.
To succeed with a project of this size, I think the solution is to create our own mirror of the Gentoo CVS server on one of our local machines. I know that some projects such as OpenBSD allow you to create your own source control mirror using CVSup. This would be much more efficient method to pull the entire history than the one-diff-at-a-time brute force method we use. Does Gentoo offer this? It's not something that we currently support, but to get the very largest projects into our system it's something we're considering.
Until then, I'm happy to keep scheduling Gentoo downloads, but when I see a server timeout error after 20 of 381,000 downloads, I get a bit pessimistic. I have to admit we might have to let it wait until we have time to implement the local CVS mirror feature.
I added the SVN resources, which might be easier to fetch according to the problems you (robin) mentioned. No failures so far. Something else: "Additionally the following CVS commands are blocked for safety and security reasons: Kerberos-encrypt Gssapi-encrypt Gssapi-authenticate add remove admin import init history watch-on watch-off watch-add watch-remove watchers editors edit version tag rtag checkin" (quoted from http://anoncvs.gentoo.org/)
Hi Robin, we'd be interested in supplying you with the data you require about Gentoo; if you're interested get in touch with robbat2 [at] gentoo [dot] org directly :)
That's great news!
Once we manage to get some relative peace on our schedule we'll get going on this. Thanks!
Any news here so far?
I meant to ask about the status of this if it wasn't clear, just tell us what we can do on our end.
Any news regarding Gentoo CVS import?
Copyright © 2013 Black Duck Software, Inc. and its contributors, Some Rights Reserved. Unless otherwise marked, this work is licensed under a Creative Commons Attribution 3.0 Unported License . Ohloh ® and the Ohloh logo are trademarks of Black Duck Software, Inc. in the United States and/or other jurisdictions. All other trademarks are the property of their respective holders.