Ohloh Blog rss feed




Beginning today, we're trying out a new method of importing Subversion repositories. We'll be using this new method on a limited number of projects, and as the code proves itself we'll gradually phase it in for all Subversion projects.

If you see an enlistment on your project marked "Subversion (Sync Beta)", then your project is part of this experiment. Please let us know if you notice anything unusual. We expect the new project reports to be identical to the previous ones.

For those who are curious, here's a description of what's changing.

Internally, Ohloh currently stores all project source in Git repositories, and project reports are all prepared based on these Git repositories. This means that source code stored in other formats (Subversion or CVS) must be converted to Git. Ohloh uses an in-house tool to convert a single Subversion or CVS branch to Git.

While this converter has been extremely reliable for us, it has a lot of limitations in its design, and it's also painfully slow.

We're trying a new strategy now: we will begin storing Subversion repositories in their native format. We'll be using the svnsync command to create local mirrors of entire Subversion repositories.

This has a lot of great benefits:

  1. It's much faster, which means more frequent project report updates for you and less server maintenance for us.
  2. Ohloh's Subversion converter follows only a single branch, and it cannot follow directory renames (the infamous --stop-on-copy limitation that causes so much forum traffic). The use of svnsync removes these limitations. Note importantly that this does not mean that all of those projects on Ohloh with missing history will suddenly fix themselves -- this is just the first step towards that goal.
  3. This code introduces a new abstraction layer in the Ohloh architecture. We now allow multiple native source code formats on our servers. This opens up the ability (finally) for us to add additional source control systems, like Mercurial.

Most software doesn't survive very long. The hard truth is that more than 80% of the open source software being written today will be forgotten in a few years.

For those projects that do succeed and thrive, the developers typically decide at some point that they need a new source control system. For many reasons (lack of time, lack of tools, or a simple desire to start fresh), most projects simply throw away their development history at this point and start again.

All of which means that most source control repositories are lucky to survive more than a couple of years.

However, there's a class of meticulous, responsible, (obsessed?) programmer that somehow manages to keep the same thread of development alive and unbroken for decades.

Among the 14,000 repositories Ohloh is currently tracking, here are the three oldest open source repositories -- all three of which are still under continuous development today. [And if we're missing something, please let us know!]

Number 3: GCC, the GNU Complier Collection - View Contributors

Started in November 1988

This building block of the open source world is all the more impressive for surviving a conversion from RCS through CVS and into Subversion. Ohloh found 350 developers listed in this repository, two of which have over 10 years of development experience.

Number 2: GNU Emacs - View Contributors

Started in April 1985

No matter which side of the Emacs/Vim debate you come down on, you have to hand it to this team for keeping their CVS repository alive for over 22 years. And yes, rms is still hacking away after 20,000 commits. Impressive.

Number 1: BRL-CAD - View Contributors

Started in April 1983

When this open source repository was getting started, I was saving my pennies for a Sinclair ZX-81 kit. That computer and all its code are long gone, but BRL-CAD marches on. Not surprisingly, BRL-CAD developers were very helpful in wringing obscure bugs out of the Ohloh CVS parser [thanks sean!].


I admit to being amazed when I take a peek at Alexa and compare our pageviews with other web sites that focus on open source software. To be sure we are nowhere near SourceForge, the grandaddy of all open source sites, but we are fast approaching or have surpassed the pageviews of FreshMeat, OSDir and Swik.

Why is this? We finally (sigh!) realized that finding more out about the people creating and using open source projects is a fascinating thing. Each time we ship a feature that reveals more about people who create and use open source software our traffic goes up.

Can't argue with that feedback! But perhaps some of you have a different opinion. Let me know if you do.

--Scott

Avatar
written by Jason Allen
aug 14 2007

One of the most difficult challenges of running Ohloh is deciding which, among the plethora of great ideas, to implement next. You (our community) are continually suggesting very interesting ideas - as well as requesting reasonable bug fixes. With our current resources, I figure we could spend the next year just working on the stuff you've proposed already - without taking account the suggestions that keep on coming.

Beyond that, we also have some "internally"-motivated features (money!). Actually, it's not just money - we also want to push forward with our vision of where we see Ohloh going.

All of which leaves us with agonizing choices to make. Fix a bug? Implement a requested feature? Or push Ohloh forward?

As a partial remedy to this problem, we're going to open up our service so that anyone can leverage our data and build on it. This will :

  • enable integration of ohloh data into any widgets, apps, etc...
  • address the problem that Ohloh is currently a data black hole (difficult to get data out)

So far, we are planning on creating a REST API on top of ohloh, enabling the ability get information on projects, accounts, stacks, etc... We would support XML and JSON out/in and also require a free API key that would limit the number of requests a day (to keep our servers reponsive).

What could one do with this?

I'm hoping you tell us. Off the top of my head, I imagine people might want to

  • track who has kudo'd them
  • write a doap converter
  • resurface project analytics into their own websites
  • enrich a user identity with Ohloh stats
  • derive your own data, like what is the most popular language in the UK?

While we're still designing this feature, now is the time to pipe up and provide feedback and concerns. Thanks in advance!


As a coder by profession, I think I'm not alone when I say that I don't see the appeal in using most social networks.

They require lots of input (who are you? what do you do? where do you work? what do you like? who are your friends? don't you have more friends - please - add more friends!).

Then, in return for all this input, you get very little - mostly you spend more time wondering if you should accept someone else as a friend.

Here at Ohloh we often get asked: "why doesn't Ohloh focus on being the 'myspace/facebook/linkedin' of open source developers'?

The answer is is two-fold:

Yes - we are heading there. We do want to be the place open source developers connect. However,

No - we're not going to just jam random (often annoying) social networking features.

Our strategy is simple: we're going to attempt to reverse the value equation: you will tell Ohloh relatively little about yourself (basic bio and what software you create and use) - and, in return, we'll give you more and more relevant info out of it.

So far we've covered:

  • we provide automated metrics on software projects
  • we give you stats on your own development
  • we provide relevant suggestions on what other software you should consider using

As usual, our problem is what to focus on first. This is where you come in. Let's start discussing where you'd like Ohloh to grow in this space - and, if you must, where you'd like us to not go. Fire away...