Today the de facto standard for (open source) versioning systems seems to be Subversion, but it wasn’t always like that. Up until a couple of years ago most projects were using CVS.
CVS has some limitations though. It didn’t version the moving or renaming of files or directories. In the world of Java where refactoring of software is common this is quite a big issue. Complete directory trees might move because of a single refactoring action. Clearly CVS wasn’t up to this.
Another issue with CVS is that it doesn’t offer atomic commits. An atomic commit means that when a user commits a change to the repository either the whole change is applied or it is rolled back completely. If a network connection goes down during a commit, a CVS repository can be left partially changed, often leaving the code in an unusable state. When a user updates his local working copy while an other user is committing a change, they will retrieve a partial commit from the CVS server.
Branches also are an issue with CVS. In the CVS methodology you do most of the work on the trunk. Branches should generally be short-lived. In practice this means your repository becomes slower and slower as you create more branches.
The development of CVS started in the eighties and at the turn of the century its age was really starting to show. Early in 2000 Collabnet started to look for developers to write a replacement for CVS. After fourteen months of coding Subversion became “self-hosting” in August 2001. In February 2004 version 1.0 was released.
Subversion’s most important improvements lie in the parts where CVS ‘failed’. It provides a mechanism for moving and copying files without losing the history of those files. Copying files is done very cheaply and because branching is done by doing a copy action branches are cheap too in Subversion. Also commits are atomic. There are some other improvements too. Subversion can for example store symbolic links (version 1.1) and it has a better way of handling binary files than CVS. Besides versioning symbolic links versions 1.0 to 1.4 mostly implemented the basic functionality or fixed bugs. In June of 2008 version 1.5 of Subversion was released. The two biggest new additions are merge tracking and sparse checkouts. Sparse checkouts allow for just a partial checkout of a repository. This can be helpful in large repositories. Similar functionality already existed with the -N option on checkouts, but in 1.5 this has been greatly refined. I will go a little deeper into merge tracking in the following paragraphs.
When you work in an environment that demands a lot of flexibility from your versioning system you will end up with a lot branches that are active at the same time. You might start several branches for the same release as not all features might be ready at the same time. This way you can make sure you won’t have a problem later to keep these unfinished features out of your next shipment. At the same time developers might also be working on other projects on the same code base, meaning even more branches. All in all it means there are a lot of branches to keep track of. And a whole lot of of merges to be done and statuses of merges to be recorded for later use. This is where version 1.5 really helps out, because as of this release your merge activities are stored in the repository. Mind you that the functionality of merge tracking is still foundational. This means that the basic functionality is implemented, but there are still parts of the specification that have to be done. Also merging can sometimes still be a bit slow. Merge tracking works in the following way.
Lets say you have a project in Subversion and you’ve made a feature branch called feature-branch. You’ve built some great new features in that branch and you would now like to do a catch up from the trunk so you get the latest bug fixes or the functionality from an other branch that has been merged to the trunk. Lets assume you are already in the directory where you have your working copy of the feature branch. You would do the catch up in the following way:
svn merge http://server/sources/my-great-app/trunk
I’ll explain what happened here. The svn command merges the changes in the trunk of the my-great-app module in the repository called sources to our branch.
So what’s so great about this? Well, you don’t have to mention which revisions to merge. You give this merge command when you do a catch up from the trunk for the first time and when you later want to do it again, you just give the same command.
So how come Subversion knows what to merge? When you merge Subversion sets a property called mergeinfo. So when you’ve done a catch up and you would like to see what value this property has, give the following command in the root of the working copy of your branch:
svn propget svn:mergeinfo
The result might look something like this:
This means that the changes from revision 2 to 43 have been merged to your working copy. Merge tracking as it is (going to be) implemented in Subversion brings three benefits. First of all it reduces the bookkeeping overhead for branch maintenance. It’s done automatically now. No need to write in the comments of your commit what branch and which revisions you are committing. No more forgetting to do it and having to search through the repository log to find out what exactly it was you did earlier.
Secondly it avoids common cases of the ‘repeated merge’ problem. Because Subversion keeps track of what you have merged you’ll never merge the same revisions more than once.
And thirdly you can merge in a set of preferred revisions to get the features you want, this is called cherry picking, without having a problem later when you catch up with the rest of the revisions. Cherry picking is for example used to get particular bug fixes on parts of the code you are working on into your branch.
If you merged from several branches the result of a propget command on svn:mergeinfo might look like this:
Merge tracking is really a valuable addition to Subversion. It makes the life of trunk masters a bit easier and less knowledgeable users of Subversion can now do merge chores with a smaller risk that they make mistakes. And when a mistake has been made with merges it’s easier to find out what exactly has been merged into your branch. This all adds to the quality of your repository, so there are good reasons to upgrade to 1.5 if you haven’t already done so.
Marco van Os