Using Subversion at CSAIL

What is Subversion?

According to the Subversion home page, it is "a compelling replacement for CVS." Subversion is a version control system that was built with CVS users in mind, but it addresses many of the limitations and annoyances of CVS. It allows you to manage versioning on directories, for example, where CVS does not. Subversion has module-based support for integration with Apache2, and can leverage its authentication and access control models to allow people who don't have local user accounts to collaborate with local users.

The topic of version control is outside the scope of this document, as is the nitty-gritty underpinnings of Subversion. For more info on these topics, please read the SVN Book.

How do I create a Subversion repository at CSAIL?

1. Determine where to put your repositories.

All repositories should be created in the appropriate group, project, or personal directory in the CSAIL AFS cell. If your repository is for use across your entire research group, we suggest that you create a directory called REPOS in the top-level of your group's directory, e.g., /afs/csail.mit.edu/group/tig/REPOS. If your repository is for more personal projects, keeping it in REPOS in your home directory makes sense.

2. Create the empty repository structure.

To do anything useful with Subversion, you need to have the subversion package installed on your workstation. It is installed by default on all CSAIL Debian workstations. For other operating systems, you'll need to consult the Subversion Project Packages web page.

You will need to create an empty repository structure and later you can import files into it. IMPORTANT: Old versions of subversion (1.1.4 and earlier) by default will attempt to create "housekeeping" files with the BerkeleyDB format, which will not work properly on a network file system like AFS or NFS. YOU MUST use the alternate backend, which is called FSFS. In all newer versions however, it uses the FSFS backend by default. In the following example, we will create an empty repository at /afs/csail.mit.edu/group/tig/REPO/test_repo:

mpearrow@shaggy:~$ cd /afs/csail/group/tig/
mpearrow@shaggy:/afs/csail/group/tig$ mkdir REPOS
mpearrow@shaggy:/afs/csail/group/tig$ cd REPOS/
mpearrow@shaggy:/afs/csail/group/tig/REPOS$ ls
mpearrow@shaggy:/afs/csail/group/tig/REPOS$ svnadmin create --fs-type=fsfs ./test_repo
mpearrow@shaggy:/afs/csail/group/tig/REPOS$ ls
test_repo
mpearrow@shaggy:/afs/csail/group/tig/REPOS$ cd test_repo/
mpearrow@shaggy:/afs/csail/group/tig/REPOS/test_repo$ ls
README.txt  conf  dav  db  format  hooks  locks
mpearrow@shaggy:/afs/csail/group/tig/REPOS/test_repo$ 

You will doubtless read all sorts of warnings about not running Subversion repositories on network file systems. And, in fact, it is a really bad idea to do so, unless you are using the FSFS backend. As long as you always remember to use the --fs-type=fsfs flag when you create repositories, you'll be all set.

3. Import stuff into the repository.

Let's say you have an existing codebase tree in your home directory, in a subdirectory called project_x. You will need to do an initial import of your code into Subversion's format. The following example shows us importing that code base into the newly created repository.

mpearrow@shaggy:~/Work$ ls
project_x
mpearrow@shaggy:~/Work$ svn import ./project_x/ file:///afs/csail/group/tig/REPOS/test_repo/ --message "initial import of project_x"
Adding         project_x/foo
Adding         project_x/bar
Adding         project_x/pub
Adding         project_x/pub/beer
Adding         project_x/baz

Committed revision 1.
mpearrow@shaggy:~/Work$ 

The svn import command above tells Subversion to take the contents of your local directory "project_x" and import it into the new, empty repository. Note that the path to the repository is a "file" URL. We can use this method since the AFS filesystem is, within our context, a "local" filesystem. If it had been a filesystem on a remote server, we would have had to use a different access method (which we'll discuss later in this document).

4. Make sure you can checkout a copy of the repository.

Once you have created the repository, you will likely never deal directly with it again. All modifications to the repository are done by way of edits you make to local copies (called "working copies"). To get a working copy, you just need to check one out. The following example shows how to do that.

mpearrow@shaggy:~/Work$ cd  
mpearrow@shaggy:~$ mkdir Wark
mpearrow@shaggy:~$ cd Wark
mpearrow@shaggy:~/Wark$ ls
mpearrow@shaggy:~/Wark$ svn checkout file:///afs/csail.mit.edu/group/tig/REPOS/test_repo
A  test_repo/foo
A  test_repo/bar
A  test_repo/pub
A  test_repo/pub/beer
A  test_repo/baz
Checked out revision 1.
mpearrow@shaggy:~/Wark$ 

We now have a local working copy of the repository, and we can share the URL of the repository with our group mates so they can also check out a copy and begin working.

How would I do the same checkout on Windows?

You will first need to install AFS. Once you have done this, the procedure is almost the same, except that you use afs as the host in the URL:
c:\My Dir>svn checkout file://afs/csail.mit.edu/group/tig/REPOS/test_repo

If you have an appropriate drive letter mapped to AFS and it will stay mapped for the life of your checkout, you could alternatively use:

c:\My Dir>svn checkout file:///g:/tig/REPOS/test_repo

How do I change a working copy and update the repository?

There's no particular magic to editing a working copy; just use your favorite editor and hack away. When you're done, you will want to commit your changes. in the following example, we've edited the file pub/beer, and want to push the change to the repository.

mpearrow@shaggy:~/Wark/test_repo/pub$ svn commit -m "added more beer"
Sending        pub/beer
Transmitting file data .
Committed revision 2.
mpearrow@shaggy:~/Wark/test_repo/pub$ 

How do I make sure I have the latest version of files in my working copy?

Easy; just use svn update. Is this all starting to feel a lot like CVS?

mpearrow@shaggy:~/Wark/test_repo$ svn update
U  bar
Updated to revision 3.

Hmmm, someone else went to the bar. Now I have an updated version of that file, which has version number 3.

How do I do (whatever) in Subversion?

That's not really what this document is for, but the SVN Book will probably have the answers you are looking for.

If not, and you are a member of the CSAIL community, please join the CSAIL svn-users mailing list. If you don't know how to join a CSAIL mailing list, ask one of us or one of your group mates.

The rest of this document is only related to issues that are particular to our implementation here at CSAIL and will probably be of little use to anyone else.

Other access methods

So far, we've only seen how to use Subversion with the file: protocol, which expects to work on "local" filesystems. This actually works fine if all you are doing is working on files with your CSAIL colleagues, who all have CSAIL accounts. However, this isn't always adequate, especially if you need to collaborate with people outside of CSAIL. There are four different ways to access a Subversion repository:

  • Via the file:/// "protocol";
  • Via the svn proprietary client/server protocol;
  • Via a combination of the SVN program and an SSH tunnel (called svn+ssh);
  • Via Apache2 DAV, using the https:// protocol.

Each of these access methods has strengths and weaknesses. The most straightforward approach is to simply use the file: protocol, since any repository to which you have access (this is also discussed later) can be checked out, and if you have write privileges to the repository, updated from any machine that is attached to the CSAIL AFS cell. This is probably good enough for the majority of use, and in fact does not even require a dedicated server. All management of the repository is done by the client processes and the repository's housekeeping files.

Note: it is possible for you to share your repository via the file: protocol to people with Athena accounts, but no CSAIL account, by setting the AFS access control list for the repository so that their Athena instance has read or read-write privileges.

The svn proprietary protocol, as it turns out, is mostly useless for our purposes. It can only support global access settings, so setting fine-grained access control to repositories isn't possible. We won't consider this protocol for the remainder of our discussion.

svn+ssh can be useful if you are working on a computer that is not attached to the CSAIL AFS cell, but you do have a CSAIL login account. This combo opens an SSH tunnel to a CSAIL computer that has the Subversion package installed (login.csail.mit.edu is a good host to use for this purpose). To use this access method, just use a URL of the form

svn+ssh://login.csail.mit.edu/afs/csail/<path-to-your-repository>
:

magick:mpearrow> svn list svn+ssh://login.csail.mit.edu/afs/csail/group/tig/REPOS/test_repo
bar
baz
foo
pub/

In the example above, there was no prompt for a password since I had valid Kerberos credentials on my local machine and a kerberos-aware version of ssh. As an aside, this method is really nice if you want to make backups of a laptop's documents (we'd really prefer that you not back up your OS and applications this way...)

Setting up https access

However, if you need to collaborate with someone who does not have a CSAIL account, you will need to use the https access method for Subversion, which makes the repository accessable as a WebDAV? share. Once configured properly, your repository will be accessible from a url of the form

https://svn.csail.mit.edu/coolNameGoesHere
Unlike the other two access methods, this gives you the ability to, and requires that you, manage your own usernames and passwords. These exist in a simple flat file which is made with the htpasswd command. See AuthUserFile for more information. You also have the ability to define fairly complex and finely-grained access control based on the directory in the repository. See the Subversion Book for more information.

Once you've created your htpasswd file (and also an access control file, if so desired) you must give the SVN WebDAV? server access to your repository. Change to the directory containing your repository and run

find . -type d -exec fs sa {} svn rlidwk \;

Note: even if the svn user already has appropriate access permissions, please make sure that your effective afs permissions to the repository directory (either through an AFS group or directly) are all (rlidwka). If they are not, the final step (web script creating the mapping) will fail.

Make sure that the htpasswd and optional access control files are in a location in AFS accessible by svn but not necessarily by other users.

Finally, to set up and enable the SVN WebDAV? mapping, go to https://svn.csail.mit.edu:1443/admin/admin.cgi. Fill in your desired location and the path to your SVN repository. Type the path to your htpasswd and/or access control files in the field labeled "Password File", "ACL File". Be sure to click "Use Authentication" if you want the server to check usernames and passwords! Once done, click commit and then try out your repository.

Important note about SVN repositories on network filesystems

Older versions of Subversion used BerkeleyDB as their back-end storage mechanism. BerkeleyDB doesn't mix well with network filesystems like AFS or NFS, and its use on these filesystems could result in data loss. We strongly recommend that you convert any repositories with the BerkeleyDB backend to the FSFS backend. You can find out which backend you're using by examining the db/fs-type file in your repository. If it contains the string bdb, you're using the BerkeleyDB format. Instructions for converting a bdb repository to fsfs are given in the following example:

  • correct any errors in place
svnadmin recover /afs/csail.mit.edu/group/rvsn/papers 

  • dump all svn actions to a log
svnadmin dump /afs/csail.mit.edu/group/rvsn/papers > svn.dump 

  • move existing repos out of the way
cd /afs/csail.mit.edu/group/rvsn/
mv papers papers.bdb

  • recreate repos; default type is FSFS
svnadmin create papers
replay the log
svnadmin load /afs/csail.mit.edu/group/rvsn/papers < svn.dump

  • if everything worked
rm svn.dump
rm -rf papers.bdb