Using Subversion at CSAIL
What is Subversion?
According to the
Subversion home page, it is "a compelling replacement for CVS." Subversion is a version control system that was built with CVS users in mind, but it addresses many of the limitations and annoyances of CVS. It allows you to manage versioning on directories, for example, where CVS does not. Subversion has module-based support for integration with Apache2, and can leverage its authentication and access control models to allow people who don't have local user accounts to collaborate with local users.
The topic of version control is outside the scope of this document, as is the nitty-gritty underpinnings of Subversion. For more info on these topics, please read the
SVN Book.
How do I create a Subversion repository at CSAIL?
1. Determine where to put your repositories.
All repositories should be created in the appropriate group, project, or personal directory in the CSAIL AFS cell. If your repository is for use across your entire research group, we suggest that you create a directory called REPOS in the top-level of your group's directory, e.g.,
/afs/csail.mit.edu/group/tig/REPOS. If your repository is for more personal projects, keeping it in REPOS in your home directory makes sense.
2. Create the empty repository structure.
To do anything useful with Subversion, you need to have the subversion package installed on your workstation. It is installed by default on all CSAIL Debian workstations. For other operating systems, you'll need to consult the
Subversion Project Packages web page.
You will need to create an empty repository structure and later you can import files into it.
IMPORTANT: Old versions of subversion (1.1.4 and earlier) by default will attempt to create "housekeeping" files with the BerkeleyDB format, which will not work properly on a network file system like AFS or NFS.
YOU MUST use the alternate backend, which is called
FSFS. In all newer versions however, it uses the FSFS backend by default. In the following example, we will create an empty repository at
/afs/csail.mit.edu/group/tig/REPO/test_repo:
mpearrow@shaggy:~$ cd /afs/csail/group/tig/
mpearrow@shaggy:/afs/csail/group/tig$ mkdir REPOS
mpearrow@shaggy:/afs/csail/group/tig$ cd REPOS/
mpearrow@shaggy:/afs/csail/group/tig/REPOS$ ls
mpearrow@shaggy:/afs/csail/group/tig/REPOS$ svnadmin create --fs-type=fsfs ./test_repo
mpearrow@shaggy:/afs/csail/group/tig/REPOS$ ls
test_repo
mpearrow@shaggy:/afs/csail/group/tig/REPOS$ cd test_repo/
mpearrow@shaggy:/afs/csail/group/tig/REPOS/test_repo$ ls
README.txt conf dav db format hooks locks
mpearrow@shaggy:/afs/csail/group/tig/REPOS/test_repo$
You will doubtless read all sorts of warnings about not running Subversion repositories on network file systems. And, in fact, it is a really bad idea to do so, unless you are using the FSFS backend. As long as you always remember to use the
--fs-type=fsfs flag when you create repositories, you'll be all set.
3. Import stuff into the repository.
Let's say you have an existing codebase tree in your home directory, in a subdirectory called
project_x. You will need to do an initial import of your code into Subversion's format. The following example shows us importing that code base into the newly created repository.
mpearrow@shaggy:~/Work$ ls
project_x
mpearrow@shaggy:~/Work$ svn import ./project_x/ file:///afs/csail/group/tig/REPOS/test_repo/ --message "initial import of project_x"
Adding project_x/foo
Adding project_x/bar
Adding project_x/pub
Adding project_x/pub/beer
Adding project_x/baz
Committed revision 1.
mpearrow@shaggy:~/Work$
The
svn import command above tells Subversion to take the contents of your local directory "project_x" and import it into the new, empty repository. Note that the path to the repository is a "file" URL. We can use this method since the AFS filesystem is, within our context, a "local" filesystem. If it had been a filesystem on a remote server, we would have had to use a different access method (which we'll discuss later in this document).
4. Make sure you can checkout a copy of the repository.
Once you have created the repository, you will likely never deal directly with it again. All modifications to the repository are done by way of edits you make to local copies (called "working copies"). To get a working copy, you just need to check one out. The following example shows how to do that.
mpearrow@shaggy:~/Work$ cd
mpearrow@shaggy:~$ mkdir Wark
mpearrow@shaggy:~$ cd Wark
mpearrow@shaggy:~/Wark$ ls
mpearrow@shaggy:~/Wark$ svn checkout file:///afs/csail.mit.edu/group/tig/REPOS/test_repo
A test_repo/foo
A test_repo/bar
A test_repo/pub
A test_repo/pub/beer
A test_repo/baz
Checked out revision 1.
mpearrow@shaggy:~/Wark$
We now have a local working copy of the repository, and we can share the URL of the repository with our group mates so they can also check out a copy and begin working.
How would I do the same checkout on Windows?
You will first need to install AFS. Once you have done this, the procedure is almost the same, except that you use
afs as the host in the URL:
c:\My Dir>svn checkout file://afs/csail.mit.edu/group/tig/REPOS/test_repo
If you have an appropriate drive letter mapped to AFS and it will stay mapped for the life of your checkout, you could alternatively use:
c:\My Dir>svn checkout file:///g:/tig/REPOS/test_repo
How do I change a working copy and update the repository?
There's no particular magic to editing a working copy; just use your favorite editor and hack away. When you're done, you will want to commit your changes. in the following example, we've edited the file pub/beer, and want to push the change to the repository.
mpearrow@shaggy:~/Wark/test_repo/pub$ svn commit -m "added more beer"
Sending pub/beer
Transmitting file data .
Committed revision 2.
mpearrow@shaggy:~/Wark/test_repo/pub$
How do I make sure I have the latest version of files in my working copy?
Easy; just use svn update. Is this all starting to feel a lot like CVS?
mpearrow@shaggy:~/Wark/test_repo$ svn update
U bar
Updated to revision 3.
Hmmm, someone else went to the bar. Now I have an updated version of that file, which has version number 3.
How do I do (whatever) in Subversion?
That's not really what this document is for, but the
SVN Book will probably have the answers you are looking for.
If not, and you are
a member of the CSAIL community, please join the CSAIL svn-users mailing list. If you don't know how to join a CSAIL mailing list, ask one of us or one of your group mates.
The rest of this document is only related to issues that are particular to our implementation here at CSAIL and will probably be of little use to anyone else.
Other access methods
So far, we've only seen how to use Subversion with the
file: protocol, which expects to work on "local" filesystems. This actually works fine if all you are doing is working on files with your CSAIL colleagues, who all have CSAIL accounts. However, this isn't always adequate, especially if you need to collaborate with people outside of CSAIL. There are four different ways to access a Subversion repository:
- Via the file:/// "protocol";
- Via the svn proprietary client/server protocol;
- Via a combination of the SVN program and an SSH tunnel (called svn+ssh);
- Via Apache2 DAV, using the https:// protocol.
Each of these access methods has strengths and weaknesses. The most straightforward approach is to simply use the
file: protocol, since any repository to which you have access (this is also discussed later) can be checked out, and if you have write privileges to the repository, updated from any machine that is attached to the CSAIL AFS cell. This is probably good enough for the majority of use, and in fact does not even require a dedicated server. All management of the repository is done by the client processes and the repository's housekeeping files.
Note: it is possible for you to share your repository via the
file: protocol to people with Athena accounts, but no CSAIL account, by setting the AFS access control list for the repository so that their Athena instance has read or read-write privileges.
The
svn proprietary protocol, as it turns out, is mostly useless for our purposes. It can only support global access settings, so setting fine-grained access control to repositories isn't possible. We won't consider this protocol for the remainder of our discussion.
svn+ssh can be useful if you are working on a computer that is not attached to the CSAIL AFS cell, but you do have a CSAIL login account. This combo opens an SSH tunnel to a CSAIL computer that has the Subversion package installed (
login.csail.mit.edu is a good host to use for this purpose). To use this access method, just use a URL of the form
svn+ssh://login.csail.mit.edu/afs/csail/<path-to-your-repository>
:
magick:mpearrow> svn list svn+ssh://login.csail.mit.edu/afs/csail/group/tig/REPOS/test_repo
bar
baz
foo
pub/
In the example above, there was no prompt for a password since I had valid Kerberos credentials on my local machine and a kerberos-aware version of ssh. As an aside, this method is really nice if you want to make backups of a laptop's documents (we'd really prefer that you not back up your OS and applications this way...)
Setting up https access
However, if you need to collaborate with someone who does not have a CSAIL account, you will need to use the
https access method for Subversion, which makes the repository accessable as a
WebDAV? share. Once configured properly, your repository will be accessible from a url of the form
https://svn.csail.mit.edu/coolNameGoesHere
Unlike the other two access methods, this gives you the ability to, and requires that you, manage your own usernames and passwords. These exist in a simple flat file which is made with the htpasswd command. See
AuthUserFile for more information. You also have the ability to define fairly complex and finely-grained access control based on the directory in the repository. See
the Subversion Book for more information.
Once you've created your htpasswd file (and also an access control file, if so desired) you must give the SVN
WebDAV? server access to your repository. Change to the directory containing your repository and run
find . -type d -exec fs sa {} svn rlidwk \;
Note: even if the
svn user already has appropriate access permissions, please make sure that
your effective afs permissions to the repository directory (either through an AFS group or directly) are
all (
rlidwka). If they are not, the final step (web script creating the mapping) will fail.
Make sure that the htpasswd and optional access control files are in a location in AFS accessible by svn but not necessarily by other users.
Finally, to set up and enable the SVN
WebDAV? mapping, go to
https://svn.csail.mit.edu:1443/admin/admin.cgi. Fill in your desired location and the path to your SVN repository. Type the path to your htpasswd and/or access control files in the field labeled "Password File", "ACL File". Be sure to click "Use Authentication" if you want the server to check usernames and passwords! Once done, click commit and then try out your repository.
Important note about SVN repositories on network filesystems
Older versions of Subversion used BerkeleyDB as their back-end storage mechanism. BerkeleyDB doesn't mix well with network filesystems like AFS or NFS, and its use on these filesystems could result in data loss. We strongly recommend that you convert any repositories with the BerkeleyDB backend to the FSFS backend. You can find out which backend you're using by examining the
db/fs-type file in your repository. If it contains the string
bdb, you're using the BerkeleyDB format. Instructions for converting a bdb repository to fsfs are given in the following example:
- correct any errors in place
svnadmin recover /afs/csail.mit.edu/group/rvsn/papers
- dump all svn actions to a log
svnadmin dump /afs/csail.mit.edu/group/rvsn/papers > svn.dump
- move existing repos out of the way
cd /afs/csail.mit.edu/group/rvsn/
mv papers papers.bdb
- recreate repos; default type is FSFS
svnadmin create papers
replay the log
svnadmin load /afs/csail.mit.edu/group/rvsn/papers < svn.dump
rm svn.dump
rm -rf papers.bdb