CSAIL Web server architecture

There are five main CSAIL Web servers: courses, groups, people, projects, and www. In addition, there are a number of special-purpose Web servers, among them legacy, mit-only, redirects, and webstats. As www is the Lab's official face on the Web, it has a different configuration that will not be described here. All the others run Apache 2.2 on 64-bit CSAIL Debian GNU/Linux.

These servers share a common configuration layout, based on the standard layout for the Debian Apache 2.2 package. They are all also implemented as virtual machines, so that the service may easily be transferred from one physical server to another as resource requirements and hardware reliability dictate. In addition, all of the servers except mit-only and webstats serve multiple virtual hosts. The functions of the servers break down as follows:

Server Description
courses academic courses which have storage in /afs/csail/proj/courses/6.xxx; supports both CSAIL and MIT certificates for authentication
groups research groups which have storage in /afs/csail/group/*
people Lab member and alumni home pages and anything else which is served from a user home directory
projects anything which has storage in /afs/csail/proj/* except academic courses
legacy old, unmaintained sites belonging to former groups and discontinued projects; scripting restricted for security; includes www.ai.mit.edu
mit-only projects which, for administrative reasons, must use MIT and not CSAIL certificates for authentication
redirects specialized configuration which only serves redirects; used to remap discontinued URI namespace to current servers, and for active virtual hosts which have a large number of aliases
webstats Web server statistics

Languages and tools available on each web server

The courses, groups, people, projects, and mit-only servers all belong to the configuration class WEB_SERVER_FEATUREFUL and should have nearly-identical configuration including support for PHP4 and PHP5, FastCGI, and popular Web and database framework packages for Perl, PHP, Python, and Ruby. The other servers generally do not support server-side scripting beyond the standard mechanisms included with the Perl, Python, and Ruby packages in Debian.

Server logs

Apache is configured to write its logs in only two locations: /var/log/apache2/access.log is the access log, and /var/log/apache2/error.log is the error log; these files are unique to each server. A daily process collects access logs from all the servers early each morning and splits them up according to the virtual host which was accessed; these split logs are deposited in /afs/csail.mit.edu/proj/www/logs/YYYY-MM-DD/SERVERNAME. After the logs are split, Webalizer runs on webstats to analyze the traffic; it usually takes a few hours for Webalizer to grind through all of CSAIL's logs, so the results for some servers may not be available until about noon.

-- GarrettWollman - 26 Sep 2007