CSAIL web server architecture
TIG maintains a web-server cluster for the use of the Lab. This cluster contains several back-end servers, with different capabilities, and a front-end proxy that handles automatically getting certificates for Lab websites.
When asking for a new TIG-hosted website, you don’t need to pay attention to the details of our web cluster. You just need to tell us the domain name of the new website, who will be responsible for it, where to find the content you want to serve from it, and a little bit about what features it needs – whether it’s a static site, or whether it needs support for CGI scripts or PHP, for instance. Then we can set up the server configuration for you and your site will automatically serve your content from the location you told us about (which is typically in AFS).
In addition to custom domains like that (like myproject.csail.mit.edu, or maybe myproject.mit.edu or myproject.example.net), we also have several web domains defined that automatically serve static content from specific locations in AFS. So if you just want to throw a web page (or video or image or PDF) up on the web somewhere, and you don’t care about it having a particularly short or memorable URL, you don’t need to ask us for anything; you can just put your content in the right place in AFS and it will automatically be served at the corresponding URL.
(There are also some special-purpose websites TIG manages, like the
https://www.csail.mit.edu/ website for the lab as a whole, the web
server that handles archival content from the AI Lab and LCS, and the
data.csail.mit.edu
website for serving datasets to the public.)
See also
- Shared websites for static content (in detail)
- Custom domains for websites (in detail)
- Distributing data sets from NFS
- Web server cluster logs
Static versus dynamic content
A website can serve either purely static content or dynamic content.
Static content is files that are served as-is by the webserver out of a filesystem, like HTML files (or plain text files), PDFs, multimedia files, CSS files, and so forth. (JavaScript files also count as static if they are run by the user’s web browser, rather than by the web server; the point is that what the web server sends to the user’s browser is exactly what it read from disk.)
Dynamic content is generated by the web server, and the result is sent to the web browser. PHP or CGI scripts fall into this category: The script is run by the web server, and it decides what content to send to the viewer’s web browser. (“Server-side includes” are not very commonly used these days but they also count as dynamic content, where the content to send to the user’s web browser is computed in real time by the server.)
At CSAIL, static content can be served either from a custom
domain of your choosing (e.g. https://mysite.csail.mit.edu/
) or
from one of a handful of shared domains (https://people.csail.mit.edu,
https://groups.csail.mit.edu, and so on).
Dynamic content, however, can only be served from a custom domain you choose, that TIG sets up for you. (We used to allow dynamic content from our shared domains, but that created security and support problems and made upgrades complicated.)
If you are serving dynamic content from CSAIL, you are responsible for upgrading and maintaining your code and keeping it secure. Depending on what tools you’re using this might mean upgrading for compatibility, keeping up with security-related announcements for your tools, responding promptly and fixing problems if we (TIG) or IS&T notify you of potential vulnerabilities, and upgrading to maintain compatibility when TIG upgrades the web-cluster server OSes. If a dynamic website is not kept secure and up-to-date, TIG may disable it until its maintainer fixes it. If we need to contact you about a website, we’ll contact the person (or people) who originally requested the web domain, as recorded in our web-cluster configuration. If you leave CSAIL (or your responsibilities shift) and somebody else is taking over the website from you, it’s your responsibility to make sure we know about the change and we have the new website maintainer listed.
Making static content available on our shared websites
We have several web domains available that automatically serve content from specific places in AFS. To serve content from one of those domains, all you have to do is put your content (HTML files, images, CSS files, etc.) in the right place in AFS, and it will be available at a corresponding URL. So this is a quick-and-dirty way to put (static) content up on the web without having to coordinate with TIG.
Only static files (not CGI or PHP scripts or other dynamic content) can be served automatically from our shared websites, without TIG setting up a custom domain for you. See below under “Custom domains for websites” if you need dynamic content or if you want a shorter, more memorable URL.
Details are on the page Shared websites for static content, but the short version is:
AFS directory | URL |
---|---|
$HOME/public_html/ |
https://people.csail.mit.edu/$USER |
(e.g. /afs/csail.mit.edu/u/s/somebody ) |
(e.g. https://people.csail.mit.edu/somebody/ ) |
/afs/csail.mit.edu/group/GROUPNAME/www/data |
https://groups.csail.mit.edu/GROUPNAME |
/afs/csail.mit.edu/proj/PROJECTNAME/www/data |
https://projects.csail.mit.edu/PROJECTNAME |
/afs/csail.mit.edu/proj/courses/COURSENUMBER/www/data |
https://courses.csail.mit.edu/COURSENUMBER |
(The .../www/
part in some of those locations is because group,
project, and course directories can contain stuff that’s not intended
to be served over the web, and the .../data/
part is a historical
artifact from a previous web-service architecture.)
We also have a dedicated data.csail.mit.edu domain intended for serving large datasets out of NFS. You need to ask us to set up the mapping from your NFS directory to a public URL, though. That is documented on its own page.
See Shared websites for static content for more detail.
Custom domains for websites
If you want a shorter, more memorable URL than our shared static
websites provide or if you need dynamic content, such as CGI
or PHP scripts or server-side includes, then you’ll need to arrange
for a custom domain (usually something.csail.mit.edu
,
but it can be a non-CSAIL MIT domain like
something.mit.edu
, or even a non-MIT domain like
something.org
if you arrange for the domain
yourself).
See Custom domains for websites to learn how these work and how to request them.
Redirects
Another, lighterweight way to give an existing website (whether hosted at CSAIL or elsewhere) a short, memorable URL is to arrange for a CSAIL domain to redirect to it. This doesn’t hide the “real” location of the website, but means people can use the shorter domain. And this requires no coordination with the website that actually hosts the content, so you can redirect to an existing website, a particular page on a website, a Dropbox or Google Drive or iCloud file, or anything else you can link to.
TIG maintains a dedicated server that does nothing but redirect URLs. If you want a (new) domain to redirect to a particular location, send mail to help@csail.mit.edu and tell us what new domain you want us to create for you and where that domain should redirect to. (You might want to check that the new domain you want isn’t already in use.)
(That is how you can have us redirect an entire domain. If you just
want to redirect one or more URLs from an existing website – maybe you
reorganized your content, or one particular paper or dataset got moved to
another site – we have a page on Redirecting particular web pages using .htaccess
files.)
Logs and debugging
While it’s a bit clumsy, TIG provides access to the Apache logs for our web-server cluster as an aid in debugging. Access to the cluster logs is described under Web server cluster logs.