The Infrastructure Group
The Infrastructure Group (TIG) provides service-oriented computing, communications and operations assistance to support the world-class research underway at CSAIL. On the technical side, TIG handles everything from maintaining and monitoring a complex computing infrastructure 24 hours a day, 7 days a week to ensuring lab members can access cutting-edge computing services and in-person support. TIG maintains the lab’s four on-premises data centers as well as hosting in off-premises facilities. Additionally, TIG actively promotes CSAIL research to the broader MIT community, reputable news organizations and the general public via a full-range of media relations and communications services.
We have experienced a multiple-server failure in the Ceph storage system that backs up our OpenStack cloud and many TIG services. The Ceph system is automatically recovering as fast as it can; however, some VMs are offline at the moment because two of the three copies of certain data blocks were stored on servers that failed. (This is in addition to the multiple-server failure we had earlier this year, which appears to have had a different cause.) We are unsure when the problem blocks will be restored, because they have to wait for other data to be shifted around in order to make room. Rebooting affected OpenStack VMs will not result in a restoration of service and could make matters worse, but starting new VMs from a new OS image may work.
This is currently affecting some VMs that provide CSAIL public web services such as http[s]://groups.csail.mit.edu.
This planned power outage is for the yearly preventative maintenance shutdown and the entire facility will be offline during this time. The shutdown usually requires 24 hours of downtime (Tuesday 2021/08/10 00:00 - Wednesday 2021/08/11 00:00)
Per usual, we will have staff on site before and after to help with the shutdown and help bring things back online. If you have any questions what machines are out there (or any other questions) please reach out to firstname.lastname@example.org.
TIG is available Mon - Fri, 9 AM - 6 PM US/Eastern Time
If you would like to open a support case regarding Computing or Facilities related issues, sending an email to the address below will create a trouble tracking ticket in which we will promptly respond to.
Due to the impacts of Institute, State, and Federal travel restrictions, most of us are still performing our work from home. The doors to TIG will remain closed. Any on-site walk-in service to TIG will be by appointment only.
Creating a ticket via e-mail is always preferred, but you can also reach us by phone and leave a voice mail if we are unavailable.
When leaving an off-hour message via voicemail you may be waking up an uncaffeinated sysadmin. Reserve off-hour voicemails for emergencies only.