Data Security at CSAIL
MIT has Data Security information available on line at http://infoprotect.mit.edu/ and policy information at http://infoprotect.mit.edu/wisp
This page is meant to help CSAIL lab members determine how sensitive their data is particularly for Medium Risk.
For an overview of how to properly store and handle data of different security levels, please see the Data Risk Reference Grid
Higher education institutions have a reputation for employing less stringent data security protocols which increases the potential of accidental data loss and exposure. The breadth and volume of personal data collected by universities, coupled with high turnover and a technically un-savvy population in general, makes the problem of data loss at institutions nearly epidemic in nature.
Some information that is considered sensitive data requires special care and handling. Inappropriate handling of the data could result in penalties, identity theft, financial loss, invasion of privacy, or unauthorized access by an individual or many individuals. The data could also be subject to regulation by state or federal laws and require notification in the event of a disclosure.
The Institute’s Written Information Security Program (WISP) defines three level of risk - Low, Medium, and High:
This information is meant to be freely available to both members of the MIT community as well as the general public without access controls. Publicly available information may still be subject to University review or disclosure procedures to mitigate potential risks of inappropriate disclosure.
- Directory information for faculty, staff, or students
- Research data that has been de-identified in accordance with applicable rules
- Published research data; published information about the Institute
Low risk data may be stored on AFS, NFS, or Local system disk using standard access controls to limit the set of authenticated users who have access to the data as appropriate.
Low risk data is appropriate to be backed up according to lab backup policies.
Low risk data is public which requires no special treatment.
Information not intended to be freely available to the general public, or to the MIT community, without access controls.
The loss of confidentiality, integrity, or availability of these information assets could reasonably be expected to result in legal liability, reputational damage, or potential for other types of harm.
- MIT IDs with associated identifying information
- Personnel records
- Faculty and staff employment applications, personnel files, benefits, salary, birth date, personal contact information
- De-identified medical and financial data sets which are not covered by law but are governed by individual data security contracts
- Institute financial account numbers and budgets
- Donor contact information and non-public gift information
- Non-public contracts
- Unpublished research papers
- Building floor plans
Medium Risk data is not appropriate for general storage on AFS, NFS, or typical workstation or server local disks, nor is it appropriate for backup with our standard tools.
TIG is able to provide a secured data environment for this level. This requires dedicated hardware purchasing and planning.
This information is subject to legal or regulatory requirements necessitating its proper safeguarding and handling, including possible notification in the event of a breach.
The loss of confidentiality, integrity, or availability of these information assets could reasonably be expected to result in serious harm to individuals or the Institute.
High Risk data should never be stored on CSAIL research systems as we do not have security staffing to provide appropriate audit and training.
Regulated Administrative or Academic Information
- Personal information requiring notification (PIRN)
- MIT credentials with access to Level 2 or higher information
- Student information classified under FERPA
- Health information covered under HIPAA/HITECH
- Credit card information covered by PCI-DSS rules
- Court or national security orders that prohibit disclosure (e.g., subpoenas, National Security Letters)
Regulated Research or Human Subject Information
- Information regarding illegal activities
- National security information
ITAR (International Traffic in Arms Regulations) and the EAR (Export Administration Regulations)
- Export-related security controls on information that is subject to a Technology Control Plan
Data Risk Reference Grid
The Data Risk Reference Grid will help you get a quick overview of what data you can safely store or not store, and where.
: Allowed : Prohibited : Allowed with conditions met
|Dropbox, Google Drive, OneDrive cloud storage|
|Secured Data Environment|
Email is inherently an insecure method of communication. Medium and / or High risk data should never be sent via email. Consider using email for sending links to cloud storage with proper access control to higher risk data. The only caveat to that are files that are attached with appropriate file level encryption.
Slack on it’s own does not provide the required security protocols for Medium or High risk Data. Consider using Slack for sending links to cloud storage with proper access control to higher risk data.
AFS cannot accommodate Medium Risk data in it’s current configuration. Duo two-factor authentication for interactive user and administrator logins cannot provided. If you require Medium Risk data on AFS, please see our Secured Data Environment.
NFS does not provide any reasonable security whatsoever. Medium and / or High risk data should never be stored on NFS
Dropbox, Google Drive, OneDrive
Medium risk data can be stored in Dropbox, Google Drive or OneDrive with the proper access control and responsible steps in place. Additionally, users should delete the data from their local systems when they are finished with it. For more details, please see MIT IS&T’s Knowledgebase Article
done with - delete
Secured Data Environment
TIG can provide support for creating a Secured Data Environment and compute clusters on user purchased hardware sufficient for Medium Level Confidential Information’ such as deidentified medical or financial datasets.