Storage tiers
Storage Tiers
CSAIL offers three tiers of NFS storage with different performance characteristics; scratch, production, and archival. Some research groups have their own servers, but TIG operates servers in these tiers for shared use by all CSAIL members.
Scratch
Best for: High-performance storage of temporary files and intermediate computations
Key characteristics:
- No backups or snapshots
- Users limited to 1 TB of public scratch usage
- Group-owned scratch quotas determined by PI
- Temporary quota elevation sometimes available on request
- Overcommitted by design (not enough storage for every user’s full quota simultaneously)
Data management:
- TIG periodically prunes files not accessed recently
- No attempt to preserve or restore data on hardware/software faults
- Deleted or overwritten data cannot be recovered
- All data automatically compressed; quotas applied after compression
SCRATCH STORAGE IS FOR TEMPORARY FILES ONLY.
Files in /data/scratch, /data/scratch-fast, and /data/scratch-oc40 are automatically deleted if not accessed within six months. They may also be deleted sooner if the filesystem fills up.
Do NOT store:
- Conda environments
- Python package libraries or virtual environments
- Software installations
- Original research data
- Anything you need to keep permanently
Why Python/Conda breaks in scratch: Python caches bytecode (.pyc files) but cleanup procedures delete plain-text source files (.py) when bytecode is accessed. Your Python environment then fails with cryptic errors because source files are missing.
For permanent storage, use production or archival tiers (see below).
Production
Best for: Main workhorse for research computing and data analysis
Key characteristics:
- Optimized for high performance with parallel reads/writes from multiple clients
- Fully provisioned; storage is not overcommitted
- Same checksum and compression settings as scratch storage
Quotas and organization:
- Some groups have shared quota across multiple filesystems
- Most groups have per-filesystem allocation
- Keep unrelated data separate for efficient operations and cost-effective backups
Data protection:
- Regular snapshot policy: hourly, daily, weekly, and monthly snapshots
- Easy file recovery without TIG intervention
- Daily backups when enabled (depends on user needs and access patterns)
Archival
Best for: Read-only reference data and data not actively updated
Key characteristics:
- Does not require high performance parallel random reads
- Automatic data compression enabled
- Users advised that offline compression (bzip2, lz) will be more efficient
Data protection:
- Weekly backups (optional, per-filesystem)
- Standard snapshot policy applied
- Best for read-only or rarely-updated data
Data Center Locations
All three storage tiers are split between:
- 32-341 (main machine room in Stata) - high-speed access from OpenStack and group-owned compute clusters in Stata
- OC40-250 (Massachusetts Green High-Performance Computing Center in Holyoke)
When performance matters, ensure you use servers in the same building as your data. The Holyoke data center has an annual 24-hour maintenance shutdown (usually in June) during which all servers and data there are inaccessible. The community is notified a few months in advance.
Storage Tiers Comparison
| Feature | Scratch | Production | Archival |
|---|---|---|---|
| Use case | Temporary, intermediate files | Active research data | Read-only reference data |
| Snapshots | None | Yes | Yes |
| Backups | None | Daily | Weekly |
| Quota | 1 TiB per user | Per filesystem | Per filesystem |
| Performance | High | High | Medium (optimized for reliability) |
| File cleanup | After six months of non-access | Never | Never |
| Cost | Lowest | Higher | Medium |


