Groups at CSAIL
What’s a “group”?
CSAIL’s computing infrastructure has a bunch of different things called “groups”, and it can be confusing sometimes. These functions arose over time and some of them reflect historical circumstances that don’t necessarily apply in the way you expect, especially if you have experience with computing facilities at other instutions. This guide will attempt to explain what the different kinds of groups are and how they are used.
Fundamentally, for computing purposes, a group serves two functions: access control and delegation of administration. Different services have different access-control policies and take notice of different kinds of groups (sometimes internal to the service, most of the time external).
All of these different kinds of groups are “free”, except insofar as they are limited by prior claims on names and the potential for confusion. CSAIL users should therefore take advantage of them in whatever way fits with the needs of your research. TIG will try to offer guidance as to what is the most appropriate way to set up groups to meet your needs, and if you find that things are not working the way you want, please ask for assistance.
History
Historically, groups were aligned with what in the Laboratory for Computer Science were called “research groups”: one or more faculty or principal investigators, an ongoing research activity and budget, research staff, administrative assistants, and students. Most LCS research groups were multi-PI and had their own computing and storage resources and system administrators. Most groups in the Artificial Intelligence Laboratory were single-PI and shared a common computing infrastructure, storage, and system administration group with the rest of the AI Lab. When the two labs merged in 2003, services that one lab had originally implemented generally retained whatever “native” access control model they had started with.
As a result of a security incident, it was decided that an
authenticated, secure shared storage infrastructure was required, and
the obvious answer at the time was to deploy AFS, which was already in
use on the Athena system and therefore familiar to most faculty and
students on campus.
AFS had the additional benefit of allowing delegated access control so
that system administrators would not be spending all their time
editing a centralized /etc/group
file: users could set access lists
on their own.
Some researchers found the security mechanism of AFS too onerous for
their computing needs, and insisted on keeping the old insecure NFS
storage around.
In order to support this usage without losing the benefits of
delegated administration, TIG developed a mechanism to automatically
populate a Unix-style groups database (originally /etc/group
but now
in an LDAP directory) with the memberships of a select subset of AFS
groups, so the contents of the AFS protection database would largely
reflected in Unix (and therefore NFS) permissions after a propagation
delay.
Meanwhile, the INQUIR account management system had its own notion of research groups, based on the LCS model. (While INQUIR existed as a PDP-10 program on the AI ITS machines, it had been abandoned with the machines themselves.) With many PIs no longer employing their own system administrators, there needed to be a way for users to sign up for accounts online, using the web, rather than having a sysadmin manually create one by running a terminal application on a central server. But we could not just allow anyone to sign up for an account, without the approval of someone responsible, since we knew that this would lead very quickly to abuse. Yet, PIs did not want to have to manually approve their new students’ accounts, especially when they were traveling and the account setup was urgently needed. So the existing INQUIR research group mechanism, which had already included an “administrator flag” for the group system administrators, was extended to use the same mechanism (and the same flag) to allow for delegated user account administration, including new user signup, account record editing, and (when user expiration was introduced) account renewals.
INQUIR groups
- Used for:
- Access control for administrative updates to INQUIR records; account signup; INQUIR record audit notifications
- Visible in:
- Controls write access to INQUIR records and generation of audit email for updates
- Source of truth:
- A table in the INQUIR database
- Who can create?
- TIG sysadmins
- Who can modify?
- TIG sysadmins; those designated as group administrators can add and remove other administrators
- Special restrictions:
- Names must be unique and meaningful to new users signing up; should abbreviate to something that is a valid AFS protection group
The original INQUIR database represented groups as a free-form text field. To allow for easy keyboard completion, this was converted into a controlled vocabulary, which also made it possible for users (primarily sysadmins, consultants, and administrative assistants) to belong to multiple groups, although they still can only have one supervisor. Because the set of historic LCS research groups was significantly smaller than the set of potential supervisors, INQUIR functions including account signup were implemented as “select a group first, then select a supervisor from a group-specific list”, rather than having users select a supervisor first and then inheriting the supervisor’s group membership. (This also makes it easier to deal with new users who have no supervisor, such as new faculty hires.)
Associated with each user’s group membership is a three-way status flag: it can be “administrator”, “administrator (notifications disabled)", or “regular group member”. Members of a group who have either administrator status can approve new users in that group, change the status of other group members, and can change most of the fields in individual user records. This is primarily used for the annual account expiration cycle. Group members with “administrator” status receive an email notification for all changes to members of that group, and also receive account expiration reminders during the annual account renewal cycle.
Filesystem groups
AFS group volumes
- Used for:
- Low-cost, moderate-volume bulk storage of research data; group web sites
- Visible in:
/afs/csail.mit.edu/group
; AFS mount points (fs
command)- Source of truth:
- AFS mount points in the
group
volume; AFS file server metadata - Who can create?
- TIG system administrators
- Who can modify?
- Determined by the AFS access list on each directory
- Special restrictions:
- Must be a Portable Filename as defined by the POSIX standard, except
that the
.
(dot) character is not allowed; TIG recommends all-lower-case and avoidance of underscores for ease of typing. - Relevant commands:
fs examine
,fs listquota
,fs lsmount
When a PI requests the creation of a new INQUIR group (see above), TIG
creates and mounts an AFS volume such that the groups
web server
(see below) will automatically recognize the group and serve its
content.
In addition, an AFS protection group will be created with write access
to the volume, which will either be self-administered or have a
separate -admin
group created to own it.
By convention, the AFS volume name is group.
_name
_, which imposes
some restrictions on the length of the name; it will be mounted at
/afs/csail.mit.edu/group/
_name
_.
Note that the AFS volume location database and AFS file servers are globally accessible and volume status information can be queried by unauthenticated remote users.
AFS protection groups
- Used for:
- AFS access controls, including filesystem access lists; a subset is propagated into the Unix group database
- Visible in:
- AFS protection database; AFS access lists (
fs
command) - Source of truth:
- AFS protection database
- Who can create?
- User-scoped groups can be created by anyone; system-scoped groups
can only be created by members of the group
system:administrators
. - Who can modify?
- AFS protection groups can be created, modified, examined, listed,
and deleted by the group’s owner, using the
pts
utility. In addition, AFS groups have a set of access flags that control whether list members, or any authenticated user, can add members or themself, remove members or themself, and view the group membership. - Special restrictions:
- Must be unique with all existing protection group, user, and mail
alias names;
.
(dot) characters and leading hyphens are not allowed;:
(colon) characters are forbidden except where mandatory. - Relevant commands:
pts add
,pts remove
,pts examine
,pts members
;fs listacl
,fs setacl
AFS protection groups are not to be confused with AFS group volumes as described above; both “group” and “project” volumes will ordinarily correspond one-to-one with an AFS protection group. (Same naming, different types of objects: one is a storage volume and one is a principal for access control purposes.)
Note that the AFS protection database is globally accessible, and some information about users and groups may be queried by unauthenticated remote users.
CSAIL practice for AFS groups generally divides them into two classes:
“self-administered” groups allow any group member to add or remove
other group members, but the group is formally owned by
system:administrators
(that is, TIG sysadmins) so that it cannot be
accidentally deleted; “regular” groups (which are actually the
minority) are owned by a separate -admin
group, which is itself
self-administered.
Self-administered groups have access flags S-Mar
; regular groups
will normally be S-M--
.
(To ensure uniqueness, an INQUIR user with relation type Namespace reservation
should be created for each system-level AFS protection
group, including the -admin
groups if used.)
Autofs mount points
- Used for:
- Automatically mounting centrally managed NFS filesystems
- Visible in:
/data
and/archive
; defined in/etc/auto.d/auto.data
and/etc/auto.d/auto.archive
- Source of truth:
- Puppet servers; TIG-maintained configuration management repository
- Who can create?
- TIG sysadmins
- Who can modify?
- TIG sysadmins
- Special restrictions:
- TIG recommends avoiding capital letters and underscores
CSAIL NFS uses the Linux automounter, autofs
, to provide for some
flexibility in assignment of filesystems to NFS servers.
CSAIL NFS paths start with /data
or /archive
, and the second
component of the pathname is an arbitrary string that normally
identifies the group, project, or principal investigator.
What is mounted on that path may be an actual filesystem, but more
commonly it will be a second-level automount map, which then in turn
mounts the individual filesystems.
(Some groups have more complex structures.)
There is no technical significance to the name; it is merely an
organizational convenience to group together filesystems belonging to
the same PI or activity.
Unix/LDAP groups
- Used for:
- NFS and local access controls on CSAIL Ubuntu servers and workstations; some web applications
- Visible in:
ldap.csail.mit.edu
directory; Unix system databases;/afs/csail.mit.edu/service/inquir
- Source of truth:
- INQUIR cross-walked with the AFS protection database
- Who can create?
- TIG sysadmins
- Who can modify?
- Anyone who can modify the underlying AFS protection group
- Special restrictions:
- Same as for system-scope AFS protection groups
- Relevant commands:
getent
,groups
,id
,ldapsearch
In addition to the normal filesystem access control function, Unix
groups are also used for system access controls, enforced for local
logins, SSH logins, and sudo
, each using different requirements
logic and configuration files.
At present, the process that crosswalks INQUIR with the AFS protection
database and the process that synchronizes LDAP with the result are
run by two separate (and unsynchronized) cron
jobs.
The crosswalk runs every 15 minutes, but the LDAP update (which is
much slower) only runs every half hour, so there is a maximum
propagation delay of 45 minutes.
It is planned to fix this some day.
Be aware that NFS access control is performed entirely on the client, and clients are free to lie to the NFS server about the identity of the user; moreover, any superuser on any client can change their group IDs to arbitrary values.
Also note that the CSAIL LDAP directory is public and globally
accessible, including all group memberships.
The information in the directory is also globally readable via AFS in
/afs/csail.mit.edu/service/inquir
.
The selection of which AFS protection groups are reflected into Unix
groups is gated on the existence of an INQUIR user with relation
type Magic AFS group pseudo-user
; the underlying AFS group must
exist and have at least one member before the INQUIR entry is added.
Web server groups
groups.csail.mit.edu
- Used for:
- Web sites
- Visible in:
https://groups.csail.mit.edu/([[:alnum:]][-_[:alnum:]]*)/
- Source of truth:
/afs/csail.mit.edu/group/$1/www/data
- Who can create?
- TIG system administrators
- Who can modify?
- Determined by the AFS access list on each directory
- Special restrictions:
- As defined for AFS group volumes (see above); should also be valid as a label in a DNS name and not require encoding in a URL.
The shared web server groups.csail.mit.edu
was created with the
classic LCS “research group” model in mind, as contrasted with
smaller “projects” served from projects.csail.mit.edu
.
Today, the division between “groups” and “projects” in CSAIL web space is
fairly arbitrary, but the intuition at least should be
that a “group” has more people and is longer-lived than a “project”.
The groups
web server always serves content from the AFS directory
/afs/csail.mit.edu/group/$1/www/data
, where $1
is the first
pathname component in the URL; this directory structure is created by
TIG (and configured with read-only access for the web server) as a
part of creating an AFS group volume (see above).
CSAIL OpenID Connect
- Used for:
- Web site access control
- Visible in:
groups
claim returned by oidc.csail.mit.edu- Source of truth:
- LDAP directory
- Who can create?
- See Unix/LDAP groups, above
- Who can modify?
- See Unix/LDAP groups, above
- Special restrictions:
- None
Any web application that uses the CSAIL OpenID Connect service for user
authentication can request the groups
scope.
(This includes TIG-supported shared web servers such as groups
,
projects
, and people
, as well as applications like
WebDNS.)
If the authenticated user approves the release of this information,
the application will receive a groups
claim when querying the OIDC
user information endpoint, which will consist of an array of group
names to which the user belongs.
There is no inherent meaning assigned to this information, and it is
not a standard OpenID Connect scope or claim, but applications or
access lists may be written to make use of it.
Note that the OIDC servers cache the results of the LDAP query, and therefore the information returned can be stale.