You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Preliminary design decisions
In line with other in-house build web-application Django has been picked as the framework for the IDM solution.
Core LDAP functionality will be implemented in a separate library, which can be reused for other LDAP tools. The goal of the LDAP wrapper library is to abstract the LDAP searches and command away, and instead present user and group objects for the developers to interact with. See Gerrit for code: https://gerrit.wikimedia.org/g/operations/debs/python-wmf-ldap
Safety is a major concern and the IDM must not become an entry point for an attack to escalate privileges. In the initial implementation certain operations will require manual interaction from a member of the SRE team. E.g. access to servers will require a manual review and merge of a Gerrit pull-request.
Initial use cases for IDM
This document describes the initial use cases for the IDM implementation. The goal is to identify three, or more, frequently occurring tasks regarding access rights and management. These are to serve as the guidelines for the initial implementation of the IDM solution.
Major use cases for the wikitech IDM page
The use cases below will serve as starting points of design and implementation of the IDM solution. Each use case is split in a "as is" and "to be", representing the current state or workflow, and the desired outcome.
Account creation currently happens via an extension (https://www.mediawiki.org/wiki/Extension:LDAP_Authentication) to the standard MediaWiki account creation (since it also asks for a shell username) plus the OpenStackManager extension which is installed on the servers running https://wikitech.wikimedia.org. This MediaWiki installation is different from the main MediaWiki installation powering the wikis, but there’s the intent to eventually make wikitech a standard wiki like the rest, (tracked as https://phabricator.wikimedia.org/T161553 and https://phabricator.wikimedia.org/T161859) Notably the initial purpose of the OpenStackManager extension is mostly gone by now and has shifted towards OpenStack-internal management UIs (Horizon).
This account creation mechanism applies both to community members, but also to staff members of the Wikimedia Foundation who are working in a technical capacity.
This account creation generates a user within the main Wikimedia LDAP directory, it allocates a UID and also allows the configuration of SSH public keys used to log into Toolforge and Wikimedia Cloud VPS. It also sets the person’s email address.
The user ID used is also the same user ID that is going to be used when logging into the Wikimedia production clusters (which is done by WMF technical staff and community members), but the configuration of SSH access isn’t configured within the extension and needs a separate Phabricator task (see next section).
On the login to the Wikimedia IDM page, the user gets offered the choice to create an account or login (see next section).
When creating a new account, the following values can be set:
- Wikimedia Developer account name
- Given name or pseudonym
- Email address
- Password (needs to be given twice) (enforced with a password policy)
- Mediawiki SUL (standard unified login) account name (or can be done later)
After submitting an account request, the account would be put on hold until a confirmation email is received by the user and some confirmation link is clicked. This makes an email address mandatory, but it seems like an acceptable tradeoff given that it’s important to have a reliable communication channel to the user (e.g. in case of an account compromise, to announce invasive maintenance etc.). Anyone who for some reason does not wish to reveal their mail address to us, can still create a secondary one. With account creation a UID is allocated for the user.
Once the user has logged in the first time with the confirmed account name, additional attributes can be configured to the ones required for account creation:
- Given name or pseudonym
- An optional user affiliation (could be “Wikimedia staff member”, “Community member with NDA”, ”Researcher working with Wikimedia under MOU” and others. Depending on the affiliation some attributes are mandatory or not (e.g. we require the given name for any staff member, but don’t for community members)
- Email address (mandatory, changing it re-triggers the confirmation mail)
- Password change (needs to be given twice) (enforced with a password policy)
- Mediawiki SUL account name
- IRC username
- Phabricator username (initially this would reliably map a user towards a Phabricator name, but a future buildout step could be to have all Phabricator accounts rely on that WDA account (currently they are also linked to Mediawiki SUL accounts via OAuth
For all changes a history should be logged (within reasonable limits, e.g. with setting were changed at what time)
Gaining addtional access needed for a role
Requesting and granting access to additional LDAP roles and/or shell group access.
Let’s consider the new user is working in a Wikimedia department which needs to access the Hadoop cluster and various PII-sensitive web services, such as Turnilo.
After enabling the account the staff member opens a task within Phabricator tagged User-Access-Requests. This ends up on the radar of someone within Site Reliability Engineering which operates a weekly rotation called Wikimedia Clinic Duty. The ticket is processed and the following steps are taken:
- Validate that the user is a WMF staff member
- The user needs to approve the “Acknowledgement of Wikimedia Server Access Responsibilties” by accessing https://phabricator.wikimedia.org/L3 and confirming it (and the SRE handling the access request needs to confirm it)
- Get confirmation from the person’s manager (after figuring who the person’s manager is)
- Get confirmation from the service owner of the service to request access for (here the Data Engineering team which operates Hadoop)
- The user needs to provide separate SSH public keys for the SSH access to the Wikimedia production cluster
- Since the LDAP does not contain a full name, the full name needs be provided (or alternatively a pseudonym in select cases)
- If the person is a contractor or a researcher with a time-limited contract/MOU the designated end date needs to be noted (along with a point of contact who to ping when the end date is nearing to check for extension or account revocation)
Once all the data is in place, access to the cn=wmf LDAP group is granted by the SRE (by running a CLI tool or using the OpenLDAP tools). This enables access to Turnilo and other services guarded by that group.
The SRE handling the access request pushes a git change against a YAML data structure maintained in the Puppet git repository, which adds the new SSH key for the user (along with the rest of the data provided on task) and adds the user to the access group which enables Hadoop (here “analytics-privatedata-user”).
After logging into Wikimedia IDM: In addition to the base attributes, the user can request access to “account profiles” (this is an interim name, “role” seems like a more fitting name, please leave further feedback in the comments) which enable additional attributes which can be configured. Adding/removing an account profile can also add/remove a user from an LDAP group.
Every profile can declare a validation module which checks internal state (e.g. for the SSH keys it’s checked that certain key type requirements are upheld and that the Cloud VPS keys are distinct from the production keys). Every profile has a group of owners/administrators. If someone asks for a specific profile it can either be granted automatically (e.g. in the case of SSH access to Cloud VPS) or the request is added to a queue where it needs to be approved or rejected by the team or person managing a profile. The profile owner can also leave comments, e.g. to communicate that additional steps are necessary.
For the specific example of Hadoop access, the user would log into the Wikimedia IDM and request the “Hadoop” role. This would cause the following internal steps:
- If the user affiliation is “Wmf staff” or if there’s a pre-existing NDA (e.g. for a researcher or has a community NDA), all is fine and we proceed. Otherwise, they are asked to initiate the NDA process (description is out of scope for this section)
- It gets validated whether the “Acknowledgement of Wikimedia Server Access Responsibilities” are approved by the user and otherwise is asked to “sign” them.
- The person’s manager gets a notification via email and is asked to confirm that the user should have access to Hadoop (happens by logging into the IDM and confirming)
- The service owner for Hadoop manager gets a notification via email and is asked to confirm that the user should have access to Hadoop (happens by logging into the IDM and confirming)
- It’s checked whether the user provided SSH public keys for the SSH access to the Wikimedia production cluster and otherwise is prompted to provide them
- The IDM generates a Puppet patch against the YAML structure in Wikimedia Puppet and the resulting patch is reviewed and merged by an SRE member.
- If LDAP access is also requested (which would be the case for Hadoop), the user gets added to the respective LDAP group.
Disabling an account
Access is revoked under the following circumstances:
- A WMF staff member departs to a new position
- A time-limited access expired because the project is completed (e.g. a MOU for researchers or an internship or a short term contract for some work)
- A community member is no longer interested in contributing
- As an emergency measure for security reasons (e.g. a stolen laptop or indications of account compromise)
The current removal policy is that the account itself it kept (since anyone leaving access may still use the account for volunteer work. As such, the current removal process only strips the parts of access which:
- Grant access to the Wikimedia production systems
- Grant access to PII-sensitive groups in Phabricator (e.g. tasks which may contain user data)
- Grant access to PII-sensitive groups in LDAP (such as the group enabling Turnilo)
The removal of access credentials happens via a CLI tool run by SRE (offboard-user) and - if SSH access needs to be revoked) via an additional Puppet patch against the YAML structure mentioned above.
Intended new workflow when all workflow steps of the IDM have been implemented. Not all the features will be implemented at once/immediately, so there will be a mix until all the bits are in place.
If access is to be removed this can be triggered by multiple events (e.b. By the Talent & Culture department of the Wikimedia Foundation notifying about someone departing their job) and the removal happens by an administrative account in the IDM.
By default only sensitive access is stripped (as listed above), but there is also an option to disable an account (by stripping all attributes and marking it as disabled, but ensuring that the UID is never reused).
This can be used e.g. if someone indicates that they’ll never use it again or if an account was created maliciously.
Removing an account makes the necessary changes in LDAP and if SSH shell access is involved, generates a patch against the LDAP structure which gets reviewed and merged by an SRE.
Basic Django project setup
- Scaffolding, starting up a default Django project, laying out project files.
- Overwrite default user model (following Django best practices).
- Configure LDAP authentication for users.
- Default Django sign-in page.
- Landing page, to confirm successful sign-in.
- Sign-up form.
- Email confirmation flow
- Account creation in LDAP.
When linking SUL (Single Unified Logins) with IDM accounts, validation is required. This is to be done, to avoid having unclaimed SUL accounts linked to fraudulently IDM accounts.