Difference between revisions of "Production access"

From Wikitech-static
Jump to navigation Jump to search
m (→‎Security: Shorten sentence)
(28 intermediate revisions by 21 users not shown)
Line 1: Line 1:
''For instructions on accessing public Cloud VPS and Toolforge instances, see [[Help:Access]].''
#REDIRECT [[SRE/Production access]]
'''Production''' (sometimes called '''prod''') is the network of servers that run the real, live [[metawiki:Wikimedia_projects|Wikimedia websites]]. Access to production is necessary for [[Deployments|deploying updates]] and other [[w:Site reliability engineering|site reliability engineering]] work, as well as for [[Analytics/Data access|accessing sensitive data]]. This page explains how to request and set up this access.
'''Remember: production access is extremely sensitive'''. With production access, it's possible to break our websites or steal private data about users' activities. If you have access, act carefully and take [[phab:L3|the server access responsibilities]] seriously. Immediately [[SRE Team requests|contact the SRE team]] if you have any doubts about security or if something goes wrong.
== Eligibility ==
Production access is regulated by [[mw:Wikimedia Site Reliability Engineering|SRE team]]; they grant it only when it is strictly needed and can deny any request that that creates an unacceptable security risk.
A request that is likely to be granted needs to contain three main things:
# A '''clear, ongoing need''' for the access. Requests based on a one-time need will not be granted. if you have a one-time need for data, request the data instead.
# A '''non-disclosure agreement''' with the Wikimedia Foundation. If you work for the Foundation, this was probably included in your employment agreement. Otherwise, you'll have to follow [[Volunteer NDA#Privileged LDAP or shell access|the volunteer NDA process]].
# '''Support from a relevant Wikimedia Foundation employee'''. If you work for the Foundation, this should be your supervisor; otherwise, this should be the employee you will be collaborating with.
== Request process ==
[[File:Gamagory shell museum2 2004.jpg|thumb|right|Shells!]]If you've satisified the eligibility requirements above, you can make the access request by following these instructions.
=== Accounts ===
To follow these instructions, you'll need the following accounts:
* A [[phab:|Phabricator]] account. If you don't have one, see the instructions [[mw:Phabricator/Help#Creating your account|for creating an account on mediawiki.org]].
* A [[Help:Create a Wikimedia developer account|Wikimedia developer account]]. If you don't have one, follow the link.
=== Signing the agreement ===
Next, read and sign the [[phab:L3|Acknowledgement of Wikimedia Server Access Responsibilities]]. Make sure you actually read it; this is a legal agreement and by signing it, you are committing to follow the security practices it describes.
=== Generating your SSH key ===
Since production access uses the [[:en:Secure_Shell|Secure Shell protocol]] (SSH), you'll have to generate a '''new''' SSH keypair. Do '''not''' reuse an existing key; this presents an unacceptable security risk.
GitHub has a [https://help.github.com/articles/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent/#platform-mac good help page] (note that you can switch between Mac, Windows, and Linux documentation right under the title).
We recommend that you use an ED25519 key (or, alternatively, a 4096-bit RSA key). Do ''not'' use DSA keys as they are insecure and rejected by our SSH servers.
To generate an ED25519 key, run the following command in your terminal:
<syntaxhighlight lang="bash">
ssh-keygen -t ed25519
To generate an RSA key, run the following command in your terminal:
<syntaxhighlight lang="bash">
ssh-keygen -t rsa -b 4096 -o
''Some systems don't support the newer <code>-o</code> option which saves private keys in a slightly more secure format (OpenSSH rather than PEM), but those should be fairly rare, it was introduced in 6.5''
The minimum bit length for this key is <code>-b 2048</code>, which is currently the default length for OpenSSH.  More bits won't hurt.
Remember: the key you use for production access must be different from the key you use for [[Portal:Cloud VPS|Cloud VPS]], so do '''not''' paste it into the Openstack field under Special:Preferences on this wiki).
=== Filing the request ===
#[[phab:maniphest/task/edit/form/8/|Create a ticket requesting access]].<ref name=":0">The form automatically adds the ticket to the [[phab:tag/ops-access-requests/|Ops-Access-Requests]] project so the SRE team will see your request.</ref>
## In the title, replace "RESOURCE" and "USER" with your name and the resource you need access to. (For new user requests, make a separate ticket for each user.)
## Add the following information to the description:
##* Your full name
##* Your [[mediawikiwiki:Developer_access|developer access]] username (that is, the one you use for Cloud VPS SSH, not Wikitech login. Wikitech shows this as "instance shell account name" in [[Special:Preferences|preferences]]). We will use this as your production shell username.
##* The public key from new your SSH keypair.<ref>You can also put your public key on your wiki user page, in a Phabricator paste, or in a Gerrit patchset you upload, but you can't include it in an email reply to the task.</ref>
##* A detailed reason for your request. In particular, describe which specific servers you need access to and why. We err on the side of giving fewer permissions rather than more, so the more detailed your request, the more likely you are to get all the permissions you need.
# Get approvals from the following people as comments to the Phabricator task. The comments should be made directly through the web interface, not via email.<ref>This protects against [[w:Email_spoofing|email spoofing]].</ref>
#* The relevant Wikimedia Foundation employee, as explained above.
#* The project lead where your access will be granted.
# For most requests, a three business day waiting period must be observed after the request is filed.<ref name=":1">If you request any level of [[w:Sudo|sudo]] privileges, your request must have a security review at a weekly SRE meeting. Sudo access is granted on an extremely limited basis, and will typically apply to the smallest permissions possible (user/process restricted over all). Expect this process to take at least one business week.</ref>
#*This may not be required when the request is to correct a previous request, but should be followed for escalations that include not previously approved permissions. It may not be required in some other circumstances.
# When your request is approved, you will be asked to provide your full legal name, preferred email address for contact, and physical address to the Wikimedia Foundation Legal team (or your employee contact may forward this information on your behalf). This information will be used to customize a non-disclosure agreement, which you will be asked to read, comprehend, and electrically sign through the Foundation's contract management system. The agreement will be similar to the [[Volunteer NDA]].
# The Wikimedia Foundation employee that will be supervising your work will coordinate final sign off by an [[foundation:Delegation of authority policy#Schedule of Financial Delegations Authority|Executive level staff of the Wikimedia Foundation]] when all other criteria have been met before your access is granted.
# Shell access and access to private data are different things. Access to data is granted to volunteers only if they have a formal collaboration with the research team.
If you feel an unreasonable amount of time has passed, you can comment on the ticket to request update and/or request an update directly from the Operations team member on [[Ops Clinic Duty]] that week.
=== Technical details ===
Production shell users, their keys, and their permissions are managed in <code>[[phab:diffusion/OPUP/browse/production/modules/admin/data/data.yaml|modules/admin/data/data.yaml]]</code> in the ''operations/puppet.git'' repository.
== Setting up your access ==
===Setting up your SSH config===
The standard configuration for people not having root access is to have the ssh connection to be established on a bastion and proxy the command to the target host inside the cluster. To do this, add the following to your SSH config file (usually located at ''$HOME/.ssh/config''):
<syntaxhighlight lang="apache">
# Configure the initial connection to the bastion host, with the one HostName closest to you
Host bast
    User your_username_here
    HostName bast1002.wikimedia.org
    IdentityFile ~/.ssh/your_production_ssh_key
    ForwardAgent no
    IdentitiesOnly yes
# Proxy all connections to internal servers through the bastion host
Host *.wmnet
    User your_username_here
    ProxyCommand ssh -W %h:%p bast
    IdentityFile ~/.ssh/your_production_ssh_key
    ForwardAgent no
    IdentitiesOnly yes
In the example above you may replace ''bast1002.wikimedia.org'' with the bastion that is physically closest to you:{{BastionMap|caption=1}}
===Advanced: operations config===
If you will be setting up new servers or doing other administration work, you can use the below advanced configuration instead. Otherwise, skip this section. If you're not sure, you almost certainly don't need this!
{{Collapse top|Advanced $HOME/.ssh/config for production root users}}
<syntaxhighlight lang="apache">
## Production & External Zones
Host iron.wikimedia.org bast1002.wikimedia.org bast2002.wikimedia.org bast3004.wikimedia.org bast4002.wikimedia.org bast5001.wikimedia.org bastion-restricted.wmflabs.org
    StrictHostKeyChecking yes
    ProxyCommand none
    ControlMaster auto
    IdentitiesOnly yes
Host *.wikimedia.org !gerrit.wikimedia.org !git-ssh.wikimedia.org
    User your_username_here
    StrictHostKeyChecking yes
    IdentitiesOnly yes
    IdentityFile ~/.ssh/your_production_ssh_key
    UserKnownHostsFile ~/.ssh/known_hosts.d/wmf-prod
    ProxyCommand ssh -a -W %h:%p bast1002.wikimedia.org
## Internal Zones
Host *.mgmt.eqiad.wmnet *.mgmt.codfw.wmnet *.mgmt.ulsfo.wmnet *.mgmt.esams.wmnet *.mgmt.eqsin.wmnet
    User root
    StrictHostKeyChecking no
Host *.wmnet
    User your_username_here
    StrictHostKeyChecking yes
    IdentitiesOnly yes
    IdentityFile ~/.ssh/your_production_ssh_key
    UserKnownHostsFile ~/.ssh/known_hosts.d/wmf-prod
Host *.eqiad.wmnet
    ProxyCommand ssh -a -W %h:%p bast1002.wikimedia.org
Host *.codfw.wmnet
    ProxyCommand ssh -a -W %h:%p bast2002.wikimedia.org
Host *.esams.wmnet
    ProxyCommand ssh -a -W %h:%p bast3004.wikimedia.org
Host *.ulsfo.wmnet
    ProxyCommand ssh -a -W %h:%p bast4002.wikimedia.org
Host *.eqsin.wmnet
    ProxyCommand ssh -a -W %h:%p bast5001.wikimedia.org
## Networking Equipment
Host *-eqiad.wikimedia.org *-eqord.wikimedia.org
    ProxyCommand ssh -a -W %h:%p bast1002.wikimedia.org
Host *-codfw.wikimedia.org *-eqdfw.wikimedia.org
    ProxyCommand ssh -a -W %h:%p bast2002.wikimedia.org
Host *-esams.wikimedia.org *-knams.wikimedia.org
    ProxyCommand ssh -a -W %h:%p bast3004.wikimedia.org
Host *-ulsfo.wikimedia.org
    ProxyCommand ssh -a -W %h:%p bast4002.wikimedia.org
Host *-eqsin.wikimedia.org
    ProxyCommand ssh -a -W %h:%p bast5001.wikimedia.org
## Gerrit and Cloud VPS
Host gerrit.wikimedia.org
    User your_username_here
    StrictHostKeyChecking yes
    ProxyCommand none
    IdentitiesOnly yes
    IdentityFile ~/.ssh/your_development_ssh_key
    UserKnownHostsFile ~/.ssh/known_hosts.d/wmf-cloud
Host *.wmflabs.org *.wmflabs
    User your_username_here
    IdentityFile ~/.ssh/your_development_ssh_key
    StrictHostKeyChecking no
    UserKnownHostsFile ~/.ssh/known_hosts.d/wmf-cloud
    ProxyCommand ssh -a -W %h:%p bastion-restricted.wmflabs.org
{{Collapse bottom}}
==== Known host files ====
To ensure the validity of the hosts you connect to, enable the ''StrictHostKeyChecking yes'' option and create a local list of known hosts. A [[phab:P5608| utility script is available]] to generate that list and keep it up to date. Read the instructions in the script's header for help on usage. If you need any additional help, contact the script's author.
Before you can use the script, you'll need to bootstrap this setup with at least one bastion host.  Disable strict host key checking, ssh to a bastion, and make sure the fingerprint matches what's listed at [[Help:SSH Fingerprints]].
=== Security ===
{{see also|Help:SSH Fingerprints}}
Do ''not'' use SSH agent forwarding (the <code>-A</code> command line option). Agent forwarding does not make it possible to steal your private key itself, but it does make it possible for someone to hijack your SSH agent and thus your identity, so we do not do it. The <code>-a</code> option (with a lower case "a") ''disables'' agent forwarding, and is thus included in the sample configurations below.
This page used to recommend that you add the following lines to protect against [https://www.debian.org/security/2016/dsa-3446 an SSH bug from 2016], '''don't use any longer''':
<div style="border: 2px solid #d33; padding: 0 16px;>
<syntaxhighlight lang="apache" class="tpl-code-negative">
Host *
    UseRoaming no
However, we are now using an updated version which removed the vulnerable options, so '''you will get an error if your config includes the lines above'''. Just remove them from your config to connect.
Do not use your production cluster SSH key for any other service, including Gerrit or Cloud VPS.
=== Other tips ===
* [[Fundraising/tech/ssh config|Fundraising infrastructure config]]
* [https://phabricator.wikimedia.org/P433 Greg Grossmeier's SSH config]
* [[Managing multiple SSH agents]]
* [https://docs.google.com/document/d/1BwB92e-wNc-y6c5DYfBj7ZxdRFmYlKa-ijzp4t-2f0c/edit Notes] on configuring SSH for production shell access (for the purpose of working with the stats servers stat1002/3/4), by [[User:Zareen|Zareen]]
* [https://people.wikimedia.org/~dzahn/bastion.sh.txt (experimental) Bash script to detect the correct bastion and auto-fix SSH config]
== Debugging ==
If your production access has been approved but you aren't able to log in, you can ask for help in the Phabricator ticket for your access request. If you got access a long time ago and it's a new problem, you can file a new ticket and tag it with [[phab:tag/operations/|#operations]].
Wherever you ask for help, make sure you include your SSH configuration (but not your key itself!) and the output you get when you run your ssh command with the <code>-v</code> option (verbose mode).
If you are prompted for a password when attempting to SSH into production, it generally means that your client is misconfigured -- most often you are presenting the wrong public key to the server.  <code>ssh -v</code> can help you debug this.  When debugging, in order to keep things clear, it's best to attempt to connect directly to a bastion host, e.g. <code>ssh -v bast1002.eqiad.wmnet</code>.
== See also ==
* [[Help:Access]] for instructions on accessing Cloud VPS and Toolforge instances
* [[Help:SSH Fingerprints]] for fingerprints of ssh bastion servers
* [[Proxy access to cluster]] for direct web access to production servers behind the firewall
* [[Yubikey-SSH]] and [[Yubikey4 and gpg-agent]] for instructions on using a YubiKey device to manage your ssh key
* [[Managing multiple SSH agents]] for help configuring separate ssh-agent instances for different security realms
* [[Fundraising/tech/ssh config]] for help configuring ssh for access to hosts in the ''frack'' environment
== Notes ==
<references />
[[Category:Operations policies]]

Latest revision as of 15:35, 1 September 2021