You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Portal:Data Services/Admin/Runbooks/Enable NFS for a project: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Bstorm
(categories)
 
imported>Andrew Bogott
No edit summary
 
Line 2: Line 2:


=== Overview ===
=== Overview ===
[[Portal:Data_Services/Admin/Shared_storage|NFS]] is the primary shared storage system for projects in Cloud VPS and is the main platform for users to place code on the Toolforge execution environment. When a Cloud VPS project would like to use shared storage for one reason or another, we provide a fairly simple path for them to do so. Generally all this will be done in response to a ticket.
[[Portal:Data_Services/Admin/Shared_storage|NFS]] is the primary shared storage system for projects in Cloud VPS and is the main platform for users to place code on the Toolforge execution environment. When a Cloud VPS project would like to use shared storage for one reason or another, we provide a fairly simple path for them to do so. Generally all this will be done in response to a ticket.


=== Server setup ===
We use NFS for several purposes within cloud-vps: dumps, scratch, $home, and /data/project. Dumps and scratch services use exiting NFS servers that are shared across projects; when support $home or /data/project NFS you will need to start by creating a new NFS server within the requesting project.
Unless special arrangements are made. The server to set up on is labstore1004 because the misc volume is there. All non-tools and non-maps shares end up on the misc volume.
=== Server creation ===
{{Note|If a project only requires access to the shared 'scratch' service or content dumps, skip this section -- those servers already exist.}}
 
There are ready-made cookbooks for creating a new project-local NFS server; those cookbooks are documented at [[Portal:Data_Services/Admin/Runbooks/Create_an_NFS_server]]. You may also want to also create a second, failover server if users of the project are sensitive to downtime.
 
Typically for a project named 'foo' you would create a server with the prefix 'foo-nfs' and the volume name 'foo'.
 
Scaling up to a larger VM (e.g. with more cores or RAM) is fairly easy, so feel free to start with a small-sized server flavor. Scaling up the storage size of an NFS server should also be fairly straightforward but it's best to leave some slack in the first place.
 
Once the new NFS server is built, note the path to the NFS volume. It will be something like /srv/foo. If you are planning to host multiple shares from this server (e.g. both $home and /data/project) create subdirs for those shares:  /srv/foo/home and /srv/foo/project.
 
Note that the new NFS server will not launch an NFS server OR export any shares until the yaml configuration step, below.


==== Find out the GID for the project ====
==== Find out the GID for the project ====
On labstore1004, you can run:
The NFS server will want to know the project GID. On any cloud VM, you can run:
<code>$ useldap getent group project-$project_name</code>
<code>$ getent group project-$project_name</code>


{{Note|content=A labstore1004 does not lookup in LDAP by default. 'useldap' is a script that will fix that for you. Also any CloudVPS project is 'project-$name' in ldap.}}
==== Update yaml config in Puppet ====
example:
Add a section for the new project to modules/labstore/templates/nfs-mounts.yaml.erb with the project gid and whichever mounts are required. An entry for a volume that mounts everything we've got would look like this:
<syntaxhighlight lang="shell-session">
user@labstore1004:~$ useldap getent group project-wikilink
project-wikilink:*:54031:nskaggs,samwalton9,suecarmol,jsn,novaadmin,crucio
</syntaxhighlight>


==== Create the share folders ====
  testlabs:
Create the shared folders on labstore1004 as ''/srv/misc/shared/$project_name/home'' and ''/srv/misc/shared/$project_name/project'' as appropriate. Leave ownership with "root:root". That is normal since we don't root squash projects.
    gid: 50302
    mounts:
      dumps: true
      home: testlabs-nfs.svc.testlabs.eqiad1.wikimedia.cloud:/srv/testlabs/home
      project: testlabs-nfs.svc.testlabs.eqiad1.wikimedia.cloud:/srv/testlabs/project
      scratch: scratch.svc.cloudinfra-nfs.eqiad1.wikimedia.cloud:/srv/scratch


=== Puppet setup ===
Dumps is a special case and only needs to be set to 'true' to work. Any other mount must specify the path to the nfs server followed by the path to the share on the server. Rather than using a specific VM's fqdn you can use the service name which will ease an future failovers. Any server created using the wmcs.nfs.add_server cookbook with the --service-ip flag will already have a service address associated with it which you can find via the horizon dns->zones interface.


Add it to modules/labstore/templates/nfs-mounts.yaml.erb ([[gerrit:498893|example]], [[gerrit:618390|another example]])
Once puppet is patched, run <code>run-puppet-agent</code> on labstore1004. This will trigger nfs-exportd's configuration changes and restart it. That should create a new file for the project under <code>/etc/exports.d</code> on labstore1004 that will be configured with the project's ips. There should also be a lot of 'nfsd' processes running on the server.
 
Once puppet is patched, run <code>sudo puppet agent -t</code> on labstore1004. This will trigger nfs-exportd's configuration changes and restart it. That should create a new file for the project under <code>/etc/exports.d</code> on labstore1004 that will be configured with the project's ips.


Note that this same yaml file is also consumed by the NFS client hosts: it tells them what to mount.


=== Enabling on the VMs ===
=== Enabling on the VMs ===
Line 33: Line 45:


Users can be instructed to do this step themselves. This will also enable tc traffic shaping on the VM client which will not remove itself if NFS is later removed. Setting <code>mount_nfs: false</code> will not remove NFS mounts. You must do that by hand after changing hiera.
Users can be instructed to do this step themselves. This will also enable tc traffic shaping on the VM client which will not remove itself if NFS is later removed. Setting <code>mount_nfs: false</code> will not remove NFS mounts. You must do that by hand after changing hiera.
== Historical notes for bare-metal pre-NFS servers ==
As of 2022-02-15 we are rapidly moving cloud-vps projects onto project-specific NFS servers. Straggler projects may still use the bare-metal servers but any new projects should follow the newer docs, above.
Old $home and /data/project mounts were stored on labstore1004; the old maps and scratch mounts were on cloudstore1009 in /srv/maps and /srv/scratch.


=== Support contacts ===
=== Support contacts ===

Latest revision as of 23:10, 15 February 2022

All procedures in this runbook require admin permissions to complete.

Overview

NFS is the primary shared storage system for projects in Cloud VPS and is the main platform for users to place code on the Toolforge execution environment. When a Cloud VPS project would like to use shared storage for one reason or another, we provide a fairly simple path for them to do so. Generally all this will be done in response to a ticket.

We use NFS for several purposes within cloud-vps: dumps, scratch, $home, and /data/project. Dumps and scratch services use exiting NFS servers that are shared across projects; when support $home or /data/project NFS you will need to start by creating a new NFS server within the requesting project.

Server creation

There are ready-made cookbooks for creating a new project-local NFS server; those cookbooks are documented at Portal:Data_Services/Admin/Runbooks/Create_an_NFS_server. You may also want to also create a second, failover server if users of the project are sensitive to downtime.

Typically for a project named 'foo' you would create a server with the prefix 'foo-nfs' and the volume name 'foo'.

Scaling up to a larger VM (e.g. with more cores or RAM) is fairly easy, so feel free to start with a small-sized server flavor. Scaling up the storage size of an NFS server should also be fairly straightforward but it's best to leave some slack in the first place.

Once the new NFS server is built, note the path to the NFS volume. It will be something like /srv/foo. If you are planning to host multiple shares from this server (e.g. both $home and /data/project) create subdirs for those shares: /srv/foo/home and /srv/foo/project.

Note that the new NFS server will not launch an NFS server OR export any shares until the yaml configuration step, below.

Find out the GID for the project

The NFS server will want to know the project GID. On any cloud VM, you can run: $ getent group project-$project_name

Update yaml config in Puppet

Add a section for the new project to modules/labstore/templates/nfs-mounts.yaml.erb with the project gid and whichever mounts are required. An entry for a volume that mounts everything we've got would look like this:

 testlabs:
   gid: 50302
   mounts:
     dumps: true
     home: testlabs-nfs.svc.testlabs.eqiad1.wikimedia.cloud:/srv/testlabs/home
     project: testlabs-nfs.svc.testlabs.eqiad1.wikimedia.cloud:/srv/testlabs/project
     scratch: scratch.svc.cloudinfra-nfs.eqiad1.wikimedia.cloud:/srv/scratch

Dumps is a special case and only needs to be set to 'true' to work. Any other mount must specify the path to the nfs server followed by the path to the share on the server. Rather than using a specific VM's fqdn you can use the service name which will ease an future failovers. Any server created using the wmcs.nfs.add_server cookbook with the --service-ip flag will already have a service address associated with it which you can find via the horizon dns->zones interface.

Once puppet is patched, run run-puppet-agent on labstore1004. This will trigger nfs-exportd's configuration changes and restart it. That should create a new file for the project under /etc/exports.d on labstore1004 that will be configured with the project's ips. There should also be a lot of 'nfsd' processes running on the server.

Note that this same yaml file is also consumed by the NFS client hosts: it tells them what to mount.

Enabling on the VMs

Utilize hiera key mount_nfs to opt-in / out. (e.g. mount_nfs: true) The default is false at this time. A puppet run after the above work is completed on a VM with this key set to true will mount the NFS as specified.

Users can be instructed to do this step themselves. This will also enable tc traffic shaping on the VM client which will not remove itself if NFS is later removed. Setting mount_nfs: false will not remove NFS mounts. You must do that by hand after changing hiera.

Historical notes for bare-metal pre-NFS servers

As of 2022-02-15 we are rapidly moving cloud-vps projects onto project-specific NFS servers. Straggler projects may still use the bare-metal servers but any new projects should follow the newer docs, above.

Old $home and /data/project mounts were stored on labstore1004; the old maps and scratch mounts were on cloudstore1009 in /srv/maps and /srv/scratch.

Support contacts

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Receive mail announcements about critical changes
Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
Track work tasks and report bugs
Use the Phabricator workboard #Cloud-Services for bug reports and feature requests about the Cloud VPS infrastructure itself
Learn about major near-term plans
Read the News wiki page
Read news and stories about Wikimedia Cloud Services
Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)

Related information

Portal:Data Services/Admin/Shared storage