You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Lifecycle: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Volans
(Updated workflows of states and related image)
imported>Klausman
Line 284: Line 284:
Examples:
Examples:


  $ sudo -i wmf-auto-reimage-host  -p T206450 rdb2004.codfw.wmnet
  $ sudo -i wmf-auto-reimage-host  -p T206450 somehost2002.codfw.wmnet
  $ sudo -i wmf-auto-reimage -c -a -p T206450 --sequential --sleep 10 mw2288.codfw.wmnet mw2287.codfw.wmnet mw2286.codfw.wmnet
  $ sudo -i wmf-auto-reimage -c -a -p T206450 --sequential --sleep 10 host1001.codfw.wmnet host1002.codfw.wmnet host1003.codfw.wmnet


When the tool prompts you for "IPMI Password," find it in the file <code>management</code> in [https://office.wikimedia.org/wiki/Pwstore Pwstore].
When the tool prompts you for "IPMI Password," find it in the file <code>management</code> in [https://office.wikimedia.org/wiki/Pwstore Pwstore].
Line 317: Line 317:


'''Notes:''' If something happens during reimaging and you need to restart the process, you will need to add the <code>--no-verify --no-downtime</code> options, for instance:
'''Notes:''' If something happens during reimaging and you need to restart the process, you will need to add the <code>--no-verify --no-downtime</code> options, for instance:
  $ sudo -i wmf-auto-reimage-host -a -p T239054 -c --no-verify --no-downtime mw2289.codfw.wmnet
  $ sudo -i wmf-auto-reimage-host -a -p T239054 -c --no-verify --no-downtime somehost.codfw.wmnet


<u>Troubles with the IPMI:</u> [[Management Interfaces]]
<u>Troubles with the IPMI:</u> [[Management Interfaces]]

Revision as of 12:29, 8 January 2021

This page describes the lifecycle of Wikimedia servers, starting from the moment we acquire them and until the time we don't own them anymore. A server has various states that it goes through, with several steps that need to happen in each state. The goal is to standardize our processes for 99% of the servers we deploy or decommission and ensure that some necessary steps are taken for consistency, manageability & security reasons.

This assumes the handling of bare metal hardware servers, as it includes DCOps steps. While the general philosophy applies also to Virtual Machines in terms of steps handling and final status, check Ganeti#VM_operations for the usually simplified steps regarding VMs.

The inventory tool used is Netbox and each state change for a host is documented throughout this page.

States

Server Lifecycle Netbox Racked Power
requested none, not yet in Netbox no n/a
spare INVENTORY yes or no off
planned PLANNED yes or no off
staged STAGED yes on
active ACTIVE yes on
failed FAILED yes on or off
decommissioned DECOMMISSIONING yes on or off
unracked OFFLINE no n/a
recycled none, not anymore in Netbox no n/a

Server transitions

Diagram of the Server Lifecycle transitions
Diagram of the Server Lifecycle transitions * Dashed lines are for the transitions to Failed state. * Red dashed lines highlight the transition Active -> Failed -> Staged to distinguish it from the Staged <-> Failed one.

Requested

  • Hardware Allocation Tech will review request, and detail on ticket if we already have a system that meets these requirements, or if one must be ordered.
  • If hardware is already available and request is approved by SRE management, system will be allocated, skipping the generation of quotes and ordering.
  • If hardware must be ordered, the then DC Operations will gather quotes from our approved vendors & perform initial reviews on quote(s), working with the sub-team who requested the hardware.


Existing System Allocation

See the #Decommissioned -> Staged section below.

  • Only existing systems (not new) use this step if they are requested.
  • If a system must be ordered, please skip this section and proceed to Ordered section.
  • Spare pool allocations are detailed on the #Procurement task identically to new orders.
  • Task is escalated to DC operations manager for approval of spare pool systems.
  • Once approved, the same steps of updating the procurement gsheet & filing a racking task occur from the DC operations person triaging Procurement.

Ordered

  • Only new systems (not existing/reclaimed systems)
  • Quotes are reviewed and selected, then escalated to either DC Operations Management or SRE Management (budget dependent) for order approvals.
  • At the time of Phabricator order approval, a racking sub-task is created and our budget google sheets are updated. DC Ops then places the approved Phabricator task into Coupa for ordering.
  • Coupa approvals and ordering takes place.
  • Ordering task is updated by Procurement Manager (Finance) and reassigned to the on-site person for DC Operations to receive (in Coupa) and rack the hardware.
  • Racking task is followed by DC Operations and resolved.

Post Order

  • An installation/deployment task should be created (if it doesn't already exist) for the overall deployment of the system/OS/service & place in the #operations project.
  • You can include the following steps on this ticket for ease of reference (taken from the entirely of the lifecycle document):
 System Deployment Steps:
  [] - Run the Netbox ProvisionServerNetwork script to provision mgmt and primary IPv4 and mapped IPv6 addressed with related DNS names, primary interfaces (including mgmt and switch), primary cable and vlan.
  [] - Follow DNS/Netbox#Update_generated_records to create and deploy the DNS entries. [link sub-task for on-site work here, sub-task should include the ops-datacenter project]
  [] - system bios and mgmt setup and tested [link sub-task for on-site work here, sub-task should include the ops-datacenter project]
  [] - Run Homer to update the switch interface (description & vlan)
  [] - Update Puppet: install_server module (dhcp and netboot/partitioning) and check site.pp [done via this task when on-site subtasks complete]
  [] - install OS with Server_Lifecycle#Reimage [done via this task when network sub-task(s) complete]
  [] - service implementation [done via this task post puppet acceptance]

Requested -> Spare & Requested -> Planned

Receiving Systems On-Site

  • Before the new hardware arrives on site, a shipment ticket must be placed to the datacenter to allow it to be received.
  • If the shipment has a long enough lead time, the buyer should enter a ticket with the datacenter site. Note sometimes the shipment lead times won't allow this & a shipment notification will instead be sent when shipment arrives. In that event, the on-site technician should enter the receipt ticket with the datacenter vendor.
  • New hardware arrives on site & datacenter vendor notifies us of shipment receipt.
  • Packing slip for delivery should list an RT # & the RT ticket should have been assigned to the on-site technician for receipt at this time.
  • Open boxes, compare box contents to packing slip. Note on slip if correct or incorrect, scan packing slip and attach to ticket.
  • Compare packing slip to order receipt in the RT ticket, note results on ticket.
  • If any part of the order is incorrect, reply on RT ticket with what is wrong, and assign the ticket to the buyer on the ticket.
  • If the entire order was correct, please note on the procurement ticket. Unless the ticket states otherwise, it can be resolved by the receiving on-site technician at that time.
  • Assign asset tag to system, enter system into Netbox immediately, even if not in rack location, with:
  • Device role (dropdown), Manufacturer (dropdown), Device type (dropdown), Serial Number (OEM Serial number or Service tag), Asset tag, Site (dropdown), Platform (dropdown), Purchase date, Support expiry date, Procurement ticket (Phabricator or RT)
    • For State and Name:
      • If host is scheduled to be commissioned: use the hostname from the procurement ticket as Name and PLANNED as State
      • If host is a pure spare host, not to be commissioned: Use the asset tag as Name and INVENTORY as State
  • Hardware warranties should be listed on the order ticket, most servers are three years after ship date.
  • Network equipment has one year coverage, which we renew each year as needed for various hardware.
  • A Phabricator task should exist with racking location and other details; made during the post-order steps above.
  • All systems should have the following common bios/ilom settings set: cpu hyperthreading on, cpu virtulization off (except for virt and ganeti hosts), serial redirection to com2, redirection after post off, boot mode to legacy bios, ipmi enabled, confirm boot order to list disk first, set performance options to OS performance per watt (dells).

Requested -> Planned additional steps & Spare -> Planned

  • A hostname must be defined at this stage:
    • Please see Server naming conventions for details on how hostnames are determined.
    • If hostname was not previously assigned, a label with name must be affixed to front and back of server.
  • Netbox entry must be updated to reflect rack location and hostname
  • Run the Netbox ProvisionServerNetwork script to assign mgmt IP, primary IPv4/IPv6, vlan and switch interface
  • Follow the DNS/Netbox#Update_generated_records to create and deploy the mgmt and primary IPs (for mgmt should include both the $assettag.mgmt.site.wmnet as well as $hostname.mgmt.site.wmnet).
  • Run Homer to configure the switch interface (description, vlan).
  • System Bios & out of band mgmt settings are configured at this time.
    • See the Platform-specific documentation for setup instructions for each system type.
    • Serial Redirection and mgmt must be tested at this time
      • On-site Tech should fully test the mgmt interface to ensure it responds to ssh, they are able to login, reboot the system, and watch a successful BIOS POST over serial console.

Planned -> Staged

Preparation

  • DHCP: Add server to appropriate file in Puppet, based on serial console port and speed:
  • modules/install_server/files/dhcpd/linux-host-entries.ttyS0-9600 = com port 1, speed of 9600
  • modules/install_server/files/dhcpd/linux-host-entries.ttyS0-115200 = com port 1, speed of 115200
  • modules/install_server/files/dhcpd/linux-host-entries.ttyS1-115200 = com port 2, speed of 115200 (most hosts)
  • You can pull this information from the management of most systems, as described in their specific pages under Platform-specific documentation.
  • Decide on partition mapping & add server to modules/install_server/files/autoinstall/netboot.cfg
    • Detailed implementation details for our Partman install exist here.
    • The majority of systems should use automatic partitioning, which is set by inclusion on the proper line in netboot.cfg.
    • Any hardware raid would need to be setup manually via rebooting and entering raid bios.
    • Right now there is a mix of hardware and software raid availability.
    • File located @ puppet modules/install_server.
    • partman recipe used located in modules/install_server
    • Please note if you are uncertain on what to pick, you should lean towards LVM.
    • Many reasons for this, including ease of expansion in event of filling the disk.
  • Check site.pp to ensure that the host will be reimaged into the insetup or insetup_noferm roles based on the requirements. If in doubt check with the service owner.

Installation

For virtual machines, where there is no physical BIOS to change, but there is virtual hardware to setup, check Ganeti#Create_a_VM instead.

At this point the host can be installed. From now on the service owner should be able to take over and install the host automatically, asking DC Ops to have a look only if there are issues. As a rule of thumb if the host is part of a larger cluster/batch order, it should install without issues and the service owner should try this path first. If instead the host is the first of a batch of new hardware, then is probably better to ask DC Ops to install the first one. Consider it a new hardware if it differs from the existing hosts by Generation, management card, RAID controller, network cards, BIOS, etc.

Automatic Installation

See the #Reimage section on how to use the reimage script to install a new server. Don' t forget to set the --new CLI parameter.

Change the state in Netbox to STAGED. [TODO: to be added to the reimage script]

Manual installation

Warning: if you are rebuilding a pre-existing server (rather than a brand new name), on puppetmaster clear out the old certificate before beginning this process:

 puppetmaster$ sudo puppet cert destroy $server_fqdn

1. Reboot system and boot from network / PXE boot
2. Acquires hostname in DNS
3. Acquires DHCP/autoinstall entries
4. OS installation

Run Puppet for the first time

1. From the cumin host (currently cumin1001) connect to newserver with install_console.

cumin1001:~$  sudo /usr/local/bin/install_console $newserver_fqdn
newserver# puppet agent --test

Exiting; no certificate found and waitforcert is disabled

2. On puppetmaster list all pending certificate signings and sign this server's key

puppetmaster$ sudo puppet cert -l
puppetmaster$ sudo puppet cert -s $newserver_fqdn

3. Back to the newserver, enable puppet and test it

 newserver# puppet agent --enable
 newserver# puppet agent --test

4. After a couple of successful puppet runs, you should reboot newserver just to make sure it comes up clean.
5. The newserver should now appear in puppet and in Icinga.
6. If that is a new server, change the state in Netbox to STAGED

7. Run the Netbox script to update the device with its interfaces and related IP addresses.

Note: If you already began reinstalling the server before destroying its cert on the puppetmaster, you should clean out ON THE newserver (with care):

newserver# find /var/lib/puppet/ssl -type f -exec rm {} \;

Spare -> Failed & Planned -> Failed & Staged -> Failed

If a device in the Spare, Planned or Staged state has hardware failures it can be marked in Netbox as FAILED.

Spare -> Decommissioned

When a host in the spare pool has reached its end of life and must be unracked.

Staged -> Active

  • When a server is placed into service, documentation of the service (not specifically the server) needs to reflect the new server's state. This includes puppet file references, as well as Wikitech documentation pages.
  • Service owner pool the host back in production.
  • changes Netbox's to ACTIVE.

Active -> Staged

This transition should be used when reimaging or when a rollback of the STAGED -> ACTIVE transition is needed.

  • Service owner perform actions to remove it from production, see the #Remove from production section below.
  • Perform the reimage using the available scripts, see the #Reimage section below.
  • Service owner changes Netbox's state to STAGED [TODO: include this step into the wmf-auto-reimage script]

Active -> Failed

When a host fails and requires physical maintenance/debugging by DC Ops:

  • Service owner perform actions to remove it from production, see the #Remove from production section below.
  • Service owner changes Netbox's state to FAILED
  • Once the failure is resolved the host will be put back into STAGED, and not directly into ACTIVE and in production.

Active -> Decommissioned

When the host has completed his life in a given role and should decommissioned or returned to the spare pool for re-assignement.

Failed -> Spare

When the failure of a Spare device has been fixed it can be set back to INVENTORY in Netbox.

Failed -> Planned

When the failure of a Planned device has been fixed it can be set back to PLANNED in Netbox.

Failed -> Staged

When the failure of an Active or Staged device has been fixed, it will go back to the Staged state. This because also if the host was ACTIVE before it needs to be tested and brought back to production by its service owner.

  • Change Netbox's state to STAGED

Failed -> Decommissioned

When the failure cannot be fixed and the host is not anymore usable it must be decommissioned before unracking it.

Decommissioned -> Spare

When a decommissioned host is going to be part of the spare pool.

Decommissioned -> Staged

When a host is decomissioned from one role and immediately returned in service in a different role, usually with a different hostname. (Ideally it should be wiped too)

  • Still follow the #Reclaim to Spares OR Decommission steps first decommissioning and then re-allocating the host, optionally with a new name, but it requires some additional manual steps (TBD).
  • Service owner changes Netbox's state to STAGED

Decommissioned -> Unracked

The host has completed its life and is being unracked

Unracked -> Recycled

When the host physically leaves the datacenter.

If Juniper device, fill the "Juniper Networks Service Waiver Policy" and send it to Juniper through a service request so it's removed from Juniper's DB.

Server actions

Reimage

Note that as of 2020, wmf-auto-reimage only works for physical nodes. Manual installation instructions, which are highly simplified for VMs, are described at Ganeti#Reinstall_/_Reimage_a_VM, then go to Server_Lifecycle#Manual_installation for common manual steps

The wmf-auto-reimage-host (single host) and wmf-auto-reimage (multiple hosts) scripts allow to automate most of the installation/re-image tasks outlined in this document. They are installed in cumin masters and must be run in a screen/tmux with sudo -i (to load conftool authentication).

Read the wmf-auto-reimage -h help page for a full list of options.

Examples:

$ sudo -i wmf-auto-reimage-host  -p T206450 somehost2002.codfw.wmnet
$ sudo -i wmf-auto-reimage -c -a -p T206450 --sequential --sleep 10 host1001.codfw.wmnet host1002.codfw.wmnet host1003.codfw.wmnet

When the tool prompts you for "IPMI Password," find it in the file management in Pwstore.

Actions performed by wmf-auto-reimage:

  • Updates the Phabricator task
  • Validates FQDN of hosts (unless --new or --no-verify are set)
  • Downtimes on Icinga (unless --no-downtime is set)
  • Depool hosts via conftool (if -c, --conftool is set)
  • Sets next boot in PXE mode
  • Power cycles or powers on based on current power state
  • use the new hostname (if set). Note: It is essential that the new hostname is already set via DHCP and configured in DNS
  • Runs puppet once to create the certificate and the signing request to the Puppet master
  • Masks all provided systemd units to prevent them to start automatically during the first Puppet run.
  • Triggers the first Puppet run
  • Runs Puppet on the Icinga host and set it in downtime (sometimes this might fail and some alarms may go off). Note: This is not affected by the --no-downtime flag.
  • Reboots
  • Checks if first puppet run is successful
  • Run the Netbox script to update the device with its interfaces and related IPs
  • Umasks the masked systemd units
  • run httpbb if the -a, --httpbb (formerly --apache) option was used (applies to mediawiki servers)
  • Print the conftool commands to re-pool the host (if -c )
  • Update the Phabricator task with the result

Post-flight checks:

  • Visit Icinga and search for the (unqualified) hostname. Check for critical alerts.
    • If you see an alert for the service "Check whether microcode mitigations for CPU vulnerabilities are applied": As long as the reimage succeeded, microcode fixes were already applied -- but this check only runs every 24 hours, so the alert will linger. To clear the alert, click on it, then click "Re-schedule the next check for this service," and schedule a forced check for a few minutes past the current time.
    • (MediaWiki hosts only) If you see an alert for the service "mediawiki-installation DSH group": The host is still depooled. Check the wmf-auto-reimage output for the confctl commands, and if repooling is appropriate, run them.

Notes: If something happens during reimaging and you need to restart the process, you will need to add the --no-verify --no-downtime options, for instance:

$ sudo -i wmf-auto-reimage-host -a -p T239054 -c --no-verify --no-downtime somehost.codfw.wmnet

Troubles with the IPMI: Management Interfaces

Remove from production

  • A Phabricator ticket should be created detailing the reinstallation in progress.
  • System services must be confirmed to be offline. Make sure no other services depend on this server.
  • Remove from pybal/LVS (if applicable) - see wmf-auto-reimage option -c/--conftool and consult the LVS page
  • Check if server is part of a service group. For example db class machines are in associated db-X.php, memcached in mc.php.
  • Remove server entry from DSH node groups (if applicable). For example check operations/puppet:hieradata/common/scap/dsh.yml

Rename while reimaging

This is a hint of a procedure that can be followed to rename a server while doing the reimage. It follow the active -> decommissioned -> staged path.

  • Remove the host from production (puppet, dhcp config, etc.. but not DNS)
  • Run decommission cookbook
  • Rename in Netbox both hostname and DNS name of all IPs assigned to it and change its state from DECOMMISSIONING to STAGED.
    • For the IP addresses you can start from https://netbox.wikimedia.org/ipam/ip-addresses/, you'll likely find a mgmt to rename.
    • For the hostname, it is sufficient to do a search in the main page of Netbox. Edit the page with the hostname to rename and add the new one.
  • The decom cookbook wipes the interface IPs and switch interface config from Netbox. Before running the cookbook in the next step, add in netbox the IPs (via the IPAM->IP panel).
  • Run the sre.dns.netbox cookbook: https://wikitech.wikimedia.org/wiki/DNS/Netbox#Update_generated_records
  • Run Homer to remove the switch port configuration.
  • Patch for puppet adjusting install/roles for the new server, hieradata, conftool. Merge it.
  • Patch DHCP entry, partman entry, Merge it. Run puppet on the install server: cumin 'A:installserver' 'run-puppet-agent -q'
  • Run the wmf-auto-reimage-host script with --new
  • Get the physical re-labeling done (open a task for dc-ops)
  • Run Homer to configure the switch interface (description, vlan).
  • Once the host is back in production update its state from STAGED to ACTIVE.

Examples of all of this: phab:T256363

Reclaim to Spares OR Decommission

TODO: this section should be split in three: Wipe, Unrack and Recycle.

Steps for ANY Opsen

  • A Decommission ticket should be created detailing if system is being decommissioned (and removed from datacenter) or reclaimed (wiped of all services/data and set system as spare for reallocation).
  • System services must be confirmed to be offline. Checking everything needed for this step and documenting it on this specific page is not feasible at this time(but we are working to add them all). Please ensure you understand the full service details and what software configuration files must be modified. This document will only list the generic steps required for the majority of servers.
  • If server is part of a service pool, ensure it is set to false or removed completely from pybal/LVS.
    • Instructions on how to do so are listed on the LVS page.
  • If possible, use tcpdump to verify that no production traffic is hitting the services/ports
  • If server is part of a service group, there will be associated files for removal or update. The service in question needs to be understood by tech performing the decommission (to the point they know when they can take things offline.) If assistance is needed, please seek out another operations team member to assist.
    • Example: db class machines are in associated db-X.php, memcached in mc.php.
  • Remove server entry from DSH node groups (if any).
    • If the server is part of a service group, common DSH entries are populated from conftool, unless they're proxies or canaries
    • The list of dsh groups is in operations/puppet:hieradata/common/scap/dsh.yaml.
  • Run the sre.hosts.decommission decom script available on the cluster::management hosts (cumin[12]001 as of Oct. 2019). The cookbooks is destructive and would make the host unbootable. This script, unlike the wmf-auto-reimage one, works for both physical hosts and virtual machines. The script will check for remaining occurrences of the hostname or IP in any puppet or DNS files and warn about them. Since at this point the workflow is that you should only remove the host from site.pp and DHCP after running it it is normal that you see warnings about those. You should check though if it still appears in any other files where it is not expected. Most notable case would be that an mw appserver happens to be an mcrouter proxy which needs to be replaced before decom. The actions performed by the cookbook are:
    • Downtime the host on Icinga (it will be removed at the next Puppet run on the Icinga host)
    • Detect if Physical or Virtual host based on Netbox data.
    • If virtual host (Ganeti VM)
      • Ganeti shutdown (tries OS shutdown first, pulls the plug after 2 minutes)
      • Force Ganeti->Netbox sync of VMs to update its state and avoid Netbox Report errors
    • If physical host
      • Downtime the management host on Icinga (it will be removed at the next Puppet run on the Icinga host)
      • Wipe bootloaders to prevent it from booting again
      • Pull the plug (IPMI power off without shutdown)
      • Update Netbox state to Decommissioning and delete all device interfaces and related IPs but the mgmt one
      • Disable switch interface and remove vlan config in Netbox
    • Remove it from DebMonitor
    • Remove it from Puppet master and PuppetDB
    • If virtual host (Ganeti VM), issue a VM removal that will destroy the VM. Can take few minutes.
    • Run the sre.dns.netbox cookbook to propagate the DNS changes or prompt the user for a manual patch if needed in order to remove DNS entries for the production network, and the hostname management entries, but leave the asset tag mgmt entries at this stage, servers should keep them until they are wiped and unracked.
    • Remove switch port config, either manually (eqiad) or by running Homer.
    • Update the related Phabricator task
  • Remove all references from Puppet repository:
    • site.pp
    • DHCP config from lease file (modules/install_server/files/dhcpd/linux-host-entries.ttyS... filename changes based on serial console settings)
    • Partman recipe in modules/install_server/files/autoinstall/netboot.cfg
    • All Hiera references both individual and in regex.yaml

Steps for DC-OPS (with network switch access)

  • Confirm all puppet manifest entries removal, DSH removal, Hiera data removal.
  • Remove host's port config on switch either manually (eqiad) or by running Homer (if not already done above).
    • If manual: Move the switch port to interface-range disabled
    • # show interfaces ge-x/y/z | display inheritance helps identify configuration applied to the port
  • Update associated Phabricator ticket, detailing steps taken and resolution.
    • If system is decommissioned by on-site tech, they can resolve the ticket.
    • If system is reclaimed into spares, ticket should be assigned to the HW Allocation Tech so he can update spares lists for allocation.

Decommission Specific (can be done by DC Ops without network switch access)

  • A Phabricator ticket for the decommission of the system should be placed in the #hardware-request project and the appropriate datacenter-specific ops-* project.
  • All further decommission steps are handled by the on-site technician.
  • Wipe all disks on system via following the directions on Dc-operations/Securely_Erasing_Media
  • Reset all system bios, mgmt bios, & raid bios settings to factory defaults.
  • Unrack system
  • Run the Offline a device with extra actions Netbox script that will set the device in Offline status and delete all its interfaces and associated IP addresses left.
    • To run the script in dry-run mode, uncheck the Commit changes checkbox.
  • Remove its mgmt DNS entries: run the sre.dns.netbox cookbook
  • Unless another system will be placed in the space vacated immediately, please remove all power & network cables from rack.

Network devices specific

  • SRX only: ensure autorecovery is disabled (see Juniper doc)
  • Wipe the configuration
    • By either running the command request system zeroize media
    • Or Pressing the reset button for 15s
  • Confirm the wipe is successful by login to the device via console (root/no password)

Position Assignments

The cycle above references specific position/assignments, without referring to name. To keep the document generic, we'll keep the cycle with positions listed, and just list those folks here.

  • Buyer / HW Allocation Tech: Rob H (US), Mark B (EU)
  • On-site Tech EQIAD: Chris J
  • On-site Tech CODFW: Papaul T
  • On-site Tech ULSFO: Rob H
  • Director Technical Operations : Mark B
  • Operations Technical Review: Mark B, Faidon L

See also