You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Nova Resource:Integration/Setup: Difference between revisions
imported>Addshore (→puppet) |
imported>Samtar (update post T252071 completion) |
||
(7 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
== Roles == | == Roles == | ||
=== integration- | === integration-agent-{type}-XXXX === | ||
''Updated September 2019 based on [[phab:T226233|T226233]]'' | |||
''Updated January 2021'' | |||
The instances are created via https://horizon.wikimedia.org/project/instances/ , you will need a source image to pick and an instance flavor. | |||
* '''Source''': pick the <code>debian-11.0-bullseye</code> image now that [[:phab:T252071|T252071]] is complete | |||
For the flavor the important parts are: | |||
* have enough disk space (docker role notably requests 24G for /var/lib/docker and you would need enough disk remaining for /srv). | |||
* have a <code>4xiops</code> flavor which dramatically boost the underlying Disk IO rate limiting applied to all WMCS instances. | |||
* '''Flavor''': pick <code>g3.cores8.ram24.disk20.ephemeral40.4xiops</code> | |||
* Create a new instance named <code>integration-agent-{type}-XXXX</code> where <code>{type}</code> is a role (example: <code>docker</code>) and <code>XXXX</code> increments starting from 1001. | |||
Wait a few minutes (during which the instance is created, initial setup happens). Then connect to the instance over SSH and fix puppet: | |||
* <code>sudo rm -fR /var/lib/puppet/ssl && sudo puppet agent -tv</code> | |||
* If that complains: | |||
** get the instance fully qualified domain name (FQDN): <code>hostname --fqdn</code> | |||
** On <code>integration-puppetmaster-02.integration.eqiad.wmflabs</code>, clean the old and invalid certificate(s): <code>sudo puppet cert clean <FQDN OF INSTANCE HERE</code> | |||
Apply the Puppet role: | |||
* https://horizon.wikimedia.org/project/instances/ | |||
* Click the instance then head to the tab <code>Puppet</code> | |||
* Pick <code>role::ci::slave::labs::docker</code> | |||
The Docker agent will have a 24G <code>/var/lib/docker</code> partition, the remaining disk space is allocated to <code>/srv</code>. | |||
Run Puppet on the instance (<code>puppet agent -tv</code>) and verify: | |||
* If a Docker agent, make sure there is a <code>/var/lib/docker</code> partition for Docker | |||
* Clean unused packages: <code>apt-get autoremove --purge</code> | |||
* Upgrade packages: <code>apt-get -y dist-upgrade</code> | |||
'''Reboot the instance''' (Before adding to Jenkins). This cleans state, take in account the new Linux kernel if any, launches daemons. Once it is back, you can then add it to Jenkins | |||
Add the instance to Jenkins | |||
# Create "New Node" in [https://integration.wikimedia.org/ci/computer/ Jenkins management] | # Create "New Node" in [https://integration.wikimedia.org/ci/computer/ Jenkins management] | ||
#* Name: (short hostname of instance) | #* Name: (short hostname of instance) | ||
#* Type: | #* Type: Permanent Agent | ||
#* Executors: 1 | #* Executors: 1 (for Docker agents: 4, for Qemu agents: 1) | ||
#* | #* Remote root directory: <code>/srv/jenkins/workspace</code> | ||
#* Labels: | #* Labels: | ||
#** <code> | #** For Docker agents: <code>Docker</code> | ||
#** <code> | #** For Qemu agents: <code>Qemu</code> | ||
#* Usage: <code>EXCLUSIVE</code> (Only build jobs with label restrictions matching this node) | #* Usage: <code>EXCLUSIVE</code> (Only build jobs with label restrictions matching this node) | ||
#* Launch method: SSH | #* Launch method: SSH | ||
Line 26: | Line 58: | ||
#** Credentials: jenkins-deploy (key from role::ci::slave::labs::common) | #** Credentials: jenkins-deploy (key from role::ci::slave::labs::common) | ||
#* Availability: <code>Always</code> (Keep this slave on-line as much as possible) | #* Availability: <code>Always</code> (Keep this slave on-line as much as possible) | ||
The Jenkins master will automatically trust the ssh key upon the first connection. | |||
=== integration-dev === | === integration-dev === | ||
Line 36: | Line 70: | ||
== Utilities == | == Utilities == | ||
=== puppet === | === puppet === | ||
{{outdated}} | {{outdated}} | ||
Use <code>sudo /usr/local/sbin/puppet-run &</code>. Don't use <s><code>sudo puppet agent -t</code></s>, because that is not what cron uses and leads to inconsistencies with e.g. umask and other factors affecting default values used at runtime. | Use <code>sudo /usr/local/sbin/puppet-run &</code>. Don't use <s><code>sudo puppet agent -t</code></s>, because that is not what cron uses and leads to inconsistencies with e.g. umask and other factors affecting default values used at runtime. | ||
Latest revision as of 15:44, 10 June 2022
Roles
integration-agent-{type}-XXXX
Updated September 2019 based on T226233 Updated January 2021
The instances are created via https://horizon.wikimedia.org/project/instances/ , you will need a source image to pick and an instance flavor.
- Source: pick the
debian-11.0-bullseye
image now that T252071 is complete
For the flavor the important parts are:
- have enough disk space (docker role notably requests 24G for /var/lib/docker and you would need enough disk remaining for /srv).
- have a
4xiops
flavor which dramatically boost the underlying Disk IO rate limiting applied to all WMCS instances.
- Flavor: pick
g3.cores8.ram24.disk20.ephemeral40.4xiops
- Create a new instance named
integration-agent-{type}-XXXX
where{type}
is a role (example:docker
) andXXXX
increments starting from 1001.
Wait a few minutes (during which the instance is created, initial setup happens). Then connect to the instance over SSH and fix puppet:
sudo rm -fR /var/lib/puppet/ssl && sudo puppet agent -tv
- If that complains:
- get the instance fully qualified domain name (FQDN):
hostname --fqdn
- On
integration-puppetmaster-02.integration.eqiad.wmflabs
, clean the old and invalid certificate(s):sudo puppet cert clean <FQDN OF INSTANCE HERE
- get the instance fully qualified domain name (FQDN):
Apply the Puppet role:
- https://horizon.wikimedia.org/project/instances/
- Click the instance then head to the tab
Puppet
- Pick
role::ci::slave::labs::docker
The Docker agent will have a 24G /var/lib/docker
partition, the remaining disk space is allocated to /srv
.
Run Puppet on the instance (puppet agent -tv
) and verify:
- If a Docker agent, make sure there is a
/var/lib/docker
partition for Docker - Clean unused packages:
apt-get autoremove --purge
- Upgrade packages:
apt-get -y dist-upgrade
Reboot the instance (Before adding to Jenkins). This cleans state, take in account the new Linux kernel if any, launches daemons. Once it is back, you can then add it to Jenkins
Add the instance to Jenkins
- Create "New Node" in Jenkins management
- Name: (short hostname of instance)
- Type: Permanent Agent
- Executors: 1 (for Docker agents: 4, for Qemu agents: 1)
- Remote root directory:
/srv/jenkins/workspace
- Labels:
- For Docker agents:
Docker
- For Qemu agents:
Qemu
- For Docker agents:
- Usage:
EXCLUSIVE
(Only build jobs with label restrictions matching this node) - Launch method: SSH
- Host: (internal IP of instance)
- Credentials: jenkins-deploy (key from role::ci::slave::labs::common)
- Availability:
Always
(Keep this slave on-line as much as possible)
The Jenkins master will automatically trust the ssh key upon the first connection.
integration-dev
- Create instance:
m1.medium
- Security group: Default
- Wait 10 minutes
- Reconfigure instance from wikitech: Enable
role::ci::slave::labs
. - Via SSH, force a puppet run (applies role).
Utilities
puppet
![]() | This page may be outdated or contain incorrect details. Please update it if you can. |
Use sudo /usr/local/sbin/puppet-run &
. Don't use , because that is not what cron uses and leads to inconsistencies with e.g. umask and other factors affecting default values used at runtime.
sudo puppet agent -t