You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

shortcut: PCC

Help:Puppet-compiler: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Herron
(changed the stale compiler02 example link to a more detailed description of the URL structure)
imported>RoySmith
m (→‎Overview: fix typo)
 
(24 intermediate revisions by 16 users not shown)
Line 1: Line 1:
{{Draft}}
{{Shortcut|PCC}}
{{Cloud VPS nav}}
 
==Overview==


You can run '''puppet-compiler''' by hand to get the results of a given puppet configuration without having to deploy it to servers.
You can run '''puppet-compiler''' by hand to get the results of a given puppet configuration without having to deploy it to servers.


This page gives some instructions on doing so.
This page provides instructions for this process.


== Integration Jenkins ==
== Catalog compiler in integration Jenkins ==


There is a jenkins job that takes a gerrit change and runs the compiler.
There is a Jenkins job that takes a Gerrit change and runs the compiler.


Steps:
Steps:
Line 13: Line 16:
# Go to https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/
# Go to https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/
# Go to "Build with parameters" https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/build
# Go to "Build with parameters" https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/build
# In the form, fill '''change number''' (from gerrit) and '''list of nodes'''
# In the form, fill '''change number''' (from Gerrit) and '''list of nodes'''
# Hit the '''Build''' button
# Hit the '''Build''' button
# Wait for the jenkins job to end
# Wait for the Jenkins job to end
# You can check for results in the jenkins '''Console output'''
# You can check for results in the Jenkins '''Console output'''
# You can see the compiled catalogs in a web frontend.  The URL structure is https://puppet-compiler.wmflabs.org/compiler_host/build_id where...
# You can see the compiled catalogs in a web frontend.  The URL structure is https://puppet-compiler.wmflabs.org/compiler_host/build_id, where:
#* '''compiler_host''' is the hostname (without domain name) of the compiler node that Jenkins dispatched the build to.  A current list of possible compiler nodes is available at https://integration.wikimedia.org/ci/label/puppet-compiler-node/
#* '''compiler_host''' is the hostname (without domain name) of the compiler node that Jenkins dispatched the build to.  A current list of possible compiler nodes is available at https://integration.wikimedia.org/ci/label/puppet-compiler-node/
#* '''build_id''' is the unique id of the Jenkins build (changes with every run).
#* '''build_id''' is the unique id of the Jenkins build (changes with every run)
#* ''This link is automatically constructed and can be found at the bottom of the Jenkins console output after each build.''
#* ''This link is automatically constructed and can be found at the bottom of the Jenkins console output after each build.''


== Local run (of jenkins job)==
=== Host variable override ===
 
The '''list_of_nodes''' supports selecting hosts using the following methods
 
* an empty list.  In this case pcc attempt to [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/puppet-compiler/+/refs/heads/master/puppet_compiler/nodegen.py#8 pick one host for each definition in site.pp]
* comma separate list of hosts e.g. <code>example1001.eqiad.wmnet,example2001.wikimedia.org,example3001.esams.wmnet</code>
* Regular expression - you can use '''re:''' followed by a regular expression to select hosts e.g. to select all puppetmasters use <code>re:puppetmaster.*wmnet</code>
* Simplified Cumin syntax.  you can use the '''P:, C: and O:''' cumin prefixes to select hosts based on their profile, class or role e.g.
** select all hosts with the <code>envoy</code> class: <code>C:envoy</code>
** select all hosts with the <code>tlsproxy::envoy</code> profile: <code>P:tlsproxy::envoy</code>
** select all hosts with the <code>mediawiki::appserver</code> role: <code>O:mediawiki::appserver</code>
* cumin puppetdb backend expresions - you can use '''cumin:''' followed by a cumin query using [[Cumin#PuppetDB_host_selection|the puppetdb grammer]].  however please keep in mind that the puppetdb on the pcc workers only has a subset of hosts.
 
With the simplified cumin syntax we try to select a set of hosts which covers all unique uses cases of the class, profile and role to avoid preforming the same test on multiple nodes.  We select this reduced set of hosts based on the host prefix i.e. '''mw''', '''cp''', '''db''' etc and they set of puppet tags applied to the host. [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/puppet-compiler/+/refs/heads/master/puppet_compiler/nodegen.py#43 Check the code for further details].  If you pass a cumin query you will receive all hosts in the pcc worker puppetdb which match that query.  e.g. at the time of writing if one used
* <code>P:idp</code>:  The test will run on idp1001.wikimedia.org and idp-test1001.wikimedia.org
* <code>cumin:P:idp</code>: The  test '''may''' run on all of idp1001.wikimedia.org, idp2001.wikimedia.org, idp-test1001.wikimedia.org and idp-test2001.wikimedia.org
 
=== Gerrit integration ===
 
There is an experimental feature which allows users to specify the '''list_of_node''' in the gerrit commit message.  To do this you need to specify your '''list_of_nodes''' using the keyword '''<code>Hosts: </code>''' followed by your list of hosts or one of the supported overrides listed above.
 
Make sure the list of hosts is part of the footer section (extra new line after the main text, together with '''Bug:''' and ''Change-id:'' and other footers). Commit message validator also has a maximum line width limit, so use multiple '''Hosts:''' lines if needed. [[gerrit:c/operations/puppet/+/701335|Example commit message]].
 
You can provide comments at the end of '''Hosts:''' lines to help identify the sets of machines:<syntaxhighlight lang="text">
Hosts: pc1009.eqiad.wmnet,pc2009.codfw.wmnet # pc3
Hosts: es1027.eqiad.wmnet,es2028.codfw.wmnet #  es4
</syntaxhighlight>Once this is in place you can comment on your change with [https://phabricator.wikimedia.org/T166066#5039654 '''check experimental'''] and zuul will schedule a PCC using the correct gerrit ID and the hosts specified
 
== Updating nodes ==
 
A recent update means there is now a mechanism for puppet masters to automatically send there facts data to the compiler hosts.  Configured puppet masters send there facts to the puppet compiler db host (pcc-db1001.puppet-diffs.eqiad1.wikimedia.cloud) using the <code>upload_puppet_facts</code> systemd timer.  The DB host process facts on a daily bases using the <code>pcc_facts_processor</code> systemd timer. 


You can do a local run of the integration jenkins job. This is usually done in an instance VM on CloudVPS.
=== Manually update production ===


Steps are:
one can manually update the production facts by running the following. 


* Push your patch to [[gerrit]] using [[git-review]]. You will get a '''change''' number.
<syntaxhighlight lang="console">
* Choose a puppet master server &lt;master&gt;
$ ssh puppetmaster1001.eqiad.wmnet sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080
* Build &lt;worker&gt; as a large Jessie instance (probably anywhere, but ideally in the same project as &lt;master&gt; if &lt;master&gt; is a VM
$ ssh pcc-db1001.puppet-diffs.eqiad1.wikimedia.cloud sudo systemctl start pcc_facts_processor.service
* On &lt;worker&gt;, apply the classes "role::puppet_compiler" and "profile::puppetmaster::labsenc" (via Horizon, for example)
* On &lt;worker&gt;, apply the following hiera (via Horizon, for example): <syntaxhighlight lang="yaml">
discovery::app_routes: {}
etcd::cluster_name: <something arbitrary and distinctive>
etcd::cluster_state: new
etcd::host: '%{::fqdn}'
etcd::peers_list: <worker hostname>=http://<worker fqdn>:2380
etcd::port: 2739
etcd::ssl::puppet_cert_name: '%{::fqdn}'
etcd::use_ssl: true
labsdnsconfig: {}
</syntaxhighlight>
</syntaxhighlight>


* Gather facts of target host. On a local (not a VM) with keys installed to access all the necessary hosts, run this script from the puppet repo:
=== Manually update cloud ===
<syntaxhighlight lang="shell-session">
 
$ PUPPET_MASTERS=<master> PUPPET_COMPILER=<worker> modules/puppet_compiler/files/compiler-update-facts
Projects that use the shared puppet master can update there facts by running the following command
 
<syntaxhighlight lang="console">
$ ssh puppetmaster.cloudinfra.wmflabs.org sudo /usr/local/sbin/puppet-facts-upload
$ ssh pcc-db1001.puppet-diffs.eqiad1.wikimedia.cloud sudo systemctl start pcc_facts_processor.service
</syntaxhighlight>
</syntaxhighlight>


* In order to see results in HTML for a web browser, you will need to set up a web proxy (ideally named puppet-compiler.wmflabs.org) pointing to &lt;worker&gt;
Projects that have there own puppet master will first need to add the public key of the puppet master to puppet to ensure that the db server can accept uploads. for this you will need to add something like the following to <code>hieradata/cloud/eqiad1/puppet-diffs/hosts/pcc-db1001.yaml</code>
Make sure port 80 is open on &lt;worker&gt; so the proxy can reach it. You can do this via Horizon.


* Once all that is done, you can invoke a compiler test on &lt;worker&gt; like this:
<syntaxhighlight lang="yaml">
<syntaxhighlight lang="shell-session">
puppet_compiler::uploader::realms:
CHANGE=<patch of interest> NODES=<node of interest> BUILD_NUMBER=8 puppet-compiler
  deployment-prep:  # This should be the name of the horizon project
    # The below key should be the hostname of the puppet master.
    # The value should be the content of the puppet host public key
    # cat $(sudo facter -p puppet_config.hostpubkey)
    deployment-puppetmaster04: |
      -----BEGIN PUBLIC KEY-----
      MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEApuxohaA21d8YqF5vVEIB
      06kvvEeLYsHdge3CHBwS4JVMspoXkzVDHbjbCLXMRMAJ9xy3HbsGFcE0MSr17oF2
      YMACKUidt0nNdjTUJZ8wYYWa3YqRIfUhV7C7FDCclKw9Vj73Up1BwdJMC0/S1te9
      pfHbo6nRwJDATEA1UyxgWBmUnJmqevLUvygppYeEb6IcjPhJGRia1jnK3VzNgyW8
      vRr6dbx9qZjvoY/KNMCFRrjvIxk7QUJfwxg1ZlJ8drwkm0vgKDmIN8l4zXAdPkgf
      WPRp2lpanS0vqHHILnl1UlHHf4kM7Q3H6y8QQN1OQfx4VuQIOHX5rLb8OPMdkiA4
      NQSMpWiSzJI5uUnyZm0unzu3F8d6VSAN/kgtEMnnpKA7FVCuFThW0zQGtVHz9QQn
      jE1BodAATdGmsOR4cukdfZxtYOuYmWFQsyHmvgcYaO/LXfe4tjpllhWnvPQpz48k
      8TGvctenbQH/HSo/3yFsYKMFoFGTuyWiL68hv2Ot5ZtgmxPhtTCtoEmIajvYe8k1
      EH0CKL44wBQOUmOAlHdROwQauZsqa8bXQTMEzZ8k6lXz06lGY0frhngbR53naEnY
      C0gyRPFAn46QOzOJQgzMneMSVp7IN05i4IYW/1kiQOT7Ks22UEJyZhXYpTkTnuQ6
      2jK3v7JNqnd3yHHg/iCdroUCAwEAAQ==
      -----END PUBLIC KEY-----
</syntaxhighlight>
</syntaxhighlight>


The source of this information is:
Once the above has been updated you can add <code>role::puppetmaster::standalone::upload_facts: true</code> to the hieradata of the project puppetmaster to enable uploads.  then yuo can run the commands at the beginning of this section to upload the first batch of data
* Andrew's comment in [[phab:T97081#3681225]]
 
== Purging nodes ==
There are a few different things involved to completely purge a node from PCC and it depends a bit on which host variable override you use. the main things at play are:
 
# puppetdb used by <code>compiler-update-facts</code>:
#* the <code>compiler-update-facts</code> script uses the puppetdb API to export a list of active nodes, a node is considered active if it has submitted a report to puppetdb in the last 14 days (this is the production value at least)
# when you last ran <code>compiler-update-facts</code>
#* <code>compiler-update-facts</code> exports the node data from puppetdb then rsync's it using <code>--delete</code> to ''/var/lib/catalog-differ/puppet/yaml/'' this purging any old nodes
# The puppetdb used by the compiler
#* the puppet compiler also has a puppetdb instance which expires nodes after 7 days of inactivity, however there are cron jobs to make sure this is fresh which run every nigh. i.e. unless manually purged nodes won't get removed from here until 7 days after step 2
 
Depending on the host variable override you use affects which of theses constraints you will hit.
 
* if using an empty <code>Hosts</code> list then PCC calculates the hosts from the ''site.pp'' file so it shouldn't have any issues with old nodes (as long as they have been removed from site.pp). however you may need to run <code>compiler-update-facts</code> for new hosts
* If providing a an explicit list of hosts then again none of this matters but you may need to run <code>compiler-update-facts</code> for new hosts
* if using the <code>re:</code> selector then PCC scans ''/var/lib/catalog-differ/puppet/yaml/'' looking for hosts matching the regex as such the actions up to step 2 would be required to purge the node
* if using any of the other selectors then PCC queries puppetdb for matching hosts as such you need to complete all steps to purge the node
 
== Catalog compiler for CloudVPS ==
 
The [https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/ standard Jenkins-hosted catalog compiler] can now target VPS instances.  Because VMs are frequently created and deleted, it may be necessary to update the facts from whatever puppetmaster is hosting the VM in question. Instructions for doing that can be found at [[Nova Resource:Puppet-diffs]].
 
The hostname to use for the VM is whatever the puppetmaster thinks a host is called, which is usually the output of <code>hostname -f</code>.


== Local run (pcc utility) ==
== Catalog compiler local run (pcc utility) ==


There is a also a tool called '''pcc''' under the ''operations/puppet/utils'' repo. You'll need your Jenkins API token to make it work, retrievable under https://integration.wikimedia.org/ci/user/$YOURUSERNAME/configure.
There is a also a tool called '''pcc''' under the ''operations/puppet/utils'' repo. You'll need your Jenkins API token to make it work-- retrievable under https://integration.wikimedia.org/ci/user/$YOURUSERNAME/configure.


Example:
Example:
Line 68: Line 133:
$ ./utils/pcc GERRIT_CHANGE_NUMBER LIST_OF_NODES --username YOUR_USERNAME --api-token 12312312312312313   
$ ./utils/pcc GERRIT_CHANGE_NUMBER LIST_OF_NODES --username YOUR_USERNAME --api-token 12312312312312313   
$ ./utils/pcc 282936 oxygen.eqiad.wmnet --username batman --api-token 12312312312312313
$ ./utils/pcc 282936 oxygen.eqiad.wmnet --username batman --api-token 12312312312312313
</syntaxhighlight>
</syntaxhighlight><code>--username</code> and <code>--api-token</code> can be omitted if <code>JENKINS_USERNAME</code> and <code>JENKINS_API_TOKEN</code> are set in the current environment.


== Troubleshooting ==
== Troubleshooting ==
Line 74: Line 139:
Some common errors and mistakes.
Some common errors and mistakes.


* Catalog for Cloud VPS instances doesn't get any labs classes/roles
* Catalog for Cloud VPS instances doesn't get any classes/roles.


This happens because '''$::realm''' is not set to '''labs'''.
This happens because '''$::realm''' is not set to '''labs'''.
There are patches in place to fix this, but the puppet-compiler software needs to be released with these patches.
There are patches in place to fix this, but the puppet-compiler software needs to be released with these patches.


* ERROR: Unable to find facts for host tools-services-01.tools.eqiad.wmflabs, skipping
* ERROR: Unable to find facts for host tools-services-01.tools.eqiad1.wikimedia.cloud, skipping
 
If running locally, collect facts by hand from the corresponding puppetmaster. If running in the Jenkins web service for a production host, follow [[Nova_Resource:Puppet-diffs#FAQ|these instructions]].
 
== Limitations ==
 
The puppet-compiler mechanism won't discover all the issues in the resulting catalog. If the catalog was compiled OK by Jenkins, you may still find some issues when running the puppet agent.<br>
 
Some known limitations:
 
* Files sources. When declaring a <code>File { '/my/file':</code>, the path information you specified in the <code>content</code> parameter will be resolved at puppet agent runtime.
* Private Hiera lookups. The way Hiera fetches data may vary between how it's done in the puppet-compiler process to how it's done in the final puppet master. Specifically, secrets in the private repo.
* Hiera behavior. Currently, we don't have a way to know in concrete how Hiera is behaving when compiling the catalog. See [[phab:T215507 | Phabricator ticket T215507]] for more information.
 
<!--commenting this out for now
'''TODO:''' I'm writing from memory. This section can be improved for accuracy.
-->


If running locally, collect facts by hand from the corresponding puppetmaster. If running in the jenkins web service for a production host, follow [[Nova_Resource:Puppet3-diffs#FAQ|these instructions]].
{{:Help:Cloud Services communication}}


== See also ==
== See also ==
Line 89: Line 170:
* [[Puppet migration#Puppet Catalogs compiler]]
* [[Puppet migration#Puppet Catalogs compiler]]
* [[Puppet coding/Testing#Puppet Compiler Jenkins Job]]
* [[Puppet coding/Testing#Puppet Compiler Jenkins Job]]
* [[Nova_Resource:Puppet-diffs#How_to_update_the_compiler's_facts?_(e.g._INFO:_Unable_to_find_facts_for_host_conf2001.codfw.wmnet,_skipping)|How to update the compiler's facts]]
* [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/puppet-compiler/+/refs/heads/master/README.md#steps-to-create-testing-environment-linux Linux local Hacking instructions]
* {{phabT|97081}}
* {{phabT|97081}}


[[Category:Cloud VPS]]
[[Category:Documentation]]
[[Category:Cloud Services]]
[[Category:Puppet]]
[[Category:Puppet]]
[[Category:Documentation]]
[[Category:SRE Infrastructure Foundations]]
[[Category:Cloud VPS]]

Latest revision as of 21:00, 27 May 2022

Overview

You can run puppet-compiler by hand to get the results of a given puppet configuration without having to deploy it to servers.

This page provides instructions for this process.

Catalog compiler in integration Jenkins

There is a Jenkins job that takes a Gerrit change and runs the compiler.

Steps:

  1. Push your change to gerrit using git-review
  2. Go to https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/
  3. Go to "Build with parameters" https://integration.wikimedia.org/ci/job/operations-puppet-catalog-compiler/build
  4. In the form, fill change number (from Gerrit) and list of nodes
  5. Hit the Build button
  6. Wait for the Jenkins job to end
  7. You can check for results in the Jenkins Console output
  8. You can see the compiled catalogs in a web frontend. The URL structure is https://puppet-compiler.wmflabs.org/compiler_host/build_id, where:
    • compiler_host is the hostname (without domain name) of the compiler node that Jenkins dispatched the build to. A current list of possible compiler nodes is available at https://integration.wikimedia.org/ci/label/puppet-compiler-node/
    • build_id is the unique id of the Jenkins build (changes with every run)
    • This link is automatically constructed and can be found at the bottom of the Jenkins console output after each build.

Host variable override

The list_of_nodes supports selecting hosts using the following methods

  • an empty list. In this case pcc attempt to pick one host for each definition in site.pp
  • comma separate list of hosts e.g. example1001.eqiad.wmnet,example2001.wikimedia.org,example3001.esams.wmnet
  • Regular expression - you can use re: followed by a regular expression to select hosts e.g. to select all puppetmasters use re:puppetmaster.*wmnet
  • Simplified Cumin syntax. you can use the P:, C: and O: cumin prefixes to select hosts based on their profile, class or role e.g.
    • select all hosts with the envoy class: C:envoy
    • select all hosts with the tlsproxy::envoy profile: P:tlsproxy::envoy
    • select all hosts with the mediawiki::appserver role: O:mediawiki::appserver
  • cumin puppetdb backend expresions - you can use cumin: followed by a cumin query using the puppetdb grammer. however please keep in mind that the puppetdb on the pcc workers only has a subset of hosts.

With the simplified cumin syntax we try to select a set of hosts which covers all unique uses cases of the class, profile and role to avoid preforming the same test on multiple nodes. We select this reduced set of hosts based on the host prefix i.e. mw, cp, db etc and they set of puppet tags applied to the host. Check the code for further details. If you pass a cumin query you will receive all hosts in the pcc worker puppetdb which match that query. e.g. at the time of writing if one used

  • P:idp: The test will run on idp1001.wikimedia.org and idp-test1001.wikimedia.org
  • cumin:P:idp: The test may run on all of idp1001.wikimedia.org, idp2001.wikimedia.org, idp-test1001.wikimedia.org and idp-test2001.wikimedia.org

Gerrit integration

There is an experimental feature which allows users to specify the list_of_node in the gerrit commit message. To do this you need to specify your list_of_nodes using the keyword Hosts: followed by your list of hosts or one of the supported overrides listed above.

Make sure the list of hosts is part of the footer section (extra new line after the main text, together with Bug: and Change-id: and other footers). Commit message validator also has a maximum line width limit, so use multiple Hosts: lines if needed. Example commit message.

You can provide comments at the end of Hosts: lines to help identify the sets of machines:

Hosts: pc1009.eqiad.wmnet,pc2009.codfw.wmnet # pc3
Hosts: es1027.eqiad.wmnet,es2028.codfw.wmnet #  es4

Once this is in place you can comment on your change with check experimental and zuul will schedule a PCC using the correct gerrit ID and the hosts specified

Updating nodes

A recent update means there is now a mechanism for puppet masters to automatically send there facts data to the compiler hosts. Configured puppet masters send there facts to the puppet compiler db host (pcc-db1001.puppet-diffs.eqiad1.wikimedia.cloud) using the upload_puppet_facts systemd timer. The DB host process facts on a daily bases using the pcc_facts_processor systemd timer.

Manually update production

one can manually update the production facts by running the following.

$ ssh puppetmaster1001.eqiad.wmnet sudo /usr/local/sbin/puppet-facts-upload --proxy http://webproxy.eqiad.wmnet:8080
$ ssh pcc-db1001.puppet-diffs.eqiad1.wikimedia.cloud sudo systemctl start pcc_facts_processor.service

Manually update cloud

Projects that use the shared puppet master can update there facts by running the following command

$ ssh puppetmaster.cloudinfra.wmflabs.org sudo /usr/local/sbin/puppet-facts-upload
$ ssh pcc-db1001.puppet-diffs.eqiad1.wikimedia.cloud sudo systemctl start pcc_facts_processor.service

Projects that have there own puppet master will first need to add the public key of the puppet master to puppet to ensure that the db server can accept uploads. for this you will need to add something like the following to hieradata/cloud/eqiad1/puppet-diffs/hosts/pcc-db1001.yaml

puppet_compiler::uploader::realms:
  deployment-prep:  # This should be the name of the horizon project
    # The below key should be the hostname of the puppet master.
    # The value should be the content of the puppet host public key
    # cat $(sudo facter -p puppet_config.hostpubkey)
    deployment-puppetmaster04: |
      -----BEGIN PUBLIC KEY-----
      MIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEApuxohaA21d8YqF5vVEIB
      06kvvEeLYsHdge3CHBwS4JVMspoXkzVDHbjbCLXMRMAJ9xy3HbsGFcE0MSr17oF2
      YMACKUidt0nNdjTUJZ8wYYWa3YqRIfUhV7C7FDCclKw9Vj73Up1BwdJMC0/S1te9
      pfHbo6nRwJDATEA1UyxgWBmUnJmqevLUvygppYeEb6IcjPhJGRia1jnK3VzNgyW8
      vRr6dbx9qZjvoY/KNMCFRrjvIxk7QUJfwxg1ZlJ8drwkm0vgKDmIN8l4zXAdPkgf
      WPRp2lpanS0vqHHILnl1UlHHf4kM7Q3H6y8QQN1OQfx4VuQIOHX5rLb8OPMdkiA4
      NQSMpWiSzJI5uUnyZm0unzu3F8d6VSAN/kgtEMnnpKA7FVCuFThW0zQGtVHz9QQn
      jE1BodAATdGmsOR4cukdfZxtYOuYmWFQsyHmvgcYaO/LXfe4tjpllhWnvPQpz48k
      8TGvctenbQH/HSo/3yFsYKMFoFGTuyWiL68hv2Ot5ZtgmxPhtTCtoEmIajvYe8k1
      EH0CKL44wBQOUmOAlHdROwQauZsqa8bXQTMEzZ8k6lXz06lGY0frhngbR53naEnY
      C0gyRPFAn46QOzOJQgzMneMSVp7IN05i4IYW/1kiQOT7Ks22UEJyZhXYpTkTnuQ6
      2jK3v7JNqnd3yHHg/iCdroUCAwEAAQ==
      -----END PUBLIC KEY-----

Once the above has been updated you can add role::puppetmaster::standalone::upload_facts: true to the hieradata of the project puppetmaster to enable uploads. then yuo can run the commands at the beginning of this section to upload the first batch of data

Purging nodes

There are a few different things involved to completely purge a node from PCC and it depends a bit on which host variable override you use. the main things at play are:

  1. puppetdb used by compiler-update-facts:
    • the compiler-update-facts script uses the puppetdb API to export a list of active nodes, a node is considered active if it has submitted a report to puppetdb in the last 14 days (this is the production value at least)
  2. when you last ran compiler-update-facts
    • compiler-update-facts exports the node data from puppetdb then rsync's it using --delete to /var/lib/catalog-differ/puppet/yaml/ this purging any old nodes
  3. The puppetdb used by the compiler
    • the puppet compiler also has a puppetdb instance which expires nodes after 7 days of inactivity, however there are cron jobs to make sure this is fresh which run every nigh. i.e. unless manually purged nodes won't get removed from here until 7 days after step 2

Depending on the host variable override you use affects which of theses constraints you will hit.

  • if using an empty Hosts list then PCC calculates the hosts from the site.pp file so it shouldn't have any issues with old nodes (as long as they have been removed from site.pp). however you may need to run compiler-update-facts for new hosts
  • If providing a an explicit list of hosts then again none of this matters but you may need to run compiler-update-facts for new hosts
  • if using the re: selector then PCC scans /var/lib/catalog-differ/puppet/yaml/ looking for hosts matching the regex as such the actions up to step 2 would be required to purge the node
  • if using any of the other selectors then PCC queries puppetdb for matching hosts as such you need to complete all steps to purge the node

Catalog compiler for CloudVPS

The standard Jenkins-hosted catalog compiler can now target VPS instances. Because VMs are frequently created and deleted, it may be necessary to update the facts from whatever puppetmaster is hosting the VM in question. Instructions for doing that can be found at Nova Resource:Puppet-diffs.

The hostname to use for the VM is whatever the puppetmaster thinks a host is called, which is usually the output of hostname -f.

Catalog compiler local run (pcc utility)

There is a also a tool called pcc under the operations/puppet/utils repo. You'll need your Jenkins API token to make it work-- retrievable under https://integration.wikimedia.org/ci/user/$YOURUSERNAME/configure.

Example:

$ ./utils/pcc GERRIT_CHANGE_NUMBER LIST_OF_NODES --username YOUR_USERNAME --api-token 12312312312312313  
$ ./utils/pcc 282936 oxygen.eqiad.wmnet --username batman --api-token 12312312312312313

--username and --api-token can be omitted if JENKINS_USERNAME and JENKINS_API_TOKEN are set in the current environment.

Troubleshooting

Some common errors and mistakes.

  • Catalog for Cloud VPS instances doesn't get any classes/roles.

This happens because $::realm is not set to labs. There are patches in place to fix this, but the puppet-compiler software needs to be released with these patches.

  • ERROR: Unable to find facts for host tools-services-01.tools.eqiad1.wikimedia.cloud, skipping

If running locally, collect facts by hand from the corresponding puppetmaster. If running in the Jenkins web service for a production host, follow these instructions.

Limitations

The puppet-compiler mechanism won't discover all the issues in the resulting catalog. If the catalog was compiled OK by Jenkins, you may still find some issues when running the puppet agent.

Some known limitations:

  • Files sources. When declaring a File { '/my/file':, the path information you specified in the content parameter will be resolved at puppet agent runtime.
  • Private Hiera lookups. The way Hiera fetches data may vary between how it's done in the puppet-compiler process to how it's done in the final puppet master. Specifically, secrets in the private repo.
  • Hiera behavior. Currently, we don't have a way to know in concrete how Hiera is behaving when compiling the catalog. See Phabricator ticket T215507 for more information.


Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Receive mail announcements about critical changes
Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
Track work tasks and report bugs
Use the Phabricator workboard #Cloud-Services for bug reports and feature requests about the Cloud VPS infrastructure itself
Learn about major near-term plans
Read the News wiki page
Read news and stories about Wikimedia Cloud Services
Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)

See also