You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Puppet Hiera: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Jbond
imported>Jbond
No edit summary
 
(3 intermediate revisions by 3 users not shown)
Line 2: Line 2:


== Organization ==
== Organization ==
Every variable that puppet looks up in hiera will be searched via one or more backends (at the moment, two a yaml-based ones) according to what we configured in the ''hiera.yaml'' file in the base puppet directory.
Every variable that puppet looks up in hiera will be searched via one or more [[Puppet Hiera|backends]] according to what we configured in the ''hiera.yaml'' for that environment ([https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/puppetmaster/files/production.hiera.yaml production], [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/puppetmaster/files/labs.hiera.yaml cloud VPS] or [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/puppetmaster/files/pontoon.hiera.yaml pontoon]).


Depending on the kind of search (string, array or hash), hiera will search hierarchically within the sources using the configured  [https://puppet.com/docs/puppet/5.5/hiera_merging.html#merge_behaviors Merge Behaviour].  The default merge behaviour is <code>first</code> which means use the first entry found.  Its also worth noting that the merge behaviour only affects hash's and arrays.
Depending on the kind of search (string, array or hash), hiera will search hierarchically within the sources using the configured  [https://puppet.com/docs/puppet/5.5/hiera_merging.html#merge_behaviors Merge Behaviour].  The default merge behaviour is <code>first</code> which means use the first entry found.  Its also worth noting that the merge behaviour only affects hash's and arrays.


Our hiera config file can be seen [https://github.com/wikimedia/operations-puppet/blob/production/modules/puppetmaster/files/production.hiera.yaml here] and searches for entries using the following order
Our production hiera config searches for entries using the following order
The path listed below are from the production puppetmaster, also the structure below is a simplified version of the main hiera file.
The path listed below are from the production puppetmaster the heiradata directory referenced below is the [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/hieradata/ hieradata directory from the puppet repo].  Please note the structure below is a simplified version of the main hiera file, showing the expanded paths and the hiera backed used.  check the file in [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/puppetmaster/files/production.hiera.yaml git to see the full view]


<syntaxhighlight lang=yaml>
<syntaxhighlight lang=yaml>
Line 35: Line 35:
   - backend: wmflib::expand_path
   - backend: wmflib::expand_path
     path: "/etc/puppet/private/hieradata/common"
     path: "/etc/puppet/private/hieradata/common"
</syntaxhighlight>
And [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/refs/heads/production/modules/puppetmaster/files/labs.hiera.yaml cloud VPS]:
<syntaxhighlight lang=yaml>
hierarchy:
  - backend: cloudlib::httpyaml
    uri: "http://puppet-enc.cloudinfra.wmcloud.org:8100/v1/%{::labsproject}/node/%{facts.fqdn}"
  - backend: standard
    path: "/etc/puppet/hieradata/cloud/%{::wmcs_deployment}/%{::labsproject}/hosts/%{::hostname}.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/cloud/%{::wmcs_deployment}/%{::labsproject}/common.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/cloud/%{::wmcs_deployment}.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/cloud.yaml"
  - backend: standard
    path: "/etc/puppet/secret/hieradata/hosts/%{::trusted.certname}.yaml"
  - backend: standard
    path: "/etc/puppet/secret/hieradata/%{::labsproject}.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/labs/%{::labsproject}/common.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/%{::labsproject}.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/labs.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/common.yaml"
  - backend: standard
    path: "/etc/puppet/secret/hieradata/common.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/common.yaml"
</syntaxhighlight>
</syntaxhighlight>


Line 42: Line 74:
* '''netbox/hosts/%{::hostname}''' This contains meta data populated by netbox
* '''netbox/hosts/%{::hostname}''' This contains meta data populated by netbox
* '''hieradata/regex.yaml''' If one of the entries in this file matches the ''$::fqdn'' of the host, hiera will lookup keys here. For the format, see below. This is useful if you need to set up a key for multiple hosts. Using this for cluster-wide configurations is '''deprecated''', you should use the role keyword and the role backend.
* '''hieradata/regex.yaml''' If one of the entries in this file matches the ''$::fqdn'' of the host, hiera will lookup keys here. For the format, see below. This is useful if you need to set up a key for multiple hosts. Using this for cluster-wide configurations is '''deprecated''', you should use the role keyword and the role backend.
* '''role/${::site}/$rolepath.yaml''' where <tt>$rolepath</tt> is computed like <tt>$classpath</tt> above, but based on '''each''' of the roles declared in <tt>site.pp</tt> for the current node, e.g. if we declared
* '''hieradata/${::site}/$classpath.yaml''' If you need to configure something differently per-site (so, eqiad or codfw) ''globally'' it should go here. This is the place for global defaulks and should going foward should aonly have keys for the profile namespace. <tt>$classpath</tt> is computed as in the puppet autoload mechanism - so <tt>foo::params::param</tt> would be searched inside <tt>hieradata/${::site}/foo/params.yaml</tt> as the full key (<tt>foo::params::param</tt>).
* '''hieradata/${::site}/$classpath.yaml''' If you need to configure something differently per-site (so, eqiad or codfw) ''globally'' it should go here. But it should happen only for very general, base classes. <tt>$classpath</tt> is computed as in the puppet autoload mechanism - so <tt>foo::params::param</tt> would be searched inside <tt>hieradata/${::site}/foo/params.yaml</tt> as the full key (<tt>foo::params::param</tt>).
* '''role/${::site}/$rolepath.yaml''' where <tt>$rolepath</tt> is computed like <tt>$classpath</tt> above, but based on the role declared in <tt>site.pp</tt> for the current node.  Attaching more than one role on a node is very heavily discouraged and is not guaranteed to work
* '''private/hieradata/${::site}/$classpath.yaml''' where $classpath is evaluated as explained above. since these data are in the private repository, this is the perfect place to store passwords into (without the need to define private classes at all, this should allow us to wipe most of the labs/private repo out in the end). This is the site-specific version.
* '''private/hieradata/${::site}/$classpath.yaml''' where $classpath is evaluated as explained above. Since these data are in the private repository, this is the perfect place to store passwords into (without the need to define private classes at all, this should allow us to wipe most of the labs/private repo out in the end). This is the site-specific version.
* '''hieradata/common/$classpath.yaml''' where <tt>$classpath</tt> is again computed as explained above. This is useful for global configurations that may be overridden at higher levels.
* '''hieradata/common/$classpath.yaml''' where <tt>$classpath</tt> is again computed as explained above. This is useful for global configurations that may be overridden at higher levels.
* '''private/hieradata/common/$classpath.yaml'''
* '''private/hieradata/common/$classpath.yaml'''
 
* '''role/common/$rolepath.yaml''' where <tt>$rolepath</tt> is computed like above.
<syntaxhighlight lang="puppet">
* '''private/hieradata/role/${::site}/$rolepath.yaml''' and  '''private/hieradata/role/common/$rolepath.yaml''', which are in the private repository and work exactly as the other uses of role.
node 'foobar.eqiad.wmnet' {
    role mediawiki::appserver, monitoring::aggregator
}
</syntaxhighlight>
the key will be searched in ''role/eqiad/mediawiki/appserver.yaml'' and ''role/eqiad/monitoring/aggregator.yaml''. If the key is found in either of the two, or in both (and it's equal), it will be used. More on this below.
* ''role/common/$rolepath.yaml'' where <tt>$rolepath</tt> is computed like above.
* ''private/hieradata/role/${::site}/$rolepath.yaml'' and  ''private/hieradata/role/common/$rolepath.yaml'', which are in the private repository and work exactly as the two above.


=== Backends ===
=== Backends ===
Line 67: Line 92:
you are doing the following:
you are doing the following:
* including <tt>role::mediawiki::appserver</tt>
* including <tt>role::mediawiki::appserver</tt>
* make hiera lookup the <tt>role/eqiad/mediawiki/appserver.yaml</tt>
* make hiera lookup the <tt>role/eqiad/mediawiki/appserver.yaml</tt> and <tt>role/common/mediawiki/appserver.yaml</tt>
 
===== Limitations =====
This system, which is basically abusing puppet internals, comes with its fair share of limitations, namely:
* Only one role keyword is allowed per node, so while this is good:
<syntaxhighlight lang="puppet">
node 'foobar.eqiad.wmnet' {
    role mediawiki::appserver, monitoring::aggregator
}
</syntaxhighlight>
this will raise an error:
<syntaxhighlight lang="puppet">
node 'foobar.eqiad.wmnet' {
    role mediawiki::appserver
    role monitoring::aggregator
}
</syntaxhighlight>
* Any hiera lookup happening ''before'' the role keyword is declared will ''not'' be using the role lookups, so for example any lookup in the top scope will not work; also, if we do
<syntaxhighlight lang="puppet">
node 'foobar.eqiad.wmnet' {
    role mediawiki::appserver
    include admin
}
</syntaxhighlight>
will look up the <tt>admin</tt> class parameters in the roles, but
<syntaxhighlight lang="puppet">
node 'foobar.eqiad.wmnet' {
    include admin
    role mediawiki::appserver
}
</syntaxhighlight>
will not do that.


==== wmflib::regex ====
==== wmflib::regex ====
Line 159: Line 153:


''See main article [[Portal:Cloud_VPS/Admin/Hiera | Cloud VPS - Admin - Hiera]]''.
''See main article [[Portal:Cloud_VPS/Admin/Hiera | Cloud VPS - Admin - Hiera]]''.
[[Category:Puppet]]

Latest revision as of 08:52, 11 July 2022

Hiera is a component of Puppet which allows storing configuration data outside of manifests. See Puppet coding for more general information about how, when, and where to use Puppet and Hiera on the Wikimedia cluster, and the upstream Hiera documentation for more general information about Hiera.

Organization

Every variable that puppet looks up in hiera will be searched via one or more backends according to what we configured in the hiera.yaml for that environment (production, cloud VPS or pontoon).

Depending on the kind of search (string, array or hash), hiera will search hierarchically within the sources using the configured Merge Behaviour. The default merge behaviour is first which means use the first entry found. Its also worth noting that the merge behaviour only affects hash's and arrays.

Our production hiera config searches for entries using the following order The path listed below are from the production puppetmaster the heiradata directory referenced below is the hieradata directory from the puppet repo. Please note the structure below is a simplified version of the main hiera file, showing the expanded paths and the hiera backed used. check the file in git to see the full view

  hierarchy:
  - backend: standard
    path: "/etc/puppet/hieradata/hosts/%{::hostname}.yaml"
  - backend: standard
    path: "/etc/puppet/netbox/hieradata/hosts/%{::hostname}"
  - backend: wmflib::regex
    path: "/etc/puppet/hieradata/hosts/regex.yaml"
  - backend: wmflib::expand_path
    path:  "/etc/puppet/private/hieradata/%{::site}"
  - backend: wmflib::expand_path
    path: "/etc/puppet/hieradata/%{::site}"
  - backend: standard
    path: "/etc/puppet/hieradata/role/%{::site}/%{::_role}.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/role/common/%{::_role}.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/role/%{::site}/%{::_role}.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/role/common/%{::_role}.yaml"
  - backend: wmflib::expand_path
    path: "/etc/puppet/hieradata/common"
  - backend: standard
    path: "/etc/puppet/netbox/hieradata/hosts/%{::hostname}"
  - backend: wmflib::expand_path
    path: "/etc/puppet/private/hieradata/common"


And cloud VPS:

hierarchy:
  - backend: cloudlib::httpyaml
    uri: "http://puppet-enc.cloudinfra.wmcloud.org:8100/v1/%{::labsproject}/node/%{facts.fqdn}"
  - backend: standard
    path: "/etc/puppet/hieradata/cloud/%{::wmcs_deployment}/%{::labsproject}/hosts/%{::hostname}.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/cloud/%{::wmcs_deployment}/%{::labsproject}/common.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/cloud/%{::wmcs_deployment}.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/cloud.yaml"
  - backend: standard
    path: "/etc/puppet/secret/hieradata/hosts/%{::trusted.certname}.yaml"
  - backend: standard
    path: "/etc/puppet/secret/hieradata/%{::labsproject}.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/labs/%{::labsproject}/common.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/%{::labsproject}.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/labs.yaml"
  - backend: standard
    path: "/etc/puppet/hieradata/common.yaml"
  - backend: standard
    path: "/etc/puppet/secret/hieradata/common.yaml"
  - backend: standard
    path: "/etc/puppet/private/hieradata/common.yaml"

The lookup hierarchy

This hierarchy is organized as follows:

  • hieradata/hosts/${::hostname}.yaml. Here you should basically only include host-specific overrides. It is useful for things like testing new features on a single host of a cluster.
  • netbox/hosts/%{::hostname} This contains meta data populated by netbox
  • hieradata/regex.yaml If one of the entries in this file matches the $::fqdn of the host, hiera will lookup keys here. For the format, see below. This is useful if you need to set up a key for multiple hosts. Using this for cluster-wide configurations is deprecated, you should use the role keyword and the role backend.
  • hieradata/${::site}/$classpath.yaml If you need to configure something differently per-site (so, eqiad or codfw) globally it should go here. This is the place for global defaulks and should going foward should aonly have keys for the profile namespace. $classpath is computed as in the puppet autoload mechanism - so foo::params::param would be searched inside hieradata/${::site}/foo/params.yaml as the full key (foo::params::param).
  • role/${::site}/$rolepath.yaml where $rolepath is computed like $classpath above, but based on the role declared in site.pp for the current node. Attaching more than one role on a node is very heavily discouraged and is not guaranteed to work
  • private/hieradata/${::site}/$classpath.yaml where $classpath is evaluated as explained above. Since these data are in the private repository, this is the perfect place to store passwords into (without the need to define private classes at all, this should allow us to wipe most of the labs/private repo out in the end). This is the site-specific version.
  • hieradata/common/$classpath.yaml where $classpath is again computed as explained above. This is useful for global configurations that may be overridden at higher levels.
  • private/hieradata/common/$classpath.yaml
  • role/common/$rolepath.yaml where $rolepath is computed like above.
  • private/hieradata/role/${::site}/$rolepath.yaml and private/hieradata/role/common/$rolepath.yaml, which are in the private repository and work exactly as the other uses of role.

Backends

Role-based lookup

Role based lookups use the standard hiera 5 backend however we do use a custom role function to populate the $::_role variable. As well as creating the $::_role variable the function also includes the class role::$argument (this function can be used only at the node scope). when you use the following:

node 'foobar.eqiad.wmnet' {
    role(mediawiki::appserver)
}

you are doing the following:

  • including role::mediawiki::appserver
  • make hiera lookup the role/eqiad/mediawiki/appserver.yaml and role/common/mediawiki/appserver.yaml

wmflib::regex

Implemented using the wmflib::regex hiera backend Since we have large clusters of almost-identical servers, instead of having to write out hiera data for each server we added a special file called regex.yaml where variables matching a whole cluster can be assigned; the format of the file is:

LABEL:
  __regex: !ruby/regexp PATTERN
  var1: value1
  var2: value2

where LABEL is a unique identifier of the cluster, PATTERN is a Ruby regular expression that will be matched to the $::fqdn puppet fact. So, keep on with the preceding example, we have in regex.yaml

appservers:
  __regex: !ruby/regexp /^mw1[0-2][0-9]{2}\.eqiad\.wmnet$/
  cluster: hhvm_appservers

to ensure the 'cluster' variable is defined consistently. As of now, the regex.yaml file should just be used seldom, and the role backend should be use instead.

wmflib::expand_path

Implemented using the wmflib::expand_path hiera backend. In order to minimise the size of the common.yaml and "${::site}.yaml" and thus reduce the number of merge conflicts we make use of the wmflib::expand_path backend. This is a simple backed which takes a hiera key and looks in a file relative to the hierarchy path matching the key prefix i.e. $path/$classpath.yaml, where $classpath equals the hiera key prefix with :: replaced with /. e.g. when searching for the hiera value profile::mysuper_cool_module::mysuper_cool_module::this_is_the_key, hiera will look for this key in:

 - {$::site}/profile/mysuper_cool_module/mysuper_cool_module.yaml
 - common/profile/mysuper_cool_module/mysuper_cool_module.yaml

It should also be noted that unqualified global values i.e. those without a ::, will be looked up in common.yaml and $site.yaml

Practical example

Say we are searching the value of profile::admin::always_groups for the node mw1017.eqiad.wmnet, which is defined as follows in site.pp:

node /^mw1(01[7-9]|0[2-9][0-9]|10[0-9]|11[0-3])\.eqiad\.wmnet$/ {
    role (mediawiki::appserver)
}

This will search for the value in the following files:

- hieradata/hosts/mw1017.yaml                            # $::hostname
- hieradata/regex.yaml                                   # where the $::fqdn will be matched to regexes, see above
- hieradata/eqiad/profile/admin.yaml                     # $::site, ''profile::admin'' expanded with wmflib::expand_path
- private/hieradata/eqiad/profile/admin.yaml             # $::site, ''profile::admin'' expanded with wmflib::expand_path
- hieradata/role/eqiad/mediawiki/appserver.yaml          # $::_role expaneds to ''mediawiki/appserver''
- hieradata/role/common/mediawiki/appserver.yaml         # $::_role expaneds to ''mediawiki/appserver''
- private/hieradata/role/eqiad/mediawiki/appserver.yaml  # $::_role expaneds to ''mediawiki/appserver''
- private/hieradata/role/common/mediawiki/appserver.yaml # $::_role expaneds to ''mediawiki/appserver''
- hieradata/common/profile/admin.yaml                    # profile::admin expanded with wmflib::expand_path
- private/hieradata/common/profile/admin.yaml            # profile::admin expanded with wmflib::expand_path


If it doesn't find any value in any of those files, puppet will use the class default value for that variable.

In Wikimedia Cloud Services (formerly known as Labs)

See main article Cloud VPS - Admin - Hiera.