You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
DNS/Netbox: Difference between revisions
imported>Volans (Migrated eqiad primary records) |
imported>Elukey |
||
Line 52: | Line 52: | ||
If when running the cookbook '''the presented diff show changes unrelated to your work''', follow the instructions in [[Monitoring/Netbox_DNS_uncommitted_changes#What_to_do]]. | If when running the cookbook '''the presented diff show changes unrelated to your work''', follow the instructions in [[Monitoring/Netbox_DNS_uncommitted_changes#What_to_do]]. | ||
=== Force update generated records === | |||
It might happen that one or more authdns hosts fail to run <code>authdns-update</code>, leading to an inconsistent state. The <code>sre.dns.netbox</code> cookbook offers a <code>--force</code> option, that takes as input the SHA of the git commit that you want all authdns servers to be synced on. In order to find the SHA, just do:<syntaxhighlight lang="bash"> | |||
ssh netbox.wikimedia.org | |||
sudo -i | |||
cd /srv/netbox-exports/dns.git | |||
git log -1 | |||
</syntaxhighlight> | |||
=== Convert an hardcoded $ORIGIN to Netbox === | === Convert an hardcoded $ORIGIN to Netbox === |
Revision as of 18:16, 28 October 2020
IP allocation is moving to Netbox, that will be our IPAM, and DNS records will be automatically generated from Netbox data.
Infrastructure
- IP allocation is done on Netbox.
- Netbox data is exported via Netbox#DNS.
- Netbox data is checked out on the authoritative DNS servers in
/srv/git/netbox_dns_snippets
. - When compiling the
gdnsd
final zones, the Netbox data is copied into/etc/gdnsd/zones/netbox
for later inclusion. - In the actual zonefiles, within an
$ORIGIN
, the related snippet file is included using the$INCLUDE
directive.
IP Allocation
The migration to the automated system requires that we move the allocation of IPs to Netbox that will gradually become the authoritative source of truth for IPAM.
Cutoff dates
- [Wednesday June 24th 2020 10:00am UTC] All the management IP address allocation will be performed in Netbox from now on. Either via the Add interfaces and IPs to devices Netbox script for provisioning new devices or manually via the Add an IP Address button in the IP Addresses tab of any IP Prefix that will assign the first available IP in that subnet. The Offline a device with extra actions Netbox script takes instead care of the removal of interfaces and IPs when setting a device offline.
- [Monday September 14th 2020 11:00am UTC] All IP address allocation except frack (Fundraising-tech) will be performed in Netbox from now on.
DNS records involved
- Management forward (
A
) and reverse (PTR
) records for both the hostname (foo.mgmt.eqiad.wmnet
) and the asset tag (wmf1234.mgmt.eqiad.wmnet
) - Primary IPv4 (
A
) and IPv6 (AAAA
) and related reverse (PTR
) records for the hostname (foo.eqiad.wmnet
orfoo.wikimedia.org
)
Active
Management
ulsfo
eqsin
esams
frack
incodfw
frack
ineqiad
codfw
eqiad
Primary IPs
ulsfo
eqsin
esams
eqiad
To be migrated
Management
- NONE, all migrated
Primary IPs
codfw
Operations
Update generated records
To update the dynamically generated records based on the current Netbox data and deploy them to all the authoritative DNS servers, the sre.dns.netbox
cookbook must be run. The cookbook must be run anytime records are changed in Netbox. See also Cookbooks#Cookbook_Operations. For example:
sudo cookbook sre.dns.netbox -t T12345 "Add newly racked cp hosts in eqiad"
There is an Icinga check if changes in Netbox are not committed after a while, see Monitoring/Netbox_DNS_uncommitted_changes for troubleshooting.
If when running the cookbook the presented diff show changes unrelated to your work, follow the instructions in Monitoring/Netbox_DNS_uncommitted_changes#What_to_do.
Force update generated records
It might happen that one or more authdns hosts fail to run authdns-update
, leading to an inconsistent state. The sre.dns.netbox
cookbook offers a --force
option, that takes as input the SHA of the git commit that you want all authdns servers to be synced on. In order to find the SHA, just do:
ssh netbox.wikimedia.org
sudo -i
cd /srv/netbox-exports/dns.git
git log -1
Convert an hardcoded $ORIGIN to Netbox
This is an example patch to convert an hardcoded $ORIGIN
to the dynamically generated data.
Atomically deploy auto-generated records and a manual change
In case there is a change in the generated Netbox data that requires at the same time a change in the manual operations/dns
repository too, this is the procedure to follow:
- prepare the patch for the
operations/dns
repository, send it for review- CI will fail if there is any
$INCLUDE
of files not yet existing in the generated data, that's expected
- CI will fail if there is any
- Modify the data in Netbox
- Run the
sre.dns.netbox
cookbook as described above in DNS/Netbox#Update_generated_records adding the option --skip-authdns-update - Comment recheck in the CR sent for the
operations/dns
repository, CI should now pass - Merge and deploy the patch, once deployed it will include also the generated data that was pushed but not deployed by the cookbook, making the change atomical from the DNS point of view.
The above procedure should be run all together without let too much time pass between each step and it will be wise to ask in the various SRE channels to refrain during this operation from running authdns-update
or any of the cookbooks that in turn run the sre.dns.netbox
one (as of Oct. 2020 sre.hosts.decommission
and sre.ganeti.makevm
) or the sre.dns.netbox
cookbook itself.
As an example the above procedure was used when a new prefix was created and as a result the generated data got moved from one file to another, see operations/dns/+/632953
Modify the generated data in an emergency
To modify the generated data in an emergency it's possible just running the sre.dns.netbox
cookbook as described above in DNS/Netbox#Update_generated_records adding the option --emergency-manual-edit.
After the generation of the data the cookbook will stop and prompt the user to make the modifications, something like:
Generated temporary files are available on netbox1001.wikimedia.org:/tmp/dns-c25pcHBldHM-iad8k5x_
SSH there, as root modify any file, git stage them and run "git commit --amend" to commit them
Then run "git log --pretty=oneline -1" and copy the new SHA1 of HEAD
N.B.: any subsequent run of the cookbook will try to revert the manual changes, make all SREs aware of the emergency situation.
Transition FAQ
Am I affected?
If your workflows will be affected by this change depends entirely on your interaction with the operations/dns repository:
- I never read or contribute to this repository:
- you're not affected and nothing will change for you. You can stop reading here.
- I sometimes read or search for things in this repository:
- you're marginally affected as the manual records will gradually disappear from the operations/dns repository to be replaced by the auto-generated files. You can search directly in Netbox. If you want to see directly the content of the generated files you can clone the auto-generated repository to read or search in it following the instructions in Netbox#DNS. You can optionally read the rest of the document.
- I contribute to the repository:
- you're affected and should keep reading this FAQ section and the rest of the document.
What is changing
- IP allocation that is currently done manually as part of the DNS record definition in the DNS repository zone files is moving to Netbox, which will be our IPAM tool. This transition will be done all at once to ensure consistency. Only Fundraising-tech (frack) non-mgmt records will be left out of this transition.
- The cutoff date for all remaining IP allocation except frack to be moved to Netbox is Monday September 14th around 11:00am UTC.
- All existing IPs except for frack ones will be automatically imported into Netbox prior to the cutoff time (a sneak peak can be found in netbox-next.wikimedia.org).
- The changes in the Server Lifecycle procedure are outlined in the Server_Lifecycle/DNS_Transition page and DCOps is up to speed with the process.
- After that date all IPs except frack non-mgmt ones must be allocated in Netbox prior to assigning them a DNS record in the DNS repository.
- All new host's primary IPv4/IPv6 will be automatically assigned to them at provision time.
- Additional IPs will require manual allocation in Netbox [see below]
- Right after the cutoff time, all newly allocated IPs will still need a manual patch in the operations/dns repository until their zone has been migrated [see below].
- The automatic DNS record generation (see above DNS/Netbox#Update_generated_records) generates all of the records present in Netbox, but they will be included in the DNS repository and hence in production on a per-
$ORIGIN
basis, which will not be rolled out simultaneously:- If a given
$ORIGIN
has been migrated to the automated zone file, updating Netbox and running the cookbook will change the DNS records. - If a given
$ORIGIN
has not yet been migrated to the automated zone file, a manual change to the DNS repository which adds the record in question is still needed after the Netbox allocation. - To check if an
$ORIGIN
has been migrated, just look for a$INCLUDE netbox/zone_name
line right below the$ORIGIN
line.
- If a given
Who can I ping for questions?
For questions, concerns or comments please get in touch with Cas or Riccardo. If unable to find either of us get in touch with the SRE Infrastructure Foundations team.
What to do if the diff has spurious changes?
Follow the instructions in Monitoring/Netbox_DNS_uncommitted_changes#What_to_do.
How to allocate primary IPs for a server
Physical hosts
The management, primary IPv4 and primary IPv6 for any new physical host will be automatically assigned at provision time by DCOps running a Netbox script, see also Server_Lifecycle/DNS_Transition#Provisioning_2.
Virtual machines
For Ganeti virtual machines the sre.ganeti.makevm
cookbook has been updated to take care of the new workflow automatically. During the transition phase, if needed, it will prompt the user to create a manual DNS patch with the newly pre-allocated IPs.
How to manually allocate a special purpose IP address in Netbox
This procedure is meant to be used only to create IPs in Netbox that are not attached to any device's interface because have special purposes like virtual IP addresses (VIPs, which are generally used for service addresses). Depending on real life use cases the following procedure might be automated into a Netbox script in the near future.
- Go to the VLANs page in Netbox VLANs and Netbox Prefixes
- Search for the correct VLAN based on datacenter, type, row (if applicable), etc.
- Click on the desired prefix (v4 or v6) in the Prefixes column for that VLAN
- Click on the IP Addresses tab in the prefix page
- Click on the Add an IP Address green button on the top-right, Netbox will automatically select the first available IP in that subnet
- Make sure to change the netmask to /32 in the address field.
- To create an IPv6 that is a mapped version of an existing IPv4, modify the Address field at the top to override the automatically selected address.
- If this is a VIP, make sure you get the same last octect in both eqiad and codfw datacentres
- Select the relevant Role (VIP for LVS, anycast, etc.)
- Set the DNS Name field with the FQDN to assign to this IP
- Select the Tenant if applicable (FR-Tech, RIPE)
- Click on the Create blue button at the bottom
When will the zones be migrated to the new auto-generated files?
Right after the cutoff date the migration will start on a per-$ORIGIN
basis. We plan to migrate all related zones within a month after the cutoff date.