You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
DNS/Discovery
DNS Discovery is a simple dynamic service discovery to get the closest active endpoint of a given service that is running in multiple datacenters.
This solution is meant only for simple discovery entries, if more complex data needs to be dynamically driven, the usage of a Confd / etcd managed configuration is required.
Active/active services
If a service is running in an active/active mode, it means that can be contacted in any datacenter. In this case the entry service-name.discovery.wmnet
will return the IP of the endpoint of the same datacenter of the host that is performing the resolution, if that endpoint is pooled.
So for example with both datacenters pooled, an host in eqiad that will resolve service-name.discovery.wmnet
will get the IP of service-name.svc.eqiad.wmnet
while an host in codfw will get the IP of service-name.svc.codfw.wmnet
.
If the codfw datacenter entry is depooled, an host in codfw will get the IP of the endpoint in eqiad, if that is pooled.
Active/passive services
If a service is running in an active/passive mode, it means that it can be contacted only in the primary datacenter and not in the passive one. In this case the entry service-name.discovery.wmnet
will always return the IP of the endpoint in the primary datacenter.
Read-only and read-write
If a service can handle reads in an active/active way, but writes only in an active/passive way, two DNS Discovery records can be created, service-name-ro
and service-name-rw
so they can be treated as two different services, one active/active and the other active/passive.
Failure scenario
To handle the failure cases in which no datacenter is pooled for a given service, a failoid
service was created that will always close the connection to any TCP port. In this way the DNS Discovery can have the failod
IPs as fallback and is able to return always an IP, avoiding any negative DNS caching and such. The failoid service is present in both eqiad and codfw datacenters and the IP of the local one will be returned.
How to manage a DNS Discovery service
TODO: Add more details for the Puppet configuration
The DNS configuration is managed in Puppet while the current pooled/depooled state and the TTL are stored in etcd and can be managed via Conftool, either from the CLI or using it as a library. For example:
- Get the current live state of the three main MediaWiki discovery entries:
$ confctl --quiet --object-type discovery select 'dnsdisc=(appservers|api|imagescaler)-rw' get
{"codfw": {"pooled": false, "references": [], "ttl": 300}, "tags": "dnsdisc=imagescaler-rw"}
{"eqiad": {"pooled": true, "references": [], "ttl": 300}, "tags": "dnsdisc=imagescaler-rw"}
{"eqiad": {"pooled": true, "references": [], "ttl": 300}, "tags": "dnsdisc=api-rw"}
{"codfw": {"pooled": false, "references": [], "ttl": 300}, "tags": "dnsdisc=api-rw"}
{"codfw": {"pooled": false, "references": [], "ttl": 300}, "tags": "dnsdisc=appservers-rw"}
{"eqiad": {"pooled": true, "references": [], "ttl": 300}, "tags": "dnsdisc=appservers-rw"}
- Get the current live state of the
parsoid
entry:
$ confctl --quiet --object-type discovery select 'dnsdisc=parsoid' get
{"eqiad": {"pooled": true, "references": [], "ttl": 300}, "tags": "dnsdisc=parsoid"}
{"codfw": {"pooled": true, "references": [], "ttl": 300}, "tags": "dnsdisc=parsoid"}
- Depool the codfw entry of the
imagescaler-ro
entry incodfw
:
$ confctl --object-type discovery select 'dnsdisc=imagescaler-ro,name=codfw' set/pooled=false