You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
SRE/Infrastructure naming conventions
This page documents the naming conventions of servers, routers, data center sites, and other infrastructure relevant to Wikimedia Foundation clusters.
Our servers currently fall in broadly two categories:
- Clustered servers: These use numeral sequences with a descriptive prefix (see #Networking and #Servers). For example: db1001.
- Miscellaneous servers: These use unique hostnames (see #Miscellaneous servers). For example: helium.
Name reuse
Historically we did not reuse names of past servers for new servers. For example, after db1001 is decommissioned, no other server will be named db1001. Ganeti VMs sometimes reuse hostnames, but bare metal typically will not.
The notable exception is networking gear, which are deterministically specified by rack. For example the access switch in Eqiad rack A8 is named asw-a8-eqiad. If it is replaced, the new switch will take the same name.
All hardware in the datacenter space is tracked in Netbox, which can be used to check for existing hostnames for both hardware and ganeti instances.
Server clusters
Clusters are named as vendor initials (at time of lease signing) followed by the IATA code for a nearby major airport.
For example: our Dallas site is named codfw. The vendor is CyrusOne, and DFW being the large nearby airport. (Technically, Love Field airport is closer but less well-known.)
Cluster | Vendor | Airport Code |
---|---|---|
codfw | CyrusOne | DFW |
drmrs | Digital Realty | MRS |
eqdfw | Equinix | DFW |
eqiad | Equinix | IAD |
eqord | Equinix | ORD |
eqsin | Equinix | SIN |
esams | EvoSwitch | AMS |
knams | Kennisnet | AMS |
ulsfo | United Layer | SFO |
Networking
Naming for network equipment is based on role and location.
This also applies to: power distribution units, serial console servers, and other networking infrastructure.
Name prefix | Role | Example |
---|---|---|
asw | access switch | asw-a1-eqiad |
cr | core router | cr1-eqiad |
mr | management router | mr1-eqiad |
msw | management switch | msw1-eqiad & msw-b2-eqiad |
pfw | payments fire wall | pfw1-eqiad |
ps1 / ps2 | power strips/distribution units | ps1-b3-eqiad |
scs | serial console server | scs-a8-eqiad |
fasw | Fundraising access switch | fasw-c-codfw |
cloudsw | Cloud L3 switches | cloudsw1-c8-eqiad |
OpenStack deployments
[Datacenter Site][numeric identifier](optional dev suffix to indicate non-external non-customer facing deployments) - [r (if region)][letter for AZ]
- Current Eqiad/Codfw deployments will not fully meet these standards until rebuilt: [eqiad0 (deployment), eqiad (region), nova (AZ)]
Deployment | Region | Availability Zone |
---|---|---|
eqiad0 | eqiad0-r | eqiad0-rb |
eqiad1 | eqiad1-r | eqiad1-rb |
codfw0dev | codfw0dev-r | codfw0dev-rb |
codfw1dev | codfw1dev-r | codfw1dev-rb |
Disks
- Arrays must use the
Storage array
device role in Netbox. - Naming follows two conventions:
- Array is attached to a single host:
- hostname_of_host_system-arrayN
- Example: ms2001-array1, ms2001-array2
- all arrays get a number, even if there is only a single array.
- Example: dataset1001-array1
- Array is attached to multiple hosts
- Labs uses this for labstore, each shelf connects to two different hosts. As such, the older single host naming scheme fails.
- servicehostgroup-arrayN-site
- Example: labstore-array1-codfw, labstore-array2-codfw
Servers
Any system that runs in a dedicated services cluster with other machines will be named after their role/service task. As a rule, we attempt to name after the service, not just the software package. Also, servers within a group are numbered based on the datacenter they are located in.
Datacenter | Numeral range | Example |
---|---|---|
pmtpa / sdtpa (decommissioned) | 1-999 | cp7 |
eqiad | 1000-1999 | db1001 |
codfw | 2000-2999 | mw2187 |
esams / knams | 3000-3999 | cp3031 |
ulsfo | 4000-4999 | bast4001 |
eqsin | 5000-5999 | dns5001 |
drmrs | 6000-6999 | cp6011 |
When adding a new datacenter, make sure to update operations/puppet.git
's /typos
file which checks hostnames.
Name prefix | Description | Status | Points of contact |
---|---|---|---|
acmechief | ACME certificate manager | In use | Traffic |
acmechief-test | ACME certificate manager staging environment | In use | Traffic |
alert | Alerting host (Icinga / Alertmanager) | In use | Observability |
amssq | esams caching server | No longer used (deprecated) | |
amslvs | esams LVS | No longer used (deprecated) | |
analytics | analytics nodes (Hadoop, Hive, Impala, and various other things) | Being replaced by an-worker | Data Engineering SREs |
analytics-master | analytics master nodes | Being replaced by an-master | Data Engineering SREs |
analytics-tool | virtual machines in production (Ganeti) running analytics tools/websites | Being replaced by an-tool | Data Engineering SREs |
an-coord | analytics coordination node | In use | Data Engineering SREs |
an-master | analytics master node | In use, replacing analytics-master | Data Engineering SREs |
an-tool | analytics tools node | In use | Data Engineering SREs |
an-test-(coord/master/worker) | analytics hadoop test cluster nodes | In use | Data Engineering SREs |
an-worker | analytics worker node | In use, replacing analyticsNNNN | Data Engineering SREs |
an-scheduler | analytics job scheduler node | In use | Data Engineering SREs |
an-airflow | analytics job scheduler node dedicated to the Discovery team | In use | Data Engineering SREs |
aphlict | notification server for Phabricator | In use | Service Operations |
apt | Advanced Package Tool Repository (Debian APT repo) | In use | Infrastructure Foundations |
aqs | Analytics Query Service | In use | Data Engineering SREs |
archiva | Archiva Artifact Repository | In use | Data Engineering SREs |
auth | Authentication server | In use | Infrastructure Foundations |
authdns | Authoritative DNS (gdsnd) | In use | Traffic |
backup | Backup hosts | In use | Data Persistence |
bast | bastion host | In use | Infrastructure Foundations |
censorship | Censorship monitoring databases and scripts | No longer used (deprecated) | Traffic |
centrallog | Centralized syslog | In use | Observability |
certcentral | Central certificates service | No longer used (deprecated) | |
chartmuseum | Helm Chart repository ChartMuseum | In use | Service Operations |
cloud*-dev | Any cloud role + '-dev' = internal deployment (PoC, Staging, etc) | In use | WMCS |
cloudcephmon | Ceph monitor and manager daemon for WMCS | In use | WMCS |
cloudcephosd | Ceph object storage data nodes for WMCS | In use | WMCS |
cloudceph | Converged Ceph object storage and monitor nodes for WMCS (only used for testing) | No longer used | WMCS |
cloudcontrol | OpenStack deployment controller for WMCS | In use | WMCS |
clouddb | Wiki replica servers for WMCS | In use | WMCS, with support from DBAs |
cloudelastic | Replication of ElasticSearch for WMCS | In use | WMCS |
cloudgw | Cloud gateway server for WMCS | In use | WMCS |
cloudmetrics | Monitoring server for WMCS | In use | WMCS |
cloudnet | Network gateway for tenants of WMCS (Neutron l3) | In use | WMCS |
cloudservices | Misc OpenStack components (Designate) for WMCS | In use | WMCS |
cloudvirt | OpenStack Hypervisor (libvirtd + KVM) for WMCS | In use | WMCS |
cloudvirtan | OpenStack Hypervisor (libvirtd + KVM) for WMCS (dedicated to Analytics) | No longer used | WMCS |
cloudstore | Storage system for WMCS | In use | WMCS |
cloudbackup | Backup storage system for WMCS | In use | WMCS |
conf | Configuration system host (etcd, zookeeper...) | In use | Service Operations |
contint | Continuous Integration | In use | Service Operations |
cp | Cache proxy (Varnish) | In use | Traffic |
cumin | Cluster management (cumin/spicerack/debdeploy/etc...) | In use | Infrastructure Foundations |
datahubsearch | DataHub OpenSearch Cluster - used for Data Catalog MVP | In use | Data Engineering SREs |
dataset | dataset dumps storage | No longer used (deprecated) | Service Operations & Platform Engineering |
db | Database host | In use | Data Persistence |
dbmonitor | Database monitoring | In use | Data Persistence |
dborch | Database orchestration (MySQL Orchestrator) | In use | Data Persistence |
dbprov | Database backup generation and data provisioning | In use | Data Persistence |
dbproxy | Database proxy | In use | Data Persistence |
dbstore | Database analytics | In use | Data Engineering SREs & Data Persistence |
debmonitor | Debian packages monitoring | In use | Infrastructure Foundations |
deploy | Deployment hosts | In use | Service Operations |
dns | DNS recursors | In use | Infrastructure Foundations |
doc | Documentation server (CI) | In use | Service Operations (Supportive Services) & Release Engineering |
doh | Wikidough Anycasted | In use | Traffic |
an-druid | Druid Cluster (Analytics). Due to naming legacy, druid100[1-3] are also in this cluster. | In use | Data Engineering SREs |
druid | Druid Cluster (Public) | In use | Data Engineering SREs |
dumpsdata | dataset generation fileset serving to snapshot hosts | In use | Service Operations & Platform Engineering |
durum | Check service for Wikidough | Traffic | |
elastic | elasticsearch servers | In use | Search Platform SREs |
es | Database host for MediaWiki external storage (wiki content, compressed) | In use | Data Persistence |
etcd | Etcd server | In use | Service Operations |
etherpad | Etherpad server | In use | Service Operations |
eventlog | EventLogging host | In use | Data Engineering SREs |
flowspec | Network controller | In use (testing) | Infrastructure Foundations |
fr* | Fundraising servers, e.g. frdb, frlog, frpm (puppetmaster) | In use | fr-tech SREs |
ganeti | Ganeti Virtualization Cluster | In use | Infrastructure Foundations |
ganeti-test | Ganeti Virtualization Cluster (test setup) | in use | Infrastructure Foundations |
gerrit | Gerrit code review (gerrit1001 in eqiad is currently used) | In use (deprecated) | Service Operations (Supportive Services) & Release Engineering |
gitlab | In use (phab:T274459) | ||
grafana | Grafana server | In use | Observability |
graphite | Graphite server | In use | Observability |
icinga | Icinga servers | In use | Observability |
idp | Identity provider (Apereo CAS) | In use | Infrastructure Foundations |
install | Installation server | In use (rare) | Infrastructure Foundations |
kafka | Kafka brokers | In use | Data Engineering SREs & Infrastructure Foundations |
kafka-jumbo | Large general purpose Kafka cluster | In use | Data Engineering SREs & Infrastructure Foundations |
kafkamon | Kafka monitoring (VMs) | In use | Data Engineering SREs & Infrastructure Foundations |
karapace | DataHub Schema Registry server (standalone) - Used for the Data Catalog MVP | ||
knsq | knams squid | No longer used (deprecated) | |
krb | Kerberos KDC/Kadmin | In use | Infrastructure Foundations & Data Engineering SREs |
kubernetes | Kubernetes cluster (k8s) | In use | Service Operations |
kubestage | Kubernetes staging cluster | In use | Service Operations |
kubestagetcd | Etcd cluster for the Kubernetes staging cluster | In use | Service Operations |
kubetcd | Etcd cluster for the Kubernetes cluster | In use | Service Operations |
lab | labs virtual node | No longer used (deprecated) | WMCS |
labcontrol | Controller node for WMCS (aka "labs") | No longer used (deprecated) | WMCS |
labnet | Networking host for WMCS | No longer used (deprecated) | WMCS |
labnodepool | Dedicated WMCS host for Nodepool (CI) | No longer used (deprecated) | WMCS |
labpuppetmaster | Puppetmasters for WMCS | No longer used (deprecated) | WMCS |
labsdb | Replication of production databases for WMCS | No longer used (deprecated) | WMCS with support from DBAs |
labservices | Services for WMCS | No longer used (deprecated) | WMCS |
labstore | Disk storage for WMCS | In use (deprecated) | WMCS |
labtest* | Test hosts for WMCS | No longer used (deprecated) | WMCS |
labvirt | Virtualization node for WMCS | No longer used (deprecated) | WMCS |
labweb | Management websites for WMCS | In use (deprecated) | WMCS |
lists | Mailing lists running Mailman | In use | Service Operations and Ladsgroup |
logstash | elasticsearch/logstash/kibana node | In use | Observability |
lvs | lvs load balancer | In use | Traffic |
maps | Maps cluster | In use | |
maps-test | maps test cluster | No longer used (deprecated) | |
mc | memcached server | In use | Service Operations |
mc-gp | memcached gutter pool server | In use | Service Operations |
ml-staging | Machine learning stanging env etcd and control plane machines | In use | ML team |
ml-serve | Machine learning serving cluster (ml-serve-ctrl* are VMs for k8s control plane) | In use | ML team |
ml-cache | Machine leaning caching nodes | In use | ML team |
mirror | public mirror, e.g. Debian mirror, Ubuntu mirror | In use | Infrastructure Foundations |
miscweb | miscellaneous web server | planned; to replace krypton | Service Operations |
ms | media storage | No longer used (deprecated) | Data Persistence (Media Storage) |
ms-be | media storage backend | In use | Data Persistence (Media Storage) |
ms-fe | media storage frontend | In use | Data Persistence (Media Storage) |
mw | MediaWiki application server (MediaWiki PHP webservers, api, jobrunners, videoscalers) | In use | Service Operations |
mwdebug | MediaWiki application server for debugging and deployment staging (Ganeti VMs) | In use | Service Operations |
mwlog | MediaWiki logging host | In use | Service Operations |
mwmaint | MediaWiki maintenance host (formerly "terbium") | In use | Service Operations |
mx | Mail relays | In use | Infrastructure Foundations |
nas | NAS boxes (NetApp) | Unused | |
netflow | Network visibility | In use | Infrastructure Foundations |
netmon | Network monitor (smokeping, torrus, librenms, rancid, netbox) | In use | Infrastructure Foundations |
netbox | Netbox front-end instances | In use | Infrastructure Foundations |
netbox-dev | Netbox test instances | In use | Infrastructure Foundations |
netboxdb | Netbox back-end database instances | In use | Infrastructure Foundations |
notebook | Jupyterhub experimental server | Unused | |
nfs | NFS server | Unused | |
peek | Security Team workflow and project management tooling | In use | Security Team |
ocg | offline content generator (PDF) | No longer used (deprecated) | |
ores | ORES cluster | In use | Machine Learning SREs |
orespoolcounter | ORES PoolCounter | In use | Machine Learning SREs |
oresrdb | ORES Redis systems | In use | Machine Learning SREs |
pc | Parser cache database | In use | SRE Data Persistence (DBAs), with support from Platform and Performance |
PDF Collections | No longer used (deprecated) | ||
people | peopleweb (people.wikimedia.org) | In use | Service Operations & Infrastructure Foundations |
parse | parsoid | Soon in use | Service Operations |
phab | Phabricator host (currently iridium is eqiad phab host) | In use | Service Operations |
ping | Ping offload server | In use | Infrastructure Foundations |
planet | Planet server | In use (mistake) | Service Operations |
pki | PKI Server (CFSSL) | In use | Infrastructure Foundations |
pki-root | PKI Root CA Server (CFSSL) | In use | Infrastructure Foundations |
poolcounter | PoolCounter cluster | In use | Service Operations |
prometheus | Prometheus cluster | In use | Observability |
proton | Proton cluster | In use | Service Operations |
puppetboard | PuppetDB Web UI | In use | Service Operations |
puppetdb | PuppetDB cluster | In use | Service Operations |
puppetmaster | Puppet masters | In use | Infrastructure Foundations |
pybal-test | PyBal testing and development | In use | Traffic |
rbf | Redis Bloom Filter server | Unused | Service Operations |
rcs | Obsolete:RCStream server (recent changes stream) | No longer used (deprecated) | |
rdb | Redis server | In use | Service Operations |
registry | Docker registries | In use | Service Operations |
releases | Software Releases | In use | Service Operations |
relforge | Discovery's Relevance Forge (see discovery/relevanceForge.git, T131184) | In use | Search Platform SREs |
restbase | RESTBase server | In use | Service Operations |
rpki | RPKI#Validation | In use | Infrastructure Foundations |
sca | Service Cluster A - Includes various services | In use | Service Operations |
scb | Service Cluster B - Includes various services. It's effectively the next generation of the sca cluster above | In use | Service Operations |
schema | Event Schemas HTTP server | In use | Data Engineering SREs & Service Operations |
search-loader | Analytics to Elastic Search model data loader | In use | Search Platform SREs |
sessionstore | Service Operations | ||
snapshot | Data dump processing node | In use | Service Operations & Platform Engineering |
sq | squid server | No longer used (deprecated) | |
srv | apache server | No longer used (deprecated) | |
stat | statistics computation hosts (see Analytics/Data access) | In use | Data Engineering SREs |
storage | storage host | No longer used (deprecated) | |
testreduce | parsoid visual diff testing | In use | Service Operations |
thanos-be | Prometheus long term storage backend | In use | Observability |
thanos-fe | Prometheus long term storage frontend | In use | Observability |
thumbor | Thumbor | In use | Service Operations (& Performance) |
tmh | MediaWiki videoscaler (TimedMediaHandler). See T105009 and T115950. | No longer used (deprecated) | |
torrelay | Tor relay | No longer used (deprecated) | |
urldownloader | url-downloader | In use (added in T224551) | Service Operations |
virt | labs virtualization nodes | No longer used (deprecated) | |
wdqs | wikidata query service | In use | Search Platform SREs |
webperf | webperf metrics (performance team). See T179036. | In use | Performance & Service Operations |
wtp | wiki-text processor node (parsoid) | In use | Service Operations |
xhgui | A graphical interface for PHP debug profiles built on MongoDB. See Performance/Runbook/XHGui service. | In use | Performance & Service Operations |
dragonfly-supernode | Supernode for Dragonfly P2P network (distributing docker images) (T286054) | In use | Service Operations |
Miscellaneous servers
Any one-off or single service host. This includes pretty much all non-MediaWiki software on the cluster that isn't load balanced across multiple machines. Or general task machines that can cluster (to an extent) but require opsen work to do so. The naming of these is based on location (since they tend to do more than one kind of thing or provide more than one particular service/site type). The use of those names is deprecated in favour of specialized cluster names above, when possible.
Datacenter Site | Convention | Example | Notes |
---|---|---|---|
codfw | Star Names | acamar | Only use modern proper star names that are a single word long and contain no odd characters.
Orion constellation is reserved for fundraising (Alnilam, Alnitak, Bellatrix, |
eqiad | Elements | helium | Next atomic # assignment (incremental by atomic #): 113 |
esams / knams | Notable Dutch | vandale |