You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Portal:Cloud VPS/Infrastructure"

From Wikitech-static
Jump to navigation Jump to search
imported>BryanDavis
(Update some things for rebranding)
 
imported>Arturo Borrero Gonzalez
(→‎DNS: add designate link)
 
(5 intermediate revisions by 5 users not shown)
Line 1: Line 1:
{{outdated}}
{{Cloud VPS nav}}


'''Cloud VPS''' is a virtualization cloud that uses [http://www.openstack.org/software/openstack-compute OpenStack Compute].  Base images are managed with [http://docs.openstack.org/developer/glance/ Glance] and authentication uses LDAP-backed [http://docs.openstack.org/developer/keystone/ Keystone].
'''Cloud VPS''' is a virtualization cloud that uses [http://www.openstack.org/software/openstack-compute OpenStack Compute].  Base images are managed with [http://docs.openstack.org/developer/glance/ Glance], and authentication uses LDAP-backed [http://docs.openstack.org/developer/keystone/ Keystone].


Cloud VPS currently runs in a single datacenter in Ashburn, Virginia.  In the future it will span two or more datacenters, with a slightly different configuration in each.
Cloud VPS currently runs in a single data center in Ashburn, Virginia.  In the future, it will span two or more data centers, with a slightly different configuration in each.


For troubleshooting immediate labs issues, visit [[Portal:Cloud_VPS/Admin/Troubleshooting]].
For troubleshooting immediate issues, visit [[Portal:Cloud_VPS/Admin/Troubleshooting]].


[[File:Labs architecture.pdf|thumb|Slides from a brief presentation about Wikimedia Labs architecture]]
[[File:OpenStack_at_WMCS.pdf|thumb|Slides from a brief presentation about WMCS OpenStack architecture]]


== Cloud VPS Eqiad (Ashburn, VA) ==
== Cloud VPS Eqiad (Ashburn, VA) ==


[[File:Labs_cluster.png|thumb|The servers that make Wikimedia Labs]]
=== Regions ===


=== Wikitech/OpenStackManager ===
We have one region: <code>eqiad1-r</code> (also referred to as <code>eqiad1</code>).


In addition to hosting the WMF's technical documentation, the [[wikitech]] web site also runs the OpenStackManager MediaWiki extension that provides a graphical interface for labs.  Wikitech runs on a server internally named Silver.
The <code>eqiad</code> name is based on the [[Infrastructure_naming_conventions#Server_clusters|naming convention]] for clusters.


An alternative, partially-functional labs GUI can be accessed at https://horizon.wikimedia.org/.  It runs the openstack-dashboard project and provides nova API access to project admins.
=== Horizon ===


=== Controller ===
Most users will manage their virtual servers using [[Horizon]].  Horizon is an upstream OpenStack web interface for the OpenStack APIs.  Our Horizon site also includes several custom dashboards to access special WMCS features not available in stock Horizon.


The labs controller box (currently named 'labcontrol1001') runs the Glance and Keystone services, as well as a few nova services (conductor and scheduler.)  Labcontrol1001 also runs a public DNS server (aka labs-ns0) which provides name services for the .wmflabs.org domain.
Horizon is hosted on labweb1001.wikimedia.org and labweb1002.wikimedia.org and can be accessed at https://horizon.wikimedia.org.


A second server, labcontrol1002, serves as a hot spare for labcontrol1001.
Individual user accounts on WMCS can also be created via Striker which is at https://toolsadmin.wikimedia.org.  Currently, any account created there is automatically added to the Tools project.


Another duplicate server, labcontrol2001, runs in codfw and contains a duplicate config.  It is largely vestigial, but does provide backup DNS service via the labs-ns1 service name.
=== Controller ===


=== Network ===
The OpenStack controller box <code>cloudcontrol1003</code> runs the Glance and Keystone services, as well as nova-conductor and nova-scheduler.  It is also the preferred place to access the OpenStack command-line client.


The network node ('labnet1001') hosts the nova-network service.  We currently run a single labs-wide network that supports all lab nodes and projects.  In the future we hope to use [http://docs.openstack.org/developer/neutron/ Openstack Neutron] for our network setup, but it doesn't support our use-case; to use neutron we'll need to switch to one network per project.
A second server, <code>cloudcontrol1004</code> is present as well.


Labnet1001 also hosts the nova-api service.
=== Network ===


Soon an additional server, labnet1002, will provide either redundant network service or as a hot spare. To be determined.
In the <code>eqiad1-r</code> region, we use [http://docs.openstack.org/developer/neutron/ Openstack Neutron] which runs on servers <code>cloudnet1003</code> and <code>cloudnet1004</code>.


=== Virtualization ===
=== Virtualization ===


[[File:Labs_cluster_instance_POV.png|thumb|The servers that a labs instance talks to]]
See the [[Portal:Cloud VPS/Admin/Deployments|deployments]] page for a list of hypervisors per region and their current status.


There are currently thirteen virtualization nodes in labs, named labvirt1001-1013, all running in eqiad:
Cloudvirt hosts (also known as hypervisors) are pooled or depooled using the <code>profile::openstack::eqiad1::nova::scheduler_pool</code> key in Puppet Hiera.
 
* 1001-1009 are high-powered multi-CPU [[HP_DL380p | HP servers]]; each of them hosts dozens of virtual machines.
* 1010-1013 are similar to 1001-1009 but with large SSD raids.
* 1014 is identical to 1012 and 1013 but kept empty as an emergency evacuation node.


=== Storage ===
=== Storage ===


Labs uses shared storage for several purposes:
Most Cloud VPS projects do not use shared NFS storage. If they need NFS, these are the available options:


* Each member of a project has a project-wide shared home directory.
* Each member of a project has a project-wide shared home directory.
* Each project has a public shared volume, generally mounted to /data/project
* The project has a public shared volume, generally mounted to /data/project
 
All of the above are hosted on various NFS servers (labstore* and cloudstore*).


All of the above are hosted on an NFS server named labstore1001.  There's a hot-swappable backup, labstore1002, which is generally turned off.
=== Monitoring ===


=== monitoring ===
Most OpenStack-related services are monitored in Icinga just like other production services.


Labmon1001 runs statsd and graphite. It monitors the state of labs instances and collects stats and sends alerts as needed.
VMs in the <code>tools</code> and <code>deployment-prep</code> projects are monitored with [http://shinken.wmflabs.org Shinken].


=== ldap ===
=== LDAP ===


Ldap is used for services throughout the WMF. The primary ldap server is Neptunium, running in eqiad.  The secondary server is Nembus, running in codfw.
LDAP is used for services throughout the WMF. The same LDAP database keeps track of project management and SSH keys for logins on VPS serversLDAP is hosted on seaborgium and neptunium; The LDAP server software is OpenLDAP.


The LDAP server software is opendj.  Each labs instance has an /etc/ldap.conf file (managed by puppet) that maintains info about the ldap servers.
Each Cloud VPS instance has an <code>/etc/ldap.conf</code> file (managed by Puppet) with setting on how to access the LDAP servers.


=== dns ===
=== DNS ===


[[Portal:Cloud_VPS/DNS|DNS]] is handled by PowerDNS.  Private DNS entries (e.g. foo.eqiad.wmflabs) are created via Designate Sink and stored in a PDNS server using a mysql backend.  Public DNS entries are created via Horizon and the designate API.
[[Portal:Cloud_VPS/DNS|DNS]] is handled by PowerDNS.  Private DNS entries (e.g. foo.eqiad1.wikimedia.cloud) are created via [[Portal:Cloud_VPS/Admin/DNS/Designate|Designate]] Sink and stored in a PDNS server using a MySQL backend.  Public DNS entries are created via Horizon and the Designate API.


[[File:Labs_dns_simplified.png|thumb|Future, simpler Labs DNS implementation using Horizon]]
[[File:Wmcs dns.pdf|center|750px|DNS in WMCS]]

Latest revision as of 14:53, 5 October 2021

Cloud VPS is a virtualization cloud that uses OpenStack Compute. Base images are managed with Glance, and authentication uses LDAP-backed Keystone.

Cloud VPS currently runs in a single data center in Ashburn, Virginia. In the future, it will span two or more data centers, with a slightly different configuration in each.

For troubleshooting immediate issues, visit Portal:Cloud_VPS/Admin/Troubleshooting.

File:OpenStack at WMCS.pdf

Cloud VPS Eqiad (Ashburn, VA)

Regions

We have one region: eqiad1-r (also referred to as eqiad1).

The eqiad name is based on the naming convention for clusters.

Horizon

Most users will manage their virtual servers using Horizon. Horizon is an upstream OpenStack web interface for the OpenStack APIs. Our Horizon site also includes several custom dashboards to access special WMCS features not available in stock Horizon.

Horizon is hosted on labweb1001.wikimedia.org and labweb1002.wikimedia.org and can be accessed at https://horizon.wikimedia.org.

Individual user accounts on WMCS can also be created via Striker which is at https://toolsadmin.wikimedia.org. Currently, any account created there is automatically added to the Tools project.

Controller

The OpenStack controller box cloudcontrol1003 runs the Glance and Keystone services, as well as nova-conductor and nova-scheduler. It is also the preferred place to access the OpenStack command-line client.

A second server, cloudcontrol1004 is present as well.

Network

In the eqiad1-r region, we use Openstack Neutron which runs on servers cloudnet1003 and cloudnet1004.

Virtualization

See the deployments page for a list of hypervisors per region and their current status.

Cloudvirt hosts (also known as hypervisors) are pooled or depooled using the profile::openstack::eqiad1::nova::scheduler_pool key in Puppet Hiera.

Storage

Most Cloud VPS projects do not use shared NFS storage. If they need NFS, these are the available options:

  • Each member of a project has a project-wide shared home directory.
  • The project has a public shared volume, generally mounted to /data/project

All of the above are hosted on various NFS servers (labstore* and cloudstore*).

Monitoring

Most OpenStack-related services are monitored in Icinga just like other production services.

VMs in the tools and deployment-prep projects are monitored with Shinken.

LDAP

LDAP is used for services throughout the WMF. The same LDAP database keeps track of project management and SSH keys for logins on VPS servers. LDAP is hosted on seaborgium and neptunium; The LDAP server software is OpenLDAP.

Each Cloud VPS instance has an /etc/ldap.conf file (managed by Puppet) with setting on how to access the LDAP servers.

DNS

DNS is handled by PowerDNS. Private DNS entries (e.g. foo.eqiad1.wikimedia.cloud) are created via Designate Sink and stored in a PDNS server using a MySQL backend. Public DNS entries are created via Horizon and the Designate API.

File:Wmcs dns.pdf