You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Portal:Toolforge/Nodes"

From Wikitech
Jump to navigation Jump to search
imported>Dvorapa
m (sort)
imported>SRodlund
Line 1: Line 1:
The list of [[Toolforge]] nodes. This documentation provides information about what kind of nodes exists in the cluster, what every node does and a prospective list of canary server for common operations like testing new stuff.
+
{{Template:Toolforge nav}}
 +
 
 +
== Overview ==
 +
 
 +
The list of [[Toolforge]] nodes. This page provides information about what kind of nodes exists in the cluster, what every node does and a prospective list of canary server for common operations like testing new stuff.
  
 
= Complete list of nodes =
 
= Complete list of nodes =
Line 6: Line 10:
  
 
Take this list as an example, and refer to the source of trust for actual/current data.
 
Take this list as an example, and refer to the source of trust for actual/current data.
 
 
{{Collapse top|list of Toolforge nodes}}
 
{{Collapse top|list of Toolforge nodes}}
 
<pre>
 
<pre>
Line 175: Line 178:
 
Nodes that run GridEngine and are the pool to run standard Toolforge jobs workloads.
 
Nodes that run GridEngine and are the pool to run standard Toolforge jobs workloads.
  
There should be like 20 or 30 of them.
+
There should be 20 or 30 of them.
  
 
* Nodes are like: '''tools-exec-1401.tools.eqiad.wmflabs'''
 
* Nodes are like: '''tools-exec-1401.tools.eqiad.wmflabs'''
Line 182: Line 185:
 
== Kubernetes nodes ==
 
== Kubernetes nodes ==
  
Nodes that run kubernetes and hold the k8s workloads (pods).
+
Nodes that run Kubernetes and hold the k8s workloads (pods).
  
 
There should be like 10 or 20 of them.
 
There should be like 10 or 20 of them.
Line 192: Line 195:
 
== PAWS nodes ==
 
== PAWS nodes ==
  
Nodes that run the [[PAWS]] deployment (using kubernetes). For several reasons, PAWS doesn't share the main k8s deployment.
+
Nodes that run the [[PAWS]] deployment (using Kubernetes). For several reasons, PAWS doesn't share the main k8s deployment.
  
 
There should be like 10 or 20 of them.
 
There should be like 10 or 20 of them.
Line 201: Line 204:
 
== Web nodes ==
 
== Web nodes ==
  
Nodes that acts as frontend web hosts for tools in Toolforge. They are also part of the Grid Engine grid.
+
Nodes that act as frontend web hosts for tools in Toolforge. They are also part of the Grid Engine grid.
  
 
There should be like 10 or 20 of them.
 
There should be like 10 or 20 of them.
Line 222: Line 225:
 
{{See also|Portal:Toolforge/Admin#Granting_a_tool_write_access_to_Elasticsearch}}
 
{{See also|Portal:Toolforge/Admin#Granting_a_tool_write_access_to_Elasticsearch}}
  
Elasticsearch cluster for use by tools. All indicies are available for read-only access inside Toolforge. Writing requires a username and password.
+
Elasticsearch cluster for use by tools. All indices are available for read-only access inside Toolforge. Writing requires a username and password.
  
There should be like 3 nodes.
+
There should be 3 nodes.
  
 
* Nodes are like: '''tools-elastic-01.tools.eqiad.wmflabs'''
 
* Nodes are like: '''tools-elastic-01.tools.eqiad.wmflabs'''
Line 232: Line 235:
 
Nodes related to the etcd deployment for the [[flannel]] network overlay.
 
Nodes related to the etcd deployment for the [[flannel]] network overlay.
  
There should be like 3 nodes.
+
There should be 3 nodes.
  
 
* Nodes are like: '''tools-flannel-etcd-01.tools.eqiad.wmflabs'''
 
* Nodes are like: '''tools-flannel-etcd-01.tools.eqiad.wmflabs'''
Line 289: Line 292:
 
Depending on the test or task you are doing, you may need to craft a different list:
 
Depending on the test or task you are doing, you may need to craft a different list:
  
* Different OS (Ubuntu trusty, Debian jessie, Debian stretch)
+
* Different OS (Ubuntu trusty, Debian Jessie, Debian stretch)
 
* By Linux kernel version
 
* By Linux kernel version
 
* By different workload, usage or general load
 
* By different workload, usage or general load
Line 303: Line 306:
 
</syntaxhighlight>
 
</syntaxhighlight>
  
= See also =
+
{{:Help:Cloud Services communication}}
  
Some related information.
+
== See also ==
  
 
* [[Portal:Toolforge/Admin | General Toolforge admin docs ]]
 
* [[Portal:Toolforge/Admin | General Toolforge admin docs ]]
Line 311: Line 314:
  
 
[[Category:Toolforge|Nodes]]
 
[[Category:Toolforge|Nodes]]
 +
[[Category:Toolforge]]
 +
[[Category:Documentation]]
 +
[[Category:Cloud Services]]

Revision as of 20:15, 14 February 2020

Overview

The list of Toolforge nodes. This page provides information about what kind of nodes exists in the cluster, what every node does and a prospective list of canary server for common operations like testing new stuff.

Complete list of nodes

The source of trust for this information is at https://tools.wmflabs.org/openstack-browser/project/tools

Take this list as an example, and refer to the source of trust for actual/current data.

Purpose of nodes

Explanation of what each type of node is.

bastion nodes

Servers meant to allow access to the clusters for users using SSH. You can submit your grid jobs from here.

They are not meant to run any actual workload. There are usually like 3 of them.

Nodes are like: tools-bastion-01.tools.eqiad.wmflabs

GridEngine nodes

Nodes that run GridEngine and are the pool to run standard Toolforge jobs workloads.

There should be 20 or 30 of them.

  • Nodes are like: tools-exec-1401.tools.eqiad.wmflabs
  • Master nodes are like tools-grid-master.tools.eqiad.wmflabs

Kubernetes nodes

Nodes that run Kubernetes and hold the k8s workloads (pods).

There should be like 10 or 20 of them.

  • Nodes are like: tools-worker-1027.tools.eqiad.wmflabs
  • Master nodes are like: tools-k8s-master-01.tools.eqiad.wmflabs
  • etcd nodes are like: tools-k8s-etcd-01.tools.eqiad.wmflabs

PAWS nodes

Nodes that run the PAWS deployment (using Kubernetes). For several reasons, PAWS doesn't share the main k8s deployment.

There should be like 10 or 20 of them.

  • Nodes are like: tools-paws-worker-1001.tools.eqiad.wmflabs
  • Master nodes are like: tools-paws-master-01.tools.eqiad.wmflabs

Web nodes

Nodes that act as frontend web hosts for tools in Toolforge. They are also part of the Grid Engine grid.

There should be like 10 or 20 of them.

  • Generic nodes are like: tools-webgrid-generic-1401.tools.eqiad.wmflabs
  • Those using lighttpd are like: tools-webgrid-lighttpd-1401.tools.eqiad.wmflabs
  • General web proxy nodes: tools-proxy-01.tools.eqiad.wmflabs
  • Static content nodes: tools-static-10.tools.eqiad.wmflabs

Docker nodes

These nodes are meant to build and distribute docker containers.

There should be like 3 or them.

  • Builder nodes are like: tools-docker-builder-05.tools.eqiad.wmflabs
  • Docker registry nodes are like: tools-docker-registry-01.tools.eqiad.wmflabs

Elasticsearch nodes

Elasticsearch cluster for use by tools. All indices are available for read-only access inside Toolforge. Writing requires a username and password.

There should be 3 nodes.

  • Nodes are like: tools-elastic-01.tools.eqiad.wmflabs

Flannel nodes

Nodes related to the etcd deployment for the flannel network overlay.

There should be 3 nodes.

  • Nodes are like: tools-flannel-etcd-01.tools.eqiad.wmflabs

checker nodes

TODO: fill me

misc nodes

Some misc nodes.

  • Clush master, control of the whole cluster using Clustershell: tools-clushmaster-02.tools.eqiad.wmflabs
  • Cron master, run cron jobs submitted by tools users: tools-cron-01.tools.eqiad.wmflabs
  • SMTP email for the cluster: tools-mail.tools.eqiad.wmflabs
  • DEB package builder: tools-package-builder-01.tools.eqiad.wmflabs
  • Prometheus deployment nodes: tools-prometheus-02.tools.eqiad.wmflabs
  • Redis nodes: tools-redis-1001.tools.eqiad.wmflabs
  • Misc services nodes (aptly and others): tools-services-01.tools.eqiad.wmflabs
  • TODO: fill me: tools-logs-02.tools.eqiad.wmflabs

Canary nodes

A pre-elected list of one node of each type you can use to test stuff before deploying changes to the whole cluster.

Depending on the test or task you are doing, you may need to craft a different list:

  • Different OS (Ubuntu trusty, Debian Jessie, Debian stretch)
  • By Linux kernel version
  • By different workload, usage or general load
  • Attending to other criteria

In Toolforge clushmaster, there should be a list ready to use with clush: toolforge_canary_list.txt

It can be used like this:

user@tools-clushmaster-01:~$ clush --hostfile /etc/clustershell/toolforge_canary_list.txt "command"
[...]

Communication and support

We communicate and provide support through several primary channels. Please reach out with questions and to join the conversation.

Communicate with us
Connect Best for
Phabricator Workboard #Cloud-Services Task tracking and bug reporting
IRC Channel #wikimedia-cloud connect General discussion and support
Mailing List cloud@ Information about ongoing initiatives, general discussion and support
Announcement emails cloud-announce@ Information about critical changes (all messages mirrored to cloud@)
News wiki page News Information about major near-term plans
Blog Clouds & Unicorns Learning more details about some of our work

See also