You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Orchestrator: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Kormat
(Initial version)
 
imported>Marostegui
(Quick draft of troubleshooting section, this needs more work)
Line 1: Line 1:
[https://github.com/openark/orchestrator Orchestrator] is a service for managing mysql cluster replication. The data-persistence SRE team is currently doing a proof-of-concept deployment of it within WMF, with the aim of replacing [[Tendril]]/[[Dbtree.wikimedia.org|Dbtree]].
[https://github.com/openark/orchestrator Orchestrator] is a service for managing mysql cluster replication. The data-persistence SRE team is currently doing a proof-of-concept deployment of it within WMF, with the aim of replacing [[Tendril]]/[[Dbtree.wikimedia.org|Dbtree]].
== Troubleshooting ==
=== Entry in database_resolve that maps to a bare hostname ===
+--------------------+--------------------+---------------------+
| hostname          | resolved_hostname  | resolved_timestamp  |
+--------------------+--------------------+---------------------+
| pc1008.eqiad.wmnet | pc1008            | 2020-11-18 10:11:58 |
+--------------------+--------------------+---------------------+
This can cause a 'ghost' cluster to appear, containing the bare-hostname version of the host. To fix this:
systemctl stop orchestrator
orchestrator -c forget -i <instance> for all instances in the ghost cluster
orchestrator -c reset-hostname-resolve-cache
systemctl start orchestrator
Stopping orchestrator is required to stop it from reinserting the bad entry into hostname_resolve.
The entries can be queried via ''orchestrator -c show-resolve-hosts''

Revision as of 11:07, 18 November 2020

Orchestrator is a service for managing mysql cluster replication. The data-persistence SRE team is currently doing a proof-of-concept deployment of it within WMF, with the aim of replacing Tendril/Dbtree.

Troubleshooting

Entry in database_resolve that maps to a bare hostname

+--------------------+--------------------+---------------------+
| hostname           | resolved_hostname  | resolved_timestamp  |
+--------------------+--------------------+---------------------+
| pc1008.eqiad.wmnet | pc1008             | 2020-11-18 10:11:58 |
+--------------------+--------------------+---------------------+

This can cause a 'ghost' cluster to appear, containing the bare-hostname version of the host. To fix this:

systemctl stop orchestrator
orchestrator -c forget -i <instance> for all instances in the ghost cluster
orchestrator -c reset-hostname-resolve-cache
systemctl start orchestrator

Stopping orchestrator is required to stop it from reinserting the bad entry into hostname_resolve.

The entries can be queried via orchestrator -c show-resolve-hosts