You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T312863)', diff saved to https://phabricator.wikimedia.org/P34396 and previous config saved to /var/cache/conftool/dbconfig/20220910-213300-ladsgroup.json)
imported>Stashbot
(ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T314041)', diff saved to https://phabricator.wikimedia.org/P34424 and previous config saved to /var/cache/conftool/dbconfig/20220912-012118-ladsgroup.json)
Line 1: Line 1:
== 2022-09-12 ==
* 01:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34424 and previous config saved to /var/cache/conftool/dbconfig/20220912-012118-ladsgroup.json
* 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34423 and previous config saved to /var/cache/conftool/dbconfig/20220912-004952-ladsgroup.json
* 00:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 00:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
* 00:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 00:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34422 and previous config saved to /var/cache/conftool/dbconfig/20220912-004915-ladsgroup.json
* 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P34421 and previous config saved to /var/cache/conftool/dbconfig/20220912-003409-ladsgroup.json
* 00:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P34420 and previous config saved to /var/cache/conftool/dbconfig/20220912-001902-ladsgroup.json
* 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34419 and previous config saved to /var/cache/conftool/dbconfig/20220912-000356-ladsgroup.json
== 2022-09-11 ==
* 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34418 and previous config saved to /var/cache/conftool/dbconfig/20220911-175643-ladsgroup.json
* 17:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 17:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34417 and previous config saved to /var/cache/conftool/dbconfig/20220911-175621-ladsgroup.json
* 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P34416 and previous config saved to /var/cache/conftool/dbconfig/20220911-174114-ladsgroup.json
* 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P34415 and previous config saved to /var/cache/conftool/dbconfig/20220911-172608-ladsgroup.json
* 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34414 and previous config saved to /var/cache/conftool/dbconfig/20220911-171102-ladsgroup.json
* 13:22 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 13:22 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 12:47 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 08s)
* 12:46 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 12:36 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 12:36 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 12:09 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 08s)
* 12:09 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 11:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34412 and previous config saved to /var/cache/conftool/dbconfig/20220911-114850-ladsgroup.json
* 11:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 11:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
* 11:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34411 and previous config saved to /var/cache/conftool/dbconfig/20220911-114829-ladsgroup.json
* 11:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P34410 and previous config saved to /var/cache/conftool/dbconfig/20220911-113323-ladsgroup.json
* 11:26 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 11:26 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 11:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P34409 and previous config saved to /var/cache/conftool/dbconfig/20220911-111816-ladsgroup.json
* 11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34408 and previous config saved to /var/cache/conftool/dbconfig/20220911-110310-ladsgroup.json
* 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34407 and previous config saved to /var/cache/conftool/dbconfig/20220911-110228-ladsgroup.json
* 11:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 11:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34406 and previous config saved to /var/cache/conftool/dbconfig/20220911-110207-ladsgroup.json
* 10:56 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 10:56 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P34405 and previous config saved to /var/cache/conftool/dbconfig/20220911-104700-ladsgroup.json
* 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P34404 and previous config saved to /var/cache/conftool/dbconfig/20220911-103154-ladsgroup.json
* 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34403 and previous config saved to /var/cache/conftool/dbconfig/20220911-101647-ladsgroup.json
* 10:06 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 10:06 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 08:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34402 and previous config saved to /var/cache/conftool/dbconfig/20220911-084529-ladsgroup.json
* 08:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 08:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34401 and previous config saved to /var/cache/conftool/dbconfig/20220911-041936-ladsgroup.json
* 04:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 04:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34400 and previous config saved to /var/cache/conftool/dbconfig/20220911-041914-ladsgroup.json
* 04:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P34399 and previous config saved to /var/cache/conftool/dbconfig/20220911-040407-ladsgroup.json
* 03:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P34398 and previous config saved to /var/cache/conftool/dbconfig/20220911-034901-ladsgroup.json
* 03:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34397 and previous config saved to /var/cache/conftool/dbconfig/20220911-033355-ladsgroup.json
== 2022-09-10 ==
== 2022-09-10 ==
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34396 and previous config saved to /var/cache/conftool/dbconfig/20220910-213300-ladsgroup.json
* 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T312863|T312863]])', diff saved to https://phabricator.wikimedia.org/P34396 and previous config saved to /var/cache/conftool/dbconfig/20220910-213300-ladsgroup.json

Revision as of 01:21, 12 September 2022

2022-09-12

  • 01:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T314041)', diff saved to https://phabricator.wikimedia.org/P34424 and previous config saved to /var/cache/conftool/dbconfig/20220912-012118-ladsgroup.json
  • 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2173 (T312863)', diff saved to https://phabricator.wikimedia.org/P34423 and previous config saved to /var/cache/conftool/dbconfig/20220912-004952-ladsgroup.json
  • 00:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 00:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2094.codfw.wmnet with reason: Maintenance
  • 00:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 00:49 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2173.codfw.wmnet with reason: Maintenance
  • 00:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P34422 and previous config saved to /var/cache/conftool/dbconfig/20220912-004915-ladsgroup.json
  • 00:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P34421 and previous config saved to /var/cache/conftool/dbconfig/20220912-003409-ladsgroup.json
  • 00:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311', diff saved to https://phabricator.wikimedia.org/P34420 and previous config saved to /var/cache/conftool/dbconfig/20220912-001902-ladsgroup.json
  • 00:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2170:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P34419 and previous config saved to /var/cache/conftool/dbconfig/20220912-000356-ladsgroup.json

2022-09-11

  • 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P34418 and previous config saved to /var/cache/conftool/dbconfig/20220911-175643-ladsgroup.json
  • 17:56 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 17:56 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
  • 17:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P34417 and previous config saved to /var/cache/conftool/dbconfig/20220911-175621-ladsgroup.json
  • 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P34416 and previous config saved to /var/cache/conftool/dbconfig/20220911-174114-ladsgroup.json
  • 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311', diff saved to https://phabricator.wikimedia.org/P34415 and previous config saved to /var/cache/conftool/dbconfig/20220911-172608-ladsgroup.json
  • 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P34414 and previous config saved to /var/cache/conftool/dbconfig/20220911-171102-ladsgroup.json
  • 13:22 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
  • 13:22 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 12:47 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 08s)
  • 12:46 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 12:36 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
  • 12:36 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 12:09 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 08s)
  • 12:09 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 11:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P34412 and previous config saved to /var/cache/conftool/dbconfig/20220911-114850-ladsgroup.json
  • 11:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 11:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1144.eqiad.wmnet with reason: Maintenance
  • 11:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T314041)', diff saved to https://phabricator.wikimedia.org/P34411 and previous config saved to /var/cache/conftool/dbconfig/20220911-114829-ladsgroup.json
  • 11:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P34410 and previous config saved to /var/cache/conftool/dbconfig/20220911-113323-ladsgroup.json
  • 11:26 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
  • 11:26 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 11:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142', diff saved to https://phabricator.wikimedia.org/P34409 and previous config saved to /var/cache/conftool/dbconfig/20220911-111816-ladsgroup.json
  • 11:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1142 (T314041)', diff saved to https://phabricator.wikimedia.org/P34408 and previous config saved to /var/cache/conftool/dbconfig/20220911-110310-ladsgroup.json
  • 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2167:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P34407 and previous config saved to /var/cache/conftool/dbconfig/20220911-110228-ladsgroup.json
  • 11:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 11:02 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2167.codfw.wmnet with reason: Maintenance
  • 11:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T312863)', diff saved to https://phabricator.wikimedia.org/P34406 and previous config saved to /var/cache/conftool/dbconfig/20220911-110207-ladsgroup.json
  • 10:56 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
  • 10:56 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 10:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P34405 and previous config saved to /var/cache/conftool/dbconfig/20220911-104700-ladsgroup.json
  • 10:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P34404 and previous config saved to /var/cache/conftool/dbconfig/20220911-103154-ladsgroup.json
  • 10:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2153 (T312863)', diff saved to https://phabricator.wikimedia.org/P34403 and previous config saved to /var/cache/conftool/dbconfig/20220911-101647-ladsgroup.json
  • 10:06 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
  • 10:06 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 08:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2179 (T314041)', diff saved to https://phabricator.wikimedia.org/P34402 and previous config saved to /var/cache/conftool/dbconfig/20220911-084529-ladsgroup.json
  • 08:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 08:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2179.codfw.wmnet with reason: Maintenance
  • 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2153 (T312863)', diff saved to https://phabricator.wikimedia.org/P34401 and previous config saved to /var/cache/conftool/dbconfig/20220911-041936-ladsgroup.json
  • 04:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 04:19 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2153.codfw.wmnet with reason: Maintenance
  • 04:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T312863)', diff saved to https://phabricator.wikimedia.org/P34400 and previous config saved to /var/cache/conftool/dbconfig/20220911-041914-ladsgroup.json
  • 04:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P34399 and previous config saved to /var/cache/conftool/dbconfig/20220911-040407-ladsgroup.json
  • 03:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P34398 and previous config saved to /var/cache/conftool/dbconfig/20220911-034901-ladsgroup.json
  • 03:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2146 (T312863)', diff saved to https://phabricator.wikimedia.org/P34397 and previous config saved to /var/cache/conftool/dbconfig/20220911-033355-ladsgroup.json

2022-09-10

  • 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T312863)', diff saved to https://phabricator.wikimedia.org/P34396 and previous config saved to /var/cache/conftool/dbconfig/20220910-213300-ladsgroup.json
  • 21:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 21:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2146.codfw.wmnet with reason: Maintenance
  • 21:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T312863)', diff saved to https://phabricator.wikimedia.org/P34395 and previous config saved to /var/cache/conftool/dbconfig/20220910-213238-ladsgroup.json
  • 21:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P34394 and previous config saved to /var/cache/conftool/dbconfig/20220910-211732-ladsgroup.json
  • 21:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P34393 and previous config saved to /var/cache/conftool/dbconfig/20220910-210225-ladsgroup.json
  • 20:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T312863)', diff saved to https://phabricator.wikimedia.org/P34392 and previous config saved to /var/cache/conftool/dbconfig/20220910-204719-ladsgroup.json
  • 19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1142 (T314041)', diff saved to https://phabricator.wikimedia.org/P34391 and previous config saved to /var/cache/conftool/dbconfig/20220910-191455-ladsgroup.json
  • 19:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 19:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1142.eqiad.wmnet with reason: Maintenance
  • 19:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T314041)', diff saved to https://phabricator.wikimedia.org/P34390 and previous config saved to /var/cache/conftool/dbconfig/20220910-191434-ladsgroup.json
  • 18:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P34389 and previous config saved to /var/cache/conftool/dbconfig/20220910-185927-ladsgroup.json
  • 18:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141', diff saved to https://phabricator.wikimedia.org/P34388 and previous config saved to /var/cache/conftool/dbconfig/20220910-184421-ladsgroup.json
  • 18:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T314041)', diff saved to https://phabricator.wikimedia.org/P34387 and previous config saved to /var/cache/conftool/dbconfig/20220910-182914-ladsgroup.json
  • 17:41 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 17:41 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2139.codfw.wmnet with reason: Maintenance
  • 17:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P34386 and previous config saved to /var/cache/conftool/dbconfig/20220910-174141-ladsgroup.json
  • 17:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P34385 and previous config saved to /var/cache/conftool/dbconfig/20220910-172635-ladsgroup.json
  • 17:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314', diff saved to https://phabricator.wikimedia.org/P34384 and previous config saved to /var/cache/conftool/dbconfig/20220910-171127-ladsgroup.json
  • 16:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2137:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P34383 and previous config saved to /var/cache/conftool/dbconfig/20220910-165621-ladsgroup.json
  • 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 (T312863)', diff saved to https://phabricator.wikimedia.org/P34382 and previous config saved to /var/cache/conftool/dbconfig/20220910-145558-ladsgroup.json
  • 14:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 14:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
  • 12:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • 12:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T312863)', diff saved to https://phabricator.wikimedia.org/P34381 and previous config saved to /var/cache/conftool/dbconfig/20220910-121124-ladsgroup.json
  • 11:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P34380 and previous config saved to /var/cache/conftool/dbconfig/20220910-115617-ladsgroup.json
  • 11:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P34379 and previous config saved to /var/cache/conftool/dbconfig/20220910-114111-ladsgroup.json
  • 11:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T312863)', diff saved to https://phabricator.wikimedia.org/P34378 and previous config saved to /var/cache/conftool/dbconfig/20220910-112605-ladsgroup.json
  • 09:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 09:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 09:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T312863)', diff saved to https://phabricator.wikimedia.org/P34377 and previous config saved to /var/cache/conftool/dbconfig/20220910-093703-ladsgroup.json
  • 09:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P34376 and previous config saved to /var/cache/conftool/dbconfig/20220910-092156-ladsgroup.json
  • 09:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P34375 and previous config saved to /var/cache/conftool/dbconfig/20220910-090650-ladsgroup.json
  • 08:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 (T312863)', diff saved to https://phabricator.wikimedia.org/P34374 and previous config saved to /var/cache/conftool/dbconfig/20220910-085143-ladsgroup.json
  • 05:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1186 (T312863)', diff saved to https://phabricator.wikimedia.org/P34373 and previous config saved to /var/cache/conftool/dbconfig/20220910-052410-ladsgroup.json
  • 05:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 05:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 05:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T312863)', diff saved to https://phabricator.wikimedia.org/P34372 and previous config saved to /var/cache/conftool/dbconfig/20220910-052349-ladsgroup.json
  • 05:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P34371 and previous config saved to /var/cache/conftool/dbconfig/20220910-050842-ladsgroup.json
  • 04:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P34370 and previous config saved to /var/cache/conftool/dbconfig/20220910-045336-ladsgroup.json
  • 04:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T312863)', diff saved to https://phabricator.wikimedia.org/P34369 and previous config saved to /var/cache/conftool/dbconfig/20220910-043829-ladsgroup.json
  • 02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 (T312863)', diff saved to https://phabricator.wikimedia.org/P34368 and previous config saved to /var/cache/conftool/dbconfig/20220910-025548-ladsgroup.json
  • 02:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 02:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
  • 02:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T312863)', diff saved to https://phabricator.wikimedia.org/P34367 and previous config saved to /var/cache/conftool/dbconfig/20220910-025526-ladsgroup.json
  • 02:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T314041)', diff saved to https://phabricator.wikimedia.org/P34366 and previous config saved to /var/cache/conftool/dbconfig/20220910-024401-ladsgroup.json
  • 02:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 02:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1141.eqiad.wmnet with reason: Maintenance
  • 02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T314041)', diff saved to https://phabricator.wikimedia.org/P34365 and previous config saved to /var/cache/conftool/dbconfig/20220910-024339-ladsgroup.json
  • 02:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P34364 and previous config saved to /var/cache/conftool/dbconfig/20220910-024019-ladsgroup.json
  • 02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P34363 and previous config saved to /var/cache/conftool/dbconfig/20220910-022833-ladsgroup.json
  • 02:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P34362 and previous config saved to /var/cache/conftool/dbconfig/20220910-022513-ladsgroup.json
  • 02:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121', diff saved to https://phabricator.wikimedia.org/P34361 and previous config saved to /var/cache/conftool/dbconfig/20220910-021326-ladsgroup.json
  • 02:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 (T312863)', diff saved to https://phabricator.wikimedia.org/P34360 and previous config saved to /var/cache/conftool/dbconfig/20220910-021007-ladsgroup.json
  • 01:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1121 (T314041)', diff saved to https://phabricator.wikimedia.org/P34359 and previous config saved to /var/cache/conftool/dbconfig/20220910-015820-ladsgroup.json
  • 00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P34358 and previous config saved to /var/cache/conftool/dbconfig/20220910-005046-ladsgroup.json
  • 00:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 00:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2137.codfw.wmnet with reason: Maintenance
  • 00:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T314041)', diff saved to https://phabricator.wikimedia.org/P34357 and previous config saved to /var/cache/conftool/dbconfig/20220910-005025-ladsgroup.json
  • 00:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P34356 and previous config saved to /var/cache/conftool/dbconfig/20220910-003518-ladsgroup.json
  • 00:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106', diff saved to https://phabricator.wikimedia.org/P34355 and previous config saved to /var/cache/conftool/dbconfig/20220910-002012-ladsgroup.json
  • 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2106 (T314041)', diff saved to https://phabricator.wikimedia.org/P34354 and previous config saved to /var/cache/conftool/dbconfig/20220910-000504-ladsgroup.json

2022-09-09

  • 22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1184 (T312863)', diff saved to https://phabricator.wikimedia.org/P34353 and previous config saved to /var/cache/conftool/dbconfig/20220909-224245-ladsgroup.json
  • 22:42 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 22:42 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 22:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T312863)', diff saved to https://phabricator.wikimedia.org/P34352 and previous config saved to /var/cache/conftool/dbconfig/20220909-224223-ladsgroup.json
  • 22:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P34351 and previous config saved to /var/cache/conftool/dbconfig/20220909-222717-ladsgroup.json
  • 22:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P34350 and previous config saved to /var/cache/conftool/dbconfig/20220909-221210-ladsgroup.json
  • 21:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 (T312863)', diff saved to https://phabricator.wikimedia.org/P34349 and previous config saved to /var/cache/conftool/dbconfig/20220909-215704-ladsgroup.json
  • 20:27 herron@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=93) for new host dispatch-be1001.eqiad.wmnet
  • 20:27 herron@cumin1001: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dispatch-be1001.eqiad.wmnet on all recursors
  • 20:27 herron@cumin1001: START - Cookbook sre.dns.wipe-cache dispatch-be1001.eqiad.wmnet on all recursors
  • 20:27 herron@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 (T312863)', diff saved to https://phabricator.wikimedia.org/P34348 and previous config saved to /var/cache/conftool/dbconfig/20220909-201857-ladsgroup.json
  • 20:18 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 20:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
  • 20:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T312863)', diff saved to https://phabricator.wikimedia.org/P34347 and previous config saved to /var/cache/conftool/dbconfig/20220909-201835-ladsgroup.json
  • 20:03 herron@cumin1001: START - Cookbook sre.dns.netbox
  • 20:03 herron@cumin1001: START - Cookbook sre.ganeti.makevm for new host dispatch-be1001.eqiad.wmnet
  • 20:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P34345 and previous config saved to /var/cache/conftool/dbconfig/20220909-200329-ladsgroup.json
  • 20:02 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons.
  • 19:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2112', diff saved to https://phabricator.wikimedia.org/P34344 and previous config saved to /var/cache/conftool/dbconfig/20220909-194822-ladsgroup.json
  • 19:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 19:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 19:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 19:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 19:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2112 (T312863)', diff saved to https://phabricator.wikimedia.org/P34343 and previous config saved to /var/cache/conftool/dbconfig/20220909-193316-ladsgroup.json
  • 18:21 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-main-eqiad cluster: Roll restart of jvm daemons.
  • 16:21 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-main-codfw cluster: Roll restart of jvm daemons.
  • 15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 (T312863)', diff saved to https://phabricator.wikimedia.org/P34342 and previous config saved to /var/cache/conftool/dbconfig/20220909-155234-ladsgroup.json
  • 15:52 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 15:52 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 15:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P34341 and previous config saved to /var/cache/conftool/dbconfig/20220909-155213-ladsgroup.json
  • 15:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P34340 and previous config saved to /var/cache/conftool/dbconfig/20220909-153706-ladsgroup.json
  • 15:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P34339 and previous config saved to /var/cache/conftool/dbconfig/20220909-152159-ladsgroup.json
  • 15:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P34338 and previous config saved to /var/cache/conftool/dbconfig/20220909-150651-ladsgroup.json
  • 14:44 moritzm: imported jenkins 2.346.3 to thirdparty/ci
  • 14:43 dcausse@deploy1002: Synchronized wmf-config/InitialiseSettings.php: T317381: Revert "Disable CirrusSearch completion suggester" (duration: 03m 57s)
  • 14:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 14:40 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-main-codfw cluster: Roll restart of jvm daemons.
  • 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 14:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 14:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 13:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2112 (T312863)', diff saved to https://phabricator.wikimedia.org/P34336 and previous config saved to /var/cache/conftool/dbconfig/20220909-133846-ladsgroup.json
  • 13:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 13:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2112.codfw.wmnet with reason: Maintenance
  • 13:33 dcausse: restartin blazegraph on wdqs2003 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 13:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 12:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
  • 12:24 ayounsi@cumin1001: START - Cookbook sre.network.cf
  • 12:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T312863)', diff saved to https://phabricator.wikimedia.org/P34334 and previous config saved to /var/cache/conftool/dbconfig/20220909-120029-ladsgroup.json
  • 11:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P34333 and previous config saved to /var/cache/conftool/dbconfig/20220909-114522-ladsgroup.json
  • 11:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P34331 and previous config saved to /var/cache/conftool/dbconfig/20220909-113016-ladsgroup.json
  • 11:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T312863)', diff saved to https://phabricator.wikimedia.org/P34330 and previous config saved to /var/cache/conftool/dbconfig/20220909-111509-ladsgroup.json
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34329 and previous config saved to /var/cache/conftool/dbconfig/20220909-103334-root.json
  • 10:31 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
  • 10:31 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34328 and previous config saved to /var/cache/conftool/dbconfig/20220909-101830-root.json
  • 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34327 and previous config saved to /var/cache/conftool/dbconfig/20220909-100324-root.json
  • 09:53 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
  • 09:53 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34326 and previous config saved to /var/cache/conftool/dbconfig/20220909-094819-root.json
  • 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34325 and previous config saved to /var/cache/conftool/dbconfig/20220909-093314-root.json
  • 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1137 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34323 and previous config saved to /var/cache/conftool/dbconfig/20220909-091809-root.json
  • 08:59 cgoubert@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts wtp[1029-1033].eqiad.wmnet
  • 08:59 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:56 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 08:37 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp[1029-1033].eqiad.wmnet
  • 08:32 dcausse: rebuilding all completion indices in elastic@codfw
  • 08:16 dcausse: restarting on blazegraph on wdqs2002 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 08:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 08:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2106 (T314041)', diff saved to https://phabricator.wikimedia.org/P34322 and previous config saved to /var/cache/conftool/dbconfig/20220909-081103-ladsgroup.json
  • 08:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 08:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2106.codfw.wmnet with reason: Maintenance
  • 08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T314041)', diff saved to https://phabricator.wikimedia.org/P34321 and previous config saved to /var/cache/conftool/dbconfig/20220909-081042-ladsgroup.json
  • 08:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 08:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1121.eqiad.wmnet with reason: Maintenance
  • 08:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 08:06 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
  • 08:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P34320 and previous config saved to /var/cache/conftool/dbconfig/20220909-080609-ladsgroup.json
  • 08:06 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 08:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 08:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 07:47 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1157 (T312863)', diff saved to https://phabricator.wikimedia.org/P34319 and previous config saved to /var/cache/conftool/dbconfig/20220909-074710-ladsgroup.json
  • 07:47 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 07:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 07:46 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 07:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-tool1007.eqiad.wmnet
  • 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1137 for upgrade', diff saved to https://phabricator.wikimedia.org/P34318 and previous config saved to /var/cache/conftool/dbconfig/20220909-071255-root.json
  • 07:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-tool1007.eqiad.wmnet
  • 05:44 marostegui: dbmaint s8 wikidatawiki eqiad T317349
  • 05:43 marostegui: dbmaint s3 testwikidatawiki eqiad T317349
  • 05:42 marostegui: dbmaint s4 commonswiki eqiad T317349
  • 05:41 marostegui: dbmaint s4 testcommonswiki eqiad T317349
  • 05:19 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 05:18 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 05:18 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 05:11 ebernhardson@deploy1002: Synchronized wmf-config/InitialiseSettings.php: cirrus: Switch all wikis from completion suggester to prefix search, yesterdays completion index builds in codfw weren't all succesfull and users are getting incomplete results (duration: 04m 01s)
  • 05:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 00:09 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 22s)
  • 00:08 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)

2022-09-08

  • 23:56 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 27s)
  • 23:55 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
  • 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 21:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 21:08 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.28 refs T314189
  • 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 21:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:02 TheresNoTime: closing UTC late backport and config training
  • 21:01 samtar@deploy1002: Finished scap: Backport for Fix selser on html endpoints (T317215) (duration: 06m 48s)
  • 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:55 samtar@deploy1002: samtar and arlolra: Backport for Fix selser on html endpoints (T317215) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 20:55 samtar@deploy1002: Started scap: Backport for Fix selser on html endpoints (T317215)
  • 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:33 samtar@deploy1002: Finished scap: Backport for Fix selser on html endpoints (T317215) (duration: 12m 06s)
  • 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:21 samtar@deploy1002: samtar and arlolra: Backport for Fix selser on html endpoints (T317215) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:21 samtar@deploy1002: Started scap: Backport for Fix selser on html endpoints (T317215)
  • 20:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 19:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 19:56 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.39.0-wmf.27 refs T314189
  • 19:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 19:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 19:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 19:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 19:36 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.28 refs T314189
  • 19:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 19:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 19:15 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.39.0-wmf.28 refs T314189 (duration: 03m 39s)
  • 19:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 19:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 19:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 19:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 19:11 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.39.0-wmf.28 refs T314189
  • 17:33 sukhe: stat1008: sudo ipmitool -I lanplus -H "stat1008.mgmt.eqiad.wmnet" -U root -E chassis power cycle
  • 17:22 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:22 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:22 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:21 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:21 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:20 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 16:22 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:04 dancy@deploy1002: Installing scap version "4.17.0" for 566 hosts
  • 15:57 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host kafka-logging1005.mgmt.eqiad.wmnet with reboot policy FORCED
  • 15:55 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:52 pt1979@cumin1001: START - Cookbook sre.dns.netbox
  • 15:51 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:50 cgoubert@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=parsoid
  • 15:49 pt1979@cumin1001: START - Cookbook sre.dns.netbox
  • 15:45 cgoubert@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: wtp: Purge wtp servers following migration to parse (T317025) (duration: 04m 00s)
  • 15:40 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:39 pt1979@cumin1001: START - Cookbook sre.dns.netbox
  • 15:36 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=appserver
  • 15:36 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=api_appserver
  • 15:35 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=codfw,cluster=api-https
  • 15:33 akosiaris: restart etcdmirror on conf2005
  • 15:28 cgoubert@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: wtp: Purge wtp servers following migration to parse (T317025) (duration: 12m 48s)
  • 15:25 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 15:25 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: sync on main
  • 15:25 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
  • 15:24 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: sync on main
  • 15:21 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
  • 15:21 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: sync on main
  • 15:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 15:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 15:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 15:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 15:11 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 15:02 moritzm: installing nginx security updates on bullseye
  • 14:58 papaul: maintenance on mr1-codfw complete
  • 14:57 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp[1025-1028,1048].eqiad.wmnet
  • 14:57 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:54 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 14:40 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp[1025-1028,1048].eqiad.wmnet
  • 14:38 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp[1043-1047].eqiad.wmnet
  • 14:38 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:38 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 14:36 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 14:35 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 14:25 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp[1043-1047].eqiad.wmnet
  • 14:23 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp[1038-1042].eqiad.wmnet
  • 14:23 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:20 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 14:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: Will do maint later', diff saved to https://phabricator.wikimedia.org/P34312 and previous config saved to /var/cache/conftool/dbconfig/20220908-142029-ladsgroup.json
  • 14:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 100%: Maint over long time ago', diff saved to https://phabricator.wikimedia.org/P34311 and previous config saved to /var/cache/conftool/dbconfig/20220908-141600-ladsgroup.json
  • 14:07 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp[1038-1042].eqiad.wmnet
  • 14:06 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp1037.eqiad.wmnet
  • 14:06 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: Will do maint later', diff saved to https://phabricator.wikimedia.org/P34310 and previous config saved to /var/cache/conftool/dbconfig/20220908-140524-ladsgroup.json
  • 14:04 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 75%: Maint over long time ago', diff saved to https://phabricator.wikimedia.org/P34309 and previous config saved to /var/cache/conftool/dbconfig/20220908-140055-ladsgroup.json
  • 14:00 papaul: on going maintenance on mr1-codfw
  • 13:57 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp1037.eqiad.wmnet
  • 13:55 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp1036.eqiad.wmnet
  • 13:55 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:53 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 13:51 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: Will do maint later', diff saved to https://phabricator.wikimedia.org/P34307 and previous config saved to /var/cache/conftool/dbconfig/20220908-135019-ladsgroup.json
  • 13:47 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp1036.eqiad.wmnet
  • 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 25%: Maint over long time ago', diff saved to https://phabricator.wikimedia.org/P34305 and previous config saved to /var/cache/conftool/dbconfig/20220908-134550-ladsgroup.json
  • 13:43 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp1035.eqiad.wmnet
  • 13:43 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:43 vgutierrez: rolling upgrade to ats 9 in cp drmrs - T309651
  • 13:41 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 13:39 vgutierrez: disable puppet on A:cp-drmrs during the update to ATS 9.1.3 - T309651
  • 13:36 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp1035.eqiad.wmnet
  • 13:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 10%: Will do maint later', diff saved to https://phabricator.wikimedia.org/P34304 and previous config saved to /var/cache/conftool/dbconfig/20220908-133514-ladsgroup.json
  • 13:31 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts wtp1034.eqiad.wmnet
  • 13:31 cgoubert@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1136 (re)pooling @ 10%: Maint over long time ago', diff saved to https://phabricator.wikimedia.org/P34303 and previous config saved to /var/cache/conftool/dbconfig/20220908-133045-ladsgroup.json
  • 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34302 and previous config saved to /var/cache/conftool/dbconfig/20220908-133036-root.json
  • 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34301 and previous config saved to /var/cache/conftool/dbconfig/20220908-133031-root.json
  • 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34300 and previous config saved to /var/cache/conftool/dbconfig/20220908-133024-root.json
  • 13:29 moritzm: installing apache2 security updates on Bullseye
  • 13:28 cgoubert@cumin1001: START - Cookbook sre.dns.netbox
  • 13:23 cgoubert@cumin1001: START - Cookbook sre.hosts.decommission for hosts wtp1034.eqiad.wmnet
  • 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34299 and previous config saved to /var/cache/conftool/dbconfig/20220908-131531-root.json
  • 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34298 and previous config saved to /var/cache/conftool/dbconfig/20220908-131526-root.json
  • 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34297 and previous config saved to /var/cache/conftool/dbconfig/20220908-131519-root.json
  • 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34296 and previous config saved to /var/cache/conftool/dbconfig/20220908-130026-root.json
  • 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34295 and previous config saved to /var/cache/conftool/dbconfig/20220908-130021-root.json
  • 13:00 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34294 and previous config saved to /var/cache/conftool/dbconfig/20220908-130014-root.json
  • 12:56 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@9e4ed94]: (no justification provided) (duration: 00m 09s)
  • 12:56 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@9e4ed94]: (no justification provided)
  • 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34293 and previous config saved to /var/cache/conftool/dbconfig/20220908-124521-root.json
  • 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34292 and previous config saved to /var/cache/conftool/dbconfig/20220908-124516-root.json
  • 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34291 and previous config saved to /var/cache/conftool/dbconfig/20220908-124509-root.json
  • 12:42 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@9e4ed94]: (no justification provided) (duration: 00m 09s)
  • 12:42 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@9e4ed94]: (no justification provided)
  • 12:39 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es1027 to es1 eqiad master, promote es1026 to es2 eqiad master, promote es1028 to es3 eqiad master', diff saved to https://phabricator.wikimedia.org/P34290 and previous config saved to /var/cache/conftool/dbconfig/20220908-123955-marostegui.json
  • 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34289 and previous config saved to /var/cache/conftool/dbconfig/20220908-123016-root.json
  • 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34288 and previous config saved to /var/cache/conftool/dbconfig/20220908-123011-root.json
  • 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34287 and previous config saved to /var/cache/conftool/dbconfig/20220908-123004-root.json
  • 12:26 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1033.eqiad.wmnet
  • 12:26 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1032.eqiad.wmnet
  • 12:26 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1032-1033].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 12:26 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1032-1033].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 12:25 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1032-1033].mgmt with reason: Downtiming replaced wtp servers
  • 12:25 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1032-1033].mgmt with reason: Downtiming replaced wtp servers
  • 12:17 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 12:15 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 12:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1031 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34286 and previous config saved to /var/cache/conftool/dbconfig/20220908-121511-root.json
  • 12:15 marostegui@cumin1001: dbctl commit (dc=all): 'es1030 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34285 and previous config saved to /var/cache/conftool/dbconfig/20220908-121506-root.json
  • 12:14 marostegui@cumin1001: dbctl commit (dc=all): 'es1029 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34284 and previous config saved to /var/cache/conftool/dbconfig/20220908-121459-root.json
  • 12:12 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 12:11 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 12:09 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 12:06 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 12:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1029 es1030 es1031 for upgrade', diff saved to https://phabricator.wikimedia.org/P34283 and previous config saved to /var/cache/conftool/dbconfig/20220908-120528-root.json
  • 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34282 and previous config saved to /var/cache/conftool/dbconfig/20220908-120439-root.json
  • 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34281 and previous config saved to /var/cache/conftool/dbconfig/20220908-120435-root.json
  • 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34280 and previous config saved to /var/cache/conftool/dbconfig/20220908-120427-root.json
  • 12:03 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34278 and previous config saved to /var/cache/conftool/dbconfig/20220908-115407-root.json
  • 11:54 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34277 and previous config saved to /var/cache/conftool/dbconfig/20220908-115401-root.json
  • 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34276 and previous config saved to /var/cache/conftool/dbconfig/20220908-115355-root.json
  • 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34275 and previous config saved to /var/cache/conftool/dbconfig/20220908-115351-root.json
  • 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34274 and previous config saved to /var/cache/conftool/dbconfig/20220908-114934-root.json
  • 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34273 and previous config saved to /var/cache/conftool/dbconfig/20220908-114930-root.json
  • 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34272 and previous config saved to /var/cache/conftool/dbconfig/20220908-114922-root.json
  • 11:39 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34271 and previous config saved to /var/cache/conftool/dbconfig/20220908-113902-root.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34270 and previous config saved to /var/cache/conftool/dbconfig/20220908-113856-root.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34269 and previous config saved to /var/cache/conftool/dbconfig/20220908-113850-root.json
  • 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34268 and previous config saved to /var/cache/conftool/dbconfig/20220908-113846-root.json
  • 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34267 and previous config saved to /var/cache/conftool/dbconfig/20220908-113429-root.json
  • 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34266 and previous config saved to /var/cache/conftool/dbconfig/20220908-113425-root.json
  • 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34265 and previous config saved to /var/cache/conftool/dbconfig/20220908-113417-root.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34264 and previous config saved to /var/cache/conftool/dbconfig/20220908-112357-root.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34263 and previous config saved to /var/cache/conftool/dbconfig/20220908-112351-root.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34262 and previous config saved to /var/cache/conftool/dbconfig/20220908-112345-root.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34261 and previous config saved to /var/cache/conftool/dbconfig/20220908-112341-root.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34260 and previous config saved to /var/cache/conftool/dbconfig/20220908-112329-root.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34259 and previous config saved to /var/cache/conftool/dbconfig/20220908-112324-root.json
  • 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34258 and previous config saved to /var/cache/conftool/dbconfig/20220908-112319-root.json
  • 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34257 and previous config saved to /var/cache/conftool/dbconfig/20220908-111924-root.json
  • 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34256 and previous config saved to /var/cache/conftool/dbconfig/20220908-111920-root.json
  • 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34255 and previous config saved to /var/cache/conftool/dbconfig/20220908-111912-root.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34254 and previous config saved to /var/cache/conftool/dbconfig/20220908-110852-root.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34253 and previous config saved to /var/cache/conftool/dbconfig/20220908-110846-root.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34252 and previous config saved to /var/cache/conftool/dbconfig/20220908-110840-root.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34251 and previous config saved to /var/cache/conftool/dbconfig/20220908-110836-root.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34250 and previous config saved to /var/cache/conftool/dbconfig/20220908-110825-root.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34249 and previous config saved to /var/cache/conftool/dbconfig/20220908-110819-root.json
  • 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34248 and previous config saved to /var/cache/conftool/dbconfig/20220908-110814-root.json
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34247 and previous config saved to /var/cache/conftool/dbconfig/20220908-110419-root.json
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34246 and previous config saved to /var/cache/conftool/dbconfig/20220908-110415-root.json
  • 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34245 and previous config saved to /var/cache/conftool/dbconfig/20220908-110407-root.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34244 and previous config saved to /var/cache/conftool/dbconfig/20220908-105347-root.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34243 and previous config saved to /var/cache/conftool/dbconfig/20220908-105341-root.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34242 and previous config saved to /var/cache/conftool/dbconfig/20220908-105335-root.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34241 and previous config saved to /var/cache/conftool/dbconfig/20220908-105331-root.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34240 and previous config saved to /var/cache/conftool/dbconfig/20220908-105320-root.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34239 and previous config saved to /var/cache/conftool/dbconfig/20220908-105314-root.json
  • 10:53 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34238 and previous config saved to /var/cache/conftool/dbconfig/20220908-105309-root.json
  • 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1028 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34237 and previous config saved to /var/cache/conftool/dbconfig/20220908-104914-root.json
  • 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1027 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34236 and previous config saved to /var/cache/conftool/dbconfig/20220908-104910-root.json
  • 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1026 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34235 and previous config saved to /var/cache/conftool/dbconfig/20220908-104902-root.json
  • 10:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1027 es1026 es1028 for upgrade', diff saved to https://phabricator.wikimedia.org/P34234 and previous config saved to /var/cache/conftool/dbconfig/20220908-104152-root.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2025 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34233 and previous config saved to /var/cache/conftool/dbconfig/20220908-103842-root.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2022 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34232 and previous config saved to /var/cache/conftool/dbconfig/20220908-103836-root.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34231 and previous config saved to /var/cache/conftool/dbconfig/20220908-103830-root.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es1022 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34230 and previous config saved to /var/cache/conftool/dbconfig/20220908-103826-root.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34229 and previous config saved to /var/cache/conftool/dbconfig/20220908-103815-root.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34228 and previous config saved to /var/cache/conftool/dbconfig/20220908-103809-root.json
  • 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34227 and previous config saved to /var/cache/conftool/dbconfig/20220908-103804-root.json
  • 10:31 mvolz@deploy1002: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
  • 10:30 mvolz@deploy1002: helmfile [eqiad] START helmfile.d/services/zotero: apply
  • 10:29 mvolz@deploy1002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
  • 10:29 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34226 and previous config saved to /var/cache/conftool/dbconfig/20220908-102859-root.json
  • 10:28 mvolz@deploy1002: helmfile [codfw] START helmfile.d/services/zotero: apply
  • 10:27 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 10:26 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 10:23 mvolz@deploy1002: helmfile [staging] DONE helmfile.d/services/zotero: apply
  • 10:23 mvolz@deploy1002: helmfile [staging] START helmfile.d/services/zotero: apply
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34225 and previous config saved to /var/cache/conftool/dbconfig/20220908-102310-root.json
  • 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34224 and previous config saved to /var/cache/conftool/dbconfig/20220908-102304-root.json
  • 10:22 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34223 and previous config saved to /var/cache/conftool/dbconfig/20220908-102259-root.json
  • 10:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1022, es1025, es2022, es2025 for upgrade', diff saved to https://phabricator.wikimedia.org/P34222 and previous config saved to /var/cache/conftool/dbconfig/20220908-102040-root.json
  • 10:18 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 15 hosts with reason: Downtiming replaced wtp servers
  • 10:18 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 15 hosts with reason: Downtiming replaced wtp servers
  • 10:18 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 7 hosts with reason: Downtiming replaced wtp servers
  • 10:18 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 7 hosts with reason: Downtiming replaced wtp servers
  • 10:13 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34221 and previous config saved to /var/cache/conftool/dbconfig/20220908-101329-root.json
  • 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2034 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34220 and previous config saved to /var/cache/conftool/dbconfig/20220908-100805-root.json
  • 10:08 marostegui@cumin1001: dbctl commit (dc=all): 'es2033 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34219 and previous config saved to /var/cache/conftool/dbconfig/20220908-100759-root.json
  • 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'es2032 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34218 and previous config saved to /var/cache/conftool/dbconfig/20220908-100754-root.json
  • 10:07 XioNoX: re-pool esams after routers upgrade - T295690
  • 10:06 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1029.eqiad.wmnet
  • 10:06 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1031.eqiad.wmnet
  • 10:06 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1030.eqiad.wmnet
  • 10:06 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1029-1031].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 10:05 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1029-1031].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 10:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on es[2032-2034].codfw.wmnet with reason: Upgrade
  • 10:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on es[2032-2034].codfw.wmnet with reason: Upgrade
  • 10:01 claime: Serving 100% of parsoid traffic with php 7.4 T307219
  • 10:00 claime: depooled wtp1033.eqiad.wmnet from parsoid cluster T307219
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34217 and previous config saved to /var/cache/conftool/dbconfig/20220908-100028-root.json
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34216 and previous config saved to /var/cache/conftool/dbconfig/20220908-100027-root.json
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34215 and previous config saved to /var/cache/conftool/dbconfig/20220908-100025-root.json
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es2032 es2033 es2034 for upgrade', diff saved to https://phabricator.wikimedia.org/P34214 and previous config saved to /var/cache/conftool/dbconfig/20220908-100014-root.json
  • 09:58 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34213 and previous config saved to /var/cache/conftool/dbconfig/20220908-095759-root.json
  • 09:50 claime: pooled parse1024.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 09:50 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1024.eqiad.wmnet
  • 09:50 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1024.eqiad.wmnet
  • 09:47 claime: depooled wtp1032.eqiad.wmnet from parsoid cluster T307219
  • 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34212 and previous config saved to /var/cache/conftool/dbconfig/20220908-094523-root.json
  • 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34211 and previous config saved to /var/cache/conftool/dbconfig/20220908-094522-root.json
  • 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34210 and previous config saved to /var/cache/conftool/dbconfig/20220908-094520-root.json
  • 09:42 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34209 and previous config saved to /var/cache/conftool/dbconfig/20220908-094229-root.json
  • 09:38 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1024.eqiad.wmnet
  • 09:37 claime: pooled parse1023.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 09:36 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1023.eqiad.wmnet
  • 09:36 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1023.eqiad.wmnet
  • 09:35 XioNoX: drain draffic from cr3-knams - T295690
  • 09:33 claime: depooled wtp1031.eqiad.wmnet from parsoid cluster T307219
  • 09:31 vgutierrez: rolling restart of purged - T317064
  • 09:31 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-knams,cr3-knams IPv6 with reason: router upgrade
  • 09:31 ayounsi@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-knams,cr3-knams IPv6 with reason: router upgrade
  • 09:31 vgutierrez: upload purged 0.18 to apt.wm.o (buster) - T317064
  • 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34208 and previous config saved to /var/cache/conftool/dbconfig/20220908-093018-root.json
  • 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34207 and previous config saved to /var/cache/conftool/dbconfig/20220908-093017-root.json
  • 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34206 and previous config saved to /var/cache/conftool/dbconfig/20220908-093015-root.json
  • 09:27 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34205 and previous config saved to /var/cache/conftool/dbconfig/20220908-092700-root.json
  • 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2027 to es3 codfw master', diff saved to https://phabricator.wikimedia.org/P34204 and previous config saved to /var/cache/conftool/dbconfig/20220908-092436-marostegui.json
  • 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2026 to es2 codfw master', diff saved to https://phabricator.wikimedia.org/P34203 and previous config saved to /var/cache/conftool/dbconfig/20220908-092346-marostegui.json
  • 09:23 marostegui@cumin1001: dbctl commit (dc=all): 'Promote es2028 to es1 codfw master', diff saved to https://phabricator.wikimedia.org/P34202 and previous config saved to /var/cache/conftool/dbconfig/20220908-092301-marostegui.json
  • 09:21 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1023.eqiad.wmnet
  • 09:21 claime: pooled parse1022.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 09:19 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1022.eqiad.wmnet
  • 09:19 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1022.eqiad.wmnet
  • 09:18 vgutierrez: testing purged 0.18 in cp4026 and cp4032
  • 09:18 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1025.eqiad.wmnet
  • 09:17 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1026.eqiad.wmnet
  • 09:17 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=wtp1029.eqiad.wmnet
  • 09:16 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1029.eqiad.wmnet
  • 09:16 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1025-1028].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 09:16 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1028.eqiad.wmnet
  • 09:16 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1025-1028].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34201 and previous config saved to /var/cache/conftool/dbconfig/20220908-091513-root.json
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34200 and previous config saved to /var/cache/conftool/dbconfig/20220908-091512-root.json
  • 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34199 and previous config saved to /var/cache/conftool/dbconfig/20220908-091510-root.json
  • 09:14 claime: depooled wtp1030.eqiad.wmnet from parsoid cluster T307219
  • 09:12 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34198 and previous config saved to /var/cache/conftool/dbconfig/20220908-091200-root.json
  • 09:11 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34197 and previous config saved to /var/cache/conftool/dbconfig/20220908-091157-root.json
  • 09:11 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34196 and previous config saved to /var/cache/conftool/dbconfig/20220908-091151-root.json
  • 09:11 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34195 and previous config saved to /var/cache/conftool/dbconfig/20220908-091129-root.json
  • 09:10 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
  • 09:09 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
  • 09:08 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
  • 09:07 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
  • 09:05 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1022.eqiad.wmnet
  • 09:04 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
  • 09:04 claime: pooled parse1021.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 09:03 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
  • 09:03 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1021.eqiad.wmnet
  • 09:03 cgoubert@cumin2002: START - Cookbook sre.hosts.remove-downtime for parse1021.eqiad.wmnet
  • 09:02 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34194 and previous config saved to /var/cache/conftool/dbconfig/20220908-090008-root.json
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34193 and previous config saved to /var/cache/conftool/dbconfig/20220908-090007-root.json
  • 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34192 and previous config saved to /var/cache/conftool/dbconfig/20220908-090005-root.json
  • 08:56 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34190 and previous config saved to /var/cache/conftool/dbconfig/20220908-085630-root.json
  • 08:56 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34189 and previous config saved to /var/cache/conftool/dbconfig/20220908-085627-root.json
  • 08:56 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34188 and previous config saved to /var/cache/conftool/dbconfig/20220908-085621-root.json
  • 08:56 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34187 and previous config saved to /var/cache/conftool/dbconfig/20220908-085559-root.json
  • 08:55 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1021.eqiad.wmnet
  • 08:53 claime: depooled wtp1029.eqiad.wmnet from parsoid cluster T307219
  • 08:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-tool1011.eqiad.wmnet
  • 08:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-tool1011.eqiad.wmnet
  • 08:45 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr2-esams,cr2-esams IPv6,re0.cr2-esams.mgmt with reason: router upgrade
  • 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2029 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34186 and previous config saved to /var/cache/conftool/dbconfig/20220908-084503-root.json
  • 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2030 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34185 and previous config saved to /var/cache/conftool/dbconfig/20220908-084502-root.json
  • 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'es2031 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34184 and previous config saved to /var/cache/conftool/dbconfig/20220908-084500-root.json
  • 08:44 ayounsi@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-esams,cr2-esams IPv6,re0.cr2-esams.mgmt with reason: router upgrade
  • 08:44 XioNoX: reverting cr3-esams changes (JTAC will be needed for a firmware upgrade), and moving on to cr2-esams - T295690
  • 08:41 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 100%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34183 and previous config saved to /var/cache/conftool/dbconfig/20220908-084133-root.json
  • 08:41 claime: pooled parse1020.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 08:41 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34182 and previous config saved to /var/cache/conftool/dbconfig/20220908-084059-root.json
  • 08:40 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34181 and previous config saved to /var/cache/conftool/dbconfig/20220908-084057-root.json
  • 08:40 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34180 and previous config saved to /var/cache/conftool/dbconfig/20220908-084051-root.json
  • 08:40 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1020.eqiad.wmnet
  • 08:40 cgoubert@cumin2002: START - Cookbook sre.hosts.remove-downtime for parse1020.eqiad.wmnet
  • 08:40 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34179 and previous config saved to /var/cache/conftool/dbconfig/20220908-084029-root.json
  • 08:40 claime: depooled wtp1028.eqiad.wmnet from parsoid cluster T307219
  • 08:39 marostegui@cumin2002: dbctl commit (dc=all): 'Depool es2029, es2030, es2031', diff saved to https://phabricator.wikimedia.org/P34178 and previous config saved to /var/cache/conftool/dbconfig/20220908-083941-marostegui.json
  • 08:31 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1020.eqiad.wmnet
  • 08:26 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 75%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34176 and previous config saved to /var/cache/conftool/dbconfig/20220908-082604-root.json
  • 08:25 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34175 and previous config saved to /var/cache/conftool/dbconfig/20220908-082530-root.json
  • 08:25 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34174 and previous config saved to /var/cache/conftool/dbconfig/20220908-082528-root.json
  • 08:25 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34173 and previous config saved to /var/cache/conftool/dbconfig/20220908-082521-root.json
  • 08:25 marostegui@cumin2002: dbctl commit (dc=all): 'db1127 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34172 and previous config saved to /var/cache/conftool/dbconfig/20220908-082500-root.json
  • 08:24 claime: pooled parse1019.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 08:22 cgoubert@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1019.eqiad.wmnet
  • 08:22 cgoubert@cumin2002: START - Cookbook sre.hosts.remove-downtime for parse1019.eqiad.wmnet
  • 08:12 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin1001.eqiad.wmnet
  • 08:10 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
  • 08:10 ayounsi@cumin2002: START - Cookbook sre.network.cf
  • 08:10 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 50%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34171 and previous config saved to /var/cache/conftool/dbconfig/20220908-081034-root.json
  • 08:09 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34170 and previous config saved to /var/cache/conftool/dbconfig/20220908-080958-root.json
  • 08:09 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34169 and previous config saved to /var/cache/conftool/dbconfig/20220908-080951-root.json
  • 08:09 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34168 and previous config saved to /var/cache/conftool/dbconfig/20220908-080946-root.json
  • 08:09 ayounsi@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-esams,cr3-esams IPv6,re0.cr3-esams.mgmt with reason: router upgrade
  • 08:09 ayounsi@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-esams,cr3-esams IPv6,re0.cr3-esams.mgmt with reason: router upgrade
  • 08:08 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1019.eqiad.wmnet
  • 08:08 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 100%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34167 and previous config saved to /var/cache/conftool/dbconfig/20220908-080823-root.json
  • 08:07 XioNoX: drain draffic from cr3-esams - T295690
  • 08:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin1001.eqiad.wmnet
  • 07:55 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 25%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34166 and previous config saved to /var/cache/conftool/dbconfig/20220908-075504-root.json
  • 07:54 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34165 and previous config saved to /var/cache/conftool/dbconfig/20220908-075429-root.json
  • 07:54 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34164 and previous config saved to /var/cache/conftool/dbconfig/20220908-075421-root.json
  • 07:54 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34163 and previous config saved to /var/cache/conftool/dbconfig/20220908-075416-root.json
  • 07:52 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 75%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34162 and previous config saved to /var/cache/conftool/dbconfig/20220908-075253-root.json
  • 07:41 XioNoX: depool esams for routers upgrade - T295690
  • 07:39 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 10%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34161 and previous config saved to /var/cache/conftool/dbconfig/20220908-073935-root.json
  • 07:39 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34160 and previous config saved to /var/cache/conftool/dbconfig/20220908-073900-root.json
  • 07:38 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34159 and previous config saved to /var/cache/conftool/dbconfig/20220908-073851-root.json
  • 07:38 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34158 and previous config saved to /var/cache/conftool/dbconfig/20220908-073846-root.json
  • 07:37 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 50%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34157 and previous config saved to /var/cache/conftool/dbconfig/20220908-073724-root.json
  • 07:24 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 5%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34156 and previous config saved to /var/cache/conftool/dbconfig/20220908-072405-root.json
  • 07:23 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34155 and previous config saved to /var/cache/conftool/dbconfig/20220908-072330-root.json
  • 07:23 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34154 and previous config saved to /var/cache/conftool/dbconfig/20220908-072321-root.json
  • 07:23 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34153 and previous config saved to /var/cache/conftool/dbconfig/20220908-072316-root.json
  • 07:21 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 25%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34152 and previous config saved to /var/cache/conftool/dbconfig/20220908-072154-root.json
  • 07:08 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 4%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34151 and previous config saved to /var/cache/conftool/dbconfig/20220908-070836-root.json
  • 07:08 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34150 and previous config saved to /var/cache/conftool/dbconfig/20220908-070800-root.json
  • 07:07 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34149 and previous config saved to /var/cache/conftool/dbconfig/20220908-070752-root.json
  • 07:07 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34148 and previous config saved to /var/cache/conftool/dbconfig/20220908-070746-root.json
  • 07:06 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 10%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34147 and previous config saved to /var/cache/conftool/dbconfig/20220908-070625-root.json
  • 07:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
  • 07:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
  • 07:01 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
  • 07:00 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
  • 07:00 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
  • 07:00 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
  • 06:53 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 3%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34146 and previous config saved to /var/cache/conftool/dbconfig/20220908-065306-root.json
  • 06:52 marostegui@cumin2002: dbctl commit (dc=all): 'es2028 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34145 and previous config saved to /var/cache/conftool/dbconfig/20220908-065229-root.json
  • 06:52 marostegui@cumin2002: dbctl commit (dc=all): 'es2027 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34144 and previous config saved to /var/cache/conftool/dbconfig/20220908-065222-root.json
  • 06:52 marostegui@cumin2002: dbctl commit (dc=all): 'es2026 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34143 and previous config saved to /var/cache/conftool/dbconfig/20220908-065216-root.json
  • 06:50 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 5%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34142 and previous config saved to /var/cache/conftool/dbconfig/20220908-065054-root.json
  • 06:44 marostegui@cumin2002: dbctl commit (dc=all): 'Depool es2026, es2027, es2028', diff saved to https://phabricator.wikimedia.org/P34141 and previous config saved to /var/cache/conftool/dbconfig/20220908-064450-marostegui.json
  • 06:37 marostegui@cumin2002: dbctl commit (dc=all): 'db1203 (re)pooling @ 2%: Pooling for the first time in s8', diff saved to https://phabricator.wikimedia.org/P34140 and previous config saved to /var/cache/conftool/dbconfig/20220908-063737-root.json
  • 06:35 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 4%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34139 and previous config saved to /var/cache/conftool/dbconfig/20220908-063525-root.json
  • 06:19 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 3%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34138 and previous config saved to /var/cache/conftool/dbconfig/20220908-061955-root.json
  • 06:14 marostegui@cumin2002: dbctl commit (dc=all): 'Add db1203 to s8, depooled, T316342', diff saved to https://phabricator.wikimedia.org/P34137 and previous config saved to /var/cache/conftool/dbconfig/20220908-061413-marostegui.json
  • 06:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1157 T316622', diff saved to https://phabricator.wikimedia.org/P34136 and previous config saved to /var/cache/conftool/dbconfig/20220908-060438-ladsgroup.json
  • 06:04 marostegui@cumin2002: dbctl commit (dc=all): 'db1202 (re)pooling @ 2%: Pooling for the first time in s7', diff saved to https://phabricator.wikimedia.org/P34135 and previous config saved to /var/cache/conftool/dbconfig/20220908-060426-root.json
  • 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1123 to s3 primary and set section read-write T316622', diff saved to https://phabricator.wikimedia.org/P34134 and previous config saved to /var/cache/conftool/dbconfig/20220908-060138-ladsgroup.json
  • 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s3 eqiad as read-only for maintenance - T316622', diff saved to https://phabricator.wikimedia.org/P34133 and previous config saved to /var/cache/conftool/dbconfig/20220908-060110-ladsgroup.json
  • 06:00 Amir1: Starting s3 eqiad failover from db1157 to db1123 - T316622
  • 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Increase weight for db1194', diff saved to https://phabricator.wikimedia.org/P34132 and previous config saved to /var/cache/conftool/dbconfig/20220908-055546-marostegui.json
  • 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1202 for the first time in s7 T316342', diff saved to https://phabricator.wikimedia.org/P34131 and previous config saved to /var/cache/conftool/dbconfig/20220908-055451-marostegui.json
  • 05:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Pooling back db2140', diff saved to https://phabricator.wikimedia.org/P34130 and previous config saved to /var/cache/conftool/dbconfig/20220908-054921-ladsgroup.json
  • 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1202 to s7, depooled, T316342', diff saved to https://phabricator.wikimedia.org/P34129 and previous config saved to /var/cache/conftool/dbconfig/20220908-054429-marostegui.json
  • 05:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1123 with weight 0 T316622', diff saved to https://phabricator.wikimedia.org/P34128 and previous config saved to /var/cache/conftool/dbconfig/20220908-051043-ladsgroup.json
  • 05:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Primary switchover s3 T316622
  • 05:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Primary switchover s3 T316622
  • 02:04 pt1979@cumin1001: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1005:
  • 02:04 pt1979@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1005:
  • 02:02 pt1979@cumin1001: END (ERROR) - Cookbook sre.network.configure-switch-interfaces (exit_code=97) for host db2169
  • 02:02 pt1979@cumin1001: START - Cookbook sre.network.configure-switch-interfaces for host db2169
  • 02:01 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:59 pt1979@cumin1001: START - Cookbook sre.dns.netbox
  • 01:56 ejegg: re-enabled recurring charge job
  • 01:48 ejegg: updated fundraising civicrm from c1f0e041 to efbbcb57
  • 01:26 ejegg: disabled recurring charge job
  • 00:33 ejegg: updated standalone Smashpig from 11ba0a1b to 88e5e9bb

2022-09-07

  • 22:12 bd808: Attempting to migrate all remaining Striker managed git repos from Diffusion to GitLab (T315706)
  • 21:27 TheresNoTime: closing UTC late backport window, +27m
  • 21:26 samtar@deploy1002: Finished scap: Backport for Respect skin's TOC option (T316947) (duration: 08m 02s)
  • 21:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 21:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 21:23 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 21:19 samtar@deploy1002: samtar and jdlrobson: Backport for Respect skin's TOC option (T316947) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 21:18 samtar@deploy1002: Started scap: Backport for Respect skin's TOC option (T316947)
  • 21:14 samtar@deploy1002: Finished scap: Backport for Respect skin's TOC option (T316947) (duration: 07m 06s)
  • 21:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 21:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 21:11 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 21:07 samtar@deploy1002: samtar and jdlrobson: Backport for Respect skin's TOC option (T316947) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
  • 21:07 samtar@deploy1002: Started scap: Backport for Respect skin's TOC option (T316947)
  • 21:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 21:02 samtar@deploy1002: Finished scap: Backport for beta: Remove deployment-parsoid11 from wgLinterSubmitterWhitelist (duration: 04m 36s)
  • 21:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 21:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 21:01 TheresNoTime: extending UTC late backport window
  • 20:58 samtar@deploy1002: samtar and zabe: Backport for beta: Remove deployment-parsoid11 from wgLinterSubmitterWhitelist synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:58 samtar@deploy1002: Started scap: Backport for beta: Remove deployment-parsoid11 from wgLinterSubmitterWhitelist
  • 20:56 samtar@deploy1002: Finished scap: Backport for Enable Extension:Nearby on wikidata (T246493) (duration: 05m 54s)
  • 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:51 samtar@deploy1002: samtar and jdlrobson: Backport for Enable Extension:Nearby on wikidata (T246493) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 20:50 samtar@deploy1002: Started scap: Backport for Enable Extension:Nearby on wikidata (T246493)
  • 20:49 samtar@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Wikidata has a wordmark (T315572) (duration: 03m 44s)
  • 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:44 samtar@deploy1002: Synchronized static/images/mobile/copyright/wikidata-en.svg: Config: Wikidata has a wordmark (T315572) (duration: 03m 45s)
  • 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:35 samtar@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Revert "Enable wgDiscussionToolsEnablePermalinksBackend on all wikis" (duration: 03m 42s)
  • 20:21 samtar@deploy1002: Finished scap: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on all wikis (T315353) (duration: 06m 57s)
  • 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:14 samtar@deploy1002: samtar and matmarex: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on all wikis (T315353) synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:14 samtar@deploy1002: Started scap: Backport for Enable wgDiscussionToolsEnablePermalinksBackend on all wikis (T315353)
  • 20:12 mutante: pcc-worker1003 - rm of /srv/jenkins/puppet-compiler/output/36713 and 37153 - /srv is back to 58% usage again
  • 20:10 mutante: integration.wikimedia.org - clicked to delete builds 36713 and 37153 because they were several GB in size
  • 20:08 TheresNoTime: running `extensions/WikimediaMaintenance/createExtensionTables.php discussiontools` on mwmaint1002
  • 20:08 mutante: puppet compiler out of disk space, (pcc-worker1003): identified build 37153 as huge compared to others in the filesystem, then clicked to delete it via integration.wm.org web UI
  • 19:26 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons.
  • 18:34 dduvall@deploy1002: Finished deploy [phabricator/deployment@a7616e6]: testing deployment to phab2001 (inactive) (duration: 00m 35s)
  • 18:33 dduvall@deploy1002: Started deploy [phabricator/deployment@a7616e6]: testing deployment to phab2001 (inactive)
  • 18:26 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-eqiad cluster: Roll restart of jvm daemons.
  • 18:25 herron@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0) for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons.
  • 17:24 herron@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers for Kafka A:kafka-logging-codfw cluster: Roll restart of jvm daemons.
  • 16:56 ejegg: restarted fundraising scheduled jobs
  • 16:34 ejegg: fundraising civicrm upgraded from 2fcd3bb4 to c1f0e041
  • 16:32 xcollazo@deploy1002: Finished deploy [airflow-dags/platform_eng@9e4ed94]: Update platform_eng Airflow to latest (duration: 00m 10s)
  • 16:31 xcollazo@deploy1002: Started deploy [airflow-dags/platform_eng@9e4ed94]: Update platform_eng Airflow to latest
  • 16:31 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
  • 16:30 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
  • 16:22 moritzm: installing twisted security updates on bullseye
  • 16:22 cparle@deploy1002: Finished deploy [airflow-dags/platform_eng@9e4ed94]: (no justification provided) (duration: 00m 17s)
  • 16:21 cparle@deploy1002: Started deploy [airflow-dags/platform_eng@9e4ed94]: (no justification provided)
  • 16:16 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
  • 16:00 ejegg: fundraising civicrm upgraded from 5aa1309d to 2fcd3bb4
  • 15:55 ejegg: fundraising scheduled jobs disabled for deployment
  • 15:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 100%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34125 and previous config saved to /var/cache/conftool/dbconfig/20220907-153827-root.json
  • 15:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 75%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34124 and previous config saved to /var/cache/conftool/dbconfig/20220907-152322-root.json
  • 15:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34122 and previous config saved to /var/cache/conftool/dbconfig/20220907-152028-root.json
  • 15:11 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1027.eqiad.wmnet
  • 15:10 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1048.eqiad.wmnet
  • 15:10 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1047.eqiad.wmnet
  • 15:10 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1027,1047-1048].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 15:10 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1027,1047-1048].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 15:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 50%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34121 and previous config saved to /var/cache/conftool/dbconfig/20220907-150817-root.json
  • 15:07 claime: depooled wtp1026.eqiad.wmnet from parsoid cluster T307219
  • 15:07 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 15:06 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 15:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34120 and previous config saved to /var/cache/conftool/dbconfig/20220907-150523-root.json
  • 15:02 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 14:56 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Wikimania 2023 setup T316928 (duration: 04m 04s)
  • 14:54 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1010.eqiad.wmnet with OS bullseye
  • 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 25%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34119 and previous config saved to /var/cache/conftool/dbconfig/20220907-145313-root.json
  • 14:52 claime: pooled parse1018.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 14:50 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1018.eqiad.wmnet
  • 14:50 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1018.eqiad.wmnet
  • 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34118 and previous config saved to /var/cache/conftool/dbconfig/20220907-145018-root.json
  • 14:48 claime: depooled wtp1025.eqiad.wmnet from parsoid cluster T307219
  • 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P34117 and previous config saved to /var/cache/conftool/dbconfig/20220907-144434-ladsgroup.json
  • 14:41 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1018.eqiad.wmnet
  • 14:39 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1010.eqiad.wmnet with reason: host reimage
  • 14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P34116 and previous config saved to /var/cache/conftool/dbconfig/20220907-143828-ladsgroup.json
  • 14:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 10%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34115 and previous config saved to /var/cache/conftool/dbconfig/20220907-143808-root.json
  • 14:37 claime: pooled parse1017.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 14:36 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync data - jbond@cumin1001"
  • 14:35 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1010.eqiad.wmnet with reason: host reimage
  • 14:35 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1017.eqiad.wmnet
  • 14:35 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1017.eqiad.wmnet
  • 14:35 jbond@cumin1001: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync data - jbond@cumin1001"
  • 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34114 and previous config saved to /var/cache/conftool/dbconfig/20220907-143513-root.json
  • 14:32 claime: parsoid eqiad canaries switched to parse1001 and parse1002 T307219
  • 14:29 moritzm: installing runc security updates on k8s servers
  • 14:23 cgoubert@puppetmaster1001: conftool action : set/weight=1; selector: dc=eqiad,cluster=parsoid,service=canary
  • 14:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P34113 and previous config saved to /var/cache/conftool/dbconfig/20220907-142321-ladsgroup.json
  • 14:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 14:23 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host rdb1010.eqiad.wmnet with OS bullseye
  • 14:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 5%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34112 and previous config saved to /var/cache/conftool/dbconfig/20220907-142303-root.json
  • 14:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 14:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 14:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 14:20 claime: Switching canaries for parsoid eqiad T307219
  • 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34111 and previous config saved to /var/cache/conftool/dbconfig/20220907-142008-root.json
  • 14:08 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host rdb1009.eqiad.wmnet with OS bullseye
  • 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T312863)', diff saved to https://phabricator.wikimedia.org/P34110 and previous config saved to /var/cache/conftool/dbconfig/20220907-140813-ladsgroup.json
  • 14:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1173 (re)pooling @ 1%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34109 and previous config saved to /var/cache/conftool/dbconfig/20220907-140758-root.json
  • 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: Pooling after maintenance', diff saved to https://phabricator.wikimedia.org/P34108 and previous config saved to /var/cache/conftool/dbconfig/20220907-140503-root.json
  • 14:02 claime: depooled wtp1027.eqiad.wmnet from parsoid cluster T307219
  • 14:01 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 14:00 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 13:57 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:57 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:56 TheresNoTime: UTC afternoon backport window closed
  • 13:55 samtar@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: CommonSettings-labs: Set config to production-esque values (T314294) (duration: 03m 47s)
  • 13:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maint
  • 13:54 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maint
  • 13:53 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 13:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 13:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 13:52 akosiaris@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on rdb1009.eqiad.wmnet with reason: host reimage
  • 13:49 akosiaris@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on rdb1009.eqiad.wmnet with reason: host reimage
  • 13:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 13:46 samtar@deploy1002: Finished scap: Backport for private/readme.php: Add $wgPhonosApiKeyGoogle (T315491) (duration: 04m 51s)
  • 13:42 samtar@deploy1002: samtar and samtar: Backport for private/readme.php: Add $wgPhonosApiKeyGoogle (T315491) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 13:42 samtar@deploy1002: Started scap: Backport for private/readme.php: Add $wgPhonosApiKeyGoogle (T315491)
  • 13:38 samtar@deploy1002: Synchronized php-1.39.0-wmf.27/extensions/GrowthExperiments/modules/ext.growthExperiments.MentorDashboard.Vue/components/MenteeOverview/MenteeFiltersForm.vue: Backport: Mentee overview(vue): prevent clicks on more recent edit buttons to submit the filters (T316926) (duration: 04m 07s)
  • 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 13:36 akosiaris@cumin1001: START - Cookbook sre.hosts.reimage for host rdb1009.eqiad.wmnet with OS bullseye
  • 13:12 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 100%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34107 and previous config saved to /var/cache/conftool/dbconfig/20220907-131223-root.json
  • 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 75%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34106 and previous config saved to /var/cache/conftool/dbconfig/20220907-125718-root.json
  • 12:57 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 100%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34105 and previous config saved to /var/cache/conftool/dbconfig/20220907-125706-root.json
  • 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 50%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34104 and previous config saved to /var/cache/conftool/dbconfig/20220907-124213-root.json
  • 12:42 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 75%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34103 and previous config saved to /var/cache/conftool/dbconfig/20220907-124201-root.json
  • 12:31 jbond: re-enable puppet
  • 12:27 moritzm: installing runc security updates on codfw staging hosts
  • 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 25%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34102 and previous config saved to /var/cache/conftool/dbconfig/20220907-122708-root.json
  • 12:26 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 50%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34101 and previous config saved to /var/cache/conftool/dbconfig/20220907-122656-root.json
  • 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 10%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34100 and previous config saved to /var/cache/conftool/dbconfig/20220907-121204-root.json
  • 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 25%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34099 and previous config saved to /var/cache/conftool/dbconfig/20220907-121152-root.json
  • 12:08 jbond: disable puppet fleet wide to fix issues
  • 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 5%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34098 and previous config saved to /var/cache/conftool/dbconfig/20220907-115659-root.json
  • 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 10%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34097 and previous config saved to /var/cache/conftool/dbconfig/20220907-115647-root.json
  • 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db2122 (re)pooling @ 1%: Pooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34096 and previous config saved to /var/cache/conftool/dbconfig/20220907-114154-root.json
  • 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db2120 (re)pooling @ 5%: Pooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34095 and previous config saved to /var/cache/conftool/dbconfig/20220907-114142-root.json
  • 11:34 jbond: change default puppet file permissions ro root:root
  • 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34094 and previous config saved to /var/cache/conftool/dbconfig/20220907-111821-root.json
  • 11:05 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: sync
  • 11:04 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: sync
  • 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34093 and previous config saved to /var/cache/conftool/dbconfig/20220907-110316-root.json
  • 11:01 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 11:01 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 11:01 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 11:00 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 11:00 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 7 hosts with reason: Downtime pending inclusion in production
  • 11:00 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 7 hosts with reason: Downtime pending inclusion in production
  • 11:00 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: sync
  • 10:59 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: sync
  • 10:59 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
  • 10:59 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
  • 10:53 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1046.eqiad.wmnet
  • 10:53 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1045.eqiad.wmnet
  • 10:53 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1044.eqiad.wmnet
  • 10:52 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1044-1046].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 10:52 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1044-1046].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 10:48 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1017.eqiad.wmnet
  • 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34092 and previous config saved to /var/cache/conftool/dbconfig/20220907-104811-root.json
  • 10:40 claime: pooled parse1016.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 10:39 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1016.eqiad.wmnet
  • 10:39 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1016.eqiad.wmnet
  • 10:36 claime: depooled wtp1048.eqiad.wmnet from parsoid cluster T307219
  • 10:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34091 and previous config saved to /var/cache/conftool/dbconfig/20220907-103306-root.json
  • 10:31 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Enable sitelinks to redirects on testwikidatawiki (T316637) (duration: 03m 51s)
  • 10:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 10:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 10:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 10:27 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1016.eqiad.wmnet
  • 10:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 10:26 claime: pooled parse1015.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 10:25 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1015.eqiad.wmnet
  • 10:25 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1015.eqiad.wmnet
  • 10:21 claime: depooled wtp1047.eqiad.wmnet from parsoid cluster T307219
  • 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34089 and previous config saved to /var/cache/conftool/dbconfig/20220907-101801-root.json
  • 10:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34088 and previous config saved to /var/cache/conftool/dbconfig/20220907-101734-root.json
  • 10:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2120 to clone db2122', diff saved to https://phabricator.wikimedia.org/P34086 and previous config saved to /var/cache/conftool/dbconfig/20220907-101258-root.json
  • 10:12 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1015.eqiad.wmnet
  • 10:12 claime: repooled parse1014.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 10:10 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1014.eqiad.wmnet
  • 10:05 claime: pooled parse1014.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34085 and previous config saved to /var/cache/conftool/dbconfig/20220907-100257-root.json
  • 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34084 and previous config saved to /var/cache/conftool/dbconfig/20220907-100229-root.json
  • 09:57 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1014.eqiad.wmnet
  • 09:57 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1014.eqiad.wmnet
  • 09:53 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1014.eqiad.wmnet
  • 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 100%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34083 and previous config saved to /var/cache/conftool/dbconfig/20220907-094825-root.json
  • 09:48 topranks: Re-pooling eqsin for user traffic after successful core router upgrades - T295690
  • 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34082 and previous config saved to /var/cache/conftool/dbconfig/20220907-094752-root.json
  • 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34081 and previous config saved to /var/cache/conftool/dbconfig/20220907-094724-root.json
  • 09:44 claime: depooled wtp1046.eqiad.wmnet from parsoid cluster T307219
  • 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34080 and previous config saved to /var/cache/conftool/dbconfig/20220907-093736-root.json
  • 09:35 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1013.eqiad.wmnet
  • 09:35 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1013.eqiad.wmnet
  • 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 75%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34079 and previous config saved to /var/cache/conftool/dbconfig/20220907-093320-root.json
  • 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34078 and previous config saved to /var/cache/conftool/dbconfig/20220907-093247-root.json
  • 09:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34077 and previous config saved to /var/cache/conftool/dbconfig/20220907-093219-root.json
  • 09:31 claime: pooled parse1013.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 09:26 godog: restart swift-proxy and repool ms-fe1012
  • 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34076 and previous config saved to /var/cache/conftool/dbconfig/20220907-092230-root.json
  • 09:20 topranks: rebooting cr3-eqsin to complete JunOS upgrade
  • 09:20 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cr3-eqsin with reason: router upgrade
  • 09:19 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on cr3-eqsin with reason: router upgrade
  • 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 50%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34075 and previous config saved to /var/cache/conftool/dbconfig/20220907-091815-root.json
  • 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1201 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34074 and previous config saved to /var/cache/conftool/dbconfig/20220907-091740-root.json
  • 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34073 and previous config saved to /var/cache/conftool/dbconfig/20220907-091715-root.json
  • 09:10 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1013.eqiad.wmnet
  • 09:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 100%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34072 and previous config saved to /var/cache/conftool/dbconfig/20220907-090830-root.json
  • 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34071 and previous config saved to /var/cache/conftool/dbconfig/20220907-090725-root.json
  • 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 25%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34070 and previous config saved to /var/cache/conftool/dbconfig/20220907-090310-root.json
  • 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34069 and previous config saved to /var/cache/conftool/dbconfig/20220907-090210-root.json
  • 08:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34068 and previous config saved to /var/cache/conftool/dbconfig/20220907-085610-root.json
  • 08:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34067 and previous config saved to /var/cache/conftool/dbconfig/20220907-085325-root.json
  • 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34066 and previous config saved to /var/cache/conftool/dbconfig/20220907-085220-root.json
  • 08:51 topranks: rebooting cr2-eqsin to complete JunOS upgrade
  • 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 10%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34065 and previous config saved to /var/cache/conftool/dbconfig/20220907-084805-root.json
  • 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34064 and previous config saved to /var/cache/conftool/dbconfig/20220907-084705-root.json
  • 08:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 08:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 08:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1201 for the first time in s6 T316342', diff saved to https://phabricator.wikimedia.org/P34063 and previous config saved to /var/cache/conftool/dbconfig/20220907-084454-marostegui.json
  • 08:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1122 (s2 master) from API', diff saved to https://phabricator.wikimedia.org/P34062 and previous config saved to /var/cache/conftool/dbconfig/20220907-084232-root.json
  • 08:42 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Move 50% of traffic to php 7.4 (T271736) (duration: 04m 00s)
  • 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from x1 master', diff saved to https://phabricator.wikimedia.org/P34061 and previous config saved to /var/cache/conftool/dbconfig/20220907-084133-marostegui.json
  • 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34060 and previous config saved to /var/cache/conftool/dbconfig/20220907-084105-root.json
  • 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1201 to s6, depooled, T316342', diff saved to https://phabricator.wikimedia.org/P34059 and previous config saved to /var/cache/conftool/dbconfig/20220907-084057-marostegui.json
  • 08:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1131 (s6 master) from API', diff saved to https://phabricator.wikimedia.org/P34058 and previous config saved to /var/cache/conftool/dbconfig/20220907-083958-root.json
  • 08:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34057 and previous config saved to /var/cache/conftool/dbconfig/20220907-083820-root.json
  • 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34056 and previous config saved to /var/cache/conftool/dbconfig/20220907-083752-root.json
  • 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34055 and previous config saved to /var/cache/conftool/dbconfig/20220907-083715-root.json
  • 08:37 cmooney@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
  • 08:37 cmooney@cumin1001: START - Cookbook sre.network.cf
  • 08:35 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr2-eqsin with reason: router upgrade
  • 08:35 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-eqsin with reason: router upgrade
  • 08:35 cmooney@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cr2-eqsin.wikimedia.org with reason: router upgrade
  • 08:35 cmooney@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cr2-eqsin.wikimedia.org with reason: router upgrade
  • 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 5%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34054 and previous config saved to /var/cache/conftool/dbconfig/20220907-083300-root.json
  • 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34053 and previous config saved to /var/cache/conftool/dbconfig/20220907-083200-root.json
  • 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34052 and previous config saved to /var/cache/conftool/dbconfig/20220907-082554-root.json
  • 08:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34051 and previous config saved to /var/cache/conftool/dbconfig/20220907-082315-root.json
  • 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34050 and previous config saved to /var/cache/conftool/dbconfig/20220907-082247-root.json
  • 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 5%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34049 and previous config saved to /var/cache/conftool/dbconfig/20220907-082210-root.json
  • 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34048 and previous config saved to /var/cache/conftool/dbconfig/20220907-081826-root.json
  • 08:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 4%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34047 and previous config saved to /var/cache/conftool/dbconfig/20220907-081756-root.json
  • 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1200 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34046 and previous config saved to /var/cache/conftool/dbconfig/20220907-081655-root.json
  • 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34045 and previous config saved to /var/cache/conftool/dbconfig/20220907-081049-root.json
  • 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1200 for the first time in s5 T316342', diff saved to https://phabricator.wikimedia.org/P34044 and previous config saved to /var/cache/conftool/dbconfig/20220907-080825-marostegui.json
  • 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 10%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34043 and previous config saved to /var/cache/conftool/dbconfig/20220907-080810-root.json
  • 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34042 and previous config saved to /var/cache/conftool/dbconfig/20220907-080742-root.json
  • 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 4%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34041 and previous config saved to /var/cache/conftool/dbconfig/20220907-080705-root.json
  • 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1169 (T312863)', diff saved to https://phabricator.wikimedia.org/P34040 and previous config saved to /var/cache/conftool/dbconfig/20220907-080449-ladsgroup.json
  • 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34039 and previous config saved to /var/cache/conftool/dbconfig/20220907-080439-root.json
  • 08:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 08:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34038 and previous config saved to /var/cache/conftool/dbconfig/20220907-080321-root.json
  • 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 3%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34037 and previous config saved to /var/cache/conftool/dbconfig/20220907-080251-root.json
  • 07:59 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1200 to s5, depooled, T316342', diff saved to https://phabricator.wikimedia.org/P34036 and previous config saved to /var/cache/conftool/dbconfig/20220907-075919-marostegui.json
  • 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34035 and previous config saved to /var/cache/conftool/dbconfig/20220907-075544-root.json
  • 07:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 5%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34034 and previous config saved to /var/cache/conftool/dbconfig/20220907-075305-root.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34033 and previous config saved to /var/cache/conftool/dbconfig/20220907-075237-root.json
  • 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 3%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34032 and previous config saved to /var/cache/conftool/dbconfig/20220907-075200-root.json
  • 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34031 and previous config saved to /var/cache/conftool/dbconfig/20220907-074935-root.json
  • 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34030 and previous config saved to /var/cache/conftool/dbconfig/20220907-074816-root.json
  • 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1199 (re)pooling @ 2%: Pooling for the first time in s4', diff saved to https://phabricator.wikimedia.org/P34029 and previous config saved to /var/cache/conftool/dbconfig/20220907-074746-root.json
  • 07:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34028 and previous config saved to /var/cache/conftool/dbconfig/20220907-074636-root.json
  • 07:46 topranks: Depool eqsin from user traffic in advance of core router upgrades - T295690
  • 07:44 jmm@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cumin2002.codfw.wmnet
  • 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 5%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34027 and previous config saved to /var/cache/conftool/dbconfig/20220907-074039-root.json
  • 07:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 4%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34026 and previous config saved to /var/cache/conftool/dbconfig/20220907-073800-root.json
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1199 for the first time in s4 T316342', diff saved to https://phabricator.wikimedia.org/P34025 and previous config saved to /var/cache/conftool/dbconfig/20220907-073745-marostegui.json
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34024 and previous config saved to /var/cache/conftool/dbconfig/20220907-073732-root.json
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1199 to s4, depooled, T316342', diff saved to https://phabricator.wikimedia.org/P34023 and previous config saved to /var/cache/conftool/dbconfig/20220907-073727-marostegui.json
  • 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 2%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34022 and previous config saved to /var/cache/conftool/dbconfig/20220907-073655-root.json
  • 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34021 and previous config saved to /var/cache/conftool/dbconfig/20220907-073430-root.json
  • 07:33 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
  • 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34020 and previous config saved to /var/cache/conftool/dbconfig/20220907-073311-root.json
  • 07:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 07:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 07:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34019 and previous config saved to /var/cache/conftool/dbconfig/20220907-073131-root.json
  • 07:28 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 07:28 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 3%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34018 and previous config saved to /var/cache/conftool/dbconfig/20220907-072255-root.json
  • 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db2146 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34017 and previous config saved to /var/cache/conftool/dbconfig/20220907-072214-root.json
  • 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 1%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P34016 and previous config saved to /var/cache/conftool/dbconfig/20220907-072151-root.json
  • 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34015 and previous config saved to /var/cache/conftool/dbconfig/20220907-071925-root.json
  • 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34014 and previous config saved to /var/cache/conftool/dbconfig/20220907-071806-root.json
  • 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2146 and db2122', diff saved to https://phabricator.wikimedia.org/P34013 and previous config saved to /var/cache/conftool/dbconfig/20220907-071744-root.json
  • 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34012 and previous config saved to /var/cache/conftool/dbconfig/20220907-071627-root.json
  • 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 2%: Pooling for the first time in s3', diff saved to https://phabricator.wikimedia.org/P34011 and previous config saved to /var/cache/conftool/dbconfig/20220907-070750-root.json
  • 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34010 and previous config saved to /var/cache/conftool/dbconfig/20220907-070420-root.json
  • 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34009 and previous config saved to /var/cache/conftool/dbconfig/20220907-070301-root.json
  • 07:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34008 and previous config saved to /var/cache/conftool/dbconfig/20220907-070122-root.json
  • 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34007 and previous config saved to /var/cache/conftool/dbconfig/20220907-064915-root.json
  • 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1198 for the first time in s3 T316342', diff saved to https://phabricator.wikimedia.org/P34006 and previous config saved to /var/cache/conftool/dbconfig/20220907-064831-marostegui.json
  • 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34005 and previous config saved to /var/cache/conftool/dbconfig/20220907-064757-root.json
  • 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34004 and previous config saved to /var/cache/conftool/dbconfig/20220907-064617-root.json
  • 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1198 to s3, depooled, T316342', diff saved to https://phabricator.wikimedia.org/P34003 and previous config saved to /var/cache/conftool/dbconfig/20220907-064552-marostegui.json
  • 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34002 and previous config saved to /var/cache/conftool/dbconfig/20220907-063410-root.json
  • 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P34001 and previous config saved to /var/cache/conftool/dbconfig/20220907-063252-root.json
  • 06:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34000 and previous config saved to /var/cache/conftool/dbconfig/20220907-063112-root.json
  • 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33999 and previous config saved to /var/cache/conftool/dbconfig/20220907-061906-root.json
  • 06:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1197 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33998 and previous config saved to /var/cache/conftool/dbconfig/20220907-061747-root.json
  • 06:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33997 and previous config saved to /var/cache/conftool/dbconfig/20220907-061607-root.json
  • 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1197 for the first time in s2 T316342', diff saved to https://phabricator.wikimedia.org/P33996 and previous config saved to /var/cache/conftool/dbconfig/20220907-061147-marostegui.json
  • 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1197 to s2, depooled, T316342', diff saved to https://phabricator.wikimedia.org/P33995 and previous config saved to /var/cache/conftool/dbconfig/20220907-060828-marostegui.json
  • 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1196 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33994 and previous config saved to /var/cache/conftool/dbconfig/20220907-060401-root.json
  • 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33993 and previous config saved to /var/cache/conftool/dbconfig/20220907-060102-root.json
  • 05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Pooling db1196 for the first time in s1 T316342', diff saved to https://phabricator.wikimedia.org/P33992 and previous config saved to /var/cache/conftool/dbconfig/20220907-055201-marostegui.json
  • 05:49 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1196 to s1, depooled, T316342', diff saved to https://phabricator.wikimedia.org/P33991 and previous config saved to /var/cache/conftool/dbconfig/20220907-054910-marostegui.json
  • 05:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33990 and previous config saved to /var/cache/conftool/dbconfig/20220907-054557-root.json
  • 05:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1172 T316342', diff saved to https://phabricator.wikimedia.org/P33988 and previous config saved to /var/cache/conftool/dbconfig/20220907-053350-root.json
  • 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1174 T316342', diff saved to https://phabricator.wikimedia.org/P33986 and previous config saved to /var/cache/conftool/dbconfig/20220907-053154-root.json
  • 05:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33985 and previous config saved to /var/cache/conftool/dbconfig/20220907-053053-root.json
  • 02:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 02:27 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 02:26 ejegg: rolled back civicrm from 2fcd3bb4 to 5aa1309d
  • 02:16 ejegg: civicrm upgraded from 5aa1309d to 2fcd3bb4
  • 01:57 ejegg: updated payments-wiki from 648842f9 to de4b2bb9
  • 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2140 (T314041)', diff saved to https://phabricator.wikimedia.org/P33983 and previous config saved to /var/cache/conftool/dbconfig/20220907-015138-ladsgroup.json
  • 01:51 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 01:51 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2140.codfw.wmnet with reason: Maintenance
  • 01:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2172 (T314041)', diff saved to https://phabricator.wikimedia.org/P33982 and previous config saved to /var/cache/conftool/dbconfig/20220907-015116-ladsgroup.json

2022-09-06

  • 23:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T314041)', diff saved to https://phabricator.wikimedia.org/P33981 and previous config saved to /var/cache/conftool/dbconfig/20220906-233809-ladsgroup.json
  • 23:07 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on phab1004.eqiad.wmnet with reason: new install
  • 23:06 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 6 days, 0:00:00 on phab1004.eqiad.wmnet with reason: new install
  • 22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2114 (T314041)', diff saved to https://phabricator.wikimedia.org/P33980 and previous config saved to /var/cache/conftool/dbconfig/20220906-222439-ladsgroup.json
  • 22:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 22:24 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2114.codfw.wmnet with reason: Maintenance
  • 22:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2117 (T314041)', diff saved to https://phabricator.wikimedia.org/P33979 and previous config saved to /var/cache/conftool/dbconfig/20220906-222418-ladsgroup.json
  • 21:56 milimetric@deploy1002: Finished deploy [analytics/refinery@b14c9f4] (thin): Hotfix for requestctl field (duration: 00m 08s)
  • 21:56 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4] (thin): Hotfix for requestctl field
  • 21:56 milimetric@deploy1002: Finished deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field (duration: 02m 28s)
  • 21:53 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field
  • 21:53 milimetric@deploy1002: Finished deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field (duration: 03m 28s)
  • 21:49 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field
  • 21:49 milimetric@deploy1002: Finished deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field (duration: 03m 55s)
  • 21:45 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field
  • 21:45 milimetric@deploy1002: deploy aborted: Hotfix for requestctl field (duration: 32m 09s)
  • 21:41 root@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
  • 21:39 mutante: phabricator - passive hosts in codfw switched to readonly DB access (m3-slave, not m3-master) T315713
  • 21:30 root@cumin1001: END (ERROR) - Cookbook sre.network.prepare-upgrade (exit_code=97)
  • 21:13 milimetric@deploy1002: Started deploy [analytics/refinery@b14c9f4]: Hotfix for requestctl field
  • 20:57 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@8a5ce13] (duration: 08m 54s)
  • 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:49 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:48 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@8a5ce13]
  • 20:48 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:48 cjming: end of UTC late backport window
  • 20:47 cjming@deploy1002: Finished scap: Backport for Add localized wordmark for Bengali Wiktionary (T316953) (duration: 05m 24s)
  • 20:45 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13] (duration: 00m 16s)
  • 20:44 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13]
  • 20:44 milimetric@deploy1002: deploy aborted: Regular analytics weekly train [analytics/refinery@8a5ce13] (duration: 00m 00s)
  • 20:44 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13]
  • 20:44 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13] (thin): Regular analytics weekly train THIN [analytics/refinery@8a5ce13] (duration: 00m 08s)
  • 20:44 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13] (thin): Regular analytics weekly train THIN [analytics/refinery@8a5ce13]
  • 20:42 cjming@deploy1002: cjming and mdsshakil: Backport for Add localized wordmark for Bengali Wiktionary (T316953) synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 20:41 cjming@deploy1002: Started scap: Backport for Add localized wordmark for Bengali Wiktionary (T316953)
  • 20:38 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13] (duration: 03m 15s)
  • 20:35 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13]
  • 20:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2117 (T314041)', diff saved to https://phabricator.wikimedia.org/P33978 and previous config saved to /var/cache/conftool/dbconfig/20220906-203258-ladsgroup.json
  • 20:32 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 20:32 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2117.codfw.wmnet with reason: Maintenance
  • 20:32 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2124 (T314041)', diff saved to https://phabricator.wikimedia.org/P33977 and previous config saved to /var/cache/conftool/dbconfig/20220906-203236-ladsgroup.json
  • 20:29 cjming@deploy1002: Finished scap: Backport for Ensure namespace filters is passed as a list (duration: 06m 35s)
  • 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:28 milimetric@deploy1002: Finished deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13] (duration: 63m 48s)
  • 20:27 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 20:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1139.eqiad.wmnet with reason: Maintenance
  • 20:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T312863)', diff saved to https://phabricator.wikimedia.org/P33976 and previous config saved to /var/cache/conftool/dbconfig/20220906-202654-ladsgroup.json
  • 20:23 cjming@deploy1002: cjming and ebernhardson: Backport for Ensure namespace filters is passed as a list synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 20:23 cjming@deploy1002: Started scap: Backport for Ensure namespace filters is passed as a list
  • 20:16 bd808: Forcing puppet runs on cloudweb100[34] to deploy new version of Striker (T296893)
  • 20:13 bd808: Running database migrations for Striker (T296893)
  • 20:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P33975 and previous config saved to /var/cache/conftool/dbconfig/20220906-201148-ladsgroup.json
  • 20:03 inflatador: 'bking@cumin1001 disabling puppet on elastic codfw hosts T313431'
  • 19:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P33974 and previous config saved to /var/cache/conftool/dbconfig/20220906-195642-ladsgroup.json
  • 19:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T312863)', diff saved to https://phabricator.wikimedia.org/P33973 and previous config saved to /var/cache/conftool/dbconfig/20220906-194135-ladsgroup.json
  • 19:24 milimetric@deploy1002: Started deploy [analytics/refinery@8a5ce13]: Regular analytics weekly train [analytics/refinery@8a5ce13]
  • 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2124 (T314041)', diff saved to https://phabricator.wikimedia.org/P33972 and previous config saved to /var/cache/conftool/dbconfig/20220906-184515-ladsgroup.json
  • 18:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 18:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2124.codfw.wmnet with reason: Maintenance
  • 18:25 cwhite: reduce codfw replicas 2 to 1 for logstash-(webrequest|k8s) partitions. Make space for failed logstash2027 - T316996
  • 17:50 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 17:48 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 17:23 moritzm: installing dpkg bugfix updates from bullseye point release
  • 17:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1004']
  • 17:16 krinkle@deploy1002: Synchronized php-1.39.0-wmf.27/resources/src/: I0516527d5cc0 (duration: 03m 50s)
  • 17:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 17:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 17:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 17:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 17:11 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
  • 17:06 krinkle@deploy1002: Synchronized wmf-config/: (no justification provided) (duration: 03m 50s)
  • 17:02 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging1004']
  • 17:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 17:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
  • 17:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2158 (T314041)', diff saved to https://phabricator.wikimedia.org/P33969 and previous config saved to /var/cache/conftool/dbconfig/20220906-165958-ladsgroup.json
  • 16:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 16:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 16:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 16:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 16:55 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
  • 16:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 16:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 16:47 jelto@cumin1001: END (PASS) - Cookbook sre.gitlab.reboot-runner (exit_code=0) rolling reboot on A:gitlab-runner
  • 16:45 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1004']
  • 16:44 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
  • 16:44 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['kafka-logging1004']
  • 16:42 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
  • 16:36 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['kafka-logging1004']
  • 16:25 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 16:24 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: apply on main
  • 16:23 btullis@deploy1002: helmfile [codfw] DONE helmfile.d/services/datahub: sync on main
  • 16:22 btullis@deploy1002: helmfile [codfw] START helmfile.d/services/datahub: apply on main
  • 16:22 btullis@deploy1002: helmfile [staging] DONE helmfile.d/services/datahub: sync on main
  • 16:20 btullis@deploy1002: helmfile [staging] START helmfile.d/services/datahub: apply on main
  • 16:18 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['kafka-logging1004']
  • 16:12 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
  • 16:12 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host kafka-logging1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 16:01 jelto@cumin1001: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
  • 15:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33968 and previous config saved to /var/cache/conftool/dbconfig/20220906-154959-root.json
  • 15:48 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 15:44 root@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
  • 15:43 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 15:43 root@cumin1001: END (FAIL) - Cookbook sre.network.prepare-upgrade (exit_code=99)
  • 15:43 root@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 15:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33967 and previous config saved to /var/cache/conftool/dbconfig/20220906-153454-root.json
  • 15:21 jelto@cumin1001: END (FAIL) - Cookbook sre.gitlab.reboot-runner (exit_code=1) rolling reboot on A:gitlab-runner
  • 15:20 jelto@cumin1001: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
  • 15:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33966 and previous config saved to /var/cache/conftool/dbconfig/20220906-151950-root.json
  • 15:15 claime: Set wtp10[41-43].eqiad.wmnet inactive pending decommission T317025
  • 15:14 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1043.eqiad.wmnet
  • 15:14 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1042.eqiad.wmnet
  • 15:14 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1041.eqiad.wmnet
  • 15:12 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1041-1043].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 15:12 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1041-1043].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2158 (T314041)', diff saved to https://phabricator.wikimedia.org/P33965 and previous config saved to /var/cache/conftool/dbconfig/20220906-150953-ladsgroup.json
  • 15:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 15:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 15:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 15:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2158.codfw.wmnet with reason: Maintenance
  • 15:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33964 and previous config saved to /var/cache/conftool/dbconfig/20220906-150928-ladsgroup.json
  • 15:08 claime: depooled wtp1045.eqiad.wmnet from parsoid cluster T307219
  • 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33963 and previous config saved to /var/cache/conftool/dbconfig/20220906-150445-root.json
  • 14:58 claime: pooled parse1012.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 14:55 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1012.eqiad.wmnet
  • 14:55 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1012.eqiad.wmnet
  • 14:53 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host kafka-logging1004.mgmt.eqiad.wmnet with reboot policy FORCED
  • 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33962 and previous config saved to /var/cache/conftool/dbconfig/20220906-144940-root.json
  • 14:46 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1012.eqiad.wmnet
  • 14:39 claime: depooled wtp1044.eqiad.wmnet from parsoid cluster T307219
  • 14:39 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:37 pt1979@cumin2002: START - Cookbook sre.dns.netbox
  • 14:36 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host kafka-logging1004
  • 14:36 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host kafka-logging1004
  • 14:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 5%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33961 and previous config saved to /var/cache/conftool/dbconfig/20220906-143435-root.json
  • 14:30 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: sync
  • 14:30 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: sync
  • 14:29 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: sync
  • 14:29 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: sync
  • 14:29 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
  • 14:29 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
  • 14:28 claime: pooled parse1011.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 14:27 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1011.eqiad.wmnet
  • 14:27 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1011.eqiad.wmnet
  • 14:15 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:15 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:08 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1011.eqiad.wmnet
  • 13:56 claime: depooled wtp1043.eqiad.wmnet from parsoid cluster T307219
  • 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1135 (T312863)', diff saved to https://phabricator.wikimedia.org/P33960 and previous config saved to /var/cache/conftool/dbconfig/20220906-134545-ladsgroup.json
  • 13:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 13:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1135.eqiad.wmnet with reason: Maintenance
  • 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T312863)', diff saved to https://phabricator.wikimedia.org/P33959 and previous config saved to /var/cache/conftool/dbconfig/20220906-134523-ladsgroup.json
  • 13:35 claime: pooled parse1010.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 13:33 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1010.eqiad.wmnet
  • 13:33 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1010.eqiad.wmnet
  • 13:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P33958 and previous config saved to /var/cache/conftool/dbconfig/20220906-133017-ladsgroup.json
  • 13:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180 T316342', diff saved to https://phabricator.wikimedia.org/P33956 and previous config saved to /var/cache/conftool/dbconfig/20220906-132627-root.json
  • 13:21 TheresNoTime: closing UTC afternoon backport window
  • 13:19 samtar@deploy1002: Synchronized wmf-config/CommonSettings-labs.php: Config: CommonSettings-labs: Load Phonos extension (T314294) (duration: 04m 05s)
  • 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 13:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2169:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33954 and previous config saved to /var/cache/conftool/dbconfig/20220906-131715-ladsgroup.json
  • 13:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 13:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2169.codfw.wmnet with reason: Maintenance
  • 13:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2180 (T314041)', diff saved to https://phabricator.wikimedia.org/P33953 and previous config saved to /var/cache/conftool/dbconfig/20220906-131654-ladsgroup.json
  • 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 13:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134', diff saved to https://phabricator.wikimedia.org/P33952 and previous config saved to /var/cache/conftool/dbconfig/20220906-131510-ladsgroup.json
  • 13:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 13:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1134 (T312863)', diff saved to https://phabricator.wikimedia.org/P33951 and previous config saved to /var/cache/conftool/dbconfig/20220906-130004-ladsgroup.json
  • 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 100%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33950 and previous config saved to /var/cache/conftool/dbconfig/20220906-123145-root.json
  • 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on puppetdb2002.codfw.wmnet with reason: Temporarily stop puppetdb/postgres
  • 12:29 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 0:15:00 on puppetdb2002.codfw.wmnet with reason: Temporarily stop puppetdb/postgres
  • 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 75%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33949 and previous config saved to /var/cache/conftool/dbconfig/20220906-121640-root.json
  • 12:15 XioNoX: repool ulsfo - T295690
  • 12:14 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1010.eqiad.wmnet
  • 12:05 claime: Set wtp10[38-40].eqiad.wmnet inactive pending decommission T317025
  • 12:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2180 (T314041)', diff saved to https://phabricator.wikimedia.org/P33948 and previous config saved to /var/cache/conftool/dbconfig/20220906-120433-ladsgroup.json
  • 12:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 12:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2180.codfw.wmnet with reason: Maintenance
  • 12:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2171:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33947 and previous config saved to /var/cache/conftool/dbconfig/20220906-120412-ladsgroup.json
  • 12:03 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1040.eqiad.wmnet
  • 12:03 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1039.eqiad.wmnet
  • 12:03 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1039-1040].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 12:02 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1039-1040].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 50%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33946 and previous config saved to /var/cache/conftool/dbconfig/20220906-120135-root.json
  • 12:01 claime: depooled wtp1042.eqiad.wmnet from parsoid cluster T307219
  • 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 25%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33945 and previous config saved to /var/cache/conftool/dbconfig/20220906-114631-root.json
  • 11:35 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 11:34 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 11:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 10%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33944 and previous config saved to /var/cache/conftool/dbconfig/20220906-113126-root.json
  • 11:27 claime: pooled parse1009.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 11:26 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "sync data - jbond@cumin2002"
  • 11:26 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 12 hosts with reason: Downtime pending inclusion in production
  • 11:26 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 12 hosts with reason: Downtime pending inclusion in production
  • 11:25 jbond@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "sync data - jbond@cumin2002"
  • 11:17 XioNoX: put cr4-ulsfo back in service - T295690
  • 11:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 5%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33943 and previous config saved to /var/cache/conftool/dbconfig/20220906-111621-root.json
  • 11:12 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:12 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:12 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:11 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 11:11 moritzm: installing ghostscript updates on stretch
  • 11:06 XioNoX: restart cr4-ulsfo for software upgrade - T295690
  • 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 4%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33942 and previous config saved to /var/cache/conftool/dbconfig/20220906-110116-root.json
  • 10:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33941 and previous config saved to /var/cache/conftool/dbconfig/20220906-105841-root.json
  • 10:58 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1009.eqiad.wmnet
  • 10:57 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1009.eqiad.wmnet
  • 10:52 moritzm: uploaded ghostscript 9.26a~dfsg-0+deb9u9+wmf1 to apt.wikimedia.org
  • 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 3%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33940 and previous config saved to /var/cache/conftool/dbconfig/20220906-104611-root.json
  • 10:44 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:44 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 10:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33939 and previous config saved to /var/cache/conftool/dbconfig/20220906-104336-root.json
  • 10:42 XioNoX: drain traffic from cr4-ulsfo - T295690
  • 10:40 jayme: switched primary kube-controller-manager from kubemaster1001 to kubemaster1002
  • 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33938 and previous config saved to /var/cache/conftool/dbconfig/20220906-103402-root.json
  • 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 2%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33937 and previous config saved to /var/cache/conftool/dbconfig/20220906-103104-root.json
  • 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33936 and previous config saved to /var/cache/conftool/dbconfig/20220906-103017-root.json
  • 10:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33935 and previous config saved to /var/cache/conftool/dbconfig/20220906-102919-root.json
  • 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33934 and previous config saved to /var/cache/conftool/dbconfig/20220906-102831-root.json
  • 10:27 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:27 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 10:26 XioNoX: put cr3-ulsfo back in service - T295690
  • 10:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33932 and previous config saved to /var/cache/conftool/dbconfig/20220906-102152-root.json
  • 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33931 and previous config saved to /var/cache/conftool/dbconfig/20220906-101858-root.json
  • 10:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1138 (re)pooling @ 1%: Repooling after cloning another host', diff saved to https://phabricator.wikimedia.org/P33930 and previous config saved to /var/cache/conftool/dbconfig/20220906-101559-root.json
  • 10:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33929 and previous config saved to /var/cache/conftool/dbconfig/20220906-101513-root.json
  • 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33928 and previous config saved to /var/cache/conftool/dbconfig/20220906-101414-root.json
  • 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33927 and previous config saved to /var/cache/conftool/dbconfig/20220906-101326-root.json
  • 10:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2171:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33926 and previous config saved to /var/cache/conftool/dbconfig/20220906-101129-ladsgroup.json
  • 10:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 10:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 10:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 100%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33925 and previous config saved to /var/cache/conftool/dbconfig/20220906-100656-root.json
  • 10:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33924 and previous config saved to /var/cache/conftool/dbconfig/20220906-100647-root.json
  • 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33923 and previous config saved to /var/cache/conftool/dbconfig/20220906-100353-root.json
  • 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33921 and previous config saved to /var/cache/conftool/dbconfig/20220906-100008-root.json
  • 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33920 and previous config saved to /var/cache/conftool/dbconfig/20220906-095909-root.json
  • 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33919 and previous config saved to /var/cache/conftool/dbconfig/20220906-095821-root.json
  • 09:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33918 and previous config saved to /var/cache/conftool/dbconfig/20220906-095722-root.json
  • 09:57 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1009.eqiad.wmnet
  • 09:55 claime: depooled wtp1041.eqiad.wmnet from parsoid cluster T307219
  • 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 75%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33917 and previous config saved to /var/cache/conftool/dbconfig/20220906-095151-root.json
  • 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33916 and previous config saved to /var/cache/conftool/dbconfig/20220906-095143-root.json
  • 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33915 and previous config saved to /var/cache/conftool/dbconfig/20220906-094848-root.json
  • 09:48 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: sync
  • 09:47 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/api-gateway: sync
  • 09:46 hnowlan@deploy1002: helmfile [codfw] DONE helmfile.d/services/api-gateway: sync
  • 09:45 hnowlan@deploy1002: helmfile [codfw] START helmfile.d/services/api-gateway: sync
  • 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33914 and previous config saved to /var/cache/conftool/dbconfig/20220906-094503-root.json
  • 09:44 claime: pooled parse1008.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33913 and previous config saved to /var/cache/conftool/dbconfig/20220906-094404-root.json
  • 09:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33912 and previous config saved to /var/cache/conftool/dbconfig/20220906-094316-root.json
  • 09:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33911 and previous config saved to /var/cache/conftool/dbconfig/20220906-094217-root.json
  • 09:40 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1008.eqiad.wmnet
  • 09:40 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1008.eqiad.wmnet
  • 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 50%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33910 and previous config saved to /var/cache/conftool/dbconfig/20220906-093646-root.json
  • 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33909 and previous config saved to /var/cache/conftool/dbconfig/20220906-093638-root.json
  • 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33908 and previous config saved to /var/cache/conftool/dbconfig/20220906-093343-root.json
  • 09:31 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1008.eqiad.wmnet
  • 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33907 and previous config saved to /var/cache/conftool/dbconfig/20220906-092958-root.json
  • 09:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33906 and previous config saved to /var/cache/conftool/dbconfig/20220906-092900-root.json
  • 09:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 09:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 09:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33905 and previous config saved to /var/cache/conftool/dbconfig/20220906-092812-root.json
  • 09:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33904 and previous config saved to /var/cache/conftool/dbconfig/20220906-092712-root.json
  • 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2172 (T314041)', diff saved to https://phabricator.wikimedia.org/P33903 and previous config saved to /var/cache/conftool/dbconfig/20220906-092626-ladsgroup.json
  • 09:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 09:26 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:26 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 09:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2172.codfw.wmnet with reason: Maintenance
  • 09:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2119 (T314041)', diff saved to https://phabricator.wikimedia.org/P33902 and previous config saved to /var/cache/conftool/dbconfig/20220906-092604-ladsgroup.json
  • 09:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 09:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 09:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 09:22 btullis: installing istio configs to dse-k8s cluster
  • 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 25%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33901 and previous config saved to /var/cache/conftool/dbconfig/20220906-092141-root.json
  • 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33900 and previous config saved to /var/cache/conftool/dbconfig/20220906-092133-root.json
  • 09:19 hnowlan@deploy1002: helmfile [staging] DONE helmfile.d/services/api-gateway: sync
  • 09:19 hnowlan@deploy1002: helmfile [staging] START helmfile.d/services/api-gateway: sync
  • 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33899 and previous config saved to /var/cache/conftool/dbconfig/20220906-091838-root.json
  • 09:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33898 and previous config saved to /var/cache/conftool/dbconfig/20220906-091453-root.json
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33897 and previous config saved to /var/cache/conftool/dbconfig/20220906-091355-root.json
  • 09:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33896 and previous config saved to /var/cache/conftool/dbconfig/20220906-091307-root.json
  • 09:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33895 and previous config saved to /var/cache/conftool/dbconfig/20220906-091207-root.json
  • 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 10%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33894 and previous config saved to /var/cache/conftool/dbconfig/20220906-090637-root.json
  • 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33893 and previous config saved to /var/cache/conftool/dbconfig/20220906-090628-root.json
  • 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33892 and previous config saved to /var/cache/conftool/dbconfig/20220906-090333-root.json
  • 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33891 and previous config saved to /var/cache/conftool/dbconfig/20220906-085948-root.json
  • 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33890 and previous config saved to /var/cache/conftool/dbconfig/20220906-085850-root.json
  • 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33889 and previous config saved to /var/cache/conftool/dbconfig/20220906-085802-root.json
  • 08:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33888 and previous config saved to /var/cache/conftool/dbconfig/20220906-085703-root.json
  • 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 5%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33887 and previous config saved to /var/cache/conftool/dbconfig/20220906-085132-root.json
  • 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 4%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33886 and previous config saved to /var/cache/conftool/dbconfig/20220906-085123-root.json
  • 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33885 and previous config saved to /var/cache/conftool/dbconfig/20220906-084829-root.json
  • 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33884 and previous config saved to /var/cache/conftool/dbconfig/20220906-084443-root.json
  • 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33883 and previous config saved to /var/cache/conftool/dbconfig/20220906-084345-root.json
  • 08:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33882 and previous config saved to /var/cache/conftool/dbconfig/20220906-084257-root.json
  • 08:42 XioNoX: restart cr3-ulsfo for software upgrade - T295690
  • 08:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33881 and previous config saved to /var/cache/conftool/dbconfig/20220906-084158-root.json
  • 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 4%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33880 and previous config saved to /var/cache/conftool/dbconfig/20220906-083627-root.json
  • 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33879 and previous config saved to /var/cache/conftool/dbconfig/20220906-083619-root.json
  • 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33878 and previous config saved to /var/cache/conftool/dbconfig/20220906-083324-root.json
  • 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 100%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33876 and previous config saved to /var/cache/conftool/dbconfig/20220906-083019-root.json
  • 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 100%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33875 and previous config saved to /var/cache/conftool/dbconfig/20220906-083002-root.json
  • 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1138 T316342', diff saved to https://phabricator.wikimedia.org/P33874 and previous config saved to /var/cache/conftool/dbconfig/20220906-082954-root.json
  • 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33873 and previous config saved to /var/cache/conftool/dbconfig/20220906-082939-root.json
  • 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33872 and previous config saved to /var/cache/conftool/dbconfig/20220906-082841-root.json
  • 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33871 and previous config saved to /var/cache/conftool/dbconfig/20220906-082653-root.json
  • 08:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 08:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance
  • 08:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33870 and previous config saved to /var/cache/conftool/dbconfig/20220906-082507-ladsgroup.json
  • 08:24 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
  • 08:23 ayounsi@cumin1001: START - Cookbook sre.network.cf
  • 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 3%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33869 and previous config saved to /var/cache/conftool/dbconfig/20220906-082122-root.json
  • 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 2%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33868 and previous config saved to /var/cache/conftool/dbconfig/20220906-082114-root.json
  • 08:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1188 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33867 and previous config saved to /var/cache/conftool/dbconfig/20220906-081819-root.json
  • 08:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 08:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 08:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 75%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33866 and previous config saved to /var/cache/conftool/dbconfig/20220906-081514-root.json
  • 08:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 75%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33865 and previous config saved to /var/cache/conftool/dbconfig/20220906-081458-root.json
  • 08:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33864 and previous config saved to /var/cache/conftool/dbconfig/20220906-081434-root.json
  • 08:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33863 and previous config saved to /var/cache/conftool/dbconfig/20220906-081336-root.json
  • 08:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 08:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P33862 and previous config saved to /var/cache/conftool/dbconfig/20220906-081001-ladsgroup.json
  • 08:09 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.39.0-wmf.28 refs T314189
  • 08:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 08:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 2%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33861 and previous config saved to /var/cache/conftool/dbconfig/20220906-080618-root.json
  • 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1103 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P33860 and previous config saved to /var/cache/conftool/dbconfig/20220906-080609-root.json
  • 08:04 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
  • 08:02 marostegui: Set x1 back to binlog_format=ROW
  • 08:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 50%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33859 and previous config saved to /var/cache/conftool/dbconfig/20220906-080009-root.json
  • 07:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 50%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33858 and previous config saved to /var/cache/conftool/dbconfig/20220906-075953-root.json
  • 07:58 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cr3-ulsfo.wikimedia.org with reason: router upgrade
  • 07:58 ayounsi@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cr3-ulsfo.wikimedia.org with reason: router upgrade
  • 07:58 jnuche@deploy1002: Pruned MediaWiki: 1.39.0-wmf.24, 1.39.0-wmf.26 (duration: 02m 48s)
  • 07:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P33857 and previous config saved to /var/cache/conftool/dbconfig/20220906-075455-ladsgroup.json
  • 07:52 XioNoX: depool ulsfo for routers upgrade - T295690
  • 07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1107 (re)pooling @ 1%: Pooling for the first time', diff saved to https://phabricator.wikimedia.org/P33856 and previous config saved to /var/cache/conftool/dbconfig/20220906-075113-root.json
  • 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 25%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33855 and previous config saved to /var/cache/conftool/dbconfig/20220906-074504-root.json
  • 07:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 25%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33854 and previous config saved to /var/cache/conftool/dbconfig/20220906-074448-root.json
  • 07:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33853 and previous config saved to /var/cache/conftool/dbconfig/20220906-073948-ladsgroup.json
  • 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130 T316342', diff saved to https://phabricator.wikimedia.org/P33851 and previous config saved to /var/cache/conftool/dbconfig/20220906-073434-root.json
  • 07:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 10%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33850 and previous config saved to /var/cache/conftool/dbconfig/20220906-072959-root.json
  • 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 10%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33849 and previous config saved to /var/cache/conftool/dbconfig/20220906-072943-root.json
  • 07:26 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.prepare-upgrade (exit_code=0)
  • 07:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 07:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 07:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 5%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33848 and previous config saved to /var/cache/conftool/dbconfig/20220906-071455-root.json
  • 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 5%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33847 and previous config saved to /var/cache/conftool/dbconfig/20220906-071438-root.json
  • 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 07:11 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Move 1 of 6 users to php 7.4 (T271736) (duration: 04m 06s)
  • 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 4%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33846 and previous config saved to /var/cache/conftool/dbconfig/20220906-065950-root.json
  • 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 4%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33845 and previous config saved to /var/cache/conftool/dbconfig/20220906-065934-root.json
  • 06:53 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 3%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33844 and previous config saved to /var/cache/conftool/dbconfig/20220906-064445-root.json
  • 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 3%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33843 and previous config saved to /var/cache/conftool/dbconfig/20220906-064429-root.json
  • 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1189 T316342', diff saved to https://phabricator.wikimedia.org/P33841 and previous config saved to /var/cache/conftool/dbconfig/20220906-064021-root.json
  • 06:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1188 T316342', diff saved to https://phabricator.wikimedia.org/P33839 and previous config saved to /var/cache/conftool/dbconfig/20220906-063322-root.json
  • 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 2%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33838 and previous config saved to /var/cache/conftool/dbconfig/20220906-062940-root.json
  • 06:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 2%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33837 and previous config saved to /var/cache/conftool/dbconfig/20220906-062924-root.json
  • 06:15 ayounsi@cumin1001: START - Cookbook sre.network.prepare-upgrade
  • 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 1%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33836 and previous config saved to /var/cache/conftool/dbconfig/20220906-061434-root.json
  • 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1132 (re)pooling @ 1%: Repooling again', diff saved to https://phabricator.wikimedia.org/P33835 and previous config saved to /var/cache/conftool/dbconfig/20220906-061419-root.json
  • 06:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 (T312863)', diff saved to https://phabricator.wikimedia.org/P33833 and previous config saved to /var/cache/conftool/dbconfig/20220906-061150-ladsgroup.json
  • 06:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 06:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
  • 06:10 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 06:10 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
  • 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Give some weight to current x1 eqiad master', diff saved to https://phabricator.wikimedia.org/P33832 and previous config saved to /var/cache/conftool/dbconfig/20220906-060833-root.json
  • 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103 T316745', diff saved to https://phabricator.wikimedia.org/P33831 and previous config saved to /var/cache/conftool/dbconfig/20220906-060815-root.json
  • 06:06 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db1120 to x1 primary T316745', diff saved to https://phabricator.wikimedia.org/P33830 and previous config saved to /var/cache/conftool/dbconfig/20220906-060602-root.json
  • 06:05 marostegui: Starting x1 eqiad failover from db1103 to db1120 - T316745
  • 06:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1118 T316623', diff saved to https://phabricator.wikimedia.org/P33829 and previous config saved to /var/cache/conftool/dbconfig/20220906-060418-ladsgroup.json
  • 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1163 to s1 primary and set section read-write T316623', diff saved to https://phabricator.wikimedia.org/P33828 and previous config saved to /var/cache/conftool/dbconfig/20220906-060055-ladsgroup.json
  • 06:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s1 eqiad as read-only for maintenance - T316623', diff saved to https://phabricator.wikimedia.org/P33827 and previous config saved to /var/cache/conftool/dbconfig/20220906-060032-ladsgroup.json
  • 06:00 Amir1: Starting s1 eqiad failover from db1118 to db1163 - T316623
  • 05:32 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1107 to dbctl depooled T316870', diff saved to https://phabricator.wikimedia.org/P33826 and previous config saved to /var/cache/conftool/dbconfig/20220906-053238-marostegui.json
  • 05:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1098:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33825 and previous config saved to /var/cache/conftool/dbconfig/20220906-052609-ladsgroup.json
  • 05:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 05:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1098.eqiad.wmnet with reason: Maintenance
  • 05:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33824 and previous config saved to /var/cache/conftool/dbconfig/20220906-052547-ladsgroup.json
  • 05:13 marostegui@cumin1001: dbctl commit (dc=all): 'Set db1120 with weight 0 T316745', diff saved to https://phabricator.wikimedia.org/P33823 and previous config saved to /var/cache/conftool/dbconfig/20220906-051304-root.json
  • 05:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 T316745
  • 05:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 T316745
  • 05:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P33822 and previous config saved to /var/cache/conftool/dbconfig/20220906-051041-ladsgroup.json
  • 05:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1163 with weight 0 T316623', diff saved to https://phabricator.wikimedia.org/P33821 and previous config saved to /var/cache/conftool/dbconfig/20220906-050610-ladsgroup.json
  • 05:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 36 hosts with reason: Primary switchover s1 T316623
  • 05:05 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 36 hosts with reason: Primary switchover s1 T316623
  • 04:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P33820 and previous config saved to /var/cache/conftool/dbconfig/20220906-045535-ladsgroup.json
  • 04:40 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33819 and previous config saved to /var/cache/conftool/dbconfig/20220906-044029-ladsgroup.json
  • 03:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 03:47 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 03:47 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 03:40 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.39.0-wmf.28 refs T314189 (duration: 36m 17s)
  • 03:26 TimStarling: multi-DC stage 4: all traffic to appservers-ro, rolling out via puppet 03:24-03:54
  • 03:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 03:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 03:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 03:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 03:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 03:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 03:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.39.0-wmf.28 refs T314189
  • 02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1113:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33816 and previous config saved to /var/cache/conftool/dbconfig/20220906-024351-ladsgroup.json
  • 02:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 02:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1113.eqiad.wmnet with reason: Maintenance
  • 02:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33815 and previous config saved to /var/cache/conftool/dbconfig/20220906-024330-ladsgroup.json
  • 02:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 02:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 02:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 02:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 02:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P33814 and previous config saved to /var/cache/conftool/dbconfig/20220906-022824-ladsgroup.json
  • 02:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316', diff saved to https://phabricator.wikimedia.org/P33813 and previous config saved to /var/cache/conftool/dbconfig/20220906-021318-ladsgroup.json
  • 02:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 02:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 02:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 01:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33812 and previous config saved to /var/cache/conftool/dbconfig/20220906-015812-ladsgroup.json
  • 01:03 TimStarling: multi-DC stage 3: 2% of codfw/ulsfo/eqsin traffic going to codfw appservers, rolling out via puppet 00:54-01:24
  • 00:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance
  • 00:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1133.eqiad.wmnet with reason: Maintenance

2022-09-05

  • 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3316 (T314041)', diff saved to https://phabricator.wikimedia.org/P33811 and previous config saved to /var/cache/conftool/dbconfig/20220905-232237-ladsgroup.json
  • 23:22 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 23:22 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1096.eqiad.wmnet with reason: Maintenance
  • 23:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T314041)', diff saved to https://phabricator.wikimedia.org/P33810 and previous config saved to /var/cache/conftool/dbconfig/20220905-232216-ladsgroup.json
  • 23:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P33809 and previous config saved to /var/cache/conftool/dbconfig/20220905-230709-ladsgroup.json
  • 22:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P33808 and previous config saved to /var/cache/conftool/dbconfig/20220905-225203-ladsgroup.json
  • 22:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T314041)', diff saved to https://phabricator.wikimedia.org/P33807 and previous config saved to /var/cache/conftool/dbconfig/20220905-223657-ladsgroup.json
  • 21:24 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1180 (T314041)', diff saved to https://phabricator.wikimedia.org/P33806 and previous config saved to /var/cache/conftool/dbconfig/20220905-212415-ladsgroup.json
  • 21:24 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 21:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 21:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T314041)', diff saved to https://phabricator.wikimedia.org/P33805 and previous config saved to /var/cache/conftool/dbconfig/20220905-212343-ladsgroup.json
  • 21:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33804 and previous config saved to /var/cache/conftool/dbconfig/20220905-210837-ladsgroup.json
  • 20:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P33803 and previous config saved to /var/cache/conftool/dbconfig/20220905-205330-ladsgroup.json
  • 20:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T314041)', diff saved to https://phabricator.wikimedia.org/P33802 and previous config saved to /var/cache/conftool/dbconfig/20220905-203824-ladsgroup.json
  • 19:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T314041)', diff saved to https://phabricator.wikimedia.org/P33801 and previous config saved to /var/cache/conftool/dbconfig/20220905-192554-ladsgroup.json
  • 19:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 19:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 19:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 19:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 19:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Maint needs to be redone', diff saved to https://phabricator.wikimedia.org/P33800 and previous config saved to /var/cache/conftool/dbconfig/20220905-191532-ladsgroup.json
  • 19:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Maint needs to be redone', diff saved to https://phabricator.wikimedia.org/P33799 and previous config saved to /var/cache/conftool/dbconfig/20220905-190027-ladsgroup.json
  • 18:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Maint needs to be redone', diff saved to https://phabricator.wikimedia.org/P33798 and previous config saved to /var/cache/conftool/dbconfig/20220905-184522-ladsgroup.json
  • 18:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 10%: Maint needs to be redone', diff saved to https://phabricator.wikimedia.org/P33797 and previous config saved to /var/cache/conftool/dbconfig/20220905-183017-ladsgroup.json
  • 18:25 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 18:25 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1132.eqiad.wmnet with reason: Maintenance
  • 18:25 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T312863)', diff saved to https://phabricator.wikimedia.org/P33796 and previous config saved to /var/cache/conftool/dbconfig/20220905-182510-ladsgroup.json
  • 18:10 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P33795 and previous config saved to /var/cache/conftool/dbconfig/20220905-181003-ladsgroup.json
  • 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128', diff saved to https://phabricator.wikimedia.org/P33794 and previous config saved to /var/cache/conftool/dbconfig/20220905-175457-ladsgroup.json
  • 17:54 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2119 (T314041)', diff saved to https://phabricator.wikimedia.org/P33793 and previous config saved to /var/cache/conftool/dbconfig/20220905-175423-ladsgroup.json
  • 17:54 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 17:53 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2119.codfw.wmnet with reason: Maintenance
  • 17:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1128 (T312863)', diff saved to https://phabricator.wikimedia.org/P33792 and previous config saved to /var/cache/conftool/dbconfig/20220905-173951-ladsgroup.json
  • 16:27 btullis@deploy1002: helmfile [eqiad] DONE helmfile.d/services/datahub: sync on main
  • 16:26 btullis@deploy1002: helmfile [eqiad] START helmfile.d/services/datahub: sync on main
  • 15:30 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1038.eqiad.wmnet
  • 15:30 moritzm: installing apache2 security updates
  • 15:28 claime: depooled wtp1040.eqiad.wmnet from parsoid cluster T307219
  • 15:19 claime: pooled parse1007.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 15:16 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1007,parse1007.mgmt
  • 15:16 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1007,parse1007.mgmt
  • 15:09 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1007.eqiad.wmnet
  • 15:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1165 (T314041)', diff saved to https://phabricator.wikimedia.org/P33791 and previous config saved to /var/cache/conftool/dbconfig/20220905-150837-ladsgroup.json
  • 15:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1015,1019,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 15:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 15:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 15:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T314041)', diff saved to https://phabricator.wikimedia.org/P33790 and previous config saved to /var/cache/conftool/dbconfig/20220905-150758-ladsgroup.json
  • 15:04 moritzm: updating docker.io on gitlab-runners
  • 14:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33789 and previous config saved to /var/cache/conftool/dbconfig/20220905-145252-ladsgroup.json
  • 14:48 claime: Set wtp103[6-7].eqiad.wmnet inactive pending decommission T317025
  • 14:47 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1037.eqiad.wmnet
  • 14:46 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1036.eqiad.wmnet
  • 14:40 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1036-1038].eqiad.wmnet with reason: Downtiming replace wtp servers
  • 14:40 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1036-1038].eqiad.wmnet with reason: Downtiming replace wtp servers
  • 14:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P33788 and previous config saved to /var/cache/conftool/dbconfig/20220905-143746-ladsgroup.json
  • 14:33 claime: depooled wtp1039.eqiad.wmnet from parsoid cluster T307219
  • 14:30 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 14:30 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 14:29 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 14:29 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 14:28 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 14:28 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 14:26 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 14:26 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 14:23 claime: pooled parse1006.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 14:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T314041)', diff saved to https://phabricator.wikimedia.org/P33786 and previous config saved to /var/cache/conftool/dbconfig/20220905-142240-ladsgroup.json
  • 14:21 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1006,parse1006.mgmt
  • 14:21 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1006,parse1006.mgmt
  • 14:11 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1006.eqiad.wmnet
  • 14:02 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 14:02 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 14:01 claime: depooled wtp1038.eqiad.wmnet from parsoid cluster T307219
  • 13:51 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 13:48 claime: pooled parse1005.eqiad.wmnet (php 7.4 only) in parsoid cluster T307219
  • 13:41 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 13:31 addshore: wdqs1009 sudo systemctl stop wdqs-blazegraph.service
  • 13:13 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1011.eqiad.wmnet with OS bullseye
  • 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:15:00 on puppetdb2002.codfw.wmnet with reason: Temporarily stop puppetdb
  • 13:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 0:15:00 on puppetdb2002.codfw.wmnet with reason: Temporarily stop puppetdb
  • 13:10 urbanecm: UTC afternoon B&C window done
  • 13:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1168 (T314041)', diff saved to https://phabricator.wikimedia.org/P33785 and previous config saved to /var/cache/conftool/dbconfig/20220905-130944-ladsgroup.json
  • 13:09 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 13:09 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 13:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: edbcee4: Enable partial action blocks on fawiki (T315525) (duration: 03m 34s)
  • 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 13:07 moritzm: disabling puppet in codfw and the edges temporarily
  • 13:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 13:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 13:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 13:05 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-presto1011.eqiad.wmnet with reason: host reimage
  • 13:01 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-presto1011.eqiad.wmnet with reason: host reimage
  • 12:48 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-presto1011.eqiad.wmnet with OS bullseye
  • 12:47 btullis@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-presto1007.eqiad.wmnet with OS bullseye
  • 12:33 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host datahubsearch1003.eqiad.wmnet
  • 12:31 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1005,parse1005.mgmt
  • 12:31 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1005,parse1005.mgmt
  • 12:24 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host datahubsearch1003.eqiad.wmnet
  • 12:22 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host datahubsearch1002.eqiad.wmnet
  • 12:20 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on 18 hosts with reason: Downtime pending inclusion in production
  • 12:20 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on 18 hosts with reason: Downtime pending inclusion in production
  • 12:18 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host datahubsearch1002.eqiad.wmnet
  • 12:16 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-presto1007.eqiad.wmnet with OS bullseye
  • 12:16 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1005.eqiad.wmnet
  • 12:14 claime: depooled wtp1037.eqiad.wmnet from parsoid cluster T312638
  • 12:13 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host datahubsearch1001.eqiad.wmnet
  • 12:10 tstarling@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for db[2142-2144].codfw.wmnet
  • 12:10 tstarling@cumin1001: START - Cookbook sre.hosts.remove-downtime for db[2142-2144].codfw.wmnet
  • 12:10 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1004.mgmt
  • 12:10 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1004.mgmt
  • 12:10 btullis@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-presto1007.eqiad.wmnet with OS bullseye
  • 12:09 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host datahubsearch1001.eqiad.wmnet
  • 11:56 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse[1001-1004].eqiad.wmnet
  • 11:56 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse[1001-1004].eqiad.wmnet
  • 11:55 TimStarling: on db2142: rejecting inbound mysql traffic T316847
  • 11:55 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host karapace1001.eqiad.wmnet
  • 11:53 claime: pooled parse1004.eqiad.wmnet (php 7.4 only) in parsoid cluster T312638
  • 11:52 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1004.eqiad.wmnet
  • 11:52 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1004.eqiad.wmnet
  • 11:51 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host karapace1001.eqiad.wmnet
  • 11:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1128 (T312863)', diff saved to https://phabricator.wikimedia.org/P33784 and previous config saved to /var/cache/conftool/dbconfig/20220905-114352-ladsgroup.json
  • 11:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 11:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1128.eqiad.wmnet with reason: Maintenance
  • 11:41 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox interface ID cr2-eqiad:xe-4/1/3
  • 11:41 jnuche@deploy1002: Installation of scap version "4.16.0" completed for 584 hosts
  • 11:41 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox interface ID cr2-eqiad:xe-4/1/3
  • 11:40 jnuche@deploy1002: Installing scap version "4.16.0" for 584 hosts
  • 11:37 TimStarling: on db2142: dropping inbound mysql traffic T316847
  • 11:36 claime: Set wtp103[4-5].eqiad.wmnet inactive pending decommission https://phabricator.wikimedia.org/T317025
  • 11:34 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1035.eqiad.wmnet
  • 11:34 cgoubert@puppetmaster1001: conftool action : set/pooled=inactive; selector: dc=eqiad,cluster=parsoid,name=wtp1034.eqiad.wmnet
  • 11:32 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on wtp[1034-1036].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 11:32 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on wtp[1034-1036].eqiad.wmnet with reason: Downtiming replaced wtp servers
  • 11:30 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1004.eqiad.wmnet
  • 11:29 TimStarling: on db2142: set master_delay=30 and restarted replication T316847
  • 11:27 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1003.eqiad.wmnet
  • 11:27 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1003.eqiad.wmnet
  • 11:24 claime: depooled wtp1036.eqiad.wmnet from parsoid cluster https://phabricator.wikimedia.org/T312638
  • 11:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 11:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1140.eqiad.wmnet with reason: Maintenance
  • 11:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T314041)', diff saved to https://phabricator.wikimedia.org/P33783 and previous config saved to /var/cache/conftool/dbconfig/20220905-112308-ladsgroup.json
  • 11:18 TimStarling: on db2142: stopped mariadb replication
  • 11:16 claime: pooled parse1003.eqiad.wmnet (php 7.4 only) in parsoid cluster https://phabricator.wikimedia.org/T312638
  • 11:16 tstarling@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db[2142-2144].codfw.wmnet with reason: T316847 x2 failure test
  • 11:15 tstarling@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db[2142-2144].codfw.wmnet with reason: T316847 x2 failure test
  • 11:15 cgoubert@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid,name=parse1003.eqiad.wmnet
  • 11:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P33782 and previous config saved to /var/cache/conftool/dbconfig/20220905-110801-ladsgroup.json
  • 11:04 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1003.eqiad.wmnet
  • 10:55 Emperor: set thanos ring replicas to 3.90 T311690
  • 10:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P33781 and previous config saved to /var/cache/conftool/dbconfig/20220905-105255-ladsgroup.json
  • 10:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T314041)', diff saved to https://phabricator.wikimedia.org/P33780 and previous config saved to /var/cache/conftool/dbconfig/20220905-103749-ladsgroup.json
  • 10:36 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1015.eqiad.wmnet
  • 10:35 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 10:27 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1015.eqiad.wmnet
  • 10:25 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 10:24 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1014.eqiad.wmnet
  • 10:17 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1014.eqiad.wmnet
  • 10:14 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1013.eqiad.wmnet
  • 10:13 XioNoX: upgrade python-pynetbox to 6.6 on netbox frontends - T310745
  • 10:11 hnowlan@deploy1002: Finished deploy [restbase/deploy@79b3cd2]: Add guwwiktionary and bjnwiktionary T309058 T312216 (duration: 15m 05s)
  • 10:05 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1013.eqiad.wmnet
  • 09:56 hnowlan@deploy1002: Started deploy [restbase/deploy@79b3cd2]: Add guwwiktionary and bjnwiktionary T309058 T312216
  • 09:47 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:39 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-presto1007.eqiad.wmnet with reason: host reimage
  • 09:38 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1012.eqiad.wmnet
  • 09:37 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 09:35 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-presto1007.eqiad.wmnet with reason: host reimage
  • 09:34 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:29 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1012.eqiad.wmnet
  • 09:25 btullis: deployed calico to dse-k8s cluster T310174
  • 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
  • 09:24 btullis@deploy1002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
  • 09:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1187 (T314041)', diff saved to https://phabricator.wikimedia.org/P33779 and previous config saved to /var/cache/conftool/dbconfig/20220905-092338-ladsgroup.json
  • 09:23 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 09:23 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 09:23 btullis@cumin1001: START - Cookbook sre.hosts.reimage for host an-presto1007.eqiad.wmnet with OS bullseye
  • 09:22 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1010.eqiad.wmnet
  • 09:17 XioNoX: Squid: permit production networks instead of aggregate_networks - T265864
  • 09:17 moritzm: installing flac security updates
  • 09:14 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1010.eqiad.wmnet
  • 09:11 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1008.eqiad.wmnet
  • 09:05 hnowlan@deploy1002: Finished deploy [restbase/deploy@a571f9a]: Add pcmwiki T310880 (duration: 01m 06s)
  • 09:04 hnowlan@deploy1002: Started deploy [restbase/deploy@a571f9a]: Add pcmwiki T310880
  • 09:04 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1008.eqiad.wmnet
  • 09:03 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-presto1006.eqiad.wmnet
  • 08:55 btullis@cumin1001: START - Cookbook sre.hosts.reboot-single for host an-presto1006.eqiad.wmnet
  • 08:48 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1004.eqiad.wmnet
  • 08:39 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host graphite1004.eqiad.wmnet
  • 08:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 08:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 08:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 08:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 08:14 ladsgroup@cumin1001: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 08:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 08:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 08:14 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Stop writing to old templatelinks fields in s7 (T312865) (duration: 03m 51s)
  • 08:14 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 08:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 08:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 08:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 08:01 XioNoX: rename Telia to Arelion in Netbox
  • 07:42 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 07:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 07:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 07:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 07:32 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Make English Wikipedia read new on templatelinks migration (T306673) (duration: 03m 31s)
  • 07:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 07:25 urbanecm@deploy1002: Synchronized wmf-config/logos.php: 739920c: Fix missing logo for mniwiktionary and frwikiquote (T317004) (duration: 03m 36s)
  • 07:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 07:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 07:22 urbanecm@deploy1002: Synchronized static/images/project-logos/: ff2e108: Upload missing logo for mniwiktionary and frwikiquote (T317004) (duration: 03m 50s)
  • 07:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 07:19 moritzm: installing ghostscript security updates
  • 07:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 07:07 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Move 10% of traffic to php 7.4 (T271736) (duration: 03m 50s)
  • 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 06:28 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.debug (exit_code=0) for Netbox interface ID cr2-eqiad:xe-4/1/3
  • 06:28 ayounsi@cumin1001: START - Cookbook sre.network.debug for Netbox interface ID cr2-eqiad:xe-4/1/3
  • 06:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 06:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2099.codfw.wmnet with reason: Maintenance
  • 02:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2138:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P33778 and previous config saved to /var/cache/conftool/dbconfig/20220905-024602-ladsgroup.json
  • 00:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
  • 00:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
  • 00:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P33777 and previous config saved to /var/cache/conftool/dbconfig/20220905-003619-ladsgroup.json
  • 00:21 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P33776 and previous config saved to /var/cache/conftool/dbconfig/20220905-002112-ladsgroup.json
  • 00:06 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P33775 and previous config saved to /var/cache/conftool/dbconfig/20220905-000606-ladsgroup.json

2022-09-04

  • 23:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P33774 and previous config saved to /var/cache/conftool/dbconfig/20220904-235100-ladsgroup.json
  • 22:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P33773 and previous config saved to /var/cache/conftool/dbconfig/20220904-225044-ladsgroup.json
  • 22:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 22:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 22:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 22:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 22:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P33772 and previous config saved to /var/cache/conftool/dbconfig/20220904-225016-ladsgroup.json
  • 22:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P33771 and previous config saved to /var/cache/conftool/dbconfig/20220904-223510-ladsgroup.json
  • 22:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P33770 and previous config saved to /var/cache/conftool/dbconfig/20220904-222004-ladsgroup.json
  • 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P33769 and previous config saved to /var/cache/conftool/dbconfig/20220904-220457-ladsgroup.json
  • 15:51 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P33767 and previous config saved to /var/cache/conftool/dbconfig/20220904-155059-ladsgroup.json
  • 15:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 15:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
  • 15:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P33766 and previous config saved to /var/cache/conftool/dbconfig/20220904-155027-ladsgroup.json
  • 15:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P33765 and previous config saved to /var/cache/conftool/dbconfig/20220904-153521-ladsgroup.json
  • 15:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P33764 and previous config saved to /var/cache/conftool/dbconfig/20220904-152015-ladsgroup.json
  • 15:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P33763 and previous config saved to /var/cache/conftool/dbconfig/20220904-150508-ladsgroup.json
  • 12:51 elukey: reset-fail ifup@ens13.service on idp2002
  • 12:50 elukey: reset-fail ifup@ens13.service on netflow4002
  • 12:49 elukey: pkill remaining processes of user effeietsanders on stat1008 to unblock puppet - T314846
  • 10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P33762 and previous config saved to /var/cache/conftool/dbconfig/20220904-103427-ladsgroup.json
  • 10:34 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 10:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2138.codfw.wmnet with reason: Maintenance
  • 10:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136 (T314041)', diff saved to https://phabricator.wikimedia.org/P33761 and previous config saved to /var/cache/conftool/dbconfig/20220904-103405-ladsgroup.json
  • 08:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 (T312863)', diff saved to https://phabricator.wikimedia.org/P33760 and previous config saved to /var/cache/conftool/dbconfig/20220904-083341-ladsgroup.json
  • 08:33 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
  • 08:33 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance

2022-09-03

  • 23:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P33759 and previous config saved to /var/cache/conftool/dbconfig/20220903-235001-ladsgroup.json
  • 23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P33758 and previous config saved to /var/cache/conftool/dbconfig/20220903-233455-ladsgroup.json
  • 23:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P33757 and previous config saved to /var/cache/conftool/dbconfig/20220903-231949-ladsgroup.json
  • 23:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P33756 and previous config saved to /var/cache/conftool/dbconfig/20220903-230443-ladsgroup.json
  • 22:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P33755 and previous config saved to /var/cache/conftool/dbconfig/20220903-220427-ladsgroup.json
  • 22:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 22:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 22:03 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 22:03 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
  • 22:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T312863)', diff saved to https://phabricator.wikimedia.org/P33754 and previous config saved to /var/cache/conftool/dbconfig/20220903-220326-ladsgroup.json
  • 21:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P33753 and previous config saved to /var/cache/conftool/dbconfig/20220903-214820-ladsgroup.json
  • 21:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163', diff saved to https://phabricator.wikimedia.org/P33752 and previous config saved to /var/cache/conftool/dbconfig/20220903-213314-ladsgroup.json
  • 21:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1163 (T312863)', diff saved to https://phabricator.wikimedia.org/P33751 and previous config saved to /var/cache/conftool/dbconfig/20220903-211808-ladsgroup.json
  • 18:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2136 (T314041)', diff saved to https://phabricator.wikimedia.org/P33750 and previous config saved to /var/cache/conftool/dbconfig/20220903-180104-ladsgroup.json
  • 18:00 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 18:00 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2136.codfw.wmnet with reason: Maintenance
  • 18:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2147 (T314041)', diff saved to https://phabricator.wikimedia.org/P33749 and previous config saved to /var/cache/conftool/dbconfig/20220903-180042-ladsgroup.json
  • 15:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1163 (T312863)', diff saved to https://phabricator.wikimedia.org/P33748 and previous config saved to /var/cache/conftool/dbconfig/20220903-151224-ladsgroup.json
  • 15:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 15:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1163.eqiad.wmnet with reason: Maintenance
  • 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 09:45 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
  • 01:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2147 (T314041)', diff saved to https://phabricator.wikimedia.org/P33747 and previous config saved to /var/cache/conftool/dbconfig/20220903-015524-ladsgroup.json
  • 01:55 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 01:55 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2147.codfw.wmnet with reason: Maintenance
  • 01:55 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2155 (T314041)', diff saved to https://phabricator.wikimedia.org/P33746 and previous config saved to /var/cache/conftool/dbconfig/20220903-015502-ladsgroup.json

2022-09-02

  • 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 18:58 dancy@deploy1002: Sync cancelled.
  • 18:56 dancy@deploy1002: dancy: testing T299648 synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 18:55 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 18:51 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 18:51 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 18:47 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 18:40 dancy@deploy1002: Started scap: testing T299648
  • 17:58 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1203.eqiad.wmnet with OS bullseye
  • 17:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 17:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 17:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 17:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 17:45 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 17:41 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1203.eqiad.wmnet with reason: host reimage
  • 17:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db1203.eqiad.wmnet with OS bullseye
  • 17:11 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1202.eqiad.wmnet with OS bullseye
  • 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1201.eqiad.wmnet with OS bullseye
  • 16:56 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
  • 16:52 pt1979@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1202.eqiad.wmnet with reason: host reimage
  • 16:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
  • 16:43 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1201.eqiad.wmnet with reason: host reimage
  • 16:40 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host db1202.eqiad.wmnet with OS bullseye
  • 16:39 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1200.eqiad.wmnet with OS bullseye
  • 16:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1203']
  • 16:31 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db1201.eqiad.wmnet with OS bullseye
  • 16:30 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1203']
  • 16:26 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1199.eqiad.wmnet with OS bullseye
  • 16:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1202']
  • 16:23 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1200.eqiad.wmnet with reason: host reimage
  • 16:19 pt1979@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1200.eqiad.wmnet with reason: host reimage
  • 16:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1203']
  • 16:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1202']
  • 16:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 16:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1199.eqiad.wmnet with reason: host reimage
  • 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 16:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 16:10 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1203']
  • 16:10 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1202']
  • 16:09 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1199.eqiad.wmnet with reason: host reimage
  • 16:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 16:07 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host db1200.eqiad.wmnet with OS bullseye
  • 16:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1201']
  • 16:03 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1198.eqiad.wmnet with OS bullseye
  • 16:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 16:01 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1202']
  • 15:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 15:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 15:57 jayme: repool kubemaster2002
  • 15:57 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1201']
  • 15:57 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db1199.eqiad.wmnet with OS bullseye
  • 15:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 15:50 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1200']
  • 15:49 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1198.eqiad.wmnet with reason: host reimage
  • 15:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1197.eqiad.wmnet with OS bullseye
  • 15:45 pt1979@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1198.eqiad.wmnet with reason: host reimage
  • 15:42 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1200']
  • 15:40 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1200']
  • 15:39 jayme: depool kubemaster2002
  • 15:37 jayme: repooled kubemaster1001
  • 15:35 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1201']
  • 15:35 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1201']
  • 15:34 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1197.eqiad.wmnet with reason: host reimage
  • 15:34 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host db1198.eqiad.wmnet with OS bullseye
  • 15:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1201']
  • 15:32 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1200']
  • 15:31 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1196.eqiad.wmnet with OS bullseye
  • 15:30 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1197.eqiad.wmnet with reason: host reimage
  • 15:28 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1198']
  • 15:27 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1201']
  • 15:23 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1199']
  • 15:19 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1198']
  • 15:18 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db1197.eqiad.wmnet with OS bullseye
  • 15:16 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
  • 15:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1199']
  • 15:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1199']
  • 15:13 pt1979@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1196.eqiad.wmnet with reason: host reimage
  • 15:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1198']
  • 15:09 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1199']
  • 15:09 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1198']
  • 15:04 jayme: depooled kubemaster1001
  • 15:04 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1198']
  • 15:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1198']
  • 15:02 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1198']
  • 15:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1197']
  • 15:00 pt1979@cumin1001: START - Cookbook sre.hosts.reimage for host db1196.eqiad.wmnet with OS bullseye
  • 14:58 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudservices1003.wikimedia.org
  • 14:58 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:57 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1198']
  • 14:55 andrew@cumin1001: START - Cookbook sre.dns.netbox
  • 14:53 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1197']
  • 14:51 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudservices1003.wikimedia.org
  • 14:49 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudservices1003.wikimedia.org
  • 14:49 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1197']
  • 14:47 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1196']
  • 14:46 andrew@cumin1001: START - Cookbook sre.dns.netbox
  • 14:42 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudservices1003.wikimedia.org
  • 14:41 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1197']
  • 14:39 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1196']
  • 14:38 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db1196']
  • 14:32 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1196']
  • 14:21 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['db1196']
  • 14:18 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudservices1003
  • 14:18 andrew@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:16 pt1979@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts ['db1197']
  • 14:15 andrew@cumin1001: START - Cookbook sre.dns.netbox
  • 14:11 andrew@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudservices1003
  • 14:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1197']
  • 14:01 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1196']
  • 13:57 pt1979@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1196']
  • 13:57 pt1979@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1196']
  • 13:31 pt1979@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1196']
  • 13:31 pt1979@cumin1001: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db1196']
  • 13:15 jayme: repooled kubemaster1002
  • 13:10 jayme: redepooled kubemaster1002
  • 13:00 aikochou@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:56 aikochou@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging MMandere out of all services on: 1235 hosts
  • 10:35 jmm@cumin2002: START - Cookbook sre.idm.logout Logging MMandere out of all services on: 1235 hosts
  • 10:34 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging MMandere out of all services on: 779 hosts
  • 10:34 jmm@cumin2002: START - Cookbook sre.idm.logout Logging MMandere out of all services on: 779 hosts
  • 10:08 jayme: depooled kubemaster1002 for tests
  • 09:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2155 (T314041)', diff saved to https://phabricator.wikimedia.org/P33743 and previous config saved to /var/cache/conftool/dbconfig/20220902-092704-ladsgroup.json
  • 09:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 09:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2095.codfw.wmnet with reason: Maintenance
  • 09:26 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 09:26 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2155.codfw.wmnet with reason: Maintenance
  • 08:44 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
  • 08:41 fnegri@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:37 fnegri@cumin1001: START - Cookbook sre.dns.netbox
  • 08:36 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
  • 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
  • 08:26 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
  • 08:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2002.codfw.wmnet
  • 08:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2002.codfw.wmnet
  • 08:13 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host idp2002.wikimedia.org
  • 08:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp2002.wikimedia.org
  • 08:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1002.wikimedia.org
  • 07:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1002.wikimedia.org
  • 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test2002.wikimedia.org
  • 07:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test2002.wikimedia.org
  • 07:17 dcausse: restarting blazegraph on wdqs1016 (BlazegraphFreeAllocatorsDecreasingRapidly)
  • 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 to clone db1107 T316870', diff saved to https://phabricator.wikimedia.org/P33739 and previous config saved to /var/cache/conftool/dbconfig/20220902-054405-root.json
  • 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2149 T316494 ', diff saved to https://phabricator.wikimedia.org/P33738 and previous config saved to /var/cache/conftool/dbconfig/20220902-052841-marostegui.json

2022-09-01

  • 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:50 thcipriani@deploy1002: Finished scap: Backport for Remove Vector grid config (T313559), Disable sticky header edit experiment for idwiki, viwki (T315264) (duration: 05m 44s)
  • 20:49 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:44 thcipriani@deploy1002: thcipriani and cjming and bwang: Backport for Remove Vector grid config (T313559), Disable sticky header edit experiment for idwiki, viwki (T315264) synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 20:44 thcipriani@deploy1002: Started scap: Backport for Remove Vector grid config (T313559), Disable sticky header edit experiment for idwiki, viwki (T315264)
  • 20:41 thcipriani@deploy1002: Finished scap: Backport for cirrus: Handle transition to elasticsearch 7.10 (duration: 16m 56s)
  • 20:40 ryankemper: T300943 New hosts are in service and were pooled like so: `sudo confctl select name=elastic20[73-86].* set/weight=10:pooled=yes` (in retrospect that syntax seems to have selected too many hosts, but the final state of pybal is correct per https://config-master.wikimedia.org/pybal/codfw/search)
  • 20:39 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1203.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:39 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1202.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:37 ryankemper@puppetmaster1001: conftool action : set/weight=10:pooled=yes; selector: name=elastic20[73-86].*
  • 20:35 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: T300943
  • 20:35 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 14 hosts with reason: T300943
  • 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:24 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:24 thcipriani@deploy1002: thcipriani and ebernhardson: Backport for cirrus: Handle transition to elasticsearch 7.10 synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
  • 20:24 thcipriani@deploy1002: Started scap: Backport for cirrus: Handle transition to elasticsearch 7.10
  • 20:20 thcipriani@deploy1002: backport aborted: (duration: 03m 09s)
  • 20:20 thcipriani@deploy1002: backport aborted: (duration: 02m 57s)
  • 20:20 thcipriani@deploy1002: sync-world aborted: Backport for Revert "Deploy Research Incentive Survey to idwiki" (duration: 01m 23s)
  • 20:20 thcipriani@deploy1002: thcipriani and trainbranchbot: Backport for Revert "Deploy Research Incentive Survey to idwiki" synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
  • 20:19 thcipriani@deploy1002: Started scap: Backport for Revert "Deploy Research Incentive Survey to idwiki"
  • 20:14 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1203.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 20:14 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host db1202.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 20:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1201.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:13 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1200.mgmt.eqiad.wmnet with reboot policy FORCED
  • 20:13 thcipriani@deploy1002: thcipriani and dani: Backport for Deploy Research Incentive Survey to idwiki (T316466) synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
  • 20:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 20:06 thcipriani@deploy1002: Started scap: Backport for Deploy Research Incentive Survey to idwiki (T316466)
  • 19:58 mutante: otrs1001 - sudo systemctl reset-failed - T316903
  • 19:48 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1201.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:46 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host db1200.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:41 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1199.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:41 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1198.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:17 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1199.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:17 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host db1198.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1197.mgmt.eqiad.wmnet with reboot policy FORCED
  • 19:16 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1196.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:53 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db1197.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:53 pt1979@cumin1001: START - Cookbook sre.hosts.provision for host db1196.mgmt.eqiad.wmnet with reboot policy FORCED
  • 18:52 pt1979@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1203
  • 18:51 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db1203
  • 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1202
  • 18:51 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db1202
  • 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1201
  • 18:50 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db1201
  • 18:50 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1200
  • 18:50 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db1200
  • 18:50 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1199
  • 18:50 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db1199
  • 18:50 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1198
  • 18:50 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db1198
  • 18:50 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1197
  • 18:49 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db1197
  • 18:49 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db1196
  • 18:49 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db1196
  • 18:48 pt1979@cumin1001: START - Cookbook sre.dns.netbox
  • 18:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 18:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 18:42 dduvall@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.39.0-wmf.27 refs T314188
  • 17:34 bd808@deploy1002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:33 bd808@deploy1002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:33 bd808@deploy1002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:32 bd808@deploy1002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:32 bd808@deploy1002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:31 bd808@deploy1002: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 17:26 herron: restarted rsyslog on centrallog2002
  • 16:29 topranks: Brining Lumen Tranport CCT 442550294 (cr1-codfw to cr4-ulsfo) back into service following successful hot-cut to lower-latency path with carrier
  • 16:17 hnowlan@puppetmaster1001: conftool action : set/weight=10; selector: name=restbase103[1-3].eqiad.wmnet
  • 15:55 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=restbase103[1-3].eqiad.wmnet
  • 15:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 15:35 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 15:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 15:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 15:21 moritzm: installing usb.ids update from Bullseye 11.4 point release
  • 15:19 moritzm: updating docker.io on ml-serve* to bugfix release from Bullseye 11.4 point release
  • 14:54 topranks: Draining traffic from Lumen Tranport CCT 442550294 (cr1-codfw to cr4-ulsfo) ahead of hot-cut to lower-latency path with carrier
  • 14:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1002.eqiad.wmnet
  • 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1002.eqiad.wmnet
  • 14:07 moritzm: installing net-snmp security updates on Buster
  • 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1002.eqiad.wmnet
  • 14:01 marostegui: test T316744
  • 14:01 marostegui: test T316744
  • 14:00 marostegui: Failover m5 from db1107 to db1183 - T316744
  • 13:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1002.eqiad.wmnet
  • 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2002.codfw.wmnet
  • 13:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2002.codfw.wmnet
  • 13:52 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host netbox1002.eqiad.wmnet
  • 13:43 moritzm: rebooting netbox1002 (running netbox.wikimedia.org)
  • 13:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox1002.eqiad.wmnet
  • 13:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox2002.codfw.wmnet
  • 13:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox2002.codfw.wmnet
  • 13:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db[2135,2160].codfw.wmnet,db[1107,1117,1183].eqiad.wmnet with reason: switchover m5 T316744
  • 13:31 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db[2135,2160].codfw.wmnet,db[1107,1117,1183].eqiad.wmnet with reason: switchover m5 T316744
  • 13:19 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 13:19 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 13:19 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 13:19 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 13:18 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:18 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 13:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 13:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 13:09 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Move 5% of traffic to php 7.4 (T271736) (duration: 03m 45s)
  • 13:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 13:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 13:00 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 13:00 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:00 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 12:59 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 12:56 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 12:56 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 12:29 herron: restarted thanos-query on thanos-fe1001
  • 12:20 cdanis@cumin2002: dbctl commit (dc=all): 'T316482 remove replicas from x2', diff saved to https://phabricator.wikimedia.org/P33736 and previous config saved to /var/cache/conftool/dbconfig/20220901-122026-cdanis.json
  • 12:13 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ml-serve-ctrl1001.eqiad.wmnet
  • 12:13 klausman@cumin1001: START - Cookbook sre.hosts.remove-downtime for ml-serve-ctrl1001.eqiad.wmnet
  • 12:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 12:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P33735 and previous config saved to /var/cache/conftool/dbconfig/20220901-121252-ladsgroup.json
  • 12:05 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on ml-serve-ctrl1001.eqiad.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
  • 12:05 klausman@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on ml-serve-ctrl1001.eqiad.wmnet with reason: Reboot to pick up kernel 5.10.136 (T316185)
  • 12:03 klausman@cumin1001: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-eqiad
  • 11:59 moritzm: rebalance row B after completed Bullseye updates T311686
  • 11:57 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P33734 and previous config saved to /var/cache/conftool/dbconfig/20220901-115746-ladsgroup.json
  • 11:48 cdanis: root@apt1001:/home/cdanis/build-area# reprepro --ignore=wrongdistribution -C main include bullseye-wikimedia conftool_2.2.2-1_amd64.changes
  • 11:42 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314', diff saved to https://phabricator.wikimedia.org/P33733 and previous config saved to /var/cache/conftool/dbconfig/20220901-114239-ladsgroup.json
  • 11:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P33732 and previous config saved to /var/cache/conftool/dbconfig/20220901-112733-ladsgroup.json
  • 11:04 claime: depooled wtp1035.eqiad.wmnet from parsoid cluster https://phabricator.wikimedia.org/T312638
  • 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki2002.codfw.wmnet
  • 10:58 claime: pooled parse1002.eqiad.wmnet (php 7.4 only) in parsoid cluster https://phabricator.wikimedia.org/T312638
  • 10:56 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1002.eqiad.wmnet
  • 10:56 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1002.eqiad.wmnet
  • 10:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki2002.codfw.wmnet
  • 10:43 claime: depooled wtp1034.eqiad.wmnet from parsoid cluster https://phabricator.wikimedia.org/T312638
  • 10:43 claime: pooled parse1001.eqiad.wmnet (php 7.4 only) in parsoid cluster https://phabricator.wikimedia.org/T312638
  • 10:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2002.codfw.wmnet
  • 10:40 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for parse1001.eqiad.wmnet
  • 10:40 cgoubert@cumin1001: START - Cookbook sre.hosts.remove-downtime for parse1001.eqiad.wmnet
  • 10:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2002.codfw.wmnet
  • 10:36 cgoubert@puppetmaster1001: conftool action : set/pooled=no:weight=10; selector: dc=eqiad,cluster=parsoid,name=parse1002.eqiad.wmnet
  • 10:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
  • 10:29 klausman@cumin1001: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
  • 10:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
  • 10:21 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 10:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 10:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 10:19 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 10:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 10:13 marostegui@deploy1002: Synchronized wmf-config/ProductionServices.php: Promote pc1013 backt to pc3 master (duration: 03m 43s)
  • 10:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 10:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 10:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 10:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 09:58 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 09:58 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 09:58 cgoubert@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Update wgLinterSubmitterWhitelist (T312638) (duration: 03m 37s)
  • 09:57 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 09:52 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 09:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 09:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 09:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 09:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 09:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 09:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 09:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 09:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 09:32 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 09:32 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 09:32 marostegui@deploy1002: Synchronized wmf-config/ProductionServices.php: Promote pc1014 to pc3 master (duration: 03m 34s)
  • 09:31 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 08:18 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2015.codfw.wmnet to cluster codfw and group D
  • 08:17 cgoubert@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on parse1002.eqiad.wmnet with reason: Readding downtime removed by reimage
  • 08:17 cgoubert@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on parse1002.eqiad.wmnet with reason: Readding downtime removed by reimage
  • 08:17 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2015.codfw.wmnet to cluster codfw and group D
  • 08:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2015.codfw.wmnet
  • 07:56 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Moving 1% of traffic to php 7.4 (duration: 03m 42s)
  • 07:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2015.codfw.wmnet
  • 07:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2015.codfw.wmnet with OS bullseye
  • 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2015.codfw.wmnet with reason: host reimage
  • 07:10 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2015.codfw.wmnet with reason: host reimage
  • 06:50 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2015.codfw.wmnet with OS bullseye
  • 06:37 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 06:37 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 06:25 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Reverting to no php 7.4 traffic (duration: 03m 44s)
  • 06:19 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 06:18 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 10:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 06:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • 06:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
  • 06:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
  • 06:13 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 06:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1136.eqiad.wmnet with reason: Maintenance
  • 06:12 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
  • 06:10 oblivian@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Moving 1% of users to php 7.4 (duration: 03m 55s)
  • 06:09 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1136 T316111', diff saved to https://phabricator.wikimedia.org/P33729 and previous config saved to /var/cache/conftool/dbconfig/20220901-060923-ladsgroup.json
  • 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Promote db1181 to s7 primary and set section read-write T316111', diff saved to https://phabricator.wikimedia.org/P33728 and previous config saved to /var/cache/conftool/dbconfig/20220901-060128-ladsgroup.json
  • 06:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set s7 eqiad as read-only for maintenance - T316111', diff saved to https://phabricator.wikimedia.org/P33727 and previous config saved to /var/cache/conftool/dbconfig/20220901-060100-ladsgroup.json
  • 06:00 Amir1: Starting s7 eqiad failover from db1136 to db1181 - T316111
  • 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Set db1181 with weight 0 T316111', diff saved to https://phabricator.wikimedia.org/P33726 and previous config saved to /var/cache/conftool/dbconfig/20220901-051701-ladsgroup.json
  • 05:16 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 28 hosts with reason: Primary switchover s7 T316111
  • 05:16 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 28 hosts with reason: Primary switchover s7 T316111
  • 01:20 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase201[3-8].codfw.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001
  • 00:21 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase201[3-8].codfw.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001

Archives

See Server Admin Log/Archives.