You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(mutante: renewing TLS cert for peopleweb.discovery.wmnet, adding *3 hosts)
imported>Stashbot
(legoktm: regenerating pipermail redirects to skip those with duplicate message-ids (T280731))
(48 intermediate revisions by the same user not shown)
Line 1: Line 1:
== 2021-04-26 ==
== 2021-06-17 ==
* 23:28 mutante: renewing TLS cert for peopleweb.discovery.wmnet, adding *3 hosts
* 21:49 legoktm: regenerating pipermail redirects to skip those with duplicate message-ids ([[phab:T280731|T280731]])
* 23:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on people1003.eqiad.wmnet with reason: new host
* 18:24 ryankemper: [[phab:T285106|T285106]] [WDQS] `ryankemper@wdqs2001:~$ sudo depool`
* 23:21 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on people1003.eqiad.wmnet with reason: new host
* 18:01 dancy: Deployed latest scap code to beta cluster
* 22:26 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1006.eqiad.wmnet with reason: REIMAGE
* 13:28 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.9/extensions/Wikibase/client/includes/ClientHooks.php: Backport: [[gerrit:700036{{!}}client: Bring back using the client setting for langlink group (T284854)]] (duration: 00m 58s)
* 22:24 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1006.eqiad.wmnet with reason: REIMAGE
* 13:28 jbond: add prometheus-jmx-exporter to bullseye-wikimedia
* 22:11 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs1006.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `reimage`
* 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16604 and previous config saved to /var/cache/conftool/dbconfig/20210617-121146-root.json
* 21:21 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host people1003.eqiad.wmnet
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16603 and previous config saved to /var/cache/conftool/dbconfig/20210617-120109-root.json
* 20:48 twentyafterfour: restarting php-fpm on phab1001 to deploy phabricator hotfix {{Gerrit|d238db85b8d8072d99f31805aa4a8a7cf0c09941}}
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16602 and previous config saved to /var/cache/conftool/dbconfig/20210617-115643-root.json
* 20:35 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host people1003.eqiad.wmnet
* 11:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 100%: Repool db1144:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16601 and previous config saved to /var/cache/conftool/dbconfig/20210617-115319-root.json
* 20:26 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts planet1003.eqiad.wmnet
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16600 and previous config saved to /var/cache/conftool/dbconfig/20210617-114605-root.json
* 20:15 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts planet1003.eqiad.wmnet
* 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16599 and previous config saved to /var/cache/conftool/dbconfig/20210617-114139-root.json
* 19:45 legoktm: uploaded python3-falcon, python3-mimeparse, python3-mujson, openstack-pkg-tools to mailman3 component on apt.wm.o
* 11:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 75%: Repool db1144:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16598 and previous config saved to /var/cache/conftool/dbconfig/20210617-113816-root.json
* 18:51 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1003.eqiad.wmnet with reason: REIMAGE
* 11:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16597 and previous config saved to /var/cache/conftool/dbconfig/20210617-113101-root.json
* 18:49 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1002.eqiad.wmnet with reason: REIMAGE
* 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16596 and previous config saved to /var/cache/conftool/dbconfig/20210617-112635-root.json
* 18:49 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1003.eqiad.wmnet with reason: REIMAGE
* 11:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180', diff saved to https://phabricator.wikimedia.org/P16595 and previous config saved to /var/cache/conftool/dbconfig/20210617-112431-marostegui.json
* 18:47 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wcqs1001.eqiad.wmnet with reason: REIMAGE
* 11:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 50%: Repool db1144:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16594 and previous config saved to /var/cache/conftool/dbconfig/20210617-112312-root.json
* 18:47 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1002.eqiad.wmnet with reason: REIMAGE
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16593 and previous config saved to /var/cache/conftool/dbconfig/20210617-111558-root.json
* 18:45 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wcqs1001.eqiad.wmnet with reason: REIMAGE
* 11:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16592 and previous config saved to /var/cache/conftool/dbconfig/20210617-111026-marostegui.json
* 18:18 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2d16f6251a67cf13cef02bbdcb3c9f5c1c505d16}}: elwiki: Update Growth experiments configuration ([[phab:T280172|T280172]]) (duration: 00m 58s)
* 11:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3315 (re)pooling @ 25%: Repool db1144:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16591 and previous config saved to /var/cache/conftool/dbconfig/20210617-110808-root.json
* 18:06 urbanecm@deploy1002: Synchronized multiversion/MWScript.php: {{Gerrit|5ace4e1b806bcfc4ea059f9e9cae9aa94c0bdbd1}}: Fix error message if MWScript.php is run without arguments (duration: 00m 58s)
* 11:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 100%: Repool db1130 after schema change', diff saved to https://phabricator.wikimedia.org/P16590 and previous config saved to /var/cache/conftool/dbconfig/20210617-110656-root.json
* 17:28 dduvall@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 11:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1144:3315', diff saved to https://phabricator.wikimedia.org/P16589 and previous config saved to /var/cache/conftool/dbconfig/20210617-110200-marostegui.json
* 17:26 dduvall@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 10:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 75%: Repool db1130 after schema change', diff saved to https://phabricator.wikimedia.org/P16588 and previous config saved to /var/cache/conftool/dbconfig/20210617-105153-root.json
* 17:18 dduvall@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 50%: Repool db1130 after schema change', diff saved to https://phabricator.wikimedia.org/P16587 and previous config saved to /var/cache/conftool/dbconfig/20210617-103649-root.json
* 17:06 legoktm: imported postorius_1.3.4-2~bpo10+2 to apt.wm.o
* 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 25%: Repool db1130 after schema change', diff saved to https://phabricator.wikimedia.org/P16586 and previous config saved to /var/cache/conftool/dbconfig/20210617-102145-root.json
* 16:49 mutante: gerrit - restarted apache (hard) to remove time out from gerrit:682502
* 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130', diff saved to https://phabricator.wikimedia.org/P16585 and previous config saved to /var/cache/conftool/dbconfig/20210617-101827-marostegui.json
* 16:40 mutante: gerrit1001 - reload apache2
* 10:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 100%: Repool db1161 after schema change', diff saved to https://phabricator.wikimedia.org/P16584 and previous config saved to /var/cache/conftool/dbconfig/20210617-100445-root.json
* 16:36 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc1025.eqiad.wmnet
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 75%: Repool db1161 after schema change', diff saved to https://phabricator.wikimedia.org/P16583 and previous config saved to /var/cache/conftool/dbconfig/20210617-094942-root.json
* 16:30 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host mc1025.eqiad.wmnet
* 09:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 50%: Repool db1161 after schema change', diff saved to https://phabricator.wikimedia.org/P16582 and previous config saved to /var/cache/conftool/dbconfig/20210617-093438-root.json
* 15:26 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE
* 09:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: Repool db1110 after schema change', diff saved to https://phabricator.wikimedia.org/P16581 and previous config saved to /var/cache/conftool/dbconfig/20210617-092056-root.json
* 15:24 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 25%: Repool db1161 after schema change', diff saved to https://phabricator.wikimedia.org/P16580 and previous config saved to /var/cache/conftool/dbconfig/20210617-091934-root.json
* 15:21 elukey: restart zookeeper on conf2004 to pick up the -javaagent setting for the prometheus exporter
* 09:09 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1161', diff saved to https://phabricator.wikimedia.org/P16579 and previous config saved to /var/cache/conftool/dbconfig/20210617-090947-marostegui.json
* 15:06 moritzm: installing jquery security updates on stretch
* 09:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: Repool db1110 after schema change', diff saved to https://phabricator.wikimedia.org/P16578 and previous config saved to /var/cache/conftool/dbconfig/20210617-090552-root.json
* 15:01 hnowlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: Repool db1110 after schema change', diff saved to https://phabricator.wikimedia.org/P16577 and previous config saved to /var/cache/conftool/dbconfig/20210617-085048-root.json
* 15:01 hnowlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 08:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 100%: Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16576 and previous config saved to /var/cache/conftool/dbconfig/20210617-084941-root.json
* 14:54 hnowlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: Repool db1110 after schema change', diff saved to https://phabricator.wikimedia.org/P16575 and previous config saved to /var/cache/conftool/dbconfig/20210617-083545-root.json
* 14:54 hnowlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 08:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 75%: Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16574 and previous config saved to /var/cache/conftool/dbconfig/20210617-083438-root.json
* 14:48 hnowlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'staging' .
* 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110', diff saved to https://phabricator.wikimedia.org/P16573 and previous config saved to /var/cache/conftool/dbconfig/20210617-083005-marostegui.json
* 14:47 hnowlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'api-gateway' for release 'production' .
* 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3315', diff saved to https://phabricator.wikimedia.org/P16572 and previous config saved to /var/cache/conftool/dbconfig/20210617-082939-marostegui.json
* 14:28 moritzm: installing ldap-replica1003/1004
* 08:28 elukey: upload istioctl 1.6.14-1 to buster-wikimedia
* 14:03 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on conf2001.codfw.wmnet with reason: for zookeeper migration
* 08:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Repool db1168 after schema change', diff saved to https://phabricator.wikimedia.org/P16571 and previous config saved to /var/cache/conftool/dbconfig/20210617-082437-root.json
* 14:03 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on conf2001.codfw.wmnet with reason: for zookeeper migration
* 08:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3315', diff saved to https://phabricator.wikimedia.org/P16570 and previous config saved to /var/cache/conftool/dbconfig/20210617-082409-marostegui.json
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 100%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P15537 and previous config saved to /var/cache/conftool/dbconfig/20210426-133922-root.json
* 08:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 50%: Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16569 and previous config saved to /var/cache/conftool/dbconfig/20210617-081934-root.json
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P15536 and previous config saved to /var/cache/conftool/dbconfig/20210426-133905-root.json
* 08:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Repool db1168 after schema change', diff saved to https://phabricator.wikimedia.org/P16568 and previous config saved to /var/cache/conftool/dbconfig/20210617-080933-root.json
* 13:28 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 6 hosts with reason: for zookeeper migration
* 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 25%: Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16567 and previous config saved to /var/cache/conftool/dbconfig/20210617-080430-root.json
* 13:27 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 6 hosts with reason: for zookeeper migration
* 07:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315', diff saved to https://phabricator.wikimedia.org/P16566 and previous config saved to /var/cache/conftool/dbconfig/20210617-075825-marostegui.json
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 100%: Repool db1135', diff saved to https://phabricator.wikimedia.org/P15535 and previous config saved to /var/cache/conftool/dbconfig/20210426-132533-root.json
* 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: Repool db1168 after schema change', diff saved to https://phabricator.wikimedia.org/P16565 and previous config saved to /var/cache/conftool/dbconfig/20210617-075429-root.json
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 75%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P15534 and previous config saved to /var/cache/conftool/dbconfig/20210426-132417-root.json
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: Repool db1168 after schema change', diff saved to https://phabricator.wikimedia.org/P16564 and previous config saved to /var/cache/conftool/dbconfig/20210617-073926-root.json
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P15533 and previous config saved to /var/cache/conftool/dbconfig/20210426-132402-root.json
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1168', diff saved to https://phabricator.wikimedia.org/P16563 and previous config saved to /var/cache/conftool/dbconfig/20210617-073305-marostegui.json
* 13:14 moritzm: installing ldap-replica2005/2006
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 100%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16562 and previous config saved to /var/cache/conftool/dbconfig/20210617-073229-root.json
* 13:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 75%: Repool db1135', diff saved to https://phabricator.wikimedia.org/P15532 and previous config saved to /var/cache/conftool/dbconfig/20210426-131029-root.json
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 75%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16561 and previous config saved to /var/cache/conftool/dbconfig/20210617-071726-root.json
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 50%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P15531 and previous config saved to /var/cache/conftool/dbconfig/20210426-130913-root.json
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 50%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16560 and previous config saved to /var/cache/conftool/dbconfig/20210617-070222-root.json
* 13:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P15530 and previous config saved to /var/cache/conftool/dbconfig/20210426-130858-root.json
* 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 25%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16559 and previous config saved to /var/cache/conftool/dbconfig/20210617-064717-root.json
* 12:57 moritzm: installing gst-plugins-base1.0 security updates
* 06:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16558 and previous config saved to /var/cache/conftool/dbconfig/20210617-063135-marostegui.json
* 12:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 50%: Repool db1135', diff saved to https://phabricator.wikimedia.org/P15529 and previous config saved to /var/cache/conftool/dbconfig/20210426-125526-root.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Repool db1165 after schema change', diff saved to https://phabricator.wikimedia.org/P16557 and previous config saved to /var/cache/conftool/dbconfig/20210617-062514-root.json
* 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 25%: Repool db1105:3311', diff saved to https://phabricator.wikimedia.org/P15528 and previous config saved to /var/cache/conftool/dbconfig/20210426-125409-root.json
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Repool db1165 after schema change', diff saved to https://phabricator.wikimedia.org/P16556 and previous config saved to /var/cache/conftool/dbconfig/20210617-061010-root.json
* 12:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: Repool db1105:3312', diff saved to https://phabricator.wikimedia.org/P15527 and previous config saved to /var/cache/conftool/dbconfig/20210426-125354-root.json
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: Repool db1165 after schema change', diff saved to https://phabricator.wikimedia.org/P16555 and previous config saved to /var/cache/conftool/dbconfig/20210617-055507-root.json
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15526 and previous config saved to /var/cache/conftool/dbconfig/20210426-124141-marostegui.json
* 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Repool db1165 after schema change', diff saved to https://phabricator.wikimedia.org/P16554 and previous config saved to /var/cache/conftool/dbconfig/20210617-054003-root.json
* 12:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 25%: Repool db1135', diff saved to https://phabricator.wikimedia.org/P15525 and previous config saved to /var/cache/conftool/dbconfig/20210426-124022-root.json
* 05:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1165', diff saved to https://phabricator.wikimedia.org/P16553 and previous config saved to /var/cache/conftool/dbconfig/20210617-053455-marostegui.json
* 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1135 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15524 and previous config saved to /var/cache/conftool/dbconfig/20210426-123020-marostegui.json
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16552 and previous config saved to /var/cache/conftool/dbconfig/20210617-053105-root.json
* 12:28 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,service=nginx,name=mw1338.eqiad.wmnet
* 05:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16551 and previous config saved to /var/cache/conftool/dbconfig/20210617-051601-root.json
* 12:27 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,service=nginx,name=mw1338.eqiad.wmnet
* 05:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16550 and previous config saved to /var/cache/conftool/dbconfig/20210617-050057-root.json
* 12:24 Amir1: cleaning watchlist of QuickStatementsBot in wikidatawiki
* 04:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16549 and previous config saved to /var/cache/conftool/dbconfig/20210617-044554-root.json
* 12:06 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,service=nginx,name=mw1338.eqiad.wmnet
* 04:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180', diff saved to https://phabricator.wikimedia.org/P16548 and previous config saved to /var/cache/conftool/dbconfig/20210617-044146-marostegui.json
* 12:05 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,service=nginx,name=mw1338.eqiad.wmnet
* 04:41 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3316', diff saved to https://phabricator.wikimedia.org/P16547 and previous config saved to /var/cache/conftool/dbconfig/20210617-044132-marostegui.json
* 12:00 marostegui@deploy1002: Synchronized wmf-config/db-eqiad.php: Enable writes on es4 [[phab:T279281|T279281]] (duration: 00m 56s)
* 04:31 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3316', diff saved to https://phabricator.wikimedia.org/P16546 and previous config saved to /var/cache/conftool/dbconfig/20210617-043130-marostegui.json
* 11:57 marostegui: Restart es4 primary master - [[phab:T279281|T279281]]
* 11:55 marostegui@deploy1002: Synchronized wmf-config/db-eqiad.php: Disable writes on es4 [[phab:T279281|T279281]] (duration: 00m 56s)
* 11:51 aborrero@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:49 hashar@deploy1002: Finished deploy [integration/docroot@c2e48c9]: doc: Explain that VE is both stand-alone and integrated into MediaWiki (duration: 00m 13s)
* 11:49 hashar@deploy1002: Started deploy [integration/docroot@c2e48c9]: doc: Explain that VE is both stand-alone and integrated into MediaWiki
* 11:46 Urbanecm: EU B&C done
* 11:45 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/TemplateData/modules/ext.templateDataGenerator.editTemplatePage/Dialog.js: {{Gerrit|a347517f906b07b2503ae559c6cc714e1c50e4aa}}: Fix suggested values not being shown when the params type isnt specified ([[phab:T280688|T280688]]) (duration: 00m 57s)
* 11:31 hoo@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:681137{{!}}Revert "Set wgPageImagesAPIDefaultLicense to 'any' for wikidata"]] (duration: 00m 57s)
* 11:30 aborrero@cumin1001: START - Cookbook sre.dns.netbox
* 11:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2b5b640ad28bce1df20c2ca82654996d9cfc7630}}: Enable ContentTranslation as a default tool for 11 Wikipedias ([[phab:T279422|T279422]]) (duration: 00m 57s)
* 10:58 effie: restarting php-fpm in mw* clusters in codfw to pick up php7.2 update
* 10:46 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:682575{{!}} Bumping portals to master (T128546)]] (duration: 00m 57s)
* 10:45 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:682575{{!}} Bumping portals to master (T128546)]] (duration: 00m 57s)
* 10:38 jmm@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ldap-replica1004.wikimedia.org
* 10:37 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Setup wmgUseFooterCodeOfConductLink for later usage (duration: 00m 57s)
* 10:31 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1125.eqiad.wmnet with reason: REIMAGE
* 10:29 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1125.eqiad.wmnet with reason: REIMAGE
* 10:26 effie: upgrading mw* servers  php7.2  in codfw
* 10:25 marostegui: Deploy schema change on s4 codfw, lag will appear [[phab:T276292|T276292]]
* 10:24 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: Use wmgUseFooterTechCodeOfConductLink instead of wmgUseFooterCodeOfConductLink (duration: 00m 57s)
* 10:24 jmm@cumin2001: START - Cookbook sre.ganeti.makevm for new host ldap-replica1004.wikimedia.org
* 10:22 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Add wmgUseFooterTechCodeOfConductLink (duration: 00m 59s)
* 10:22 jmm@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ldap-replica1003.wikimedia.org
* 10:18 moritzm: installing systemd updates from buster 10.9 point release
* 10:07 jmm@cumin2001: START - Cookbook sre.ganeti.makevm for new host ldap-replica1003.wikimedia.org
* 10:00 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift,name=eqiad
* 09:53 jmm@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ldap-replica2006.wikimedia.org
* 09:42 moritzm: installing clamav security updates on otrs1001
* 09:38 godog: reboot ms-be1062, kernel backtrace saved
* 09:26 filippo@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=swift,name=eqiad
* 09:26 jmm@cumin2001: START - Cookbook sre.ganeti.makevm for new host ldap-replica2006.wikimedia.org
* 09:24 jmm@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ldap-replica2005.wikimedia.org
* 09:15 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on conf2005.codfw.wmnet with reason: for initial etcd replication
* 09:15 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on conf2005.codfw.wmnet with reason: for initial etcd replication
* 09:13 jayme: imported etcd-mirror_0.0.6-2 to buster-wikimedia
* 09:10 jmm@cumin2001: START - Cookbook sre.ganeti.makevm for new host ldap-replica2005.wikimedia.org
* 09:07 jmm@cumin2001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host ldap-replica2005failoid1002.wikimedia.org
* 09:04 jayme: imported etcd-mirror_0.0.6-1 to buster-wikimedia
* 08:55 jmm@cumin2001: START - Cookbook sre.ganeti.makevm for new host ldap-replica2005failoid1002.wikimedia.org
* 08:49 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: NOOP: {{Gerrit|f01a6dab70f74938dd51668809a181a8f551b6c8}}: GrowthExperiments: Enable community configuration on testwiki ([[phab:T274520|T274520]]) (duration: 00m 57s)
* 08:42 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: NOOP: {{Gerrit|88da8226823e59d1d19db9aeca3b5a5140c0c60c}}: GrowthExperiments: Do not enable community configuration outside of beta wikis ([[phab:T274520|T274520]]) (duration: 00m 59s)
* 08:28 moritzm: update debmonitor to 0.2.9 on remaining hosts [[phab:T281090|T281090]]
* 08:13 moritzm: installing lxml security updates on stretch
* 07:54 jayme@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on conf2005.codfw.wmnet with reason: for initial etcd replication
* 07:54 jayme@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on conf2005.codfw.wmnet with reason: for initial etcd replication
* 07:53 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe1001.eqiad.wmnet with reason: REIMAGE
* 07:51 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe1001.eqiad.wmnet with reason: REIMAGE
* 07:32 godog: swift eqiad-prod: less weight for ms-be[1019-1026] / more weight to ms-be106[0-3] - [[phab:T272836|T272836]]
* 07:24 moritzm: installing pear security updates
* 07:09 moritzm: removed rawdog from bullseye-wikimedia, needs Py2 [[phab:T280989|T280989]]
* 06:24 elukey: reboot an-coord1001 to pick up kernel security settings (after reimage)
* 05:47 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1158 to dbctl, depooled, [[phab:T258361|T258361]]', diff saved to https://phabricator.wikimedia.org/P15521 and previous config saved to /var/cache/conftool/dbconfig/20210426-054700-marostegui.json
* 05:32 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1124.eqiad.wmnet with reason: REIMAGE
* 05:30 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1124.eqiad.wmnet with reason: REIMAGE
* 03:43 kart_: Updated cxserver to 2021-04-21-044024-production ([[phab:T279045|T279045]])
* 03:41 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:37 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:32 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .


== 2021-04-25 ==
== 2021-06-16 ==
* 15:23 Amir1: sudo -u list /var/lib/mailman/bin/change_pw -l wikica-l -p $(pwgen -c1 -s 12) ([[phab:T281066|T281066]])
* 21:35 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 21:32 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 17:41 dancy: Reverted Scap release on beta
* 16:18 topranks: Resetting metric on Telia CCT IC-331929, cr1-codfw and cr3-eqsin.
* 15:22 dancy: testing upcoming Scap release on beta
* 12:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 100%: Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P16545 and previous config saved to /var/cache/conftool/dbconfig/20210616-125329-root.json
* 12:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 75%: Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P16544 and previous config saved to /var/cache/conftool/dbconfig/20210616-123826-root.json
* 12:34 kormat: deploying heartbeat service puppet change
* 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 50%: Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P16543 and previous config saved to /var/cache/conftool/dbconfig/20210616-122322-root.json
* 12:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 25%: Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P16541 and previous config saved to /var/cache/conftool/dbconfig/20210616-120818-root.json
* 12:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on maps1007.eqiad.wmnet with reason: Reparenting from maps1009
* 12:00 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on maps1007.eqiad.wmnet with reason: Reparenting from maps1009
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1131', diff saved to https://phabricator.wikimedia.org/P16540 and previous config saved to /var/cache/conftool/dbconfig/20210616-120015-marostegui.json
* 11:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16539 and previous config saved to /var/cache/conftool/dbconfig/20210616-112115-root.json
* 11:20 hnowlan: running `nodetool cleanup` on maps1005
* 11:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16538 and previous config saved to /var/cache/conftool/dbconfig/20210616-110612-root.json
* 10:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16537 and previous config saved to /var/cache/conftool/dbconfig/20210616-105108-root.json
* 10:36 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on maps1007.eqiad.wmnet with reason: REIMAGE
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16536 and previous config saved to /var/cache/conftool/dbconfig/20210616-103604-root.json
* 10:34 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on maps1007.eqiad.wmnet with reason: REIMAGE
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16535 and previous config saved to /var/cache/conftool/dbconfig/20210616-102349-marostegui.json
* 09:52 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1007.eqiad.wmnet
* 09:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on maps1007.eqiad.wmnet with reason: Reparenting from maps1009
* 09:51 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on maps1007.eqiad.wmnet with reason: Reparenting from maps1009
* 09:50 hnowlan: disabling puppet on maps1* to reparent maps1007 from new master maps1009
* 09:47 kormat: truncating all pc* tables on pc1010 [[phab:T282761|T282761]]
* 09:40 kormat@deploy1002: Synchronized wmf-config/db-eqiad.php: Repool pc1009 as pc3 primary [[phab:T282761|T282761]] (duration: 00m 59s)
* 09:04 kormat: Deploying wmfmariadbpy 0.7.1 [[phab:T284819|T284819]]
* 09:04 kormat: uploaded wmfmariadbpy 0.7.1 to apt.wm.o
* 08:24 Amir1: running "update flaggedrevs set fr_quality = 0 where fr_quality != 0;" on all wikis where flagged revs is enabled ([[phab:T279761|T279761]])
* 07:27 dcausse: cleanup old /var/log/airflow/scheduler logs to reclaim space on an-airflow1001
* 06:55 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:52 volans@cumin1001: START - Cookbook sre.dns.netbox
* 05:06 marostegui: Upgrade clouddb1014


== 2021-04-24 ==
== 2021-06-15 ==
* 22:24 bstorm: Rebooting labstore1007 from ilo after crash
* 17:54 dancy: testing upcoming Scap release on beta
* 17:21 mutante: new Wikimedia language "shi" added - Shilha /ˈʃɪlhə/ is a Berber language native to Shilha people. The endonym is Taclḥit /taʃlʜijt/, and in recent English publications the language is often rendered Tashelhiyt or Tashelhit.
* 17:17 mutante: new Wikimedia language "dag" added - Dagbani (or Dagbane), also known as Dagbanli and Dagbanle, is a Gur language spoken in Ghana.
* 17:11 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-master1002.eqiad.wmnet with reason: REIMAGE
* 17:09 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-master1002.eqiad.wmnet with reason: REIMAGE
* 16:11 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 60 days, 0:00:00 on an-master1002.eqiad.wmnet with reason: Update operating system to bullseye
* 16:11 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 60 days, 0:00:00 on an-master1002.eqiad.wmnet with reason: Update operating system to bullseye
* 14:55 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 14:25 XioNoX: re-enable cr1-codfw:xe-5/1/2
* 13:23 marostegui: Upgrade clouddb1018
* 13:15 effie: enable puppet on canaries
* 13:10 effie: disable puppet on canaries to deploy 699908
* 10:45 XioNoX: re-enable cr1-codfw:xe-5/1/2
* 09:42 XioNoX: cr1-codfw# set interfaces xe-5/1/2 disable
* 09:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2080', diff saved to https://phabricator.wikimedia.org/P16533 and previous config saved to /var/cache/conftool/dbconfig/20210615-092511-marostegui.json
* 09:24 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2086:3318, db2082', diff saved to https://phabricator.wikimedia.org/P16532 and previous config saved to /var/cache/conftool/dbconfig/20210615-092409-marostegui.json
* 09:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2086:3318', diff saved to https://phabricator.wikimedia.org/P16531 and previous config saved to /var/cache/conftool/dbconfig/20210615-090802-marostegui.json
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2083', diff saved to https://phabricator.wikimedia.org/P16530 and previous config saved to /var/cache/conftool/dbconfig/20210615-090650-marostegui.json
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2084', diff saved to https://phabricator.wikimedia.org/P16529 and previous config saved to /var/cache/conftool/dbconfig/20210615-090243-marostegui.json
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2081', diff saved to https://phabricator.wikimedia.org/P16528 and previous config saved to /var/cache/conftool/dbconfig/20210615-090206-marostegui.json
* 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2082', diff saved to https://phabricator.wikimedia.org/P16527 and previous config saved to /var/cache/conftool/dbconfig/20210615-085953-marostegui.json
* 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2091', diff saved to https://phabricator.wikimedia.org/P16526 and previous config saved to /var/cache/conftool/dbconfig/20210615-085938-marostegui.json
* 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2080 db2083 db2084 db2091', diff saved to https://phabricator.wikimedia.org/P16525 and previous config saved to /var/cache/conftool/dbconfig/20210615-083233-marostegui.json
* 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2081', diff saved to https://phabricator.wikimedia.org/P16524 and previous config saved to /var/cache/conftool/dbconfig/20210615-082857-marostegui.json
* 06:10 XioNoX: roll OSPF link-protection to all routers - [[phab:T167306|T167306]]
* 02:30 eileen: civicrm revision changed from {{Gerrit|d9d61dad0b}} to {{Gerrit|acbcce94a2}}, config revision is {{Gerrit|2aed6ff89b}}
* 01:22 eileen: civicrm revision changed from {{Gerrit|28ace1b86f}} to {{Gerrit|d9d61dad0b}}, config revision is {{Gerrit|2aed6ff89b}}
* 00:37 eileen: civicrm revision changed from {{Gerrit|31d07115a0}} to {{Gerrit|28ace1b86f}}, config revision is {{Gerrit|2aed6ff89b}}


== 2021-04-23 ==
== 2021-06-14 ==
* 21:36 foks: removing 1 file for legal compliance
* 21:40 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@baeee47]: [[phab:T261407|T261407]] bulk_daemon: Deploy prioritized topics (duration: 00m 49s)
* 20:15 mutante: [apt1001:~] $ sudo -i reprepro -C main includedeb bullseye-wikimedia /home/dzahn/rawdog_2.23-2_all.deb ([[phab:T280989|T280989]])
* 21:40 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@baeee47]: [[phab:T261407|T261407]] bulk_daemon: Deploy prioritized topics
* 19:41 mutante: [apt1001:~] $ sudo -i reprepro copy bullseye-wikimedia buster-wikimedia envoyproxy - copy envoy package from buster to bullseye [[phab:T280989|T280989]]
* 19:27 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-airflow1003.eqiad.wmnet
* 19:09 ebernhardson: closing duplicate/wrong cluster indices in cloudelastic
* 19:21 twentyafterfour_: applying hotfix for [[phab:T284397|T284397]] and restarting php7.3-fpm on phab1001
* 17:02 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp1087.eqiad.wmnet
* 18:30 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-airflow1003.eqiad.wmnet
* 16:35 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:05 jforrester@deploy1002: Finished deploy [integration/docroot@22061b6]: Actually add mediawiki/tools/api-testing JSDoc to doc.wikimedia for [[phab:T236915|T236915]] (duration: 00m 07s)
* 16:32 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:05 jforrester@deploy1002: Started deploy [integration/docroot@22061b6]: Actually add mediawiki/tools/api-testing JSDoc to doc.wikimedia for [[phab:T236915|T236915]]
* 16:24 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:46 jforrester@deploy1002: Finished deploy [integration/docroot@ca7af97]: Add mediawiki/tools/api-testing JSDoc to doc.wikimedia for [[phab:T236915|T236915]] (duration: 00m 07s)
* 16:19 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:46 jforrester@deploy1002: Started deploy [integration/docroot@ca7af97]: Add mediawiki/tools/api-testing JSDoc to doc.wikimedia for [[phab:T236915|T236915]]
* 14:59 jbond@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on theemin.codfw.wmnet with reason: REIMAGE
* 15:56 razzi@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-airflow1002.eqiad.wmnet
* 14:59 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on theemin.codfw.wmnet with reason: REIMAGE
* 15:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 100%: Repool db1142 after upgrade', diff saved to https://phabricator.wikimedia.org/P16521 and previous config saved to /var/cache/conftool/dbconfig/20210614-155258-root.json
* 14:25 moritzm: revert back bullseye image to daily build from last week (to rule out potential reimage issue)
* 15:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 75%: Repool db1142 after upgrade', diff saved to https://phabricator.wikimedia.org/P16520 and previous config saved to /var/cache/conftool/dbconfig/20210614-153754-root.json
* 13:33 elukey: roll restart of all thanos-swift proxies to pick up new ML account - [[phab:T280773|T280773]]
* 15:24 otto@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0)
* 12:50 jbond42: upload new debmonitor-client packages
* 15:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 50%: Repool db1142 after upgrade', diff saved to https://phabricator.wikimedia.org/P16519 and previous config saved to /var/cache/conftool/dbconfig/20210614-152250-root.json
* 11:50 moritzm: installing perf updates from Buster 10.9 point release
* 15:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1005.eqiad.wmnet
* 10:06 moritzm: installing Linux 4.19.181 updates from Buster 10.9 point release (no reboots, just updating the packages)
* 15:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 25%: Repool db1142 after upgrade', diff saved to https://phabricator.wikimedia.org/P16518 and previous config saved to /var/cache/conftool/dbconfig/20210614-150747-root.json
* 09:54 moritzm: installing xen security updates
* 15:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1005.eqiad.wmnet
* 09:49 moritzm: installing xorg-server security updates
* 15:04 razzi@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-airflow1002.eqiad.wmnet
* 09:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1079 (re)pooling @ 100%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15512 and previous config saved to /var/cache/conftool/dbconfig/20210423-093723-root.json
* 15:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1004.eqiad.wmnet
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1079 (re)pooling @ 75%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15511 and previous config saved to /var/cache/conftool/dbconfig/20210423-092220-root.json
* 14:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1004.eqiad.wmnet
* 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1079 (re)pooling @ 50%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15510 and previous config saved to /var/cache/conftool/dbconfig/20210423-090716-root.json
* 14:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1142 (re)pooling @ 10%: Repool db1142 after upgrade', diff saved to https://phabricator.wikimedia.org/P16517 and previous config saved to /var/cache/conftool/dbconfig/20210614-145243-root.json
* 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1079 (re)pooling @ 25%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15509 and previous config saved to /var/cache/conftool/dbconfig/20210423-085212-root.json
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1003.eqiad.wmnet
* 08:27 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1020.eqiad.wmnet
* 14:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 100%: Repool db1147 after upgrade', diff saved to https://phabricator.wikimedia.org/P16516 and previous config saved to /var/cache/conftool/dbconfig/20210614-145039-root.json
* 08:21 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be1020.eqiad.wmnet
* 14:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1003.eqiad.wmnet
* 08:19 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1021.eqiad.wmnet
* 14:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1142 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16515 and previous config saved to /var/cache/conftool/dbconfig/20210614-144130-marostegui.json
* 08:13 moritzm: upgrading d-i image for bullseye to RC1 release [[phab:T275873|T275873]]
* 14:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1002.eqiad.wmnet
* 08:12 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be1021.eqiad.wmnet
* 14:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 75%: Repool db1147 after upgrade', diff saved to https://phabricator.wikimedia.org/P16514 and previous config saved to /var/cache/conftool/dbconfig/20210614-143536-root.json
* 08:12 moritzm: upgrading d-i image for bullseye to RC1 release
* 14:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1002.eqiad.wmnet
* 08:12 filippo@cumin1001: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ms-be1019.eqiad.wmnet
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 100%: Repool db1170:3317 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16513 and previous config saved to /var/cache/conftool/dbconfig/20210614-143224-root.json
* 07:59 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be1019.eqiad.wmnet
* 14:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 100%: Repool db1170:3312 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16512 and previous config saved to /var/cache/conftool/dbconfig/20210614-143211-root.json
* 07:56 jynus: deleting db1156 s2 database and reloading it from logical backups [[phab:T280492|T280492]]
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps1001.eqiad.wmnet
* 07:22 Amir1: removing junk bounced email addresses from yahoo from all mailing lists
* 14:27 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Migrate CentralNotice<nowiki>{</nowiki>BannerHistory,Impression<nowiki>}</nowiki> to EventGate on all wikis - [[phab:T271168|T271168]] (duration: 00m 57s)
* 05:40 marostegui: Stop db1079 to clone db1158 (lag will appear on s7 on wiki replicas)
* 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps1001.eqiad.wmnet
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079 to clone db1158 [[phab:T258361|T258361]]', diff saved to https://phabricator.wikimedia.org/P15506 and previous config saved to /var/cache/conftool/dbconfig/20210423-053907-marostegui.json
* 14:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2007.codfw.wmnet
* 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 50%: Repool db1147 after upgrade', diff saved to https://phabricator.wikimedia.org/P16511 and previous config saved to /var/cache/conftool/dbconfig/20210614-142032-root.json
* 14:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1032 (re)pooling @ 100%: Repool es1032 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16510 and previous config saved to /var/cache/conftool/dbconfig/20210614-142014-root.json
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 75%: Repool db1170:3317 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16509 and previous config saved to /var/cache/conftool/dbconfig/20210614-141720-root.json
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 75%: Repool db1170:3312 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16508 and previous config saved to /var/cache/conftool/dbconfig/20210614-141707-root.json
* 14:17 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Migrate CentralNotice<nowiki>{</nowiki>BannerHistory,Impression<nowiki>}</nowiki> to EventGate on testwiki - [[phab:T271168|T271168]] (duration: 00m 57s)
* 14:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2007.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2006.codfw.wmnet
* 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 25%: Repool db1147 after upgrade', diff saved to https://phabricator.wikimedia.org/P16507 and previous config saved to /var/cache/conftool/dbconfig/20210614-140529-root.json
* 14:05 marostegui@cumin1001: dbctl commit (dc=all): 'es1032 (re)pooling @ 75%: Repool es1032 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16506 and previous config saved to /var/cache/conftool/dbconfig/20210614-140511-root.json
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 50%: Repool db1170:3317 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16505 and previous config saved to /var/cache/conftool/dbconfig/20210614-140217-root.json
* 14:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 50%: Repool db1170:3312 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16504 and previous config saved to /var/cache/conftool/dbconfig/20210614-140203-root.json
* 14:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2006.codfw.wmnet
* 13:54 marostegui@cumin1001: dbctl commit (dc=all): 'es1033 (re)pooling @ 100%: Repool es1033 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16503 and previous config saved to /var/cache/conftool/dbconfig/20210614-135456-root.json
* 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1147 (re)pooling @ 10%: Repool db1147 after upgrade', diff saved to https://phabricator.wikimedia.org/P16502 and previous config saved to /var/cache/conftool/dbconfig/20210614-135025-root.json
* 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'es1032 (re)pooling @ 50%: Repool es1032 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16501 and previous config saved to /var/cache/conftool/dbconfig/20210614-135007-root.json
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 25%: Repool db1170:3317 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16500 and previous config saved to /var/cache/conftool/dbconfig/20210614-134713-root.json
* 13:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 25%: Repool db1170:3312 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16499 and previous config saved to /var/cache/conftool/dbconfig/20210614-134700-root.json
* 13:43 otto@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers
* 13:39 marostegui@cumin1001: dbctl commit (dc=all): 'es1033 (re)pooling @ 75%: Repool es1033 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16498 and previous config saved to /var/cache/conftool/dbconfig/20210614-133953-root.json
* 13:38 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1147 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16497 and previous config saved to /var/cache/conftool/dbconfig/20210614-133801-marostegui.json
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'es1032 (re)pooling @ 25%: Repool es1032 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16496 and previous config saved to /var/cache/conftool/dbconfig/20210614-133503-root.json
* 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1034 (re)pooling @ 100%: Repool es1034 after upgrade', diff saved to https://phabricator.wikimedia.org/P16495 and previous config saved to /var/cache/conftool/dbconfig/20210614-133442-root.json
* 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3317 (re)pooling @ 10%: Repool db1170:3317 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16494 and previous config saved to /var/cache/conftool/dbconfig/20210614-133210-root.json
* 13:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1170:3312 (re)pooling @ 10%: Repool db1170:3312 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16493 and previous config saved to /var/cache/conftool/dbconfig/20210614-133156-root.json
* 13:29 effie: restart memcached on codfw
* 13:24 marostegui@cumin1001: dbctl commit (dc=all): 'es1033 (re)pooling @ 50%: Repool es1033 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16492 and previous config saved to /var/cache/conftool/dbconfig/20210614-132449-root.json
* 13:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1170:3312 db1170:3317 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16491 and previous config saved to /var/cache/conftool/dbconfig/20210614-132235-marostegui.json
* 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'es1032 (re)pooling @ 10%: Repool es1032 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16490 and previous config saved to /var/cache/conftool/dbconfig/20210614-132000-root.json
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'es1034 (re)pooling @ 75%: Repool es1034 after upgrade', diff saved to https://phabricator.wikimedia.org/P16489 and previous config saved to /var/cache/conftool/dbconfig/20210614-131938-root.json
* 13:09 marostegui@cumin1001: dbctl commit (dc=all): 'es1033 (re)pooling @ 25%: Repool es1033 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16488 and previous config saved to /var/cache/conftool/dbconfig/20210614-130946-root.json
* 13:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1032 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16487 and previous config saved to /var/cache/conftool/dbconfig/20210614-130723-marostegui.json
* 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 100%: Repool db1174 after schema change', diff saved to https://phabricator.wikimedia.org/P16486 and previous config saved to /var/cache/conftool/dbconfig/20210614-130547-root.json
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1034 (re)pooling @ 50%: Repool es1034 after upgrade', diff saved to https://phabricator.wikimedia.org/P16485 and previous config saved to /var/cache/conftool/dbconfig/20210614-130435-root.json
* 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'es1033 (re)pooling @ 10%: Repool es1033 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16484 and previous config saved to /var/cache/conftool/dbconfig/20210614-125442-root.json
* 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 75%: Repool db1174 after schema change', diff saved to https://phabricator.wikimedia.org/P16483 and previous config saved to /var/cache/conftool/dbconfig/20210614-125043-root.json
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1034 (re)pooling @ 25%: Repool es1034 after upgrade', diff saved to https://phabricator.wikimedia.org/P16482 and previous config saved to /var/cache/conftool/dbconfig/20210614-124931-root.json
* 12:37 XioNoX: configure OSPF link-protection on cr3/4-ulsfo - [[phab:T167306|T167306]]
* 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 50%: Repool db1174 after schema change', diff saved to https://phabricator.wikimedia.org/P16481 and previous config saved to /var/cache/conftool/dbconfig/20210614-123539-root.json
* 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1033 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16480 and previous config saved to /var/cache/conftool/dbconfig/20210614-123512-marostegui.json
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1034 (re)pooling @ 10%: Repool es1034 after upgrade', diff saved to https://phabricator.wikimedia.org/P16479 and previous config saved to /var/cache/conftool/dbconfig/20210614-123427-root.json
* 12:23 marostegui@cumin1001: dbctl commit (dc=all): 'Restore es1028 original weight', diff saved to https://phabricator.wikimedia.org/P16478 and previous config saved to /var/cache/conftool/dbconfig/20210614-122322-marostegui.json
* 12:22 marostegui@cumin1001: dbctl commit (dc=all): 'Give some weight to es1028 while es1034 gets upgraded', diff saved to https://phabricator.wikimedia.org/P16477 and previous config saved to /var/cache/conftool/dbconfig/20210614-122242-marostegui.json
* 12:22 dcausse: re-pooling wdqs1012
* 12:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1034 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16476 and previous config saved to /var/cache/conftool/dbconfig/20210614-122212-marostegui.json
* 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1174 (re)pooling @ 25%: Repool db1174 after schema change', diff saved to https://phabricator.wikimedia.org/P16475 and previous config saved to /var/cache/conftool/dbconfig/20210614-122036-root.json
* 12:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2005.codfw.wmnet
* 12:17 XioNoX: configure OSPF link-protection on cr3-ulsfo:xe-0/1/1 - [[phab:T167306|T167306]]
* 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2005.codfw.wmnet
* 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1148', diff saved to https://phabricator.wikimedia.org/P16474 and previous config saved to /var/cache/conftool/dbconfig/20210614-121101-marostegui.json
* 12:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1174 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16473 and previous config saved to /var/cache/conftool/dbconfig/20210614-121031-marostegui.json
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2004.codfw.wmnet
* 12:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2004.codfw.wmnet
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1148 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16472 and previous config saved to /var/cache/conftool/dbconfig/20210614-120112-marostegui.json
* 11:28 effie: restart memcached on mc2019
* 11:09 effie: restart memcached on codfw memcached gutter pool (mc-gp2* hosts)
* 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2003.codfw.wmnet
* 10:52 topranks: [[phab:T283163|T283163]]: Adding "metric-out minimum-igp" to all internal/Confed BGP groups on CR routers.
* 10:46 effie: enable puppet on mc*
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2003.codfw.wmnet
* 10:39 effie: disable puppet on mc* hosts
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host maps2001.codfw.wmnet
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host maps2001.codfw.wmnet
* 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 100%: Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P16471 and previous config saved to /var/cache/conftool/dbconfig/20210614-101839-root.json
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 75%: Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P16469 and previous config saved to /var/cache/conftool/dbconfig/20210614-100336-root.json
* 09:56 jbond@deploy1002: Finished deploy [netbox/deploy@e9f2382]: deploy v2.10.4-wmf4 (duration: 02m 37s)
* 09:54 jbond@deploy1002: Started deploy [netbox/deploy@e9f2382]: deploy v2.10.4-wmf4
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 50%: Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P16467 and previous config saved to /var/cache/conftool/dbconfig/20210614-094832-root.json
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1131 (re)pooling @ 25%: Repool db1131 after schema change', diff saved to https://phabricator.wikimedia.org/P16466 and previous config saved to /var/cache/conftool/dbconfig/20210614-093329-root.json
* 09:22 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1008.eqiad.wmnet
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1131 for schema change', diff saved to https://phabricator.wikimedia.org/P16465 and previous config saved to /var/cache/conftool/dbconfig/20210614-092234-marostegui.json
* 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Repool db1165 after schema change', diff saved to https://phabricator.wikimedia.org/P16464 and previous config saved to /var/cache/conftool/dbconfig/20210614-092125-root.json
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Repool db1165 after schema change', diff saved to https://phabricator.wikimedia.org/P16463 and previous config saved to /var/cache/conftool/dbconfig/20210614-090622-root.json
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: Repool db1165 after schema change', diff saved to https://phabricator.wikimedia.org/P16462 and previous config saved to /var/cache/conftool/dbconfig/20210614-085118-root.json
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Repool db1165 after schema change', diff saved to https://phabricator.wikimedia.org/P16461 and previous config saved to /var/cache/conftool/dbconfig/20210614-083614-root.json
* 08:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1165 for schema change', diff saved to https://phabricator.wikimedia.org/P16460 and previous config saved to /var/cache/conftool/dbconfig/20210614-081239-marostegui.json
* 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: Repool db1168 after schema change', diff saved to https://phabricator.wikimedia.org/P16459 and previous config saved to /var/cache/conftool/dbconfig/20210614-081031-root.json
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2148', diff saved to https://phabricator.wikimedia.org/P16458 and previous config saved to /var/cache/conftool/dbconfig/20210614-080552-marostegui.json
* 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: Repool db1168 after schema change', diff saved to https://phabricator.wikimedia.org/P16456 and previous config saved to /var/cache/conftool/dbconfig/20210614-075528-root.json
* 07:51 marostegui: Depool clouddb1013 to upgrade mysql
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: Repool db1168 after schema change', diff saved to https://phabricator.wikimedia.org/P16455 and previous config saved to /var/cache/conftool/dbconfig/20210614-074024-root.json
* 07:30 marostegui: Reboot db2148 [[phab:T284852|T284852]]
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2148 [[phab:T284852|T284852]]', diff saved to https://phabricator.wikimedia.org/P16454 and previous config saved to /var/cache/conftool/dbconfig/20210614-072930-marostegui.json
* 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: Repool db1168 after schema change', diff saved to https://phabricator.wikimedia.org/P16453 and previous config saved to /var/cache/conftool/dbconfig/20210614-072520-root.json
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1168 for schema change', diff saved to https://phabricator.wikimedia.org/P16452 and previous config saved to /var/cache/conftool/dbconfig/20210614-071839-marostegui.json
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 100%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16451 and previous config saved to /var/cache/conftool/dbconfig/20210614-071742-root.json
* 07:15 dcausse: restart blazegraph and depool wdqs1012
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 75%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16450 and previous config saved to /var/cache/conftool/dbconfig/20210614-070238-root.json
* 07:01 moritzm: restarting mw canaries to pick up libwebp security updates
* 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 50%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16449 and previous config saved to /var/cache/conftool/dbconfig/20210614-064734-root.json
* 06:39 moritzm: installing libwep security updates on buster
* 06:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1180 (re)pooling @ 25%: Repool db1180 after schema change', diff saved to https://phabricator.wikimedia.org/P16448 and previous config saved to /var/cache/conftool/dbconfig/20210614-063231-root.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1180 for schema change', diff saved to https://phabricator.wikimedia.org/P16447 and previous config saved to /var/cache/conftool/dbconfig/20210614-062554-marostegui.json
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 100%: Repool db1113:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16446 and previous config saved to /var/cache/conftool/dbconfig/20210614-061226-root.json
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 100%: Repool db1099:3311 after schema change', diff saved to https://phabricator.wikimedia.org/P16445 and previous config saved to /var/cache/conftool/dbconfig/20210614-060119-root.json
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 75%: Repool db1113:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16444 and previous config saved to /var/cache/conftool/dbconfig/20210614-055723-root.json
* 05:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 75%: Repool db1099:3311 after schema change', diff saved to https://phabricator.wikimedia.org/P16443 and previous config saved to /var/cache/conftool/dbconfig/20210614-054615-root.json
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 50%: Repool db1113:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16442 and previous config saved to /var/cache/conftool/dbconfig/20210614-054219-root.json
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 50%: Repool db1099:3311 after schema change', diff saved to https://phabricator.wikimedia.org/P16441 and previous config saved to /var/cache/conftool/dbconfig/20210614-053112-root.json
* 05:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1113:3316 (re)pooling @ 25%: Repool db1113:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16440 and previous config saved to /var/cache/conftool/dbconfig/20210614-052715-root.json
* 05:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3316 for schema change', diff saved to https://phabricator.wikimedia.org/P16439 and previous config saved to /var/cache/conftool/dbconfig/20210614-051930-marostegui.json
* 05:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 25%: Repool db1099:3311 after schema change', diff saved to https://phabricator.wikimedia.org/P16438 and previous config saved to /var/cache/conftool/dbconfig/20210614-051608-root.json
* 05:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099:3311 for schema change', diff saved to https://phabricator.wikimedia.org/P16437 and previous config saved to /var/cache/conftool/dbconfig/20210614-051522-marostegui.json


== 2021-04-22 ==
== 2021-06-12 ==
* 17:26 marostegui: Stop mysql on tendril/dbtree database
* 13:49 rzl@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on 6 hosts with reason: alert noise, no impact, x2 is unused
* 16:33 volker-e@deploy1002: Finished deploy [design/style-guide@e914e8a]: Deploy design/style-guide: {{Gerrit|e914e8a}} icons: Add 'share' icon (#455) (duration: 00m 06s)
* 13:49 rzl@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on 6 hosts with reason: alert noise, no impact, x2 is unused
* 16:32 volker-e@deploy1002: Started deploy [design/style-guide@e914e8a]: Deploy design/style-guide: {{Gerrit|e914e8a}} icons: Add 'share' icon (#455)
* 13:23 marostegui: Tendril and dbtree are up but on a degraded status (slow reponse)
* 13:19 marostegui: Tendril and dbtree are down at the moment
* 12:46 Urbanecm: Start server-side upload for 2 video files ([[phab:T280763|T280763]], [[phab:T280524|T280524]])
* 12:31 marostegui: Restart mysql on db1115 (tendril/dbtree will fail)
* 04:55 eileen: civicrm revision changed from {{Gerrit|42ca3cf65a}} to {{Gerrit|33a63d5789}}, config revision is {{Gerrit|cf07e7ba0b}}
* 02:47 krinkle@deploy1002: Finished deploy [integration/docroot@010e445]: (no justification provided) (duration: 00m 09s)
* 02:47 krinkle@deploy1002: Started deploy [integration/docroot@010e445]: (no justification provided)
* 01:34 eileen: civicrm revision changed from {{Gerrit|35a8dd33ba}} to {{Gerrit|42ca3cf65a}}, config revision is {{Gerrit|cf07e7ba0b}}
* 00:28 legoktm: legoktm@deneb:/var/cache/pbuilder/aptcache$ sudo rm -rf * # Cleaned up 8GB more
* 00:27 legoktm: legoktm@deneb:/var/cache/apt/archives$ sudo rm -rf * # cleaned up 6GB
* 00:03 legoktm: subscribed all list admins to the listadmins@ mailing list ([[phab:T280716|T280716]])


== 2021-04-21 ==
== 2021-06-11 ==
* 23:58 eileen: tools revision changed from {{Gerrit|3d950fffbd}} to {{Gerrit|c26a8c0cb6}}
* 23:37 mutante: removing firewall hole for mgmt networks to install* because it turned out it cant be used for firmware upgrades
* 23:49 legoktm: made myself and Amir1 list admins for the listadmins@lists.wikimedia.org mailing list
* 22:08 brennen: gitlab.wikimedia.org currently up with recommended config applied; test data deleted; users can register but not create projects. brennen, dancy, and thcipriani currently marked as admins. may need to reset data again, but hopefully not.
* 20:32 robh@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephosd1017.eqiad.wmnet
* 21:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on pc2014.codfw.wmnet with reason: REIMAGE
* 20:21 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudcephosd1017.eqiad.wmnet
* 21:25 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2014.codfw.wmnet with reason: REIMAGE
* 20:18 robh@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephosd1016.eqiad.wmnet
* 21:01 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on pc2013.codfw.wmnet with reason: REIMAGE
* 20:03 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudcephosd1016.eqiad.wmnet
* 20:59 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2013.codfw.wmnet with reason: REIMAGE
* 19:59 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host planet1003.eqiad.wmnet
* 20:04 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on pc2012.codfw.wmnet with reason: REIMAGE
* 19:52 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:02 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2012.codfw.wmnet with reason: REIMAGE
* 19:48 robh@cumin1001: START - Cookbook sre.dns.netbox
* 19:27 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on pc2011.codfw.wmnet with reason: REIMAGE
* 19:48 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:25 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on pc2011.codfw.wmnet with reason: REIMAGE
* 19:46 mutante: creating a ganeti VM to test bullseye install
* 16:40 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1004
* 19:46 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host planet1003.eqiad.wmnet
* 16:40 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1004
* 19:45 bstorm: manually kicking off a run of update-openstack-mirror on sodium to capture an upstream package update
* 15:01 reedy@deploy1002: Synchronized php-1.37.0-wmf.9/extensions/MediaSearch/extension.json: Make MediaSearch default search experience for all users (duration: 00m 57s)
* 19:15 robh@cumin1001: START - Cookbook sre.dns.netbox
* 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 100%: Repool db1143 after upgrade', diff saved to https://phabricator.wikimedia.org/P16432 and previous config saved to /var/cache/conftool/dbconfig/20210611-150018-root.json
* 18:46 Urbanecm: Morning B&C done
* 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 75%: Repool db1143 after upgrade', diff saved to https://phabricator.wikimedia.org/P16431 and previous config saved to /var/cache/conftool/dbconfig/20210611-144514-root.json
* 18:42 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/WikibaseMediaInfo/: {{Gerrit|f831d16e42e712832d683233a5b21ad59f7c73b3}}: Make the logistic regression image search default ([[phab:T271799|T271799]]) (duration: 00m 58s)
* 14:44 mbsantos@deploy1002: Finished deploy [tilerator/deploy@6bfdab5]: (no justification provided) (duration: 00m 05s)
* 18:38 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|f6d076a69607172475a86ba935a273e7519108d1}}: Update $wgGEHomepageNewAccountVariants ([[phab:T278123|T278123]]) (duration: 00m 58s)
* 14:44 mbsantos@deploy1002: Started deploy [tilerator/deploy@6bfdab5]: (no justification provided)
* 18:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1ae5ca5467fad7bfdae8aa94b241fe6c048ab8e5}}: Set wgGEMentorshipMigrationStage to WRITE_BOTH/READ_NEW everywhere ([[phab:T279853|T279853]]) (duration: 00m 59s)
* 14:43 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@5d7c993]: (no justification provided) (duration: 00m 05s)
* 18:05 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e252de0482c60e87e06d866006bb9ceb186af6cf}}: eswiki: Push Growth features out of dark mode ([[phab:T278235|T278235]]) (duration: 01m 00s)
* 14:42 mbsantos@deploy1002: Started deploy [kartotherian/deploy@5d7c993]: (no justification provided)
* 17:43 jynus: deploy grant changes on m5 backup sources (db1117 and db2078) [[phab:T278614|T278614]]
* 14:36 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1009
* 15:54 legoktm: [[phab:T280744|T280744]]: legoktm@lists1001:~$ sudo chmod 644 /etc/aliases
* 14:36 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1009
* 15:15 Urbanecm: urbanecm@mwmaint1002:~$ foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php # [[phab:T279853|T279853]]
* 14:35 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 100%: Repool db1165', diff saved to https://phabricator.wikimedia.org/P15503 and previous config saved to /var/cache/conftool/dbconfig/20210421-151526-root.json
* 14:35 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:02 moritzm: installing jquery security updates on buster
* 14:34 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 15:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 75%: Repool db1165', diff saved to https://phabricator.wikimedia.org/P15502 and previous config saved to /var/cache/conftool/dbconfig/20210421-150023-root.json
* 14:34 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 14:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 50%: Repool db1165', diff saved to https://phabricator.wikimedia.org/P15501 and previous config saved to /var/cache/conftool/dbconfig/20210421-144519-root.json
* 14:34 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1008.eqiad.wmnet
* 14:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1165 (re)pooling @ 25%: Repool db1165', diff saved to https://phabricator.wikimedia.org/P15500 and previous config saved to /var/cache/conftool/dbconfig/20210421-143015-root.json
* 14:33 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:25 jbond42: upload new version of debmonitor-client to apt
* 14:33 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:54 Urbanecm: [urbanecm@mwmaint1002 ~]$ time mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=fawiki # [[phab:T279853|T279853]]
* 14:32 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:39 moritzm: upgrading mw1262-1265,mw1277-1279 to PHP 7.2.34
* 14:31 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:18 Urbanecm: [urbanecm@mwmaint1002 ~]$ time mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=frwiki # [[phab:T279853|T279853]]
* 14:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 50%: Repool db1143 after upgrade', diff saved to https://phabricator.wikimedia.org/P16430 and previous config saved to /var/cache/conftool/dbconfig/20210611-143010-root.json
* 13:01 moritzm: upgrading mw1262-1265,mw1277-1279 to PHP 7.2.34
* 14:22 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:21 moritzm: installing failoid2002
* 14:22 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1158.eqiad.wmnet with reason: REIMAGE
* 14:20 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:14 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1158.eqiad.wmnet with reason: REIMAGE
* 14:20 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:49 jbond@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1008.eqiad.wmnet
* 11:46 jbond@cumin1001: START - Cookbook sre.dns.netbox
* 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 25%: Repool db1143 after upgrade', diff saved to https://phabricator.wikimedia.org/P16429 and previous config saved to /var/cache/conftool/dbconfig/20210611-141506-root.json
* 11:32 awight: EU backport window complete
* 13:53 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1008.eqiad.wmnet
* 11:31 moritzm: installing failoid1002
* 13:53 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1009
* 11:29 awight@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/WikimediaEvents: Backport: [[gerrit:681334{{!}}Send 0 edits userEditCountBucket for anons (T210106)]] (duration: 00m 59s)
* 13:53 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on maps1008.eqiad.wmnet with reason: Reparenting from maps1009
* 10:41 jbond42: switch debmonitor-client to cfssl (second try)
* 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1143 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16428 and previous config saved to /var/cache/conftool/dbconfig/20210611-135248-marostegui.json
* 10:37 jbond42: upload golang-cfssl packages for jessi and stretch
* 13:50 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1153', diff saved to https://phabricator.wikimedia.org/P16427 and previous config saved to /var/cache/conftool/dbconfig/20210611-135036-marostegui.json
* 10:33 jmm@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host failoid1002.eqiad.wmnet
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1153 for kernel upgrade', diff saved to https://phabricator.wikimedia.org/P16426 and previous config saved to /var/cache/conftool/dbconfig/20210611-133527-marostegui.json
* 10:29 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host eventlog1002.eqiad.wmnet
* 10:46 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 10:23 jmm@cumin2001: START - Cookbook sre.ganeti.makevm for new host failoid1002.eqiad.wmnet
* 07:29 moritzm: restarting archiva to pick up OpenJDK security updates
* 10:22 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host eventlog1002.eqiad.wmnet
* 07:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mwmaint2002.codfw.wmnet
* 10:21 jmm@cumin2001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host failoid2002.codfw.wmnet
* 07:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mwmaint2002.codfw.wmnet
* 10:21 hnowlan: rebooting eventlog1002 for kernel update
* 06:56 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:06 jmm@cumin2001: START - Cookbook sre.ganeti.makevm for new host failoid2002.codfw.wmnet
* 05:56 elukey: rm -rf empty dir /etc/apache2/sites-enabled/.links2 on webperf1001 to avoid puppet changes at every run
* 09:56 jbond42: switch debmonitor-clients to use cfssl
* 05:47 elukey: run systemctl reset-failed ifup@en5.service on doh1001 - [[phab:T273026|T273026]]
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 100%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15496 and previous config saved to /var/cache/conftool/dbconfig/20210421-093109-root.json
* 01:10 eileen: process-control config revision is {{Gerrit|2aed6ff89b}}
* 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 75%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15495 and previous config saved to /var/cache/conftool/dbconfig/20210421-091605-root.json
* 09:08 elukey: upgrade hue on an-tool1009 to 4.9
* 09:05 filippo@deploy1002: Finished deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]] (duration: 00m 05s)
* 09:05 filippo@deploy1002: Started deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]]
* 09:03 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=mw2280.codfw.wmnet,service=nginx
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 50%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15494 and previous config saved to /var/cache/conftool/dbconfig/20210421-090100-root.json
* 09:00 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1009.eqiad.wmnet
* 08:58 filippo@deploy1002: Finished deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]] (duration: 00m 05s)
* 08:58 filippo@deploy1002: Started deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]]
* 08:58 filippo@deploy1002: Finished deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]] (duration: 00m 05s)
* 08:58 filippo@deploy1002: Started deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]]
* 08:56 filippo@deploy1002: Finished deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]] (duration: 00m 05s)
* 08:55 filippo@deploy1002: Started deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]]
* 08:55 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1009.eqiad.wmnet
* 08:53 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1008.eqiad.wmnet
* 08:53 filippo@deploy1002: Finished deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]] (duration: 00m 05s)
* 08:52 filippo@deploy1002: Started deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]]
* 08:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1007.eqiad.wmnet
* 08:50 filippo@deploy1002: Finished deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]] (duration: 00m 10s)
* 08:50 filippo@deploy1002: Started deploy [librenms/librenms@692b5d5]: Upgrade LibreNMS to 21.4.0 - [[phab:T266987|T266987]]
* 08:47 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1008.eqiad.wmnet
* 08:47 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1007.eqiad.wmnet
* 08:46 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1005.eqiad.wmnet
* 08:46 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1006.eqiad.wmnet
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 25%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15493 and previous config saved to /var/cache/conftool/dbconfig/20210421-084555-root.json
* 08:41 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1006.eqiad.wmnet
* 08:41 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1005.eqiad.wmnet
* 08:40 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1004.eqiad.wmnet
* 08:40 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1003.eqiad.wmnet
* 08:33 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1004.eqiad.wmnet
* 08:33 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1003.eqiad.wmnet
* 08:30 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1002.eqiad.wmnet
* 08:22 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores1001.eqiad.wmnet
* 08:16 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1002.eqiad.wmnet
* 08:16 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores1001.eqiad.wmnet
* 08:10 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2009.codfw.wmnet
* 08:05 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2009.codfw.wmnet
* 08:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2008.codfw.wmnet
* 08:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2007.codfw.wmnet
* 07:59 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2008.codfw.wmnet
* 07:59 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2007.codfw.wmnet
* 07:58 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2006.codfw.wmnet
* 07:58 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2005.codfw.wmnet
* 07:53 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2006.codfw.wmnet
* 07:52 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2005.codfw.wmnet
* 07:52 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-coord1001.eqiad.wmnet with reason: REIMAGE
* 07:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2003.codfw.wmnet
* 07:52 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2004.codfw.wmnet
* 07:50 elukey@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-coord1001.eqiad.wmnet with reason: REIMAGE
* 07:46 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2004.codfw.wmnet
* 07:46 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2003.codfw.wmnet
* 07:44 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2002.codfw.wmnet
* 07:44 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ores2001.codfw.wmnet
* 07:39 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2002.codfw.wmnet
* 07:38 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ores2001.codfw.wmnet
* 06:49 akosiaris@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 06:49 akosiaris@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 06:42 elukey: upload hue_4.9.0-2+deb10u1 to buster-wikimedia
* 06:11 marostegui: Stop MySQL on db1074 to clone db1156 (there will be lag in s2 in wiki replicas) [[phab:T258361|T258361]]
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074 to clone db1156 [[phab:T258361|T258361]]', diff saved to https://phabricator.wikimedia.org/P15491 and previous config saved to /var/cache/conftool/dbconfig/20210421-061019-marostegui.json
* 06:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2082.codfw.wmnet with reason: REIMAGE
* 06:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2077.codfw.wmnet with reason: REIMAGE
* 06:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2082.codfw.wmnet with reason: REIMAGE
* 06:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2077.codfw.wmnet with reason: REIMAGE
* 05:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1086.eqiad.wmnet
* 05:33 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1086.eqiad.wmnet
* 00:38 pt1979@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudnet2004-dev.codfw.wmnet with reason: REIMAGE
* 00:36 pt1979@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudnet2004-dev.codfw.wmnet with reason: REIMAGE
* 00:15 ryankemper: [WDQS] Pooled `wdqs1003`
* 00:14 ryankemper: [WDQS] Pooled `wdqs2008`
* 00:07 ryankemper: `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] wdqs1006.eqiad.wmnet`
* 00:04 ryankemper: [WDQS] pooled `wdqs1004`


== 2021-04-20 ==
== 2021-06-10 ==
* 23:46 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|73544ccb40d9687b54c039aceb05cd033901d86f}}: urwiki: Enable Growth team features in stealth mode ([[phab:T280067|T280067]]) (duration: 00m 57s)
* 23:29 derick@deploy1002: Synchronized php-1.37.0-wmf.9/extensions/Citoid/modules/ve/ve.ui.CitoidInspector.js: Backport: [[gerrit:699288{{!}}CitoidInspector: rename getParameterNames to getOrderedParameterNames (T284786)]] (duration: 00m 57s)
* 23:44 urbanecm@deploy1002: Synchronized wmf-config/config/urwiki.yaml: {{Gerrit|73544ccb40d9687b54c039aceb05cd033901d86f}}: urwiki: Enable Growth team features in stealth mode ([[phab:T280067|T280067]]) (duration: 00m 57s)
* 21:40 urbanecm: End of urbanecm@mwmaint1002:~$ foreachwiki extensions/WikimediaMaintenance/createExtensionTables.php discussiontools # [[phab:T282699|T282699]]
* 23:42 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|73544ccb40d9687b54c039aceb05cd033901d86f}}: urwiki: Enable Growth team features in stealth mode ([[phab:T280067|T280067]]) (duration: 00m 58s)
* 21:36 urbanecm: Start of urbanecm@mwmaint1002:~$ foreachwiki extensions/WikimediaMaintenance/createExtensionTables.php discussiontools # [[phab:T282699|T282699]]
* 23:38 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=urwiki GrowthExperiments # [[phab:T280067|T280067]]
* 21:33 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=testwiki discussiontools # [[phab:T282699|T282699]]
* 23:38 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|314367bca6e924136704911b55fd3e2c929fa704}}: elwiki: Enable Growth team features in stealth mode ([[phab:T280172|T280172]]; 3/3) (duration: 00m 56s)
* 20:13 mutante: installed tftp client on install1003 for debugging
* 23:36 urbanecm@deploy1002: Synchronized wmf-config/config/elwiki.yaml: {{Gerrit|314367bca6e924136704911b55fd3e2c929fa704}}: elwiki: Enable Growth team features in stealth mode ([[phab:T280172|T280172]]; 2/3) (duration: 00m 57s)
* 20:00 jhuneidi@deploy1002: Pruned MediaWiki: 1.37.0-wmf.5 (duration: 03m 33s)
* 23:35 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|314367bca6e924136704911b55fd3e2c929fa704}}: elwiki: Enable Growth team features in stealth mode ([[phab:T280172|T280172]]; 1/3) (duration: 00m 57s)
* 19:31 ryankemper: [[phab:T265547|T265547]] Cleanup following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/698025: `sudo -E cumin -b 5 'P:analytics::cluster::elasticsearch' 'sudo rm -rfv /etc/mjolnir /srv/deployment/search/mjolnir'`
* 23:34 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript deleteEqualMessages.php --wiki=hrwiki --delete
* 19:09 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.37.0-wmf.9  refs [[phab:T281150|T281150]]
* 23:32 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=elwiki GrowthExperiments # [[phab:T280172|T280172]]
* 18:49 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.9/extensions/WikimediaMaintenance/dumpInterwiki.php: {{Gerrit|b21904e326e917f5ac6d7129a4d224380c6e4c21}}: Remove sep11 interwiki link from dumpinterwiki.php (duration: 01m 08s)
* 23:31 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|425d77b73f48b3e16a5aa2c0086f292d370cd17e}}: cawiki: Enable Growth team features in stealth mode ([[phab:T280673|T280673]]; 3/3) (duration: 00m 57s)
* 18:45 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 01m 23s)
* 23:28 Urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist growthexperiments sql.php --cluster=extension1 /srv/mediawiki/php-1.37.0-wmf.1/extensions/GrowthExperiments/maintenance/schemas/mysql/growthexperiments_mentee_data.sql # [[phab:T279587|T279587]]
* 18:39 urbanecm@deploy1002: update-interwiki-cache aborted: Update interwiki cache (duration: 00m 03s)
* 23:28 urbanecm@deploy1002: Synchronized wmf-config/config/cawiki.yaml: {{Gerrit|425d77b73f48b3e16a5aa2c0086f292d370cd17e}}: cawiki: Enable Growth team features in stealth mode ([[phab:T280673|T280673]]; 2/3) (duration: 00m 57s)
* 18:38 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.9/extensions/UniversalLanguageSelector/resources/js/ext.uls.launch.js: {{Gerrit|8aeab139879613782548b20fc11af5e66589e30a}}: Fire language change hook ([[phab:T280770|T280770]]) (duration: 01m 07s)
* 23:27 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|425d77b73f48b3e16a5aa2c0086f292d370cd17e}}: cawiki: Enable Growth team features in stealth mode ([[phab:T280673|T280673]]; 1/3) (duration: 00m 57s)
* 18:05 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|d26968c1c3b3f3e115ff37a9a138d225cabba25a}}: wgWelcomeSurveyExperimentalGroups: Use new syntax in CS.php ([[phab:T284597|T284597]]; [[phab:T284735|T284735]]) (duration: 01m 08s)
* 23:24 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=cawiki GrowthExperiments # [[phab:T280673|T280673]]
* 17:11 moritzm: updating bullseye installer image to latest daily image (kernel ABI changed again) [[phab:T275873|T275873]]
* 23:11 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on furud.codfw.wmnet with reason: REIMAGE
* 17:09 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 23:09 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on furud.codfw.wmnet with reason: REIMAGE
* 17:06 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 23:05 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on flerovium.eqiad.wmnet with reason: REIMAGE
* 17:03 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 23:03 razzi@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on flerovium.eqiad.wmnet with reason: REIMAGE
* 16:53 razzi@cumin1001: END (FAIL) - Cookbook sre.hadoop.roll-restart-masters (exit_code=99)
* 22:14 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:51 moritzm: installing rails security updates
* 22:10 robh@cumin1001: START - Cookbook sre.dns.netbox
* 16:37 krinkle@deploy1002: Synchronized wmf-config/CommonSettings.php: no-op for Beta {{Gerrit|I2a42c222003}} (duration: 01m 07s)
* 21:46 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1016.eqiad.wmnet with reason: REIMAGE
* 16:34 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:44 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1016.eqiad.wmnet with reason: REIMAGE
* 16:29 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 20:52 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=ruwiki # [[phab:T279853|T279853]]
* 16:24 razzi@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters
* 20:48 robh@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephosd1020.wikimedia.org
* 15:09 papaul: power down ms-be2038 for BBU replacement
* 20:41 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=viwiki # [[phab:T279853|T279853]]
* 12:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 100%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16417 and previous config saved to /var/cache/conftool/dbconfig/20210610-123201-root.json
* 20:36 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudcephosd1020.wikimedia.org
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 75%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16416 and previous config saved to /var/cache/conftool/dbconfig/20210610-121657-root.json
* 20:36 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=ukwiki # [[phab:T279853|T279853]]
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 60%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16415 and previous config saved to /var/cache/conftool/dbconfig/20210610-120153-root.json
* 20:35 robh@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudcephosd[1017-1019].wikimedia.org
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 50%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16414 and previous config saved to /var/cache/conftool/dbconfig/20210610-114650-root.json
* 20:34 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=tewiki # [[phab:T279853|T279853]]
* 11:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 40%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16413 and previous config saved to /var/cache/conftool/dbconfig/20210610-113146-root.json
* 20:32 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=svwiki # [[phab:T279853|T279853]]
* 11:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 30%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16412 and previous config saved to /var/cache/conftool/dbconfig/20210610-111643-root.json
* 20:30 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=srwiki # [[phab:T279853|T279853]]
* 11:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 20%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16411 and previous config saved to /var/cache/conftool/dbconfig/20210610-110139-root.json
* 20:29 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=rowiki # [[phab:T279853|T279853]]
* 11:00 jbond@deploy1002: Finished deploy [netbox/deploy@e9f2382]: deploy v2.10.4-wmf4 to netbox-next (duration: 00m 53s)
* 20:27 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=hywiki # [[phab:T279853|T279853]]
* 10:59 jbond@deploy1002: Started deploy [netbox/deploy@e9f2382]: deploy v2.10.4-wmf4 to netbox-next
* 20:22 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=huwiki # [[phab:T279853|T279853]]
* 10:47 topranks: [[phab:T283163|T283163]]: Adding "metric-out minimum-igp" to BGP group Confed_eqord on eqiad, codfw and eqdfw CRs.
* 20:21 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=hrwiki # [[phab:T279853|T279853]]
* 10:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 10%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16410 and previous config saved to /var/cache/conftool/dbconfig/20210610-104635-root.json
* 20:18 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=hewiki # [[phab:T279853|T279853]]
* 10:43 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.9/extensions/WikiEditor/modules/jquery.wikiEditor.js: {{Gerrit|8a17c43c5470b84ba58239bb2cf947dbebf1979f}}: Fix call to renamed var ([[phab:T284716|T284716]]) (duration: 01m 25s)
* 20:16 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=frwiktionary # [[phab:T279853|T279853]]
* 10:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 5%: Repool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16409 and previous config saved to /var/cache/conftool/dbconfig/20210610-103132-root.json
* 20:16 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudcephosd[1017-1019].wikimedia.org
* 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113:3316', diff saved to https://phabricator.wikimedia.org/P16408 and previous config saved to /var/cache/conftool/dbconfig/20210610-103032-marostegui.json
* 20:15 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=euwiki # [[phab:T279853|T279853]]
* 10:29 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 20:13 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=bnwiki # [[phab:T279853|T279853]]
* 10:28 kormat: running optimize tables against pc1009 (pc3) [[phab:T282761|T282761]]
* 20:08 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:25 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 20:03 robh@cumin1001: START - Cookbook sre.dns.netbox
* 10:21 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 19:58 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 100%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16407 and previous config saved to /var/cache/conftool/dbconfig/20210610-101858-root.json
* 19:56 robh@cumin1001: START - Cookbook sre.dns.netbox
* 10:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 75%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16406 and previous config saved to /var/cache/conftool/dbconfig/20210610-100355-root.json
* 19:28 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts cloudcephosd1016.wikimedia.org
* 09:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 60%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16405 and previous config saved to /var/cache/conftool/dbconfig/20210610-094851-root.json
* 18:34 Urbanecm: mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=idwiki # [[phab:T279853|T279853]]
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 50%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16404 and previous config saved to /var/cache/conftool/dbconfig/20210610-093346-root.json
* 18:33 robh@cumin1001: START - Cookbook sre.hosts.decommission for hosts cloudcephosd1016.wikimedia.org
* 09:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113:3316', diff saved to https://phabricator.wikimedia.org/P16402 and previous config saved to /var/cache/conftool/dbconfig/20210610-093003-marostegui.json
* 18:29 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/GrowthExperiments/: {{Gerrit|4d1969d}}: {{Gerrit|1fbb8e9}}: MentorStore: Set wasPosted to true in command line mode ([[phab:T275773|T275773]]) (duration: 00m 59s)
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16401 and previous config saved to /var/cache/conftool/dbconfig/20210610-092246-marostegui.json
* 17:26 XioNoX: boot cr1-codfw:fpc1 - [[phab:T277341|T277341]]
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 40%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16399 and previous config saved to /var/cache/conftool/dbconfig/20210610-091842-root.json
* 17:16 papaul: Adding a MPC7E to cr1-codfw
* 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 30%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16398 and previous config saved to /var/cache/conftool/dbconfig/20210610-090345-root.json
* 16:32 arturo: merging change to core route firewall https://gerrit.wikimedia.org/r/c/operations/homer/public/+/681316 ([[phab:T272587|T272587]])
* 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 30%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16397 and previous config saved to /var/cache/conftool/dbconfig/20210610-090339-root.json
* 16:15 andrewbogott: updating core routers config with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/681315
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 20%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16396 and previous config saved to /var/cache/conftool/dbconfig/20210610-084841-root.json
* 15:38 hnowlan@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host eventlog1003.eqiad.wmnet
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 20%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16395 and previous config saved to /var/cache/conftool/dbconfig/20210610-084835-root.json
* 15:22 urbanecm@deploy1002: Synchronized docroot/noc/conf/debug.json: {{Gerrit|dc6647b9c674429c0811116e0caca7639b766e77}}: remove mwdebug1003 from list of debug servers ([[phab:T267248|T267248]]) (duration: 00m 58s)
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 10%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16394 and previous config saved to /var/cache/conftool/dbconfig/20210610-083338-root.json
* 15:20 urbanecm@deploy1002: Synchronized debug.json: {{Gerrit|dc6647b9c674429c0811116e0caca7639b766e77}}: remove mwdebug1003 from list of debug servers ([[phab:T267248|T267248]]) (duration: 00m 57s)
* 08:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 10%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16393 and previous config saved to /var/cache/conftool/dbconfig/20210610-083332-root.json
* 15:14 hnowlan@cumin1001: START - Cookbook sre.ganeti.makevm for new host eventlog1003.eqiad.wmnet
* 08:25 volans: uploaded spicerack_0.0.53 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 15:08 pt1979@cumin2001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 5%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16392 and previous config saved to /var/cache/conftool/dbconfig/20210610-081834-root.json
* 15:02 pt1979@cumin2001: START - Cookbook sre.dns.netbox
* 08:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3317 (re)pooling @ 5%: Repool db1098:3317 after schema change', diff saved to https://phabricator.wikimedia.org/P16391 and previous config saved to /var/cache/conftool/dbconfig/20210610-081828-root.json
* 14:59 volker-e@deploy1002: Finished deploy [design/style-guide@c4d8314]: Deploy design/style-guide: {{Gerrit|c4d8314}} “Components”: Fix “Buttons” active states (#460) (duration: 00m 07s)
* 08:17 marostegui: Drop several grants from labswiki (wikitech) [[phab:T282074|T282074]]
* 14:58 volker-e@deploy1002: Started deploy [design/style-guide@c4d8314]: Deploy design/style-guide: {{Gerrit|c4d8314}} “Components”: Fix “Buttons” active states (#460)
* 07:57 jynus: reset-failed on cumin1001 after backup rerun
* 14:40 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 07:57 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3317', diff saved to https://phabricator.wikimedia.org/P16389 and previous config saved to /var/cache/conftool/dbconfig/20210610-075702-marostegui.json
* 14:38 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16388 and previous config saved to /var/cache/conftool/dbconfig/20210610-075247-marostegui.json
* 14:37 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 07:44 jynus: retrying s6 snapshots on eqiad, acking demon failure
* 14:35 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 100%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16387 and previous config saved to /var/cache/conftool/dbconfig/20210610-073727-root.json
* 14:34 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 75%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16386 and previous config saved to /var/cache/conftool/dbconfig/20210610-072224-root.json
* 14:31 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 50%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16385 and previous config saved to /var/cache/conftool/dbconfig/20210610-070720-root.json
* 14:30 moritzm: installing exim updates from Buster point release
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1098:3316 (re)pooling @ 25%: Repool db1098:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16384 and previous config saved to /var/cache/conftool/dbconfig/20210610-065217-root.json
* 14:27 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16383 and previous config saved to /var/cache/conftool/dbconfig/20210610-064916-root.json
* 14:27 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 06:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316', diff saved to https://phabricator.wikimedia.org/P16382 and previous config saved to /var/cache/conftool/dbconfig/20210610-063745-marostegui.json
* 14:25 otto@deploy1002: Finished deploy [analytics/refinery@fc6767a] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@fc6767a] (duration: 04m 56s)
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16381 and previous config saved to /var/cache/conftool/dbconfig/20210610-063412-root.json
* 14:25 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 06:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16380 and previous config saved to /var/cache/conftool/dbconfig/20210610-061909-root.json
* 14:24 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 06:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 100%: Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16379 and previous config saved to /var/cache/conftool/dbconfig/20210610-061806-root.json
* 14:22 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 06:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16378 and previous config saved to /var/cache/conftool/dbconfig/20210610-060405-root.json
* 14:20 otto@deploy1002: Started deploy [analytics/refinery@fc6767a] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@fc6767a]
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 75%: Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16377 and previous config saved to /var/cache/conftool/dbconfig/20210610-060302-root.json
* 14:18 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 05:53 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16376 and previous config saved to /var/cache/conftool/dbconfig/20210610-055327-marostegui.json
* 14:18 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 05:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 100%: Repool db1130 after upgrade', diff saved to https://phabricator.wikimedia.org/P16375 and previous config saved to /var/cache/conftool/dbconfig/20210610-055037-root.json
* 14:17 otto@deploy1002: Finished deploy [analytics/refinery@fc6767a] (thin): Regular analytics weekly train THIN [analytics/refinery@fc6767a] (duration: 00m 07s)
* 05:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16374 and previous config saved to /var/cache/conftool/dbconfig/20210610-054802-root.json
* 14:17 otto@deploy1002: Started deploy [analytics/refinery@fc6767a] (thin): Regular analytics weekly train THIN [analytics/refinery@fc6767a]
* 05:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 50%: Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16373 and previous config saved to /var/cache/conftool/dbconfig/20210610-054759-root.json
* 14:16 otto@deploy1002: Finished deploy [analytics/refinery@fc6767a]: Regular analytics weekly train - an-launcher1002 retry [analytics/refinery@fc6767a] (duration: 00m 03s)
* 05:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 75%: Repool db1130 after upgrade', diff saved to https://phabricator.wikimedia.org/P16372 and previous config saved to /var/cache/conftool/dbconfig/20210610-053534-root.json
* 14:16 otto@deploy1002: Started deploy [analytics/refinery@fc6767a]: Regular analytics weekly train - an-launcher1002 retry [analytics/refinery@fc6767a]
* 05:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: Repool db1096:3316 after schema change', diff saved to https://phabricator.wikimedia.org/P16371 and previous config saved to /var/cache/conftool/dbconfig/20210610-053259-root.json
* 14:16 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 05:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 25%: Repool db1096:3315 after schema change', diff saved to https://phabricator.wikimedia.org/P16370 and previous config saved to /var/cache/conftool/dbconfig/20210610-053255-root.json
* 14:16 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 05:24 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315', diff saved to https://phabricator.wikimedia.org/P16369 and previous config saved to /var/cache/conftool/dbconfig/20210610-052421-marostegui.json
* 14:16 otto@deploy1002: Finished deploy [analytics/refinery@fc6767a]: Regular analytics weekly train - an-launcher1002 retry\ [analytics/refinery@fc6767a] (duration: 00m 03s)
* 05:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 50%: Repool db1130 after upgrade', diff saved to https://phabricator.wikimedia.org/P16368 and previous config saved to /var/cache/conftool/dbconfig/20210610-052030-root.json
* 14:15 otto@deploy1002: Started deploy [analytics/refinery@fc6767a]: Regular analytics weekly train - an-launcher1002 retry\ [analytics/refinery@fc6767a]
* 05:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16367 and previous config saved to /var/cache/conftool/dbconfig/20210610-052017-marostegui.json
* 14:15 otto@deploy1002: Finished deploy [analytics/refinery@fc6767a]: Regular analytics weekly train - an-launcher1002 retry\ [analytics/refinery@fc6767a] (duration: 00m 03s)
* 05:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1130 (re)pooling @ 25%: Repool db1130 after upgrade', diff saved to https://phabricator.wikimedia.org/P16366 and previous config saved to /var/cache/conftool/dbconfig/20210610-050526-root.json
* 14:14 otto@deploy1002: Started deploy [analytics/refinery@fc6767a]: Regular analytics weekly train - an-launcher1002 retry\ [analytics/refinery@fc6767a]
* 14:14 otto@deploy1002: Finished deploy [analytics/refinery@fc6767a]: Regular analytics weekly train [analytics/refinery@fc6767a] (duration: 14m 50s)
* 14:11 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 14:09 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:09 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 14:06 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'canary' .
* 14:06 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:04 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:01 jiji@cumin1001: conftool action : set/pooled=no; selector: name=mw2280.codfw.wmnet,cluster=videoscaler
* 13:59 otto@deploy1002: Started deploy [analytics/refinery@fc6767a]: Regular analytics weekly train [analytics/refinery@fc6767a]
* 13:42 moritzm: upgrading mw1276 to PHP 7.2.34
* 13:40 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 13:40 ayounsi@deploy1002: Finished deploy [homer/deploy@759f82c]: Homer release v0.2.7 (duration: 00m 13s)
* 13:40 ayounsi@deploy1002: Started deploy [homer/deploy@759f82c]: Homer release v0.2.7
* 13:38 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 13:36 otto@deploy1002: Finished deploy [analytics/aqs/deploy@ad170d4]: deploy Refactor pageviews per-article endpoint (duration: 05m 17s)
* 13:35 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 13:35 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' .
* 13:33 moritzm: upgrading mw1261 to PHP 7.2.34
* 13:31 otto@deploy1002: Started deploy [analytics/aqs/deploy@ad170d4]: deploy Refactor pageviews per-article endpoint
* 13:27 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'similar-users' for release 'main' .
* 13:26 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'similar-users' for release 'main' .
* 13:25 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'similar-users' for release 'main' .
* 13:22 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 13:21 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 13:19 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 13:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 13:14 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cumin2002.codfw.wmnet with reason: REIMAGE
* 13:13 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.1/includes/actions/RollbackAction.php: {{Gerrit|ccbfcf28a2f507ed40dcf7af748c30f581b5079f}}: Do not mark rollbacks as bot edits ([[phab:T280655|T280655]]) (duration: 00m 57s)
* 13:12 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 13:12 jmm@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on cumin2002.codfw.wmnet with reason: REIMAGE
* 13:09 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 13:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' .
* 13:07 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' .
* 13:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2076.codfw.wmnet with reason: REIMAGE
* 13:03 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .
* 13:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2076.codfw.wmnet with reason: REIMAGE
* 12:58 moritzm: reimaging cumin2002 to bullseye [[phab:T276589|T276589]]
* 12:55 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 12:54 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 12:52 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 12:51 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'recommendation-api' for release 'production' .
* 12:49 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'recommendation-api' for release 'production' .
* 12:47 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'recommendation-api' for release 'production' .
* 12:42 moritzm: uploaded PHP 7.2.34-18+0~20210223.60+debian10~1.gbpb21322+wmf1 to component/php72
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1165 to check its tables [[phab:T280492|T280492]]', diff saved to https://phabricator.wikimedia.org/P15483 and previous config saved to /var/cache/conftool/dbconfig/20210420-124118-marostegui.json
* 12:28 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5003.eqsin.wmnet
* 12:28 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams-internal' for release 'main' .
* 12:27 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams-internal' for release 'main' .
* 12:25 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventstreams-internal' for release 'main' .
* 12:23 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti5003.eqsin.wmnet
* 12:21 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 12:21 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 12:18 CFisch_WMDE: European mid-day backport window done
* 12:05 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:681321{{!}}Add NS_PROJECT alias for azwiki (T280577)]] (duration: 00m 57s)
* 12:04 moritzm: drain ganeti5003
* 11:55 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 11:55 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventstreams' for release 'canary' .
* 11:54 wmde-fisch@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/DiscussionTools/includes/CommentFormatter.php: Backport: [[gerrit:681153{{!}}CommentFormatter: Add ext-discussiontools-section class instead of overwriting (T280433)]] (duration: 00m 57s)
* 11:47 moritzm: failover ganeti master in eqsin to ganeti5001
* 11:46 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventstreams' for release 'production' .
* 11:38 wmde-fisch@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/VisualEditor/modules/ve-mw/ui/pages/ve.ui.MWParameterPage.js: Backport: [[gerrit:679462{{!}}Add filtering for the suggested values combo box (T271898)]] (duration: 00m 58s)
* 11:15 wmde-fisch@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:676930{{!}}Add default import sources (T214139)]] (duration: 00m 58s)
* 11:11 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 11:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 11:07 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 10:49 _joe_: temporary installing some python packages on deploy1002 for testing
* 10:43 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5001.eqsin.wmnet
* 10:34 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host ganeti5001.eqsin.wmnet
* 10:20 moritzm: drain ganeti5001
* 10:11 hnowlan: opening access to cassandra on new AQS hosts (aqs101*) to analytics-in4 filter
* 10:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict1001.eqiad.wmnet
* 10:04 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host aphlict1001.eqiad.wmnet
* 09:42 volans@cumin2001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cumin2001.codfw.wmnet,cumin1001.eqiad.wmnet
* 09:42 volans@cumin2001: START - Cookbook sre.hosts.remove-downtime for cumin2001.codfw.wmnet,cumin1001.eqiad.wmnet
* 09:42 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: REIMAGE
* 09:40 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1001.eqiad.wmnet with reason: REIMAGE
* 09:38 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1002.eqiad.wmnet with reason: REIMAGE
* 09:38 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1001.eqiad.wmnet with reason: REIMAGE
* 09:20 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 09:20 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 08:58 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 08:58 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 08:54 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe1003.eqiad.wmnet with reason: REIMAGE
* 08:51 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe1003.eqiad.wmnet with reason: REIMAGE
* 08:50 kharlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 08:17 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host orespoolcounter1003.eqiad.wmnet
* 08:15 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host orespoolcounter1003.eqiad.wmnet
* 08:14 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host orespoolcounter1004.eqiad.wmnet
* 08:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host orespoolcounter1004.eqiad.wmnet
* 08:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2128.codfw.wmnet with reason: REIMAGE
* 08:10 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host orespoolcounter2004.codfw.wmnet
* 08:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2128.codfw.wmnet with reason: REIMAGE
* 08:09 dcaro: reprepro updating thirdparty/ceph-octopus repo
* 08:08 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host orespoolcounter2004.codfw.wmnet
* 08:07 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe1002.eqiad.wmnet with reason: REIMAGE
* 08:06 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host orespoolcounter2003.codfw.wmnet
* 08:05 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe1002.eqiad.wmnet with reason: REIMAGE
* 08:04 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host orespoolcounter2003.codfw.wmnet
* 07:59 marostegui@cumin1001: dbctl commit (dc=all): 'Remove db1086 from dbctl [[phab:T278229|T278229]]', diff saved to https://phabricator.wikimedia.org/P15482 and previous config saved to /var/cache/conftool/dbconfig/20210420-075949-marostegui.json
* 07:38 XioNoX: BGP: prioritize directly connected peers - [[phab:T280054|T280054]]
* 07:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 100%: Repool db1161', diff saved to https://phabricator.wikimedia.org/P15480 and previous config saved to /var/cache/conftool/dbconfig/20210420-073808-root.json
* 07:35 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe2003.codfw.wmnet with reason: REIMAGE
* 07:33 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe2003.codfw.wmnet with reason: REIMAGE
* 07:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 75%: Repool db1161', diff saved to https://phabricator.wikimedia.org/P15479 and previous config saved to /var/cache/conftool/dbconfig/20210420-072305-root.json
* 07:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 50%: Repool db1161', diff saved to https://phabricator.wikimedia.org/P15478 and previous config saved to /var/cache/conftool/dbconfig/20210420-070801-root.json
* 07:05 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2074.codfw.wmnet with reason: REIMAGE
* 07:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2074.codfw.wmnet with reason: REIMAGE
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 25%: Repool db1161', diff saved to https://phabricator.wikimedia.org/P15477 and previous config saved to /var/cache/conftool/dbconfig/20210420-065257-root.json
* 06:38 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2127.codfw.wmnet with reason: REIMAGE
* 06:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2127.codfw.wmnet with reason: REIMAGE
* 06:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2073.codfw.wmnet with reason: REIMAGE
* 06:14 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2074.codfw.wmnet with reason: REIMAGE
* 06:13 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2073.codfw.wmnet with reason: REIMAGE
* 06:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2105.codfw.wmnet with reason: REIMAGE
* 06:11 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2074.codfw.wmnet with reason: REIMAGE
* 06:10 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2105.codfw.wmnet with reason: REIMAGE


== 2021-04-19 ==
== 2021-06-09 ==
* 22:56 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1040.eqiad.wmnet with reason: REIMAGE
* 22:12 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh1002.wikimedia.org
* 22:53 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1040.eqiad.wmnet with reason: REIMAGE
* 22:03 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh1002.wikimedia.org
* 22:37 Trey314159: reindexing wikidata on cloudelastic finished/failed ([[phab:T274200|T274200]])
* 21:59 dzahn@cumin1001: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host doh1002.wikimedia.org
* 22:37 Trey314159: reindexing commons and wikidata on elastic@eqiad finished/failed ([[phab:T274200|T274200]])
* 21:53 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh1002.wikimedia.org
* 21:46 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1018.wikimedia.org with reason: REIMAGE
* 21:51 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh1001.wikimedia.org
* 21:44 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1018.wikimedia.org with reason: REIMAGE
* 21:42 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh1001.wikimedia.org
* 21:03 sbassett: Deployed security patch for [[phab:T280226|T280226]]
* 21:42 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.9/extensions/DiscussionTools/modules/dt-ve/CommentTargetWidget.less: Backport: [[gerrit:698681{{!}}Update surface styles for VE changes (T284567)]] (duration: 01m 14s)
* 19:56 dcausse: repool wdqs1005
* 21:40 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.9/includes/language/LanguageConverter.php: Backport: [[gerrit:699014{{!}}Revert "Add type hint to constructor of LanguageConverter" (T284685)]] (duration: 01m 24s)
* 19:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor2004.codfw.wmnet
* 21:08 mutante: rsyncing static-bugzilla HTML from miscweb1002 to deploy1002
* 19:04 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host thumbor2004.codfw.wmnet
* 21:00 mutante: deploy1002 - creating temp dir /srv/miscweb to rsync static-bugzilla data to, coming from miscweb1002 [[phab:T281538|T281538]]
* 18:57 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor2003.codfw.wmnet
* 20:36 mutante: deployed temp ferm change on deployment servers to let miscweb dump data, puppetized. scap pull from mwdebug1001 works, deployment good to go
* 18:56 ppchelko@deploy1002: Synchronized php-1.37.0-wmf.1/tests: Factor out rollback logic from WikiPage - /tests (duration: 00m 59s)
* 19:08 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.37.0-wmf.9  refs [[phab:T281150|T281150]] (duration: 01m 07s)
* 18:55 ppchelko@deploy1002: Synchronized php-1.37.0-wmf.1/maintenance: Factor out rollback logic from WikiPage - /maintenance (duration: 00m 57s)
* 19:06 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.9  refs [[phab:T281150|T281150]]
* 18:51 ppchelko@deploy1002: Synchronized php-1.37.0-wmf.1/includes/: Factor out rollback logic from WikiPage - /includes (duration: 01m 01s)
* 18:07 Krinkle: krinkle@mwmaint1002$ mwscript deleteEqualMessages.php (foreachwiki)
* 18:47 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host thumbor2003.codfw.wmnet
* 17:52 Krinkle: krinkle@mwmaint1002$ mwscript deleteEqualMessages.php --wiki rmywiki
* 18:47 jiji@cumin1001: conftool action : set/pooled=yes; selector: cluster=thumbor,name=thumbor2001.codfw.wmnet
* 17:32 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cloudmetrics1002.eqiad.wmnet
* 18:44 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor2002.codfw.wmnet
* 17:32 aborrero@cumin1001: START - Cookbook sre.hosts.remove-downtime for cloudmetrics1002.eqiad.wmnet
* 18:39 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: [[phab:T274436|T274436]] Math: Enable RESTBase-less Wikidata math validation (duration: 00m 56s)
* 17:30 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps2009.codfw.wmnet with reason: Rebuilding as buster master
* 18:34 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host thumbor2002.codfw.wmnet
* 17:29 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps2009.codfw.wmnet with reason: Rebuilding as buster master
* 18:21 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: [[phab:T249745|T249745]] [EventBus] Make eventage-main timeout consistent with envoy (duration: 00m 56s)
* 17:16 jayme: updated python3-docker-report to 0.0.12 on chartmuseum2001.codfw.wmnet,chartmuseum1001.eqiad.wmnet,deneb.codfw.wmnet,registry[2003-2008].codfw.wmnet,registry[1003-1004].eqiad.wmnet
* 18:13 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/DiscussionTools/: {{Gerrit|66d137b75a7073c7162c443cc8c6ec6f3be714e0}}: Remove <header> tags around headings for compat with MobileFrontend ([[phab:T280433|T280433]]) (duration: 00m 59s)
* 16:35 jayme: import docker-report 0.0.12 into buster-wikimedia
* 18:03 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor2001.codfw.wmnet
* 15:37 hnowlan: rebuilding maps2009 as buster master
* 18:02 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/GrowthExperiments/includes/Mentorship/Store/DatabaseMentorStore.php: {{Gerrit|0233507470377f6ac45768e345cd2e359e5d0e57}}: DatabaseMentorStore: Fix deprecation warning in upsert query ([[phab:T280525|T280525]]) (duration: 00m 57s)
* 15:08 vgutierrez: restarting acme-chief on acmechief1001
* 17:47 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host thumbor2001.codfw.wmnet
* 15:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps2009.codfw.wmnet with reason: Rebuilding as buster master
* 17:40 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor1004.eqiad.wmnet
* 15:02 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps2009.codfw.wmnet with reason: Rebuilding as buster master
* 17:23 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host thumbor1004.eqiad.wmnet
* 15:01 volans@deploy1002: Finished deploy [netbox/deploy@91fd299]: Release v2.10.4-wmf3 to netbox-next.w.o (duration: 00m 55s)
* 17:20 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor1003.eqiad.wmnet
* 15:00 volans@deploy1002: Started deploy [netbox/deploy@91fd299]: Release v2.10.4-wmf3 to netbox-next.w.o
* 17:11 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host thumbor1003.eqiad.wmnet
* 14:57 volans@deploy1002: Finished deploy [netbox/deploy@91fd299]: Release v2.10.4-wmf3 to netbox-next.w.o (duration: 00m 04s)
* 17:08 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor1002.eqiad.wmnet
* 14:57 volans@deploy1002: Started deploy [netbox/deploy@91fd299]: Release v2.10.4-wmf3 to netbox-next.w.o
* 16:57 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host thumbor1002.eqiad.wmnet
* 14:51 volans@deploy1002: Finished deploy [netbox/deploy@91fd299]: Release v2.10.4-wmf3 to netbox-next.w.o (duration: 00m 15s)
* 16:57 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thumbor1001.eqiad.wmnet
* 14:50 volans@deploy1002: Started deploy [netbox/deploy@91fd299]: Release v2.10.4-wmf3 to netbox-next.w.o
* 16:48 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host thumbor1001.eqiad.wmnet
* 14:45 moritzm: installing postgresql 9.6 security updates on stretch
* 16:25 hoo: Updated the Wikidata property suggester with data from the 2021-04-12 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 14:37 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Migrate WMDEBanner* schemas to EventPlatform on all wikis - [[phab:T282562|T282562]] (duration: 01m 06s)
* 16:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 100%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15474 and previous config saved to /var/cache/conftool/dbconfig/20210419-161134-root.json
* 14:33 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Migrate LandingPageImpression schema to EventPlatform on all wikis - [[phab:T282855|T282855]] (duration: 01m 06s)
* 15:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 90%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15473 and previous config saved to /var/cache/conftool/dbconfig/20210419-155631-root.json
* 14:23 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Migrate LandingPageImpression schema to EventPlatform on testwiki - [[phab:T282855|T282855]] (duration: 01m 07s)
* 15:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 80%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15472 and previous config saved to /var/cache/conftool/dbconfig/20210419-154127-root.json
* 14:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Repool db1166 after schema change', diff saved to https://phabricator.wikimedia.org/P16358 and previous config saved to /var/cache/conftool/dbconfig/20210609-141807-root.json
* 15:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 70%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15471 and previous config saved to /var/cache/conftool/dbconfig/20210419-152623-root.json
* 14:08 hnowlan@puppetmaster1001: conftool action : set/weight=0; selector: name=maps2009.codfw.wmnet
* 15:24 volans: reverted debmonitor-client to 0.2.0-1 on apt.w.o for jessie-wikimedia
* 14:08 hnowlan@puppetmaster1001: conftool action : set/weight=10; selector: name=maps2009.codfw.wmnet
* 15:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 60%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15470 and previous config saved to /var/cache/conftool/dbconfig/20210419-151119-root.json
* 13:59 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Migrate WMDEBanner* schemas to EventPlatform on testwiki - [[phab:T282562|T282562]] (duration: 01m 08s)
* 14:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 50%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15469 and previous config saved to /var/cache/conftool/dbconfig/20210419-145616-root.json
* 13:56 XioNoX: upgrade Routinator 3000 to 0.9.0 on rpki1001 - [[phab:T282469|T282469]]
* 14:53 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Rename RelatedArticles wmg variables to wg (duration: 00m 56s)
* 13:54 XioNoX: Add Routinator 3000 0.9.0 to the APT repo - [[phab:T282469|T282469]]
* 14:53 jbond42: update debmonitor-client - [[phab:T280484|T280484]]
* 13:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: Repool db1166 after schema change', diff saved to https://phabricator.wikimedia.org/P16356 and previous config saved to /var/cache/conftool/dbconfig/20210609-134800-root.json
* 14:52 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: Remove RelatedArticles extension function and wmg to wg mapping (duration: 00m 56s)
* 13:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Repool db1166 after schema change', diff saved to https://phabricator.wikimedia.org/P16355 and previous config saved to /var/cache/conftool/dbconfig/20210609-133257-root.json
* 14:48 reedy@deploy1002: Synchronized wmf-config/PoolCounterSettings.php: Use namespaced PoolCounter Client (duration: 00m 57s)
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P16354 and previous config saved to /var/cache/conftool/dbconfig/20210609-132958-marostegui.json
* 14:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1086 [[phab:T278229|T278229]]', diff saved to https://phabricator.wikimedia.org/P15468 and previous config saved to /var/cache/conftool/dbconfig/20210419-144422-marostegui.json
* 13:12 moritzm: installing nginx security updates
* 14:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 40%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15467 and previous config saved to /var/cache/conftool/dbconfig/20210419-144112-root.json
* 13:10 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: test master with 698968 (duration: 02m 26s)
* 14:41 volans: uploaded debmonitor-client 0.2.8 to apt.w.o for jessie, stretch, buster, bullseye
* 13:07 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: test master with 698968
* 14:29 hnowlan: imported envoyproxy_1.16.3-1 debs to envoy-future component
* 13:07 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: test master with 698968 (duration: 00m 10s)
* 14:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 30%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15466 and previous config saved to /var/cache/conftool/dbconfig/20210419-142608-root.json
* 13:07 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: test master with 698968
* 14:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 20%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15465 and previous config saved to /var/cache/conftool/dbconfig/20210419-141105-root.json
* 13:07 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: test master with 698968 (duration: 01m 14s)
* 13:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 15%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15464 and previous config saved to /var/cache/conftool/dbconfig/20210419-135601-root.json
* 13:05 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: test master with 698968
* 13:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 10%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15463 and previous config saved to /var/cache/conftool/dbconfig/20210419-134057-root.json
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 100%: Repool db1143 after schema change', diff saved to https://phabricator.wikimedia.org/P16351 and previous config saved to /var/cache/conftool/dbconfig/20210609-130114-root.json
* 13:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1182 (re)pooling @ 5%: Slowly pool db1182 for the first time in s2 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15462 and previous config saved to /var/cache/conftool/dbconfig/20210419-132554-root.json
* 12:50 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2038.codfw.wmnet
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1182 in s2 for the first time with minimal weight [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15461 and previous config saved to /var/cache/conftool/dbconfig/20210419-131936-marostegui.json
* 12:47 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: roll back to HEAD~1 (duration: 00m 53s)
* 13:15 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1182 in s2 for the first time with minimal weight [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15460 and previous config saved to /var/cache/conftool/dbconfig/20210419-131501-marostegui.json
* 12:46 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: roll back to HEAD~1
* 12:58 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bd076306c0ae0428ff13743f499b2a02d42b6eab}}: wgGEMentorshipMigrationStage: Set to WRITE_BOTH/READ_OLD everywhere ([[phab:T279853|T279853]]) (duration: 00m 57s)
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 75%: Repool db1143 after schema change', diff saved to https://phabricator.wikimedia.org/P16350 and previous config saved to /var/cache/conftool/dbconfig/20210609-124610-root.json
* 12:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1182', diff saved to https://phabricator.wikimedia.org/P15459 and previous config saved to /var/cache/conftool/dbconfig/20210419-125600-marostegui.json
* 12:43 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: (no justification provided) (duration: 00m 28s)
* 12:54 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1182 in s2 for the first time with minimal weight [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15458 and previous config saved to /var/cache/conftool/dbconfig/20210419-125407-marostegui.json
* 12:42 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: (no justification provided)
* 12:53 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1182 to dbctl [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15457 and previous config saved to /var/cache/conftool/dbconfig/20210419-125301-marostegui.json
* 12:42 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: (no justification provided) (duration: 01m 08s)
* 12:51 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ef0f68e2a9c1c638911bb06c47ba6e8ef88ee393}}: testwiki: wgGEMentorshipMigrationStage: Set to WRITE_BOTH/READ_NEW ([[phab:T279853|T279853]]) (duration: 00m 57s)
* 12:41 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: (no justification provided)
* 12:38 Urbanecm: mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=cswiki # [[phab:T279853|T279853]]
* 12:41 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: (no justification provided) (duration: 00m 47s)
* 12:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2126.codfw.wmnet with reason: REIMAGE
* 12:40 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: (no justification provided)
* 12:34 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3e3cce192f1e99cbcae739f234271411d10974ac}}: cswiki: wgGEMentorshipMigrationStage: Set to WRITE_BOTH/READ_OLD ([[phab:T279853|T279853]]) (duration: 00m 58s)
* 12:39 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: (no justification provided) (duration: 00m 41s)
* 12:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2126.codfw.wmnet with reason: REIMAGE
* 12:39 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: (no justification provided)
* 12:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2072.codfw.wmnet with reason: REIMAGE
* 12:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 100%: Repool db1141 after schema change', diff saved to https://phabricator.wikimedia.org/P16349 and previous config saved to /var/cache/conftool/dbconfig/20210609-123615-root.json
* 12:24 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2072.codfw.wmnet with reason: REIMAGE
* 12:35 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be2038.codfw.wmnet
* 11:39 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1002.eqiad.wmnet
* 12:33 godog: lists1001:rm /var/lib/prometheus/node.d/mailman_queues.prom
* 11:37 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1001.eqiad.wmnet
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 50%: Repool db1143 after schema change', diff saved to https://phabricator.wikimedia.org/P16348 and previous config saved to /var/cache/conftool/dbconfig/20210609-123106-root.json
* 11:33 moritzm: imported debdeploy 0.0.99.13-1+deb11u1 to bullseye-wikimedia [[phab:T275873|T275873]]
* 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 75%: Repool db1141 after schema change', diff saved to https://phabricator.wikimedia.org/P16347 and previous config saved to /var/cache/conftool/dbconfig/20210609-122111-root.json
* 11:27 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=testwiki --force # [[phab:T279853|T279853]]
* 12:18 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: (no justification provided) (duration: 03m 38s)
* 11:11 Urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/migrateMentorMenteeRelationship.php --wiki=testwiki # [[phab:T279853|T279853]]
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1143 (re)pooling @ 25%: Repool db1143 after schema change', diff saved to https://phabricator.wikimedia.org/P16345 and previous config saved to /var/cache/conftool/dbconfig/20210609-121603-root.json
* 11:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|03f8ed819091624f5ae4a8d7ed3631dc322fabcd}}: testwiki: wgGEMentorshipMigrationStage: Set to WRITE_BOTH/READ_OLD ([[phab:T279853|T279853]]) (duration: 00m 57s)
* 12:15 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1143', diff saved to https://phabricator.wikimedia.org/P16344 and previous config saved to /var/cache/conftool/dbconfig/20210609-121501-marostegui.json
* 11:05 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:680871{{!}}Disable legacy javascript variable for the rest of wikis (T72470)]] (duration: 00m 57s)
* 12:14 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: (no justification provided)
* 11:02 moritzm: import promethus-rsyslog-exporter for bullseye-wikimedia/main
* 12:13 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: (no justification provided) (duration: 00m 53s)
* 11:01 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1002.eqiad.wmnet
* 12:12 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: (no justification provided)
* 11:01 aborrero@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudgw1001.eqiad.wmnet
* 12:10 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: (no justification provided) (duration: 00m 44s)
* 10:46 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:681008{{!}} Bumping portals to master (T128546)]] (duration: 00m 57s)
* 12:09 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: (no justification provided)
* 10:45 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:681008{{!}} Bumping portals to master (T128546)]] (duration: 00m 58s)
* 12:09 hnowlan: running `nodetool decommission` on maps2009
* 10:34 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on theemin.codfw.wmnet with reason: REIMAGE
* 12:06 hnowlan: stopped tilerator on maps2009
* 10:32 jmm@cumin2001: START - Cookbook sre.hosts.downtime for 2:00:00 on theemin.codfw.wmnet with reason: REIMAGE
* 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 50%: Repool db1141 after schema change', diff saved to https://phabricator.wikimedia.org/P16343 and previous config saved to /var/cache/conftool/dbconfig/20210609-120608-root.json
* 10:24 hnowlan: imported 1.16.3 into envoy-future
* 12:05 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps2009.codfw.wmnet with reason: Postgis version juggling
* 10:22 moritzm: reimaging theemin to bullseye
* 12:05 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on maps2009.codfw.wmnet with reason: Postgis version juggling
* 10:15 dcausse: depooling wdqs1005
* 12:04 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps2009.codfw.wmnet
* 10:10 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE
* 12:03 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: (no justification provided) (duration: 00m 06s)
* 10:08 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1156.eqiad.wmnet with reason: REIMAGE
* 12:03 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: (no justification provided)
* 10:05 arturo: aborrero@apt1001:~ $ sudo -i reprepro --component thirdparty/kubeadm-k8s-1-18 update buster-wikimedia
* 12:00 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ac43baa}}: {{Gerrit|d185728}}: WelcomeSurveyExperimentalGroups: Use new syntax ([[phab:T284599|T284599]]) (duration: 01m 19s)
* 10:04 arturo: aborrero@apt1001:~ $ sudo -i reprepro --delete clearvanished (remove old buster-wikimedia{{!}}thirdparty/kubeadm-k8s-1-15,16 repos and packages)
* 11:59 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: (no justification provided) (duration: 00m 54s)
* 09:56 ema: cp3051: varnish-frontend-restart to apply exp policy settings changes starting from empty cache [[phab:T275809|T275809]]
* 11:58 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: (no justification provided)
* 09:53 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe2002.codfw.wmnet with reason: REIMAGE
* 11:54 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: (no justification provided) (duration: 00m 41s)
* 09:51 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe2002.codfw.wmnet with reason: REIMAGE
* 11:54 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: (no justification provided)
* 09:42 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1161.eqiad.wmnet with reason: REIMAGE
* 11:53 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: (no justification provided) (duration: 03m 11s)
* 09:39 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1161.eqiad.wmnet with reason: REIMAGE
* 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1141 (re)pooling @ 25%: Repool db1141 after schema change', diff saved to https://phabricator.wikimedia.org/P16342 and previous config saved to /var/cache/conftool/dbconfig/20210609-115104-root.json
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15454 and previous config saved to /var/cache/conftool/dbconfig/20210419-092251-root.json
* 11:50 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: (no justification provided)
* 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1161 [[phab:T280492|T280492]]', diff saved to https://phabricator.wikimedia.org/P15453 and previous config saved to /var/cache/conftool/dbconfig/20210419-092234-marostegui.json
* 11:49 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: (no justification provided) (duration: 02m 16s)
* 09:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1079 (re)pooling @ 100%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15452 and previous config saved to /var/cache/conftool/dbconfig/20210419-091535-root.json
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1141', diff saved to https://phabricator.wikimedia.org/P16341 and previous config saved to /var/cache/conftool/dbconfig/20210609-114944-marostegui.json
* 09:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 90%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15451 and previous config saved to /var/cache/conftool/dbconfig/20210419-090747-root.json
* 11:47 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: (no justification provided)
* 09:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1079 (re)pooling @ 75%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15450 and previous config saved to /var/cache/conftool/dbconfig/20210419-090031-root.json
* 11:47 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: (no justification provided) (duration: 00m 05s)
* 08:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 80%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15449 and previous config saved to /var/cache/conftool/dbconfig/20210419-085243-root.json
* 11:46 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: (no justification provided)
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1087 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15448 and previous config saved to /var/cache/conftool/dbconfig/20210419-084834-marostegui.json
* 11:46 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: (no justification provided) (duration: 00m 53s)
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1079 (re)pooling @ 50%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15447 and previous config saved to /var/cache/conftool/dbconfig/20210419-084528-root.json
* 11:45 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: (no justification provided)
* 08:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1087 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15446 and previous config saved to /var/cache/conftool/dbconfig/20210419-084523-marostegui.json
* 11:40 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: redeploy HEAD~1 (duration: 01m 55s)
* 08:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 70%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15445 and previous config saved to /var/cache/conftool/dbconfig/20210419-083740-root.json
* 11:38 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: redeploy HEAD~1
* 08:35 ema: restart debmonitor-client.service on cp4030, dns5002, an-worker1106 [[phab:T280484|T280484]]
* 11:36 jbond@deploy1002: Finished deploy [netbox/deploy@f94ce0f]: redeploy HEAD~1 (duration: 00m 54s)
* 08:34 marostegui: Testing log
* 11:35 jbond@deploy1002: Started deploy [netbox/deploy@f94ce0f]: redeploy HEAD~1
* 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1085 (re)pooling @ 100%: Repool db1085', diff saved to https://phabricator.wikimedia.org/P15444 and previous config saved to /var/cache/conftool/dbconfig/20210419-083021-root.json
* 11:34 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: re-try (duration: 02m 23s)
* 08:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1079 (re)pooling @ 25%: Repool db1079', diff saved to https://phabricator.wikimedia.org/P15443 and previous config saved to /var/cache/conftool/dbconfig/20210419-083018-root.json
* 11:32 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: re-try
* 08:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15442 and previous config saved to /var/cache/conftool/dbconfig/20210419-082559-marostegui.json
* 11:32 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: re-try (duration: 00m 59s)
* 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 60%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15441 and previous config saved to /var/cache/conftool/dbconfig/20210419-082236-root.json
* 11:31 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: re-try
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1082 (re)pooling @ 100%: Repool db1082', diff saved to https://phabricator.wikimedia.org/P15440 and previous config saved to /var/cache/conftool/dbconfig/20210419-082000-root.json
* 11:27 jbond: drop keep_env from sudo config - #[[phab:T275852|T275852]]
* 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1085 (re)pooling @ 75%: Repool db1085', diff saved to https://phabricator.wikimedia.org/P15439 and previous config saved to /var/cache/conftool/dbconfig/20210419-081517-root.json
* 11:22 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: (no justification provided) (duration: 00m 43s)
* 08:07 dcaro@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on labstore1004.eqiad.wmnet with reason: Restarting mysql
* 11:22 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: (no justification provided)
* 08:07 dcaro@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on labstore1004.eqiad.wmnet with reason: Restarting mysql
* 11:21 jbond@deploy1002: Finished deploy [netbox/deploy@98cf8df]: (no justification provided) (duration: 01m 15s)
* 08:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15438 and previous config saved to /var/cache/conftool/dbconfig/20210419-080732-root.json
* 11:20 jbond@deploy1002: Started deploy [netbox/deploy@98cf8df]: (no justification provided)
* 08:07 godog: swift eqiad-prod: less weight for ms-be[1019-1026] / more weight to ms-be106[0-3] - [[phab:T272836|T272836]]
* 11:11 awight: EU deployment window complete
* 08:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1082 (re)pooling @ 75%: Repool db1082', diff saved to https://phabricator.wikimedia.org/P15437 and previous config saved to /var/cache/conftool/dbconfig/20210419-080456-root.json
* 11:10 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:698855{{!}}Set wgAutoConfirmCount to 10 for enwikisource (T284627)]] (duration: 02m 04s)
* 08:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 100%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P15436 and previous config saved to /var/cache/conftool/dbconfig/20210419-080454-root.json
* 10:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1130.eqiad.wmnet with reason: REIMAGE
* 08:03 moritzm: installing python-bleach security updates
* 10:18 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1130.eqiad.wmnet with reason: REIMAGE
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1085 (re)pooling @ 50%: Repool db1085', diff saved to https://phabricator.wikimedia.org/P15435 and previous config saved to /var/cache/conftool/dbconfig/20210419-080013-root.json
* 10:15 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next  (duration: 00m 53s)
* 07:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 40%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15434 and previous config saved to /var/cache/conftool/dbconfig/20210419-075229-root.json
* 10:14 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 07:51 moritzm: upgrade mwdebug2002 to PHP 7.2.34
* 10:13 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next  (duration: 05m 41s)
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1082 (re)pooling @ 50%: Repool db1082', diff saved to https://phabricator.wikimedia.org/P15433 and previous config saved to /var/cache/conftool/dbconfig/20210419-074953-root.json
* 10:07 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 75%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P15432 and previous config saved to /var/cache/conftool/dbconfig/20210419-074950-root.json
* 10:06 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next  (duration: 00m 38s)
* 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1085 (re)pooling @ 25%: Repool db1085', diff saved to https://phabricator.wikimedia.org/P15431 and previous config saved to /var/cache/conftool/dbconfig/20210419-074510-root.json
* 10:06 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 07:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1085 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15430 and previous config saved to /var/cache/conftool/dbconfig/20210419-074155-marostegui.json
* 10:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1130 [[phab:T283235|T283235]]', diff saved to https://phabricator.wikimedia.org/P16337 and previous config saved to /var/cache/conftool/dbconfig/20210609-100423-marostegui.json
* 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 30%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15429 and previous config saved to /var/cache/conftool/dbconfig/20210419-073725-root.json
* 10:00 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next  (duration: 00m 48s)
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1082 (re)pooling @ 25%: Repool db1082', diff saved to https://phabricator.wikimedia.org/P15428 and previous config saved to /var/cache/conftool/dbconfig/20210419-073449-root.json
* 09:59 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 50%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P15427 and previous config saved to /var/cache/conftool/dbconfig/20210419-073446-root.json
* 09:58 moritzm: cleanup now unused nginx mods and former deps (various X11 libs and libxslt) on schema* after switch towards nginx-light [[phab:T164456|T164456]]
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P15426 and previous config saved to /var/cache/conftool/dbconfig/20210419-073425-root.json
* 07:54 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 20%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15425 and previous config saved to /var/cache/conftool/dbconfig/20210419-072221-root.json
* 07:16 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:21 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 06:26 XioNoX: Add 185.71.138.0/24 to network::external and diffscan - [[phab:T252132|T252132]]
* 07:19 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 06:12 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 25%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P15424 and previous config saved to /var/cache/conftool/dbconfig/20210419-071943-root.json
* 05:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 100%: Repool db1135 after dropping an index', diff saved to https://phabricator.wikimedia.org/P16334 and previous config saved to /var/cache/conftool/dbconfig/20210609-053213-root.json
* 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P15423 and previous config saved to /var/cache/conftool/dbconfig/20210419-071921-root.json
* 05:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 75%: Repool db1135 after dropping an index', diff saved to https://phabricator.wikimedia.org/P16333 and previous config saved to /var/cache/conftool/dbconfig/20210609-051710-root.json
* 07:17 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 05:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 50%: Repool db1135 after dropping an index', diff saved to https://phabricator.wikimedia.org/P16332 and previous config saved to /var/cache/conftool/dbconfig/20210609-050206-root.json
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1082 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15422 and previous config saved to /var/cache/conftool/dbconfig/20210419-071701-marostegui.json
* 04:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 25%: Repool db1135 after dropping an index', diff saved to https://phabricator.wikimedia.org/P16331 and previous config saved to /var/cache/conftool/dbconfig/20210609-044703-root.json
* 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 15%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15421 and previous config saved to /var/cache/conftool/dbconfig/20210419-070718-root.json
* 04:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1135 to remove rev_page_id index [[phab:T163532|T163532]]', diff saved to https://phabricator.wikimedia.org/P16330 and previous config saved to /var/cache/conftool/dbconfig/20210609-044428-marostegui.json
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 10%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P15420 and previous config saved to /var/cache/conftool/dbconfig/20210419-070439-root.json
* 04:27 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 07:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P15419 and previous config saved to /var/cache/conftool/dbconfig/20210419-070418-root.json
* 03:30 eileen: civicrm revision changed from {{Gerrit|eac772e9c9}} to {{Gerrit|31d07115a0}}, config revision is {{Gerrit|931a941a5e}}
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15418 and previous config saved to /var/cache/conftool/dbconfig/20210419-070035-marostegui.json
* 03:01 Amir1: mwscript extensions/Cognate/maintenance/populateCognateSites.php --wiki=aawiktionary --site-group wiktionary  ([[phab:T284444|T284444]])
* 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 100%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15417 and previous config saved to /var/cache/conftool/dbconfig/20210419-065627-root.json
* 02:58 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 06:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 10%: Slowly pool db1179 for the first time in s3 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15416 and previous config saved to /var/cache/conftool/dbconfig/20210419-065213-root.json
* 02:56 Amir1: clean up of the rest of mbox files (except arbcom) ([[phab:T282303|T282303]])
* 06:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P15415 and previous config saved to /var/cache/conftool/dbconfig/20210419-064914-root.json
* 02:55 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15414 and previous config saved to /var/cache/conftool/dbconfig/20210419-064600-marostegui.json
* 02:49 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1010.eqiad.wmnet --dest wdqs1009.eqiad.wmnet --reason "xfer categories following reimage" --blazegraph_instance categories --without-lvs` on `ryankemper@cumin1001` tmux session `wdqs_1009`
* 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 75%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15413 and previous config saved to /var/cache/conftool/dbconfig/20210419-064123-root.json
* 02:49 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 50%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15412 and previous config saved to /var/cache/conftool/dbconfig/20210419-062620-root.json
* 02:39 ryankemper: [[phab:T280382|T280382]] Re-enabled puppet on `wdqs1010`
* 06:17 _joe_: upgrading envoy everywhere in eqiad [[phab:T280317|T280317]]
* 01:20 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 25%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15411 and previous config saved to /var/cache/conftool/dbconfig/20210419-061116-root.json
* 00:37 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:698654{{!}}Enable Wikisource OCR on select Wikisources (T283898)]] (duration: 01m 31s)
* 06:10 _joe_: upgrading envoy everywhere in codfw [[phab:T280317|T280317]]
* 00:00 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1010.eqiad.wmnet --dest wdqs1009.eqiad.wmnet --reason "transferring skolemized wikidata.jnl so we can reimage wdqs1009" --blazegraph_instance blazegraph --without-lvs` on `ryankemper@cumin1001` tmux session `wdqs_1009`
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1179 in s3 for the first time with minimal weight [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15410 and previous config saved to /var/cache/conftool/dbconfig/20210419-060321-marostegui.json
* 00:00 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 06:01 _joe_: rolling out further envoy upgrades [[phab:T280317|T280317]]
* 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1074 (re)pooling @ 10%: Repool db1074', diff saved to https://phabricator.wikimedia.org/P15409 and previous config saved to /var/cache/conftool/dbconfig/20210419-055613-root.json
* 05:53 marostegui: Stop sanitarium master on s2 (lag will show up on clouddb* labsdb* hosts) [[phab:T272008|T272008]]
* 05:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15408 and previous config saved to /var/cache/conftool/dbconfig/20210419-055240-marostegui.json
* 05:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106', diff saved to https://phabricator.wikimedia.org/P15407 and previous config saved to /var/cache/conftool/dbconfig/20210419-054831-marostegui.json
* 05:42 marostegui: Stop sanitarium master on s1 (lag will show up on clouddb* labsdb* hosts) [[phab:T272008|T272008]]
* 05:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 [[phab:T272008|T272008]]', diff saved to https://phabricator.wikimedia.org/P15406 and previous config saved to /var/cache/conftool/dbconfig/20210419-054158-marostegui.json
* 05:37 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1179 in s3 for the first time with minimal weight [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15405 and previous config saved to /var/cache/conftool/dbconfig/20210419-053730-marostegui.json
* 05:31 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1179 in s3 for the first time with minimal weight [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15404 and previous config saved to /var/cache/conftool/dbconfig/20210419-053127-marostegui.json
* 05:30 marostegui@cumin1001: dbctl commit (dc=all): 'Add db1179 to dbctl [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15403 and previous config saved to /var/cache/conftool/dbconfig/20210419-053050-marostegui.json
* 05:05 marostegui: Restart m2 database master [[phab:T280251|T280251]]


== 2021-04-18 ==
== 2021-06-08 ==
* 06:40 Amir1: cleaning watchlist of User:Mr._Ibrahem in wikidatawiki (in main ns only)
* 22:36 krinkle@deploy1002: Finished deploy [integration/docroot@d4c9e08]: (no justification provided) (duration: 00m 08s)
* 22:36 krinkle@deploy1002: Started deploy [integration/docroot@d4c9e08]: (no justification provided)
* 22:21 ryankemper: [[phab:T284479|T284479]] Block put back in place. We're back to expected traffic levels. We'll need a more granular mitigation in place before we can lift this block going forward.
* 22:15 ryankemper: [[phab:T284479|T284479]] Successful puppet run on `cp3052`, proceeding to rest of `A:cp-text`: `sudo cumin -b 19 'A:cp-text' 'run-puppet-agent -q'`
* 22:14 ryankemper: [[phab:T284479|T284479]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/698850, running puppet on `cp3052.esams.wmnet`
* 22:10 ryankemper: [[phab:T284479|T284479]] Yup more than enough evidence of a strong upward spike now. Proceeding to revert
* 22:10 ryankemper: [[phab:T284479|T284479]] Already starting to see a large upward spike in requests. Doing a quick sanity check to make sure this is out of the ordinary but I'll likely be putting the block back in place shortly
* 22:09 ryankemper: [[phab:T284479|T284479]] Puppet run complete across all of `cp-text`. Monitoring https://grafana.wikimedia.org/d/000000455/elasticsearch-percentiles?viewPanel=47&orgId=1&from=now-1h&to=now over the next few minutes to see if we see a large spike in `full_text` and `entity_full_text` queries
* 22:03 ryankemper: [[phab:T284479|T284479]] Successful puppet run on `cp3052`, proceeding to rest of `A:cp-text`: `sudo cumin -b 15 'A:cp-text' 'run-puppet-agent -q'`
* 22:01 ryankemper: [[phab:T284479|T284479]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/698849, running puppet on `cp3052.esams.wmnet`
* 21:59 ryankemper: [[phab:T284479|T284479]] Prior context: We put a block on a range of Google App Engine IPs yesterday to protect Cirrussearch from a bad actor; now we're going to try lifting the block and seeing if we're still getting slammed with traffic
* 21:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1009.eqiad.wmnet with reason: REIMAGE
* 21:42 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1009.eqiad.wmnet with reason: REIMAGE
* 21:29 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] wdqs1009.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `wdqs_1009`
* 21:27 ryankemper: [[phab:T280382|T280382]] Disabled puppet on `wdqs1010` out of abundance of caution; will re-enable after wdqs1009 is reimaged and xfer back is complete
* 21:12 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 20:38 bblack: authdns1001: update gdnsd to 3.7.0-2~wmf1
* 20:18 bblack: authdns2001: update gdnsd to 3.7.0-2~wmf1
* 19:55 bblack: dns[1235]002: update gdnsd to 3.7.0-2~wmf1
* 19:53 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.9  refs [[phab:T281150|T281150]]
* 19:46 bblack: dns[1235]001: update gdnsd to 3.7.0-2~wmf1
* 19:43 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:36 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 19:36 ryankemper: [[phab:T280382|T280382]] Cancelling the data-transfer run to restart it; realized that the cookbook will start up the `wdqs-updater` again so will locally hack the cookbook on `cumin1001` to prevent that
* 19:32 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.9/extensions/Echo/modules/nojs/mw.echo.alert.monobook.less: Backport: [[gerrit:698848{{!}}Fix MonoBook orange banner hover styles (T284496)]] (duration: 01m 08s)
* 19:26 bblack: dns400[12]: update gdnsd to 3.7.0-3~wmf1
* 19:25 bblack: apt: update gdnsd package to gdnsd-3.7.0-2~wmf1 (fix systemd reload issues)
* 19:20 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1009.eqiad.wmnet --dest wdqs1010.eqiad.wmnet --reason "transferring skolemized wikidata.jnl so we can reimage wdqs1009" --blazegraph_instance blazegraph --without-lvs` on `ryankemper@cumin1001` tmux session `wdqs_1009`
* 19:20 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:19 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 19:19 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:18 ryankemper: [[phab:T280382|T280382]] `sudo systemctl stop wdqs-updater wdqs-blazegraph` on `wdqs1010` in preparation for transfer
* 19:08 ryankemper: [WDQS] `ryankemper@wdqs1005:~$ sudo pool` (all caught up on lag)
* 18:47 bblack: dns4001: update gdnsd to 3.7.0-1~wmf1
* 18:43 bblack: apt: update gdnsd package to gdnsd-3.7.0-1~wmf1
* 17:49 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'proton' for release 'production' .
* 17:36 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'proton' for release 'production' .
* 17:25 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'proton' for release 'production' .
* 17:10 elukey: fix dbstore1007's ip address in analytics-in4 on cr<nowiki>{</nowiki>1,2<nowiki>}</nowiki>-eqiad
* 17:06 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.37.0-wmf.9  refs [[phab:T281150|T281150]] (duration: 34m 12s)
* 16:32 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.37.0-wmf.9  refs [[phab:T281150|T281150]]
* 16:27 papaul: powerdown  moss-fe2002  for relocation
* 16:06 papaul: powerdown  ms-backup2002  for relocation
* 16:02 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:40 papaul: powerdown ms-be2061 for relocation
* 15:40 bblack@cumin1001: conftool action : set/pooled=yes; selector: name=cp203[34].codfw.wmnet
* 15:33 papaul: powerdown thanos-fe2003 for relocation
* 15:23 Krinkle: mwmaint1002: Running purge-parsercache-now.php on server 4/4 (pc1009) ref P16060, [[phab:T280605|T280605]], [[phab:T282761|T282761]].
* 15:19 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on pc2009.codfw.wmnet,pc1009.eqiad.wmnet with reason: Purging parsercache pc3 [[phab:T282761|T282761]]
* 15:19 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on pc2009.codfw.wmnet,pc1009.eqiad.wmnet with reason: Purging parsercache pc3 [[phab:T282761|T282761]]
* 15:13 papaul: powerdown cp2034 for relocation
* 15:04 papaul: powerdown cp2033 for relocation
* 14:59 bblack@cumin1001: conftool action : set/pooled=no; selector: name=cp203[34].codfw.wmnet
* 14:43 moritzm: cleanup now unused nginx mods and former deps (various X11 libs and libxslt) on testreduce1001/scandium after switch towards nginx-light  [[phab:T164456|T164456]]
* 14:08 marostegui: Restart sanitarium hosts (db2094, db2095, db1154, db1155) to pick up new filters [[phab:T284106|T284106]]
* 14:05 kormat@deploy1002: Synchronized wmf-config/db-eqiad.php: Set pc1010 as pc3 master [[phab:T282761|T282761]] (duration: 00m 57s)
* 14:05 kormat: setting pc1010 as pc3 primary [[phab:T282761|T282761]]
* 13:51 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next  (duration: 00m 42s)
* 13:51 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 13:48 otto@cumin1001: END (PASS) - Cookbook sre.zookeeper.roll-restart-zookeeper (exit_code=0)
* 13:41 otto@cumin1001: START - Cookbook sre.zookeeper.roll-restart-zookeeper
* 13:40 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next  (duration: 00m 47s)
* 13:39 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 13:36 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next  (duration: 01m 03s)
* 13:35 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 13:33 otto@cumin1001: END (PASS) - Cookbook sre.presto.roll-restart-workers (exit_code=0) for Presto analytics cluster: Roll restart of all Presto's jvm daemons. - otto@cumin1001
* 13:22 otto@cumin1001: START - Cookbook sre.presto.roll-restart-workers for Presto analytics cluster: Roll restart of all Presto's jvm daemons. - otto@cumin1001
* 12:15 kormat@deploy1002: Synchronized wmf-config/db-eqiad.php: Repool pc1008 as pc2 master [[phab:T282761|T282761]] (duration: 00m 57s)
* 12:14 kormat: setting pc1008 back as pc2 primary [[phab:T282761|T282761]]
* 11:54 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ef49422b162ab0161bc39da857b3230175ac4492}}: enwiki: Disable indexing on the Book namespace ([[phab:T283522|T283522]]) (duration: 00m 56s)
* 11:46 urbanecm: Start server-side upload for 1 file ([[phab:T283470|T283470]])
* 11:45 moritzm: installing nginx security updates on buster
* 11:43 urbanecm: Start server-side upload for 2 files ([[phab:T283645|T283645]], [[phab:T283583|T283583]])
* 11:39 urbanecm: EU B&C deployment done
* 11:38 kormat@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: reimaged to buster [[phab:T283131|T283131]]', diff saved to https://phabricator.wikimedia.org/P16329 and previous config saved to /var/cache/conftool/dbconfig/20210608-113857-kormat.json
* 11:38 moritzm: installing ruby-nokogiri security updates
* 11:37 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/WikimediaEvents/: {{Gerrit|b0b46530b731d2a5f17b0aa04a4cf99df175e23d}}: universalLanguageSelector: Add missing properties ([[phab:T280770|T280770]]) (duration: 00m 56s)
* 11:32 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/UniversalLanguageSelector/resources/js/ext.uls.launch.js: {{Gerrit|5df13eeae3b52b98eaf3fdb99ddfa5a0f7b2b1e4}}: Pass context to compact_language_links.open hook ([[phab:T280770|T280770]]) (duration: 00m 57s)
* 11:23 kormat@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: reimaged to buster [[phab:T283131|T283131]]', diff saved to https://phabricator.wikimedia.org/P16328 and previous config saved to /var/cache/conftool/dbconfig/20210608-112354-kormat.json
* 11:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|73dc708efc25caa667be516c685885db3983be73}}: lvwiki: Enable Growth features in dark mode ([[phab:T278191|T278191]]; 3/3) (duration: 00m 58s)
* 11:13 urbanecm@deploy1002: Synchronized wmf-config/config/lvwiki.yaml: {{Gerrit|73dc708efc25caa667be516c685885db3983be73}}: lvwiki: Enable Growth features in dark mode ([[phab:T278191|T278191]]; 2/3) (duration: 00m 56s)
* 11:12 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|73dc708efc25caa667be516c685885db3983be73}}: lvwiki: Enable Growth features in dark mode ([[phab:T278191|T278191]]; 1/3) (duration: 00m 57s)
* 11:10 urbanecm: mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=lvwiki growthexperiments # [[phab:T278191|T278191]]
* 11:08 kormat@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 50%: reimaged to buster [[phab:T283131|T283131]]', diff saved to https://phabricator.wikimedia.org/P16327 and previous config saved to /var/cache/conftool/dbconfig/20210608-110850-kormat.json
* 11:05 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|abd401074247d1f1dd2722c2d4d06747b066d547}}: enwiki: Deploy Growth freatures to 2% of new accounts ([[phab:T281896|T281896]]) (duration: 00m 57s)
* 11:01 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on pc2008.codfw.wmnet,pc1008.eqiad.wmnet with reason: Rebooting pc1008
* 11:01 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on pc2008.codfw.wmnet,pc1008.eqiad.wmnet with reason: Rebooting pc1008
* 10:53 kormat@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: reimaged to buster [[phab:T283131|T283131]]', diff saved to https://phabricator.wikimedia.org/P16326 and previous config saved to /var/cache/conftool/dbconfig/20210608-105346-kormat.json
* 10:50 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next (4) (duration: 00m 53s)
* 10:49 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next (4)
* 10:16 liw: testing upcoming Scap release on beta
* 10:01 XioNoX: upgrade Routinator 3000 to 0.9.0 on rpki2001 - [[phab:T282469|T282469]]
* 09:58 jbond@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next (4) (duration: 00m 54s)
* 09:57 jbond@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next (4)
* 09:52 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:04 jayme: removing docker-images from registry: releng/ci-jessie, releng/ci-src-setup, releng/composer-php56, releng/composer-test-php56, releng/npm, releng/npm-test, releng/npm-test-3d2png, releng/npm-test-graphoid, releng/npm-test-librdkafka, releng/npm-test-maps-service, releng/php56, releng/quibble-jessie, releng/quibble-jessie-hhvm, releng/quibble-jessie-php56 - [[phab:T251918|T251918]]
* 08:31 dcausse: depooling wdqs1006 (lag)
* 08:29 dcausse: restarting blazegraph on wdqs1006
* 08:19 elukey@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:13 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:13 elukey@cumin1001: START - Cookbook sre.dns.netbox
* 07:49 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2002.codfw.wmnet
* 07:41 jmm@cumin1001: START - Cookbook sre.hosts.reboot-single for host cumin2002.codfw.wmnet
* 07:40 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:37 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:35 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 100%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P16324 and previous config saved to /var/cache/conftool/dbconfig/20210608-072937-root.json
* 07:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 75%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P16323 and previous config saved to /var/cache/conftool/dbconfig/20210608-071433-root.json
* 06:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 50%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P16322 and previous config saved to /var/cache/conftool/dbconfig/20210608-065930-root.json
* 06:52 tgr: [[phab:T283606|T283606]]: running mwscript extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php --wiki=<nowiki>{</nowiki>ar,bn,cs,vi<nowiki>}</nowiki>wiki --verbose --search-index with gerrit:696307 applied
* 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'db1161 (re)pooling @ 25%: Repool after upgrade', diff saved to https://phabricator.wikimedia.org/P16321 and previous config saved to /var/cache/conftool/dbconfig/20210608-064426-root.json
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1161 for upgrade', diff saved to https://phabricator.wikimedia.org/P16320 and previous config saved to /var/cache/conftool/dbconfig/20210608-064055-marostegui.json
* 06:27 elukey: clean some airflow logs on an-airflow1001 as one off to free space (had a chat with the Search team first)
* 05:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2123.codfw.wmnet with reason: REIMAGE
* 05:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2123.codfw.wmnet with reason: REIMAGE
* 05:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2123.codfw.wmnet with reason: REIMAGE
* 05:15 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2123.codfw.wmnet with reason: REIMAGE
* 04:54 marostegui: Repool clouddb1019:3314
* 04:07 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:38 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 02:38 ryankemper: [[phab:T284445|T284445]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1011.eqiad.wmnet --dest wdqs1012.eqiad.wmnet --reason "repairing overinflated blazegraph journal" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs`
* 02:37 ryankemper: [[phab:T284445|T284445]] after manually stopping blazegraph/wdqs-updater, `sudo rm -fv /srv/wdqs/wikidata.jnl` on `wdqs1012` (clearing old overinflated journal file away before xferring new one)
* 02:34 ryankemper: [WDQS] `ryankemper@wdqs1005:~$ sudo depool` (catching up on ~7h of lag)


== 2021-04-17 ==
== 2021-06-07 ==
* 16:16 Amir1: cleaning SuccuBot's watchlist in wikidatawiki
* 21:26 otto@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-brokers (exit_code=0)
* 00:53 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1307.eqiad.wmnet
* 21:12 sbassett: Deployed security patch for [[phab:T284364|T284364]]
* 00:48 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw1307.eqiad.wmnet
* 19:30 ryankemper: [[phab:T284479|T284479]] [Cirrussearch] We'll keep monitoring. For now this incident is resolved. Glancing at our current volume relative to what we'd expect, the numbers we see match what we'd expect. If we're accidentally banning any innocent requests they must be an incredibly small percentage of the total otherwise we'd see significantly lower volume than expected
* 00:23 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1402.eqiad.wmnet
* 19:25 ryankemper: [[phab:T284479|T284479]] [Cirrussearch] Seeing the expected drop in `entity_full_text` requests here: https://grafana-rw.wikimedia.org/d/000000455/elasticsearch-percentiles?viewPanel=47&orgId=1&from=now-12h&to=now As a result we're no longer rejecting any requests
* 00:22 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw1403.eqiad.wmnet
* 19:21 ryankemper: [[phab:T284479|T284479]] [Cirrussearch] We're working on rolling out https://gerrit.wikimedia.org/r/698607, which will ban search API requests that match the Google App Engine IP range `2600:1900::0/28` AND whose user agent includes `HeadlessChrome`
* 00:18 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw1403.eqiad.wmnet
* 19:19 cdanis: [[phab:T284479|T284479]] ✔️ cdanis@cumin1001.eqiad.wmnet ~ 🕞🍵 sudo cumin -b16 'A:cp-text' "run-puppet-agent"
* 00:18 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw1402.eqiad.wmnet
* 19:07 andrew@deploy1002: Finished deploy [horizon/deploy@6199b67]: disable shelve/unshelve [[phab:T284462|T284462]] (duration: 04m 53s)
* 00:14 ryankemper: [[phab:T267927|T267927]] `sudo run-puppet-agent` and `sudo pool` on `wdqs2003`
* 19:02 andrew@deploy1002: Started deploy [horizon/deploy@6199b67]: disable shelve/unshelve [[phab:T284462|T284462]]
* 00:11 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1307.eqiad.wmnet with reason: REIMAGE
* 19:01 andrew@deploy1002: Finished deploy [horizon/deploy@6199b67]: disable shelve/unshelve (duration: 02m 01s)
* 00:09 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1307.eqiad.wmnet with reason: REIMAGE
* 18:59 andrew@deploy1002: Started deploy [horizon/deploy@6199b67]: disable shelve/unshelve
* 00:08 ryankemper: [[phab:T267927|T267927]] Reload of `wdqs2003` complete
* 18:57 herron: prometheus3001: moved /srv back to vda1 filesystem [[phab:T243057|T243057]]
* 00:07 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 18:26 urbanecm: [urbanecm@mwmaint1002 /srv/mediawiki/php-1.37.0-wmf.7]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=skwiki --phab=[[phab:T284149|T284149]]
* 00:00 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1403.eqiad.wmnet with reason: REIMAGE
* 18:24 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/GrowthExperiments/includes/WelcomeSurvey.php: {{Gerrit|368b5d9}}: {{Gerrit|0e79aee}}: WelcomeSurvey backports ([[phab:T284127|T284127]], [[phab:T284257|T284257]]; 2/2) (duration: 00m 57s)
* 18:22 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/GrowthExperiments/extension.json: {{Gerrit|368b5d9}}: {{Gerrit|0e79aee}}: WelcomeSurvey backports ([[phab:T284127|T284127]], [[phab:T284257|T284257]]; 1/2) (duration: 00m 56s)
* 18:20 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|7089728}}: {{Gerrit|b2482fb}}: initWikiConfig GE backports ([[phab:T284072|T284072]]) (duration: 00m 58s)
* 18:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|15e09109b7c45de967a496a0eb58ad267dbc5079}}: skwiki: Make Growth features available in dark mode ([[phab:T284149|T284149]]; 3/3) (duration: 00m 56s)
* 18:14 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|15e09109b7c45de967a496a0eb58ad267dbc5079}}: skwiki: Make Growth features available in dark mode ([[phab:T284149|T284149]]; 2/3) (duration: 00m 56s)
* 18:14 otto@cumin1001: START - Cookbook sre.kafka.roll-restart-brokers
* 18:14 ottomata: rolling restart of kafka jumbo brokers  - [[phab:T283067|T283067]]
* 18:13 urbanecm@deploy1002: Synchronized wmf-config/config/skwiki.yaml: {{Gerrit|15e09109b7c45de967a496a0eb58ad267dbc5079}}: skwiki: Make Growth features available in dark mode ([[phab:T284149|T284149]]; 1/3) (duration: 00m 59s)
* 18:12 otto@cumin1001: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0)
* 18:04 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=skwiki growthexperiments # [[phab:T284149|T284149]]
* 18:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|5de2f8b27b016a2cd8f424d8e40318edde5e5704}}: Set WelcomeSurveyEnableWithHomepage ([[phab:T281896|T281896]], [[phab:T284257|T284257]]) (duration: 00m 59s)
* 17:53 otto@cumin1001: START - Cookbook sre.kafka.roll-restart-mirror-maker
* 17:53 ottomata: rolling restart of kafka jumbo mirror makers  - [[phab:T283067|T283067]]
* 17:17 ryankemper: [Cirrussearch] We're seeing ~10% of current requests being rejected by poolcounter, due to ~2x expected `eqiad.full_text` query volume and ~30x expected `eqiad.entity_full_text` query volume
* 16:56 ryankemper: [WDQS] `ryankemper@wdqs1005:~$ sudo systemctl restart wdqs-blazegraph` (blazegraph locked up)
* 16:51 razzi: run homer '*.eqiad.wmnet' diff
* 16:49 ottomata: restarting mysqld analytics-meta replica on db1108 to apply config change - [[phab:T272973|T272973]]
* 16:31 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@19313f7]: Bump glent jar to 0.2.6 (duration: 04m 29s)
* 16:27 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@19313f7]: Bump glent jar to 0.2.6
* 16:09 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@f236b95]: Bump glent jar to 0.2.6 (duration: 00m 35s)
* 16:09 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@f236b95]: Bump glent jar to 0.2.6
* 14:57 moritzm: installing remaining lz4 security updates on buster
* 14:35 moritzm: installing isc-dhcp security updates
* 14:17 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1113 (s5,s6) after upgrade', diff saved to https://phabricator.wikimedia.org/P16315 and previous config saved to /var/cache/conftool/dbconfig/20210607-141722-marostegui.json
* 14:13 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1113 (s5,s6) for upgrade', diff saved to https://phabricator.wikimedia.org/P16314 and previous config saved to /var/cache/conftool/dbconfig/20210607-141307-marostegui.json
* 13:35 volans@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next (3) (duration: 00m 52s)
* 13:34 volans@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next (3)
* 13:34 moritzm: installing libxml2 security updates on stretch
* 13:32 volans@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next (duration: 01m 14s)
* 13:31 volans@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 13:28 volans@deploy1002: Finished deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next (duration: 00m 54s)
* 13:27 volans@deploy1002: Started deploy [netbox/deploy@c70df91]: Force deploy of gerrit/672831 to netbox-next
* 12:41 moritzm: removing now obsolete Java 8 packages from gerrit* [[phab:T268225|T268225]]
* 12:36 moritzm: removing now obsolete Java 8 packages from contint* [[phab:T268225|T268225]]
* 12:35 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:32 marostegui@cumin1001: START - Cookbook sre.dns.netbox
* 12:25 moritzm: installing nginx security updates on buster
* 12:22 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=wikimaniawiki --add-prefix=BROKEN --fix # [[phab:T284442|T284442]]
* 12:22 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=wikimaniawiki # [[phab:T284442|T284442]]
* 11:09 Lucas_WMDE: EU backport+config window done
* 11:08 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:697824{{!}}Add 2021 namespaces for wikimania wiki (T284235)]] (duration: 00m 56s)
* 10:48 volans: reset netbox-next DB with the latest prod dump
* 10:42 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:698472{{!}} Bumping portals to master (T128546)]] (duration: 00m 56s)
* 10:41 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:698472{{!}} Bumping portals to master (T128546)]] (duration: 00m 58s)
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1004.wikimedia.org
* 10:38 godog: downgrade grafana to 7.4.2 on grafana2001 - [[phab:T282863|T282863]]
* 10:36 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1157.eqiad.wmnet with reason: REIMAGE
* 10:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1004.wikimedia.org
* 10:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1157.eqiad.wmnet with reason: REIMAGE
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1003.wikimedia.org
* 10:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1003.wikimedia.org
* 10:28 kormat: reimaging db1157 [[phab:T283131|T283131]]
* 10:24 moritzm: remove now obsolete nginx mods and dependencies on htmldumper1001 [[phab:T164456|T164456]]
* 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2006.wikimedia.org
* 10:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2006.wikimedia.org
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica2005.wikimedia.org
* 10:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica2005.wikimedia.org
* 10:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host theemin.codfw.wmnet
* 10:08 kormat@cumin1001: dbctl commit (dc=all): 'db1157 depooling: reimage to buster [[phab:T283131|T283131]]', diff saved to https://phabricator.wikimedia.org/P16311 and previous config saved to /var/cache/conftool/dbconfig/20210607-100822-kormat.json
* 09:55 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host theemin.codfw.wmnet
* 09:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet
* 09:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
* 09:43 moritzm: upgrading bullseye hosts to latest packages in testing
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1002.eqiad.wmnet
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1002.eqiad.wmnet
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2002.codfw.wmnet
* 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2002.codfw.wmnet
* 09:03 moritzm: installing imagemagick security updates on stretch
* 06:05 marostegui: Upgrade mysql on dbstore1003 [[phab:T283235|T283235]]
* 05:57 marostegui: Stop dbstore1004 to clone dbstore1007 [[phab:T283125|T283125]]
* 05:37 marostegui: Depool clouddb1020 (s5, s8) for upgrade
* 05:29 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2113.codfw.wmnet with reason: REIMAGE
* 05:27 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2113.codfw.wmnet with reason: REIMAGE
* 04:48 marostegui: Depool clouddb1019:3314 (long running alter table)


== 2021-04-16 ==
== 2021-06-05 ==
* 23:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mwdebug1003.eqiad.wmnet
* 16:16 Amir1: deleting all private archives of mm2. All are inaccessible now ([[phab:T282303|T282303]])
* 23:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw1402.eqiad.wmnet with reason: REIMAGE
* 15:21 Amir1: delete mbox files of group D and E in mm2 ([[phab:T282303|T282303]])
* 23:56 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1403.eqiad.wmnet with reason: REIMAGE
* 14:35 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:56 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw1402.eqiad.wmnet with reason: REIMAGE
* 00:21 mutante: backup1001 - systemctl baclua-dir works again (restoring backup for non-existing host)
* 23:48 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mwdebug1003.eqiad.wmnet
* 00:18 mutante: backup1001 systemctl reload bacula-dir  fails
* 23:47 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts mwdebug1003.eqiad.wmnet
* 23:47 mutante: decom'ing mwdebug1003, stretch VM created in [[phab:T267248|T267248]]
* 23:39 mutante: reimaging last 3 remaining stretch appservers with buster, mw1307, mw1402, mw1403
* 23:37 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on mw[1402-1403].eqiad.wmnet with reason: reimage
* 23:36 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on mw[1402-1403].eqiad.wmnet with reason: reimage
* 21:08 ejegg: updated fundraising python tools from {{Gerrit|ef54260b0d}} to {{Gerrit|3d950fffbd}}
* 20:40 Trey314159: reindexing wikidata on cloudelastic... AGAIN ([[phab:T274200|T274200]])
* 17:48 ryankemper: [[phab:T267927|T267927]] Transferring from `wdqs2008`->`wdqs2003` to resolve the data corruption on `wdqs2003`
* 17:47 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 17:41 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1020.wikimedia.org with reason: REIMAGE
* 17:39 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1020.wikimedia.org with reason: REIMAGE
* 17:39 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1019.wikimedia.org with reason: REIMAGE
* 17:37 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1019.wikimedia.org with reason: REIMAGE
* 17:35 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1017.wikimedia.org with reason: REIMAGE
* 17:35 mutante: depooling mwdebug1003 (stretch VM, will be removed), mwdebug1001/1002 (buster) and unchanged
* 17:34 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mwdebug1003.eqiad.wmnet
* 17:33 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1016.wikimedia.org with reason: REIMAGE
* 17:33 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1017.wikimedia.org with reason: REIMAGE
* 17:31 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1016.wikimedia.org with reason: REIMAGE
* 17:03 ryankemper: [[phab:T267927|T267927]] Pooled `wdqs1007`, `wdqs2003`, `wdqs1008`, `wdqs2004`
* 17:00 ryankemper: [[phab:T267927|T267927]] Following data transfers complete: `wdqs1004`->`wdqs1007`, `wdqs2001`->`wdqs2003`, `wdqs1003`->`wdqs1008`, `wdqs2008`->`wdqs2004`
* 17:00 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 17:00 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 17:00 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:59 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:13 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 16:13 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'canary' .
* 16:09 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 15:57 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 15:43 urbanecm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 15:43 urbanecm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 15:31 urbanecm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 15:31 urbanecm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 15:22 urbanecm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 14:59 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2023.codfw.wmnet
* 14:58 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on restbase-dev1006.eqiad.wmnet with reason: restarting for kernel update
* 14:58 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:10:00 on restbase-dev1006.eqiad.wmnet with reason: restarting for kernel update
* 14:52 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on restbase-dev[1005-1006].eqiad.wmnet with reason: restarting for kernel update
* 14:51 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:10:00 on restbase-dev[1005-1006].eqiad.wmnet with reason: restarting for kernel update
* 14:50 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2023.codfw.wmnet
* 14:49 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2022.codfw.wmnet
* 14:43 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2022.codfw.wmnet
* 14:39 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2021.codfw.wmnet
* 14:31 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2021.codfw.wmnet
* 14:24 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2020.codfw.wmnet
* 14:18 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2020.codfw.wmnet
* 13:07 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2019.codfw.wmnet
* 12:59 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2019.codfw.wmnet
* 12:54 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2018.codfw.wmnet
* 12:48 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2018.codfw.wmnet
* 12:47 jayme@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 12:47 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2017.codfw.wmnet
* 12:41 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2017.codfw.wmnet
* 12:37 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 12:25 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 12:22 jayme: updated envoyproxy to 1.15.4-1 on 'A:mw-canary or A:restbase-canary'
* 11:08 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2016.codfw.wmnet
* 11:02 moritzm: imported ferm 2.5.1-1+wmf1 to bullseye-wikimedia/main [[phab:T275873|T275873]]
* 11:01 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2016.codfw.wmnet
* 10:55 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2015.codfw.wmnet
* 10:49 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2015.codfw.wmnet
* 10:44 arturo: merging homer change to cr-eqiad ([[phab:T279342|T279342]])
* 10:39 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2014.codfw.wmnet
* 10:33 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2014.codfw.wmnet
* 10:28 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2013.codfw.wmnet
* 10:20 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2013.codfw.wmnet
* 10:15 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2012.codfw.wmnet
* 10:08 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2012.codfw.wmnet
* 10:08 jayme: updated envoyproxy to 1.15.4-1 on mw1325.eqiad.wmnet,restbase1026.eqiad.wmnet
* 10:05 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on thanos-fe2001.codfw.wmnet with reason: REIMAGE
* 10:04 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2011.codfw.wmnet
* 10:03 filippo@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on thanos-fe2001.codfw.wmnet with reason: REIMAGE
* 10:00 jayme: updated envoyproxy to 1.15.4-1 on mwdebug1001.eqiad.wmnet
* 09:57 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2011.codfw.wmnet
* 09:55 jayme: imported envoyproxy_1.15.4-1 to stretch-wikimedia - [[phab:T280317|T280317]]
* 09:40 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2010.codfw.wmnet
* 09:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 100%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P15384 and previous config saved to /var/cache/conftool/dbconfig/20210416-093446-root.json
* 09:33 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2010.codfw.wmnet
* 09:27 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase2009.codfw.wmnet
* 09:21 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase2009.codfw.wmnet
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 75%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P15383 and previous config saved to /var/cache/conftool/dbconfig/20210416-091942-root.json
* 09:13 jayme: imported envoyproxy_1.15.4-1 to buster-wikimedia - [[phab:T280317|T280317]]
* 09:12 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host sretest1002.eqiad.wmnet
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 50%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P15380 and previous config saved to /var/cache/conftool/dbconfig/20210416-090438-root.json
* 08:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 25%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P15374 and previous config saved to /var/cache/conftool/dbconfig/20210416-084935-root.json
* 08:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1169 (re)pooling @ 10%: Repool db1169', diff saved to https://phabricator.wikimedia.org/P15373 and previous config saved to /var/cache/conftool/dbconfig/20210416-083431-root.json
* 07:53 elukey: run reprepro --delete clearvanished on apt1001 to clear all cloudera packages
* 07:41 ema: cp-upload_ulsfo: rolling varnish-frontend-restart to apply exp policy settings changes starting from empty caches [[phab:T275809|T275809]]
* 07:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1169', diff saved to https://phabricator.wikimedia.org/P15372 and previous config saved to /var/cache/conftool/dbconfig/20210416-071936-marostegui.json
* 06:58 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1030.eqiad.wmnet
* 06:52 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1030.eqiad.wmnet
* 06:48 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1029.eqiad.wmnet
* 06:39 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1029.eqiad.wmnet
* 06:27 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1028.eqiad.wmnet
* 06:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2095.codfw.wmnet with reason: REIMAGE
* 06:20 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1028.eqiad.wmnet
* 06:19 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2095.codfw.wmnet with reason: REIMAGE
* 05:54 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts analytics-tool1001.eqiad.wmnet
* 05:50 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2094.codfw.wmnet with reason: REIMAGE
* 05:48 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2094.codfw.wmnet with reason: REIMAGE
* 05:42 elukey@cumin1001: START - Cookbook sre.hosts.decommission for hosts analytics-tool1001.eqiad.wmnet
* 03:31 ryankemper: [wdqs] `ryankemper@wdqs1013:~$ sudo systemctl restart wdqs-blazegraph`
* 03:26 ryankemper: [[phab:T267927|T267927]] Pooled `wdqs2001`
* 03:22 ryankemper: [[phab:T267927|T267927]] Pooled `wdqs1006` and `wdqs2002`
* 03:09 ryankemper: [[phab:T267927|T267927]] kicked off next round of `data-transfer`s: `wdqs1004`->`wdqs1007`, `wdqs2001`->`wdqs2003`, `wdqs1003`->`wdqs1008`, `wdqs2008`->`wdqs2004`
* 03:09 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 03:09 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 03:09 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 03:09 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 03:05 ryankemper: [[phab:T267927|T267927]] Last round of `data-transfer`s finished successfully, proceeding to next round
* 03:04 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 03:04 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 03:04 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 00:30 Krinkle: Delete old data at doc1001:/srv/doc/cover/PasswordBlacklist (ref [[phab:T254799|T254799]])
* 00:09 jforrester@deploy1002: Finished deploy [integration/docroot@63b6fb6]: Sync with CI updates (no-op) (duration: 00m 08s)
* 00:09 jforrester@deploy1002: Started deploy [integration/docroot@63b6fb6]: Sync with CI updates (no-op)


== 2021-04-15 ==
== 2021-06-04 ==
* 23:37 jforrester@deploy1002: Synchronized php-1.37.0-wmf.1/skins/Vector/skin.json: Backport: [[gerrit:679842{{!}}Adjust floating override (T280260)]] (duration: 00m 56s)
* 22:08 cwhite@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4001.wikimedia.org
* 23:35 jforrester@deploy1002: Synchronized php-1.37.0-wmf.1/skins/Vector/resources/skins.vector.styles.legacy/layouts/screen.less: Backport: [[gerrit:679842{{!}}Adjust floating override (T280260)]] (duration: 00m 56s)
* 21:51 cwhite@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh4001.wikimedia.org
* 23:31 jforrester@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/WikimediaEvents/modules/ext.wikimediaEvents/searchSatisfaction.js: Backport: [[gerrit:679845{{!}}searchSatisfaction: Default userEditBucket back to 0 edits (T280294)]] (duration: 00m 57s)
* 20:59 bblack: repool cp1087 - [[phab:T278729|T278729]]
* 23:17 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:679947{{!}}Create Draft namespace on itwiki (T280289)]] (duration: 00m 56s)
* 20:11 bblack@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1087.eqiad.wmnet with reason: REIMAGE
* 23:09 jforrester@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:678342{{!}}[wikitech] Update logo to mirror the new MediaWiki logo (T279087)]] (duration: 00m 56s)
* 20:09 bblack@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1087.eqiad.wmnet with reason: REIMAGE
* 23:08 jforrester@deploy1002: Synchronized static/images/project-logos/wikitech-2x.png: Config: [[gerrit:678342{{!}}[wikitech] Update logo to mirror the new MediaWiki logo (T279087)]] (duration: 00m 56s)
* 19:06 bblack: depool cp1087 - [[phab:T278729|T278729]]
* 23:07 jforrester@deploy1002: Synchronized static/images/project-logos/wikitech-1.5x.png: Config: [[gerrit:678342{{!}}[wikitech] Update logo to mirror the new MediaWiki logo (T279087)]] (duration: 00m 57s)
* 18:21 ayounsi@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 23:06 jforrester@deploy1002: Synchronized static/images/project-logos/wikitech.png: Config: [[gerrit:678342{{!}}[wikitech] Update logo to mirror the new MediaWiki logo (T279087)]] (duration: 00m 57s)
* 17:36 razzi@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0)
* 22:56 ryankemper: [[phab:T267927|T267927]] WDQS kicked off next round of `data-transfer`s: `wdqs1004`->`wdqs1006`, `wdqs2001`->`wdqs2002`, `wdqs2008`->`wdqs1003`
* 17:33 razzi@cumin1001: START - Cookbook sre.aqs.roll-restart
* 22:56 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 17:33 razzi@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99)
* 22:56 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 17:33 razzi@cumin1001: START - Cookbook sre.aqs.roll-restart
* 22:55 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 17:28 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1040.eqiad.wmnet with reason: REIMAGE
* 22:48 ryankemper: [[phab:T267927|T267927]] pooled `wdqs1005` (all caught up on lag)
* 17:25 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1040.eqiad.wmnet with reason: REIMAGE
* 22:46 ryankemper: [[phab:T280108|T280108]] [[phab:T267927|T267927]] Manually re-enabled and ran puppet on `wdqs1005` (had closed the tmux pane which terminated the cookbook without letting it do its final cleanup)
* 15:25 topranks: Adding 1:1 NAT configuration for fran2001 / analytics.codfw.wikimedia.org to pfw3-codfw (backup site)
* 22:33 ryankemper: [[phab:T280108|T280108]] [[phab:T267927|T267927]] Data transfers completed successfully; small issue with new `wait_for_updater` logic is preventing termination so I ctrl+c'd manually
* 14:47 krinkle@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|I434d9cfa29d84f}} (duration: 00m 56s)
* 22:32 ryankemper@cumin2001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 14:46 krinkle@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/DiscussionTools/extension.json: {{Gerrit|Iea41ab8599ffae}} (duration: 00m 56s)
* 20:03 herron: migrating kafka-logging broker logstash1012 to kafka-logging1003 [[phab:T279342|T279342]]
* 14:44 krinkle@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/DiscussionTools/includes/: {{Gerrit|Iea41ab8599ffae}} (duration: 00m 59s)
* 19:56 Trey314159: reindexing wikidata on cloudelastic finished/failed ([[phab:T274200|T274200]])
* 14:41 krinkle@deploy1002: Scap failed!: 9/9 canaries failed their endpoint checks(https://en.wikipedia.org)
* 19:43 Trey314159: reindexing wikidata on cloudelastic ([[phab:T274200|T274200]])
* 13:39 Krinkle: mwmaint1002: Running purge_parsercache_now.php on pc1008, server 3/4, ref [[phab:T282761|T282761]]
* 19:42 Trey314159: reindexing commons and wikidata on elastic@eqiad ([[phab:T274200|T274200]])
* 13:33 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:28 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:46 marostegui: Upgrade mysql on clouddb1016 [[phab:T283235|T283235]]
* 19:24 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:27 marostegui: Upgrade mysql on clouddb1015 [[phab:T283235|T283235]]
* 19:14 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.37.0-wmf.1  refs [[phab:T278345|T278345]]
* 11:20 jbond: upload debmonitor-client_0.3.0-1+deb10u3_all.deb to apt
* 18:49 andrew@deploy1002: Finished deploy [horizon/deploy@ec37c43]: test deploy of trove dashboard to codfw1dev (duration: 01m 58s)
* 10:59 topranks: Running homer for Gerrit 698162: Set up BGP peering to doh5001 in eqsin, triggering DoH /24 announcement there.
* 18:47 andrew@deploy1002: Started deploy [horizon/deploy@ec37c43]: test deploy of trove dashboard to codfw1dev
* 09:47 ema: pool cp1087 [[phab:T278729|T278729]]
* 18:39 jdrewniak@deploy1002: Synchronized private/readme.php: Config: [[gerrit:679614{{!}}Add $wgWMEVectorPrefDiffSalt to private/readme (T261842)]] (duration: 01m 08s)
* 09:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people1003.eqiad.wmnet
* 18:32 jdrewniak@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:679613{{!}}Add mediawiki.pref_diff stream to wgEventLoggingStreamNames/wgEventStreams (T261842)]] (duration: 01m 18s)
* 09:33 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people1003.eqiad.wmnet
* 17:18 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2002.codfw.wmnet
* 17:09 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 09:30 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2002.codfw.wmnet
* 16:42 crusnov@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 100%: Repool db1110', diff saved to https://phabricator.wikimedia.org/P16304 and previous config saved to /var/cache/conftool/dbconfig/20210604-091742-root.json
* 16:34 crusnov@cumin1001: START - Cookbook sre.dns.netbox
* 09:06 ema: reboot cp1087 [[phab:T278729|T278729]]
* 16:27 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1027.eqiad.wmnet
* 09:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 75%: Repool db1110', diff saved to https://phabricator.wikimedia.org/P16303 and previous config saved to /var/cache/conftool/dbconfig/20210604-090239-root.json
* 16:21 ryankemper: [[phab:T280108|T280108]] [[phab:T267927|T267927]] Current wdqs transfers in progress: `wqds1004`->`wdqs1005`, `wdqs2008`->`wdqs2001`
* 08:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 50%: Repool db1110', diff saved to https://phabricator.wikimedia.org/P16302 and previous config saved to /var/cache/conftool/dbconfig/20210604-084735-root.json
* 16:21 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1027.eqiad.wmnet
* 08:33 marostegui: Upgrade db1110 [[phab:T283235|T283235]]
* 16:17 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 08:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1110 (re)pooling @ 25%: Repool db1110', diff saved to https://phabricator.wikimedia.org/P16301 and previous config saved to /var/cache/conftool/dbconfig/20210604-083232-root.json
* 16:17 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1026.eqiad.wmnet
* 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1110', diff saved to https://phabricator.wikimedia.org/P16300 and previous config saved to /var/cache/conftool/dbconfig/20210604-082956-marostegui.json
* 16:17 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 08:20 godog: upgrade karma to 0.86-1
* 16:17 ryankemper: [[phab:T280108|T280108]] [[phab:T267927|T267927]] Merged https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/679702 and ran puppet-agent on `cumin2001` before next round of wdqs `data-transfer`s
* 07:38 jynus: stop and upgrade db1150 [[phab:T283235|T283235]]
* 16:12 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1026.eqiad.wmnet
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 100%: Repool db1096:3315', diff saved to https://phabricator.wikimedia.org/P16299 and previous config saved to /var/cache/conftool/dbconfig/20210604-073326-root.json
* 16:08 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1025.eqiad.wmnet
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: Repool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16298 and previous config saved to /var/cache/conftool/dbconfig/20210604-073318-root.json
* 16:02 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1025.eqiad.wmnet
* 07:29 moritzm: cleanup now unused nginx mods and former deps on install* and puppetdb* servers after switch towards nginx-light (various X11 libs and libxslt) [[phab:T164456|T164456]]
* 15:26 otto@deploy1002: Finished deploy [analytics/refinery@497f6a5] (hadoop-test): (no justification provided) (duration: 04m 44s)
* 07:24 moritzm: cleanup now unused nginx mods and former deps on install* servers after switch towards nginx-light (various X11 libs and libxslt)
* 15:21 otto@deploy1002: Started deploy [analytics/refinery@497f6a5] (hadoop-test): (no justification provided)
* 07:19 urbanecm: Password reset for SUL User:Dominic_Mayers  ([[phab:T282656|T282656]])
* 15:09 elukey@deploy1002: Finished deploy [analytics/refinery@497f6a5]: Regular analytics weekly train (duration: 13m 12s)
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 75%: Repool db1096:3315', diff saved to https://phabricator.wikimedia.org/P16297 and previous config saved to /var/cache/conftool/dbconfig/20210604-071823-root.json
* 15:09 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns1002.wikimedia.org
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: Repool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16296 and previous config saved to /var/cache/conftool/dbconfig/20210604-071815-root.json
* 15:03 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns1002.wikimedia.org
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 50%: Repool db1096:3315', diff saved to https://phabricator.wikimedia.org/P16295 and previous config saved to /var/cache/conftool/dbconfig/20210604-070319-root.json
* 14:59 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns1001.wikimedia.org
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: Repool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16294 and previous config saved to /var/cache/conftool/dbconfig/20210604-070311-root.json
* 14:56 elukey@deploy1002: Started deploy [analytics/refinery@497f6a5]: Regular analytics weekly train
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 25%: Repool db1096:3315', diff saved to https://phabricator.wikimedia.org/P16293 and previous config saved to /var/cache/conftool/dbconfig/20210604-064815-root.json
* 14:53 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns1001.wikimedia.org
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: Repool db1096:3316', diff saved to https://phabricator.wikimedia.org/P16292 and previous config saved to /var/cache/conftool/dbconfig/20210604-064807-root.json
* 14:50 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns5002.wikimedia.org
* 06:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:47 jayme: imported etcd-mirror_0.0.5-1 to buster-wikimedia
* 06:42 marostegui: Upgrade mysql on db1096:3315 db1096:3316
* 14:43 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns5002.wikimedia.org
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 db1096:3315', diff saved to https://phabricator.wikimedia.org/P16291 and previous config saved to /var/cache/conftool/dbconfig/20210604-064242-marostegui.json
* 14:41 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns5001.wikimedia.org
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 100%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P16290 and previous config saved to /var/cache/conftool/dbconfig/20210604-055521-root.json
* 14:37 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1048.eqiad.wmnet with reason: REIMAGE
* 05:40 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 75%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P16289 and previous config saved to /var/cache/conftool/dbconfig/20210604-054017-root.json
* 14:35 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1047.eqiad.wmnet with reason: REIMAGE
* 05:26 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:35 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1048.eqiad.wmnet with reason: REIMAGE
* 05:25 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 50%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P16288 and previous config saved to /var/cache/conftool/dbconfig/20210604-052514-root.json
* 14:34 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns5001.wikimedia.org
* 05:24 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2001.codfw.wmnet --dest wdqs2002.codfw.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 14:33 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1046.eqiad.wmnet with reason: REIMAGE
* 05:23 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 14:33 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1047.eqiad.wmnet with reason: REIMAGE
* 05:22 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns2002.wikimedia.org
* 05:17 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2001.codfw.wmnet --dest wdqs2002.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 14:31 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1046.eqiad.wmnet with reason: REIMAGE
* 05:16 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 14:27 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns2002.wikimedia.org
* 05:10 marostegui@cumin1001: dbctl commit (dc=all): 'db1121 (re)pooling @ 25%: Repool db1121', diff saved to https://phabricator.wikimedia.org/P16287 and previous config saved to /var/cache/conftool/dbconfig/20210604-051010-root.json
* 14:24 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns2001.wikimedia.org
* 04:43 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2002.codfw.wmnet with reason: REIMAGE
* 14:19 ppchelko@deploy1002: Finished deploy [restbase/deploy@4755f50]: [[phab:T271983|T271983]], try again (duration: 07m 45s)
* 04:41 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2002.codfw.wmnet with reason: REIMAGE
* 14:18 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns2001.wikimedia.org
* 04:25 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] wdqs2002.codfw.wmnet` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 14:17 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1024.eqiad.wmnet
* 04:22 ryankemper: [[phab:T280382|T280382]] `wdqs2001.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.9T  998G  1.8T  36% /srv`
* 14:12 ppchelko@deploy1002: Started deploy [restbase/deploy@4755f50]: [[phab:T271983|T271983]], try again
* 03:49 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:11 ppchelko@deploy1002: Finished deploy [restbase/deploy@4755f50]: [[phab:T271983|T271983]] (duration: 11m 15s)
* 02:42 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:09 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1024.eqiad.wmnet
* 02:33 ryankemper: [WDQS] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1006.eqiad.wmnet --dest wdqs1013.eqiad.wmnet --reason "repair overinflated wikidata jnl" --blazegraph_instance blazegraph`
* 14:00 ppchelko@deploy1002: Started deploy [restbase/deploy@4755f50]: [[phab:T271983|T271983]]
* 02:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:56 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=wtp104[5-7].eqiad.wmnet
* 02:30 ryankemper: [[phab:T280382|T280382]] `wdqs1005.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.9T  998G  1.8T  36% /srv`
* 13:55 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1023.eqiad.wmnet
* 02:25 ryankemper: [WDQS] `ryankemper@wdqs1012:~$ sudo pool` (caught up on lag)
* 13:54 andrewbogott: upgrading packages and mediawiki on wikitech-static
* 02:09 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2007.codfw.wmnet --dest wdqs2001.codfw.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 13:53 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns4002.wikimedia.org
* 02:06 ebernhardson: post-deploy restart airflow-(webserver{{!}}scheduer) on an-airflow1001
* 13:48 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1023.eqiad.wmnet
* 02:05 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@500179f]: Stop overwriting uploads in swift (duration: 04m 40s)
* 13:46 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns4002.wikimedia.org
* 02:00 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@500179f]: Stop overwriting uploads in swift
* 13:40 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1022.eqiad.wmnet
* 01:38 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 13:34 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns3001.wikimedia.org
* 01:24 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 13:32 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1022.eqiad.wmnet
* 00:12 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 13:25 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1021.eqiad.wmnet
* 00:08 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: [[phab:T280886|T280886]] (duration: 00m 57s)
* 13:19 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns3001.wikimedia.org
* 00:07 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2007.codfw.wmnet --dest wdqs2001.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 13:18 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1021.eqiad.wmnet
* 00:06 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 13:13 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1020.eqiad.wmnet
* 00:05 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1008.eqiad.wmnet --dest wdqs1005.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 13:13 XioNoX: redirect ns2 to dns3002
* 00:05 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:11 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dns3002.wikimedia.org
* 00:05 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 13:07 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1020.eqiad.wmnet
* 13:02 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host dns3002.wikimedia.org
* 13:02 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1019.eqiad.wmnet
* 12:59 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1045.eqiad.wmnet with reason: REIMAGE
* 12:57 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1044.eqiad.wmnet with reason: REIMAGE
* 12:57 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1045.eqiad.wmnet with reason: REIMAGE
* 12:56 ariel@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1019.eqiad.wmnet
* 12:55 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1043.eqiad.wmnet with reason: REIMAGE
* 12:55 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1044.eqiad.wmnet with reason: REIMAGE
* 12:54 XioNoX: redirect ns2 to dns3001
* 12:53 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1043.eqiad.wmnet with reason: REIMAGE
* 12:50 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host authdns1001.wikimedia.org
* 12:44 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host authdns1001.wikimedia.org
* 12:37 XioNoX: redirect ns0 to authdns2001
* 12:32 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host authdns2001.wikimedia.org
* 12:28 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1018.eqiad.wmnet
* 12:27 jmm@cumin2001: START - Cookbook sre.hosts.reboot-single for host authdns2001.wikimedia.org
* 12:23 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=wtp104[0-2].eqiad.wmnet
* 12:22 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1018.eqiad.wmnet
* 12:17 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1017.eqiad.wmnet
* 12:12 XioNoX: redirect ns1 to authdns1001
* 12:09 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1017.eqiad.wmnet
* 11:51 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1016.eqiad.wmnet
* 11:45 hnowlan@cumin1001: START - Cookbook sre.hosts.reboot-single for host restbase1016.eqiad.wmnet
* 11:45 hnowlan: restarting restbase1016 for kernel update
* 11:35 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1042.eqiad.wmnet with reason: REIMAGE
* 11:33 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1041.eqiad.wmnet with reason: REIMAGE
* 11:33 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1042.eqiad.wmnet with reason: REIMAGE
* 11:31 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1040.eqiad.wmnet with reason: REIMAGE
* 11:30 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1041.eqiad.wmnet with reason: REIMAGE
* 11:29 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1040.eqiad.wmnet with reason: REIMAGE
* 11:25 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:10:00 on restbase-dev1004.eqiad.wmnet with reason: restarting for kernel update
* 11:25 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 0:10:00 on restbase-dev1004.eqiad.wmnet with reason: restarting for kernel update
* 11:20 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6748a7f28ae9c71b3461b830f8323ded0a024a8e}}: Add *.jfklibrary.org to the wgCopyUploadsDomains allowlist of Wikimedia Commons ([[phab:T279506|T279506]]) (duration: 01m 51s)
* 11:14 arturo: merging homer changes for cr-codgw ([[phab:T280225|T280225]])
* 11:14 arturo: merging homer changes for cr-eqiad ([[phab:T280225|T280225]])
* 10:59 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=wtp103[7-9].eqiad.wmnet
* 10:54 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 10:53 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 10:53 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 10:24 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 10:24 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 10:24 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 10:21 elukey: Add kafka-logging100<nowiki>{</nowiki>2,3<nowiki>}</nowiki> to the kafka term in the analytics filters on cr1/cr2 eqiad - ref: https://gerrit.wikimedia.org/r/c/operations/homer/public/+/679740
* 10:08 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 10:08 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 10:08 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 09:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 100%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P15368 and previous config saved to /var/cache/conftool/dbconfig/20210415-095031-root.json
* 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 100%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15367 and previous config saved to /var/cache/conftool/dbconfig/20210415-093633-root.json
* 09:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 75%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P15366 and previous config saved to /var/cache/conftool/dbconfig/20210415-093527-root.json
* 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 75%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15365 and previous config saved to /var/cache/conftool/dbconfig/20210415-092129-root.json
* 09:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 50%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P15364 and previous config saved to /var/cache/conftool/dbconfig/20210415-092024-root.json
* 09:16 ema: cp-upload: varnishadm -n frontend param.set nuke_limit 1000 [[phab:T275809|T275809]]
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 50%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15363 and previous config saved to /var/cache/conftool/dbconfig/20210415-090625-root.json
* 09:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 25%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P15362 and previous config saved to /var/cache/conftool/dbconfig/20210415-090520-root.json
* 09:04 moritzm: installing tomcat security updates
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 25%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15361 and previous config saved to /var/cache/conftool/dbconfig/20210415-085122-root.json
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3312 (re)pooling @ 10%: Repool db1146:3312', diff saved to https://phabricator.wikimedia.org/P15360 and previous config saved to /var/cache/conftool/dbconfig/20210415-085017-root.json
* 08:48 godog: free space and bounce thanos-compact
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 10%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15359 and previous config saved to /var/cache/conftool/dbconfig/20210415-083618-root.json
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 5%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15358 and previous config saved to /var/cache/conftool/dbconfig/20210415-082115-root.json
* 08:11 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1146:3314', diff saved to https://phabricator.wikimedia.org/P15357 and previous config saved to /var/cache/conftool/dbconfig/20210415-081127-marostegui.json
* 08:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 100%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15356 and previous config saved to /var/cache/conftool/dbconfig/20210415-080947-root.json
* 07:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Repool db1166 after cloning db1179', diff saved to https://phabricator.wikimedia.org/P15355 and previous config saved to /var/cache/conftool/dbconfig/20210415-075718-root.json
* 07:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 75%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15354 and previous config saved to /var/cache/conftool/dbconfig/20210415-075444-root.json
* 07:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Repool db1166 after cloning db1179', diff saved to https://phabricator.wikimedia.org/P15353 and previous config saved to /var/cache/conftool/dbconfig/20210415-074214-root.json
* 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 50%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15352 and previous config saved to /var/cache/conftool/dbconfig/20210415-073940-root.json
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: Repool db1166 after cloning db1179', diff saved to https://phabricator.wikimedia.org/P15351 and previous config saved to /var/cache/conftool/dbconfig/20210415-072711-root.json
* 07:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1146:3314 (re)pooling @ 25%: Repool db1146:3314 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15350 and previous config saved to /var/cache/conftool/dbconfig/20210415-072436-root.json
* 07:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1146 (s2,s4) to upgrade kernel', diff saved to https://phabricator.wikimedia.org/P15348 and previous config saved to /var/cache/conftool/dbconfig/20210415-071600-marostegui.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Repool db1166 after cloning db1179', diff saved to https://phabricator.wikimedia.org/P15347 and previous config saved to /var/cache/conftool/dbconfig/20210415-071207-root.json
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: Repool db1166 after cloning db1179', diff saved to https://phabricator.wikimedia.org/P15346 and previous config saved to /var/cache/conftool/dbconfig/20210415-065704-root.json
* 06:33 ryankemper: [[phab:T280108|T280108]] [[phab:T267927|T267927]] `data-transfer` to `wdqs1004` was successful; cookbook failed due to a newly introduced minor type error that didn't effect the transfer itself
* 06:32 elukey: move hue.wikimedia.org to an-tool1009 (from analytics-tool1001)
* 06:00 ryankemper@cumin2001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 05:54 Amir1: end of cleaning archive of pywikibot-bugs and wikidata-bugs [[phab:T262773|T262773]]
* 05:44 Amir1: start deleting archive of wikidata-bugs [[phab:T262773|T262773]]
* 05:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1166 to clone db1179 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15344 and previous config saved to /var/cache/conftool/dbconfig/20210415-050239-marostegui.json
* 04:14 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 04:14 ryankemper: [[phab:T280108|T280108]] [[phab:T267927|T267927]]  `wdqs2008` (source) caught up on lag, xfering to `wdqs1004`: `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs1004.eqiad.wmnet --reason "transferring wikidata journal following reload from dumps" --blazegraph_instance blazegraph --task-id [[phab:T267927|T267927]]`
* 04:06 ryankemper: [[phab:T280108|T280108]] [[phab:T267927|T267927]] Merged https://gerrit.wikimedia.org/r/c/operations/cookbooks/+/679320, will verify correct behavior of `data-transfer` cookbook
* 01:19 Amir1: mwscript extensions/Wikibase/repo/maintenance/changePropertyDataType.php wikidatawiki --property-id P8671 --new-data-type external-id ([[phab:T278427|T278427]])
* 00:50 ejegg: updated fundraising CiviCRM from {{Gerrit|c3342aa4ea}} to {{Gerrit|35a8dd33ba}}


== 2021-04-14 ==
== 2021-06-03 ==
* 23:27 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:679350{{!}}Disable legacy javascript global variables in ruwiki (T72470)]] (duration: 01m 16s)
* 23:41 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: [[phab:T280886|T280886]] (duration: 00m 56s)
* 21:44 legoktm: manually started debmonitor-client.service on ml-serve2004 after 502 Bad gateway error
* 23:40 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: [[phab:T280886|T280886]] (duration: 00m 57s)
* 20:55 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wtp[1037-1039].eqiad.wmnet with reason: reimage
* 23:33 mutante: installing OS on fresh VM doh5001
* 20:54 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wtp[1037-1039].eqiad.wmnet with reason: reimage
* 23:30 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2001.codfw.wmnet with reason: REIMAGE
* 20:38 mutante: wtp1037, wtp1038, wtp1039 - scap pull
* 23:28 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2001.codfw.wmnet with reason: REIMAGE
* 19:52 dzahn@cumin1001: conftool action : set/weight=20; selector: name=mw2395.codfw.wmnet,cluster=jobrunner
* 23:09 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:694686{{!}}Restrict changetags to sysops and bots on meta]] [[phab:T283625|T283625]] (duration: 00m 58s)
* 19:52 dzahn@cumin1001: conftool action : set/weight=20; selector: name=mw2394.codfw.wmnet,cluster=jobrunner
* 22:41 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] wdqs2001.codfw.wmnet` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 19:51 dzahn@cumin1001: conftool action : set/weight=20; selector: name=mw2410.codfw.wmnet,cluster=videoscaler
* 22:39 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1008.eqiad.wmnet --dest wdqs1005.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 19:51 dzahn@cumin1001: conftool action : set/weight=20; selector: name=mw2411.codfw.wmnet,cluster=videoscaler
* 22:39 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:50 cstone: civicrm revision changed from {{Gerrit|ec2a3bcff6}} to {{Gerrit|c3342aa4ea}}
* 22:36 ryankemper: [[phab:T280382|T280382]] Cancelled transfer to `wdqs1005`; the source host `wdqs1013` has a `wikidata.jnl` that is 80% too big; will transfer from different node -> `wdqs1005` and then fix the journal on `wdqs1013` after
* 19:50 dzahn@cumin1001: conftool action : set/weight=15; selector: name=mw2411.codfw.wmnet,cluster=videoscaler
* 22:36 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 19:50 dzahn@cumin1001: conftool action : set/weight=15; selector: name=mw2410.codfw.wmnet,cluster=videoscaler
* 22:35 ryankemper: [[phab:T280382|T280382]] `wdqs2005.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.6T  998G  1.5T  40% /srv`
* 19:49 dzahn@cumin1001: conftool action : set/weight=10; selector: name=mw2395.codfw.wmnet,cluster=videoscaler
* 22:28 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:48 dzahn@cumin1001: conftool action : set/weight=10; selector: name=mw2394.codfw.wmnet,cluster=videoscaler
* 22:15 robh@cumin1001: START - Cookbook sre.dns.netbox
* 19:47 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2411.codfw.wmnet,cluster=jobrunner
* 21:55 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 19:42 herron: migrating kafka-logging broker logstash1011 to kafka-logging1002 [[phab:T279342|T279342]]
* 20:54 shdubsh: restart kafka on kafka-logging to take new retention config
* 19:06 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.37.0-wmf.1  refs [[phab:T278345|T278345]] (duration: 02m 03s)
* 20:47 sbassett: Deployed security patch for [[phab:T282932|T282932]]
* 19:04 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.1  refs [[phab:T278345|T278345]]
* 20:37 ebernhardson: restart mjolnir-kafka-bulk-daemon on search-loader[12]001
* 18:58 mutante: urldownloader1002 - icinga alerted about disk space, ran 'apt-get clean' which is my usual go to in that case. it reduced usage from 97% to 89%
* 20:35 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@1c40c83]: bulk daemon: accept events for search_updates swift container (duration: 01m 00s)
* 17:56 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/GrowthExperiments/: {{Gerrit|ce44792}}: {{Gerrit|84107c5}}: GrowthExperiments backports related to DatabaseMentorStore ([[phab:T279957|T279957]]; [[phab:T279959|T279959]]) (duration: 01m 55s)
* 20:34 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1013.eqiad.wmnet --dest wdqs1005.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 15:00 shdubsh: run new curator actions on codfw - [[phab:T274394|T274394]]
* 20:34 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 14:48 shdubsh: O:logstash::elasticsearch7 update elasticsearch-curator to 5.8.1
* 20:34 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@1c40c83]: bulk daemon: accept events for search_updates swift container
* 14:13 rzl: mcrouter cert renewal complete, puppet re-enabled [[phab:T276029|T276029]]
* 20:34 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2001.codfw.wmnet --dest wdqs2005.codfw.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 14:11 zpapierski@deploy1002: Finished deploy [wikimedia/discovery/analytics@8ae53e3]: [[phab:T273847|T273847]] export queries to relforge dag deployment - start date update (duration: 02m 14s)
* 20:34 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 14:11 moritzm: installing intel-microcode updates on Buster
* 19:58 mutante: [mwmaint1002:~] $ /usr/local/bin/systemd-timer-mail-wrapper -T root@mwmaint1002.eqiad.wmnet --only-on-error /usr/local/bin/cross-validate-accounts
* 14:09 zpapierski@deploy1002: Started deploy [wikimedia/discovery/analytics@8ae53e3]: [[phab:T273847|T273847]] export queries to relforge dag deployment - start date update
* 19:56 mutante: [mwmaint1002:~] $ sudo systemctl start daily_account_consistency_check.service
* 13:48 rzl: disabling puppet on C:mcrouter for cert renewal [[phab:T276029|T276029]]
* 19:41 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh5002.wikimedia.org
* 13:43 marostegui@cumin1001: dbctl commit (dc=all): 'Remove weight from es5 master', diff saved to https://phabricator.wikimedia.org/P15342 and previous config saved to /var/cache/conftool/dbconfig/20210414-134331-marostegui.json
* 19:41 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh5002.wikimedia.org
* 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 100%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15341 and previous config saved to /var/cache/conftool/dbconfig/20210414-133411-root.json
* 19:39 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@339d402]: ship pip and wheel packages for virtualenvs (duration: 04m 27s)
* 13:29 zpapierski@deploy1002: Finished deploy [wikimedia/discovery/analytics@825c60a]: [[phab:T273847|T273847]] export queries to relforge dag deployment - schedule change (duration: 02m 08s)
* 19:37 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh5001.wikimedia.org
* 13:27 zpapierski@deploy1002: Started deploy [wikimedia/discovery/analytics@825c60a]: [[phab:T273847|T273847]] export queries to relforge dag deployment - schedule change
* 19:34 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@339d402]: ship pip and wheel packages for virtualenvs
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 75%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15340 and previous config saved to /var/cache/conftool/dbconfig/20210414-131908-root.json
* 19:33 mutante: [deneb:~] $ sudo systemctl start docker-reporter-releng-images - [[phab:T251918|T251918]] - icinga-wm> RECOVERY - Check systemd state on deneb is OK
* 13:12 moritzm: installing OpenSSL updates on buster
* 19:33 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 13:12 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)
* 19:32 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 50%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15339 and previous config saved to /var/cache/conftool/dbconfig/20210414-130404-root.json
* 19:32 mutante: [deneb:~] $ sudo systemctl start docker-reporter-releng-images
* 13:02 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart
* 19:28 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2001.codfw.wmnet --dest wdqs2005.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 13:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)
* 19:27 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 13:01 godog: extend prometheus global @ codfw by 100G
* 19:27 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1013.eqiad.wmnet --dest wdqs1005.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 25%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15338 and previous config saved to /var/cache/conftool/dbconfig/20210414-124901-root.json
* 19:27 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 12:39 elukey: update kafka term for analytics-in<nowiki>{</nowiki>4,6<nowiki>}</nowiki> on cr<nowiki>{</nowiki>1,2<nowiki>}</nowiki>-eqiad to include kafka-logging1001 - ref: https://gerrit.wikimedia.org/r/c/operations/homer/public/+/679296
* 19:23 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh5001.wikimedia.org
* 12:36 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1039.eqiad.wmnet with reason: REIMAGE
* 19:14 mutante: install1003 - restarting nginx after we switched from nginx-full to nginx-light package, same on other install servers [[phab:T164456|T164456]]
* 12:34 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1039.eqiad.wmnet with reason: REIMAGE
* 19:05 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2005.codfw.wmnet with reason: REIMAGE
* 12:34 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1038.eqiad.wmnet with reason: REIMAGE
* 19:03 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1005.eqiad.wmnet with reason: REIMAGE
* 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'es1025 (re)pooling @ 10%: Repool es1025 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15337 and previous config saved to /var/cache/conftool/dbconfig/20210414-123357-root.json
* 19:03 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2005.codfw.wmnet with reason: REIMAGE
* 12:32 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1037.eqiad.wmnet with reason: REIMAGE
* 19:01 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1005.eqiad.wmnet with reason: REIMAGE
* 12:31 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1038.eqiad.wmnet with reason: REIMAGE
* 18:52 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@f40d41a]: resolve npe in datawriter (duration: 00m 31s)
* 12:30 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1037.eqiad.wmnet with reason: REIMAGE
* 18:51 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@f40d41a]: resolve npe in datawriter
* 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 100%: Repool db1096:3316 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15336 and previous config saved to /var/cache/conftool/dbconfig/20210414-122727-root.json
* 18:46 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] wdqs2005.codfw.wmnet` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 12:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 100%: Repool db1096:3315 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15335 and previous config saved to /var/cache/conftool/dbconfig/20210414-122108-root.json
* 18:46 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] wdqs1005.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 75%: Repool db1096:3316 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15334 and previous config saved to /var/cache/conftool/dbconfig/20210414-121223-root.json
* 18:39 ryankemper: [WDQS] depooled `wdqs1012` (has ~15 hours of lag to catch up on)
* 12:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1025 for kernel and mysql upgrade [[phab:T279281|T279281]]', diff saved to https://phabricator.wikimedia.org/P15333 and previous config saved to /var/cache/conftool/dbconfig/20210414-120724-marostegui.json
* 18:37 ryankemper: [WDQS] `ryankemper@wdqs1012:~$ sudo systemctl restart wdqs-blazegraph` (blazegraph on the host has been locked up for ~16 hours based off of https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&var-cluster_name=wdqs&from=1622683465757&to=1622745461547)
* 12:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 75%: Repool db1096:3315 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15332 and previous config saved to /var/cache/conftool/dbconfig/20210414-120604-root.json
* 18:37 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp1087.eqiad.wmnet with reason: replaced DIMM https://phabricator.wikimedia.org/T278729
* 12:03 marostegui: Upgrade mysql on db1080 [[phab:T279281|T279281]]
* 18:37 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on cp1087.eqiad.wmnet with reason: replaced DIMM https://phabricator.wikimedia.org/T278729
* 11:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 50%: Repool db1096:3316 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15331 and previous config saved to /var/cache/conftool/dbconfig/20210414-115720-root.json
* 18:28 mutante: temp. disabling puppet on install* servers. switching nginx to light variant ([[phab:T164456|T164456]])
* 11:53 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=(wtp1034{{!}}wtp1035{{!}}wtp1036).eqiad.wmnet
* 18:16 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@659a8e4]: resolve npe in datawriter (duration: 00m 15s)
* 11:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 50%: Repool db1096:3315 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15330 and previous config saved to /var/cache/conftool/dbconfig/20210414-115101-root.json
* 18:16 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@659a8e4]: resolve npe in datawriter
* 11:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3316 (re)pooling @ 25%: Repool db1096:3316 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15329 and previous config saved to /var/cache/conftool/dbconfig/20210414-114216-root.json
* 17:49 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on moss-be1002.eqiad.wmnet with reason: REIMAGE
* 11:41 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart
* 17:47 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on moss-be1001.eqiad.wmnet with reason: REIMAGE
* 11:39 hnowlan@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0)
* 17:47 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on moss-be1002.eqiad.wmnet with reason: REIMAGE
* 11:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15328 and previous config saved to /var/cache/conftool/dbconfig/20210414-113714-root.json
* 17:45 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on moss-be1001.eqiad.wmnet with reason: REIMAGE
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'db1096:3315 (re)pooling @ 25%: Repool db1096:3315 after kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15327 and previous config saved to /var/cache/conftool/dbconfig/20210414-113557-root.json
* 17:37 brennen: gitlab1001: re-running install-gitlab-server.sh
* 11:31 marostegui: Upgrade kernel on db1096 (s5, s6)
* 17:16 urandom: remove dropped Cassandra keyspace snapshots -- [[phab:T258414|T258414]]
* 11:29 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart
* 16:55 ejegg: updated payments-wiki from {{Gerrit|6fac77f60e}} to {{Gerrit|7be0534b91}}
* 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096 (s5,s6) kernel upgrade', diff saved to https://phabricator.wikimedia.org/P15326 and previous config saved to /var/cache/conftool/dbconfig/20210414-112619-marostegui.json
* 16:23 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 11:25 hnowlan: regenerated certificates for restbase1019/restbase102[0-7]
* 15:49 topranks: Gerrit 697993: Change BGP peer IP for doh3002 on esams CRs.
* 11:22 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 90%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15325 and previous config saved to /var/cache/conftool/dbconfig/20210414-112211-root.json
* 15:27 papaul: pdu  replacement  complete
* 11:07 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 80%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15323 and previous config saved to /var/cache/conftool/dbconfig/20210414-110706-root.json
* 15:25 moritzm: upgrading gitlab to 13.11.5
* 11:06 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1036.eqiad.wmnet with reason: REIMAGE
* 15:08 papaul: disconnect ps2-d8-codfw for replacement
* 11:06 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 14:55 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:06 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 14:54 topranks: Gerrit 697970: Add Wikidough BGP peerings on esams CRs for doh3001 and doh3002.
* 11:06 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 14:23 moritzm: installing nginx security updates on buster
* 11:06 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 14:12 moritzm: installing postgresql-9.6 security updates
* 11:04 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1035.eqiad.wmnet with reason: REIMAGE
* 13:55 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:04 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 13:25 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:04 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 13:19 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:04 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 13:17 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:03 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 100%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P16285 and previous config saved to /var/cache/conftool/dbconfig/20210603-130059-root.json
* 11:03 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1036.eqiad.wmnet with reason: REIMAGE
* 12:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 75%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P16284 and previous config saved to /var/cache/conftool/dbconfig/20210603-124556-root.json
* 11:03 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 12:32 marostegui@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 100%: Repool db1157', diff saved to https://phabricator.wikimedia.org/P16283 and previous config saved to /var/cache/conftool/dbconfig/20210603-123243-root.json
* 11:03 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 12:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 50%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P16282 and previous config saved to /var/cache/conftool/dbconfig/20210603-123052-root.json
* 11:02 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 12:17 marostegui@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 75%: Repool db1157', diff saved to https://phabricator.wikimedia.org/P16281 and previous config saved to /var/cache/conftool/dbconfig/20210603-121739-root.json
* 11:02 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 12:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1112 (re)pooling @ 25%: Repool db1112', diff saved to https://phabricator.wikimedia.org/P16280 and previous config saved to /var/cache/conftool/dbconfig/20210603-121548-root.json
* 11:02 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1034.eqiad.wmnet with reason: REIMAGE
* 12:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112', diff saved to https://phabricator.wikimedia.org/P16279 and previous config saved to /var/cache/conftool/dbconfig/20210603-121205-marostegui.json
* 11:01 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1035.eqiad.wmnet with reason: REIMAGE
* 12:11 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Repool db1166', diff saved to https://phabricator.wikimedia.org/P16278 and previous config saved to /var/cache/conftool/dbconfig/20210603-121133-root.json
* 10:59 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1034.eqiad.wmnet with reason: REIMAGE
* 12:06 moritzm: restarting FPM on mw canaries to pick up lz4 update
* 10:52 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 70%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15322 and previous config saved to /var/cache/conftool/dbconfig/20210414-105202-root.json
* 12:03 moritzm: installing lz4 security updates on buster
* 10:37 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 60%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15321 and previous config saved to /var/cache/conftool/dbconfig/20210414-103659-root.json
* 12:02 marostegui@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 50%: Repool db1157', diff saved to https://phabricator.wikimedia.org/P16277 and previous config saved to /var/cache/conftool/dbconfig/20210603-120235-root.json
* 10:30 marostegui: Failover m1 from db1080 to db1159 - [[phab:T276448|T276448]]
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Repool db1166', diff saved to https://phabricator.wikimedia.org/P16276 and previous config saved to /var/cache/conftool/dbconfig/20210603-115628-root.json
* 10:25 dcaro@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on 6 hosts with reason: Upgrading ceph to octopus
* 11:53 moritzm: installing curl security updates on stretch
* 10:25 dcaro@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on 6 hosts with reason: Upgrading ceph to octopus
* 11:47 marostegui@cumin1001: dbctl commit (dc=all): 'db1157 (re)pooling @ 25%: Repool db1157', diff saved to https://phabricator.wikimedia.org/P16275 and previous config saved to /var/cache/conftool/dbconfig/20210603-114731-root.json
* 10:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15320 and previous config saved to /var/cache/conftool/dbconfig/20210414-102153-root.json
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 100%: Repool db1175', diff saved to https://phabricator.wikimedia.org/P16274 and previous config saved to /var/cache/conftool/dbconfig/20210603-114503-root.json
* 10:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 40%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15319 and previous config saved to /var/cache/conftool/dbconfig/20210414-100649-root.json
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1157', diff saved to https://phabricator.wikimedia.org/P16273 and previous config saved to /var/cache/conftool/dbconfig/20210603-114325-marostegui.json
* 09:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 30%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15318 and previous config saved to /var/cache/conftool/dbconfig/20210414-095146-root.json
* 11:41 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 50%: Repool db1166', diff saved to https://phabricator.wikimedia.org/P16272 and previous config saved to /var/cache/conftool/dbconfig/20210603-114124-root.json
* 09:37 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 75%: Repool db1175', diff saved to https://phabricator.wikimedia.org/P16271 and previous config saved to /var/cache/conftool/dbconfig/20210603-113000-root.json
* 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 20%: Slowly pool db1177 for the first time in s8 [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15317 and previous config saved to /var/cache/conftool/dbconfig/20210414-093642-root.json
* 11:26 marostegui@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Repool db1166', diff saved to https://phabricator.wikimedia.org/P16270 and previous config saved to /var/cache/conftool/dbconfig/20210603-112620-root.json
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1177 with minimal weight on s8 for the first time [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15316 and previous config saved to /var/cache/conftool/dbconfig/20210414-093305-marostegui.json
* 11:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1166', diff saved to https://phabricator.wikimedia.org/P16269 and previous config saved to /var/cache/conftool/dbconfig/20210603-112243-marostegui.json
* 09:29 gehel: depooling wdqs1004 - corrupted data after data reload
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 50%: Repool db1175', diff saved to https://phabricator.wikimedia.org/P16268 and previous config saved to /var/cache/conftool/dbconfig/20210603-111456-root.json
* 09:27 effie: disable puppet on all mediawiki servers to merge 676580
* 11:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e84096857c8a2f753e077aa6c3e37b910b9e1fcd}}: jawiki: extended confirmed should be 120 days since first edit, not registration ([[phab:T284212|T284212]]) (duration: 00m 58s)
* 09:24 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.1/extensions/DiscussionTools/includes/Hooks/HookUtils.php: {{Gerrit|e4b2d93dcf86a336314ed09fd37844edb16f4f30}}: Dont allow query and cookie hacks to enable topic subscriptions ([[phab:T280082|T280082]]) (duration: 01m 24s)
* 11:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 100%: Repool db1179', diff saved to https://phabricator.wikimedia.org/P16267 and previous config saved to /var/cache/conftool/dbconfig/20210603-110906-root.json
* 09:23 gehel: repooling wdqs1013, catched up on lag
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'db1175 (re)pooling @ 25%: Repool db1175', diff saved to https://phabricator.wikimedia.org/P16266 and previous config saved to /var/cache/conftool/dbconfig/20210603-105953-root.json
* 09:22 gehel: depooling wdqs1003 - corrupted data after data reload
* 10:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1175', diff saved to https://phabricator.wikimedia.org/P16265 and previous config saved to /var/cache/conftool/dbconfig/20210603-105536-marostegui.json
* 09:19 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts kraz.wikimedia.org
* 10:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 75%: Repool db1179', diff saved to https://phabricator.wikimedia.org/P16264 and previous config saved to /var/cache/conftool/dbconfig/20210603-105402-root.json
* 09:16 gehel: restarting blazegraph on wdqs1003
* 10:52 jiji@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:12 ryankemper: [[phab:T267927|T267927]] depooled `wdqs1004` following data transfer (catching up on lag), current round of data transfers is done so there shouldn't be any left to depool
* 10:41 godog: test librenms/AM paging
* 09:10 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 10:40 jiji@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:09 jmm@cumin1001: START - Cookbook sre.hosts.decommission for hosts kraz.wikimedia.org
* 10:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 50%: Repool db1179', diff saved to https://phabricator.wikimedia.org/P16263 and previous config saved to /var/cache/conftool/dbconfig/20210603-103858-root.json
* 09:06 ryankemper: [[phab:T267927|T267927]] depool `wdqs2001` following data transfer (catching up on lag)
* 10:28 jiji@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:03 jmm@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast1002.wikimedia.org
* 10:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1179 (re)pooling @ 25%: Repool db1179', diff saved to https://phabricator.wikimedia.org/P16262 and previous config saved to /var/cache/conftool/dbconfig/20210603-102354-root.json
* 09:03 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 10:21 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on pc2008.codfw.wmnet,pc1008.eqiad.wmnet with reason: Purging parsercache [[phab:T282761|T282761]]
* 08:53 jmm@cumin1001: START - Cookbook sre.hosts.decommission for hosts bast1002.wikimedia.org
* 10:21 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on pc2008.codfw.wmnet,pc1008.eqiad.wmnet with reason: Purging parsercache [[phab:T282761|T282761]]
* 08:44 Urbanecm: Run scap pull on mwdebug1002
* 10:19 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1179', diff saved to https://phabricator.wikimedia.org/P16261 and previous config saved to /var/cache/conftool/dbconfig/20210603-101950-marostegui.json
* 08:40 Urbanecm: Stagging on mwdebug1002
* 10:13 kormat@deploy1002: Synchronized wmf-config/db-eqiad.php: Set pc1010 as pc2 primary [[phab:T282761|T282761]] (duration: 00m 58s)
* 08:20 akosiaris@cumin1001: conftool action : set/weight=10; selector: cluster=videoscaler,service=apache2,name=mw2394.codfw.wmnet
* 09:38 marostegui: Deploy schema change on s3 codfw master (with replication) - [[phab:T282373|T282373]] [[phab:T282372|T282372]] [[phab:T282371|T282371]]
* 08:20 akosiaris@cumin1001: conftool action : set/weight=10; selector: cluster=videoscaler,service=apache2,name=mw2395.codfw.wmnet
* 09:37 moritzm: upgrading eqiad to debmonitor-client 0.3.0 (along with deleting/recreating system user within 100-499 range) [[phab:T235162|T235162]]
* 08:16 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=(wtp1033.eqiad.wmnet{{!}}wtp1032.eqiad.wmnet)
* 08:55 moritzm: uploading gitlab-ce 13.11.5-ce to apt.wikimedia.org thirdparty/gitlab
* 08:07 jayme: updated chartmuseum to 0.13.1 on charmuseum1001, chartmuseum2001
* 08:43 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:06 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
* 08:37 moritzm: upgrading codfw to debmonitor-client 0.3.0 (along with deleting/recreating system user within 100-499 range) [[phab:T235162|T235162]]
* 08:05 gehel: depooling wdqs2004 - catching up on lag
* 08:23 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:01 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 08:19 oblivian@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:59 gehel: depooling wdqs2001 - catching up on lag
* 08:09 moritzm: upgrading esams/eqsin to debmonitor-client 0.3.0 (along with deleting/recreating system user within 100-499 range)
* 07:57 gehel: depooling wdqs1013 - catching up on lag
* 07:52 ryankemper: [WDQS] Pooled `wdqs1008` and `wdqs2006` (all caught up on lag)
* 07:56 ryankemper@cumin2001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 07:48 moritzm: uploaded debmonitor-client 0.3.0-1+deb10u2 to apt.wikimedia.org
* 07:55 gehel: restarting blazegraph + updater on wdqs1013
* 06:24 ryankemper: [WDQS] De-pooled `wdqs1008` and `wdqs2006` (~1 hour of lag to catch up on)
* 07:51 jayme@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
* 06:23 ryankemper: [[phab:T280382|T280382]] `wdqs2006.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.6T  998G  1.5T  40% /srv`
* 07:51 jayme@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=codfw
* 06:23 ryankemper: [[phab:T280382|T280382]] `wdqs1008.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.6T  998G  1.5T  40% /srv`
* 07:42 jayme@cumin1001: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=codfw
* 06:07 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 07:42 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 06:05 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 07:42 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 05:20 marostegui: Deploy schema change on db1121, lag will appear on s4 (commonswiki) wiki replicas - [[phab:T266486|T266486]] [[phab:T268392|T268392]] [[phab:T273360|T273360]]
* 07:42 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 05:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1121', diff saved to https://phabricator.wikimedia.org/P16259 and previous config saved to /var/cache/conftool/dbconfig/20210603-051853-marostegui.json
* 07:42 jayme: imported chartmuseum_0.13.1-1 to buster-wikimedia
* 05:14 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3314 (re)pooling @ 100%: Repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P16258 and previous config saved to /var/cache/conftool/dbconfig/20210603-051402-root.json
* 07:41 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 04:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3314 (re)pooling @ 75%: Repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P16257 and previous config saved to /var/cache/conftool/dbconfig/20210603-045859-root.json
* 07:41 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 04:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3314 (re)pooling @ 50%: Repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P16256 and previous config saved to /var/cache/conftool/dbconfig/20210603-044355-root.json
* 07:41 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 04:37 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1005.eqiad.wmnet --dest wdqs1008.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 07:41 godog: swift eqiad-prod: less weight for ms-be[1019-1026] / more weight to ms-be106[0-3] - [[phab:T272836|T272836]]
* 04:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 07:41 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 04:36 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2004.codfw.wmnet --dest wdqs2006.codfw.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 07:40 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 04:36 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 07:40 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 04:35 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 07:22 XioNoX: push pfw policy - [[phab:T280059|T280059]]
* 04:34 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 06:47 eileen: civicrm revision changed from {{Gerrit|649e415c07}} to {{Gerrit|ec2a3bcff6}}, config revision is {{Gerrit|c5fc1b91e0}}
* 04:30 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2004.codfw.wmnet --dest wdqs2006.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1177 with minimal weight on s8 for the first time [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15314 and previous config saved to /var/cache/conftool/dbconfig/20210414-062549-marostegui.json
* 04:29 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 05:30 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db1177 with minimal weight on s8 for the first time [[phab:T275633|T275633]]', diff saved to https://phabricator.wikimedia.org/P15313 and previous config saved to /var/cache/conftool/dbconfig/20210414-052959-marostegui.json
* 04:29 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1005.eqiad.wmnet --dest wdqs1008.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 05:14 root@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts db1076.eqiad.wmnet
* 04:29 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 05:08 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 04:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1144:3314 (re)pooling @ 25%: Repool db1144:3314', diff saved to https://phabricator.wikimedia.org/P16255 and previous config saved to /var/cache/conftool/dbconfig/20210603-042851-root.json
* 05:08 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 02:22 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1008.eqiad.wmnet with reason: REIMAGE
* 05:07 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 02:20 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1008.eqiad.wmnet with reason: REIMAGE
* 05:07 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 02:09 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2006.codfw.wmnet with reason: REIMAGE
* 05:04 root@cumin1001: START - Cookbook sre.hosts.decommission for hosts db1076.eqiad.wmnet
* 02:07 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] wdqs1008.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 04:50 ryankemper@cumin2001: START - Cookbook sre.wdqs.data-transfer
* 02:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2006.codfw.wmnet with reason: REIMAGE
* 04:42 eileen: civicrm revision changed from {{Gerrit|a4c1a7b842}} to {{Gerrit|649e415c07}}, config revision is {{Gerrit|c5fc1b91e0}}
* 02:05 ryankemper: [[phab:T280382|T280382]] `wdqs1003.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.9T  998G  1.8T  36% /srv`
* 02:54 andrew@deploy1002: Finished deploy [horizon/deploy@ef844a1]: fix for [[phab:T276963|T276963]] (duration: 04m 10s)
* 02:04 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:49 andrew@deploy1002: Started deploy [horizon/deploy@ef844a1]: fix for [[phab:T276963|T276963]]
* 01:51 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] wdqs2006.codfw.wmnet` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 00:11 legoktm@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,name=mw2411.codfw.wmnet
* 01:47 ryankemper: [[phab:T280382|T280382]] `wdqs2003.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.9T  998G  1.8T  36% /srv`
* 00:10 legoktm@cumin1001: conftool action : set/pooled=yes; selector: cluster=jobrunner,name=mw2411.codfw.wmnet
* 01:43 ryankemper: [WDQS] Pooled `wdqs1004` (caught up on lag)
* 00:08 legoktm@cumin1001: conftool action : set/pooled=no; selector: cluster=jobrunner,name=mw2411.codfw.wmnet
* 01:25 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 00:08 legoktm@cumin1001: conftool action : set/pooled=no; selector: cluster=jobrunner,name=mw2410.codfw.wmnet
* 00:40 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.7/extensions/Gadgets: Backport: [[gerrit:697816{{!}}Reduce message parse in GadgetHooks::getPreferences (second time) (T58633 T278650)]], Try II (duration: 00m 57s)
* 00:36 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.7/includes/user/UserOptionsManager.php: Backport: [[gerrit:697818{{!}}user: Accept options-messages for multiselect user options (T58633 T278650)]] (duration: 00m 57s)
* 00:35 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1007.eqiad.wmnet --dest wdqs1003.eqiad.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 00:35 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 00:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 00:18 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1007.eqiad.wmnet --dest wdqs1003.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 00:18 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 00:18 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)


== 2021-04-13 ==
== 2021-06-02 ==
* 23:27 legoktm@deploy1002: Synchronized wmf-config/ProductionServices.php: Config: [[gerrit:677806{{!}}Broadcast IRC events to irc1001 instead of kraz (T224579)]] (duration: 01m 06s)
* 23:57 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2007.codfw.wmnet --dest wdqs2003.codfw.wmnet --reason "transferring fresh wikidata journal following reimage" --blazegraph_instance blazegraph` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 23:15 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:678628{{!}}Unset $wmgUseWikimediaShopLink for ptwiki (T279877)]] (duration: 01m 06s)
* 23:57 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 23:10 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:678146{{!}}ExtensionDistributor: Add REL1_36]] (duration: 02m 03s)
* 23:56 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1004.eqiad.wmnet --dest wdqs1003.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 22:41 mutante: welcome new deployer Silvan Heintze (sihe) ([[phab:T279764|T279764]])
* 23:56 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 22:40 cstone: civicrm revision changed from {{Gerrit|76bd8ff009}} to {{Gerrit|a4c1a7b842}}
* 23:53 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 22:08 ejegg: updated payments-wiki from {{Gerrit|70f5163816}} to {{Gerrit|9a4eef1375}}
* 23:47 ryankemper: [[phab:T280382|T280382]] `wdqs1004.eqiad.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid0`: `/dev/md2        2.9T  998G  1.8T  36% /srv`
* 22:06 dzahn@cumin1001: conftool action : set/weight=15; selector: name=mw2395.codfw.wmnet,cluster=jobrunner
* 23:41 ladsgroup@deploy1002: scap failed: average error rate on 4/9 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details)
* 22:06 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2395.codfw.wmnet,cluster=jobrunner
* 23:38 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 22:04 Urbanecm: [urbanecm@mwmaint1002 /srv/mediawiki]$ foreachwikiindblist growthexperiments sql.php php-1.37.0-wmf.1/extensions/GrowthExperiments/maintenance/schemas/mysql/growthexperiments_mentor_mentee.sql # [[phab:T278573|T278573]]
* 23:28 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2007.codfw.wmnet --dest wdqs2003.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 21:50 dzahn@cumin1001: conftool action : set/weight=15; selector: name=mw2394.codfw.wmnet,cluster=jobrunner
* 23:28 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 21:49 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2394.codfw.wmnet,cluster=jobrunner
* 23:26 ryankemper: [[phab:T280382|T280382]] `wdqs2007.codfw.wmnet` has been re-imaged and had the appropriate wikidata/categories journal files transferred. `df -h` shows disk space is no longer an issue following the switch to `raid10`: `/dev/mapper/vg0-srv  2.7T  998G  1.6T  39% /srv`
* 21:48 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2394.codfw.wmnet,service=jobrunner
* 23:24 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:45 mutante: mw2394, mw2395 - scap pull
* 23:18 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.7/includes: Backport: [[gerrit:697817{{!}}Allow html form field option 'options-messages' to get parsed (T58633)]] (duration: 01m 01s)
* 21:42 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2395.codfw.wmnet
* 22:56 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2003.codfw.wmnet with reason: REIMAGE
* 21:35 mutante: mw2394 - rebooting
* 22:54 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2003.codfw.wmnet with reason: REIMAGE
* 21:34 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2394.codfw.wmnet
* 22:48 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:697855{{!}}Enable wgVectorConsolidateUserLinks on the beta cluster (T266536)]] (duration: 00m 57s)
* 21:23 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2395.codfw.wmnet with reason: REIMAGE
* 22:39 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs2003.codfw.wmnet` on `ryankemper@cumin2002` tmux session `wdqs_reimage_2`
* 21:21 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2395.codfw.wmnet with reason: REIMAGE
* 22:34 ryankemper: [[phab:T280382|T280382]] Cleaned up no-longer-needed files removed in https://gerrit.wikimedia.org/r/c/operations/puppet/+/697832 => `ryankemper@cumin1001:~$ sudo -E cumin -b 2 'P<nowiki>{</nowiki>apt*<nowiki>}</nowiki>' 'sudo rm -rfv /srv/tftpboot/buster-raid0-installer/pxelinux.cfg'`
* 21:15 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw2394.codfw.wmnet with reason: REIMAGE
* 22:30 ryankemper: [[phab:T280382|T280382]] Cleaned up no-longer-needed files removed in https://gerrit.wikimedia.org/r/c/operations/puppet/+/697832 => `ryankemper@cumin1001:~$ sudo -E cumin -b 6 'P<nowiki>{</nowiki>install*<nowiki>}</nowiki>' 'sudo rm -fv /srv/tftpboot/buster-raid0-installer/pxelinux.cfg'`
* 21:13 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw2394.codfw.wmnet with reason: REIMAGE
* 22:27 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1003.eqiad.wmnet with reason: REIMAGE
* 20:58 mutante: mw2395, mw2395 - reimaging as jobrunners ([[phab:T279100|T279100]])
* 22:25 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1003.eqiad.wmnet with reason: REIMAGE
* 20:56 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mw[2394-2395].codfw.wmnet with reason: reimage
* 22:19 Amir1: setting charset of all tables in wikitech to binary ([[phab:T284108|T284108]] [[phab:T269348|T269348]])
* 20:56 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mw[2394-2395].codfw.wmnet with reason: reimage
* 22:11 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs1003.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `wdqs_reimage_2`
* 20:47 mutante: [kubemaster1001:~] $ sudo kubectl delete pod linkrecommendation-production-load-datasets-1618311600-hn6k8 -n linkrecommendation ([[phab:T280076|T280076]])
* 22:08 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:35 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 22:07 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 19:35 kharlan@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 22:07 ryankemper@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1004.eqiad.wmnet
* 19:32 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'production' .
* 22:07 ryankemper@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs2007.codfw.wmnet
* 19:32 kharlan@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 22:05 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 19:29 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1033.eqiad.wmnet with reason: REIMAGE
* 22:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 19:29 kharlan@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 21:59 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2007.codfw.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 19:27 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1032.eqiad.wmnet with reason: REIMAGE
* 21:59 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer
* 19:27 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1033.eqiad.wmnet with reason: REIMAGE
* 21:56 ryankemper: [[phab:T280382|T280382]] `sudo -i cookbook sre.wdqs.data-transfer --source wdqs1006.eqiad.wmnet --dest wdqs1004.eqiad.wmnet --reason "transferring fresh categories journal following reimage" --blazegraph_instance categories` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 19:25 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1032.eqiad.wmnet with reason: REIMAGE
* 21:55 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:17 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2020.codfw.wmnet with reason: REIMAGE
* 21:39 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1004.eqiad.wmnet with reason: REIMAGE
* 19:15 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2020.codfw.wmnet with reason: REIMAGE
* 21:38 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh3002.wikimedia.org
* 19:11 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.1
* 21:37 ryankemper@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1004.eqiad.wmnet with reason: REIMAGE
* 18:45 jhuneidi@deploy1002: Pruned MediaWiki: 1.36.0-wmf.37 (duration: 03m 16s)
* 21:32 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2007.codfw.wmnet with reason: REIMAGE
* 18:11 jhuneidi@deploy1002: Finished scap: testwikis wikis to 1.37.0-wmf.1 (duration: 30m 36s)
* 21:30 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2007.codfw.wmnet with reason: REIMAGE
* 18:01 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1031.eqiad.wmnet with reason: REIMAGE
* 21:28 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh3002.wikimedia.org
* 17:59 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1031.eqiad.wmnet with reason: REIMAGE
* 21:21 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh3001.wikimedia.org
* 17:58 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wtp1030.eqiad.wmnet with reason: REIMAGE
* 21:19 ryankemper@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs2007.codfw.wmnet
* 17:56 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on wtp1030.eqiad.wmnet with reason: REIMAGE
* 21:17 ryankemper: `ryankemper@wdqs1013:~$ sudo depool`  (catching up on 17.9h lag)
* 17:54 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2019.codfw.wmnet with reason: REIMAGE
* 21:12 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh3001.wikimedia.org
* 17:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 21:10 ryankemper: [[phab:T280382|T280382]] [[phab:T281437|T281437]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs2007.codfw.wmnet` on `ryankemper@cumin2002` tmux session `wdqs_reimage`
* 17:54 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 21:10 ryankemper: [[phab:T280382|T280382]] `sudo -i wmf-auto-reimage-host -p [[phab:T280382|T280382]] --new wdqs1004.eqiad.wmnet` on `ryankemper@cumin1001` tmux session `wdqs_reimage`
* 17:52 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2018.codfw.wmnet with reason: REIMAGE
* 20:58 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh3001.wikimedia.org
* 17:52 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2019.codfw.wmnet with reason: REIMAGE
* 20:49 dzahn@cumin1001: START - Cookbook sre.hosts.decommission for hosts doh3001.wikimedia.org
* 17:50 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on parse2017.codfw.wmnet with reason: REIMAGE
* 20:27 dzahn@cumin1001: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh3002.wikimedia.org
* 17:50 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2018.codfw.wmnet with reason: REIMAGE
* 20:21 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh3002.wikimedia.org
* 17:48 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on parse2017.codfw.wmnet with reason: REIMAGE
* 20:00 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh3001.wikimedia.org
* 17:41 jhuneidi@deploy1002: Started scap: testwikis wikis to 1.37.0-wmf.1
* 19:42 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host doh3001.wikimedia.org
* 17:29 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 18:37 urbanecm@