You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4033.ulsfo.wmnet with OS buster)
imported>Stashbot
(pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2037.codfw.wmnet with OS buster)
 
(312 intermediate revisions by 2 users not shown)
Line 1: Line 1:
== 2021-10-28 ==
== 2022-09-29 ==
* 23:50 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4033.ulsfo.wmnet with OS buster
* 01:01 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2037.codfw.wmnet with OS buster
* 23:50 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4034.ulsfo.wmnet with OS buster
* 00:46 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2037.codfw.wmnet with reason: host reimage
* 23:46 robh@cumin1001: START - Cookbook sre.hosts.reimage for host cp4034.ulsfo.wmnet with OS buster
* 00:43 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on logstash2037.codfw.wmnet with reason: host reimage
* 23:43 robh@cumin1001: START - Cookbook sre.hosts.reimage for host cp4033.ulsfo.wmnet with OS buster
* 23:33 brennen: end of UTC late backport & config window
* 23:28 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:733403{{!}}Add User and User talk to $wgExemptFromUserRobotsControl on zhwiki (T288947)]] (duration: 00m 56s)
* 23:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:16 mutante: mwdebug1001 - letting puppet remove all mediawiki font packages using new Hiera key 'profile::mediawiki::webserver::install_fonts: false' to make sure we really don't need them ([[phab:T294378|T294378]])
* 22:06 mutante: releases1002 - releases2002, something broke puppet here about 27.5 hours ago.  lookup() did not find a value for the name 'profile::docker::storage::physical_volumes'
* 22:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:01 mutante: re-enabled puppet on deploy*, parse* and thenr everything else
* 21:59 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:56 mutante: re-enabled puppet on mw-app-canary, mwmaint, labweb1002,..
* 21:53 mutante: re-enabled puppet on mw-api-canary
* 21:48 arnoldokoth: restore script scheduled to run on gitlab2001 ([[phab:T285867|T285867]])
* 21:46 arnoldokoth: restoreccccccvkvhgbvtklgce kkbeuvvuskljihickdbgcunljcr scheduled to run on gitlab2001 ([[phab:T285867|T285867]])
* 21:42 mutante: sudo cumin 'C:profile::mediawiki::common' "disable-puppet 'gerrit:734798 - $<nowiki>{</nowiki>USER<nowiki>}</nowiki>'"
* 20:37 mutante: ensured gitlab restore timer is running only on passive server and re-enabled it - https://gerrit.wikimedia.org/r/c/operations/puppet/+/735437 [[phab:T274463|T274463]]
* 20:30 mutante: gitlab2001 - re-enabled gitlab-restore-from-backup service
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c23e7ab0f88a64b8f656e06949518fb816b2dd56}}: foundationwiki: Set wmgLocalAuthLoginOnly=false temporarily ([[phab:T205347|T205347]]) (duration: 00m 55s)
* 20:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:03 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]]
* 20:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:52 twentyafterfour@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]] (duration: 01m 02s)
* 19:51 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]]
* 19:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:43 legoktm@deploy1002: Synchronized php-1.38.0-wmf.6/includes/libs/http/MultiHttpClient.php: Hack: temporarily log headers in MultiHttpClient (duration: 01m 02s)
* 19:41 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:40 legoktm@deploy1002: Synchronized php-1.38.0-wmf.5/includes/libs/http/MultiHttpClient.php: Hack: temporarily log headers in MultiHttpClient (duration: 01m 02s)
* 19:37 robh@cumin1001: START - Cookbook sre.dns.netbox
* 19:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:36 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.6/includes/GlobalFunctions.php: {{Gerrit|b517ebd29396c23f66806cb0dca1d1b330c6e5be}}: Revert "wfParseUrl: rely on parse_url for proto-relative urls" ([[phab:T294559|T294559]]) (duration: 01m 03s)
* 19:29 mutante: [netflow4001:~] $ sudo systemctl reset-failed
* 19:23 legoktm: manually restarted nginx on ncredir4002 after accidental ganeti reboot
* 19:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:15 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.6/extensions/FlaggedRevs/: {{Gerrit|2cd2a4e62af1b66ad29adf27d761129e8fea388a}}: REST: Avoid making wpaccuracy required in API requests ([[phab:T294544|T294544]]) (duration: 01m 03s)
* 19:12 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|60413dcece4a9e5ec125be49315dc22ec3b85cc7}}: foundationwiki: Use shared OAuth tables ([[phab:T205347|T205347]]) (duration: 01m 04s)
* 19:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:53 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4e0200e3f5ad9c7514ddbeda6f7ee4f8b5ed2ec7}}: emailuser ratelimit: Use user-global rather than user ([[phab:T293866|T293866]]) (duration: 01m 04s)
* 17:23 cstone: civicrm revision changed from {{Gerrit|dba74c443b}} to {{Gerrit|403be9ce05}}
* 16:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:44 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: REVERT: {{Gerrit|78783a7624ddf4e5bfac1e255f466d3c3e36016d}}: foundationwiki: Enable Translate extension ([[phab:T205349|T205349]]) (duration: 01m 04s)
* 16:31 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|78783a7624ddf4e5bfac1e255f466d3c3e36016d}}: foundationwiki: Enable Translate extension ([[phab:T205349|T205349]]) (duration: 01m 25s)
* 16:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:24 urbanecm: foundationwiki: Create DB tables for translate extension ([[phab:T205349|T205349]])
* 16:17 urbanecm: Attach BStorm (WMF)@foundationwiki to SUL ([[phab:T205347|T205347]])
* 15:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:47 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:44 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:44 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:43 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bcc910fffbad3a776aa2465740ac42e9e8ffa26c}}: foundationwiki: Revoke editsitejson and editinterface from users ([[phab:T205347|T205347]]) (duration: 01m 04s)
* 15:41 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 15:33 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:24 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:24 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:22 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:22 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 15:05 volans@cumin2002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
* 15:04 volans@cumin2002: Updating IPMI password on 8 hosts - volans@cumin2002 - [[phab:T283050|T283050]]
* 15:03 volans@cumin2002: START - Cookbook sre.hosts.ipmi-password-reset
* 14:50 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 14:50 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 14:49 volans@cumin2002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
* 14:46 volans@cumin2002: Updating IPMI password on 16 hosts - volans@cumin2002 - [[phab:T283050|T283050]]
* 14:46 volans@cumin2002: START - Cookbook sre.hosts.ipmi-password-reset
* 14:25 urbanecm: [urbanecm@mwmaint1002 ~/foundationwiki-sul]$ mwscript extensions/CentralAuth/maintenance/attachAccount.php --wiki=foundationwiki --userlist users.txt # [[phab:T205347|T205347]], users.txt P17630
* 14:14 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:09 twentyafterfour@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.5  refs [[phab:T293947|T293947]] (duration: 01m 03s)
* 14:08 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.5  refs [[phab:T293947|T293947]]
* 14:05 twentyafterfour: rolling back group1 wikis to 1.38.0-wmf.5 ([[phab:T293947|T293947]]) due to UBN [[phab:T294559|T294559]]
* 14:00 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:59 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:42 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 13:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 13:40 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 13:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 13:40 bblack: esams: switching unified TLS cert to digicert-2021 (natural rollout over next ~30 mins)
* 13:33 XioNoX: rollback prepend our AS on all es/knams uplinks (except NaWas) - [[phab:T288505|T288505]]
* 13:23 volans@cumin2002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
* 13:23 volans@cumin2002: Updating IPMI password on 1 hosts - volans@cumin2002 - [[phab:T283050|T283050]]
* 13:22 volans@cumin2002: START - Cookbook sre.hosts.ipmi-password-reset
* 12:48 urbanecm: [urbanecm@mwmaint1002 ~/foundationwiki-sul]$ mwscript extensions/CentralAuth/maintenance/migrateAccount.php --wiki=foundationwiki --safe --auto --userlist users.txt # [[phab:T205347|T205347]], users.txt P17629
* 12:45 urbanecm: [urbanecm@mwmaint1002 ~/foundationwiki-sul]$ mwscript extensions/CentralAuth/maintenance/migrateAccount.php --wiki=foundationwiki --safe --auto --userlist users.txt # [[phab:T205347|T205347]], users.txt is at P17628
* 12:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudcephosd1021.eqiad.wmnet with OS bullseye
* 12:35 urbanecm: [urbanecm@mwmaint1002 ~/foundationwiki-sul]$ mwscript extensions/CentralAuth/maintenance/attachAccount.php --wiki=foundationwiki --userlist users.txt # [[phab:T205347|T205347]]; users.txt is at P17627
* 12:27 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/migrateAccount.php --wiki=foundationwiki --safe --auto --userlist users.txt # [[phab:T205347|T205347]], users.txt is P17626
* 12:25 XioNoX: prepend our AS on all es/knams uplinks (except NaWas) - [[phab:T288505|T288505]]
* 12:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:17 urbanecm@deploy1002: Synchronized wmf-config/config/foundationwiki.yaml: {{Gerrit|6f557db4ad82aa3ec80550a423d538099bf305fa}}: Connect foundationwiki to SUL ([[phab:T205347|T205347]]; 3/3) (duration: 01m 02s)
* 12:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:16 urbanecm@deploy1002: Synchronized dblists/fishbowl.dblist: {{Gerrit|6f557db4ad82aa3ec80550a423d538099bf305fa}}: Connect foundationwiki to SUL ([[phab:T205347|T205347]]; 2/3) (duration: 01m 03s)
* 12:14 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|6f557db4ad82aa3ec80550a423d538099bf305fa}}: Connect foundationwiki to SUL ([[phab:T205347|T205347]]; 1/3) (duration: 01m 03s)
* 12:11 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6f557db4ad82aa3ec80550a423d538099bf305fa}}: Connect foundationwiki to SUL ([[phab:T205347|T205347]]) (duration: 01m 03s)
* 12:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:03 urbanecm: urbanecm@mwmaint1002:~$ mwscript extensions/CentralAuth/maintenance/migratePass0.php --wiki=foundationwiki # [[phab:T205347|T205347]], with 717506 pulled to mwmaint1002
* 12:02 XioNoX: advertise esams prefix to NaWas - [[phab:T288505|T288505]]
* 11:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:40 volans@cumin2002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
* 11:39 volans@cumin2002: Updating IPMI password on 1 hosts - volans@cumin2002 - [[phab:T283050|T283050]]
* 11:39 volans@cumin2002: START - Cookbook sre.hosts.ipmi-password-reset
* 11:37 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=1)
* 11:37 volans@cumin2002: START - Cookbook sre.hosts.ipmi-password-reset
* 11:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:29 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|77df86abd173cec7f1a87c85fc65ce5faca329cb}}: Add permissions to eleminators on viwiki ([[phab:T294530|T294530]]) (duration: 01m 04s)
* 11:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:06 volans@cumin2002: END (PASS) - Cookbook sre.hosts.ipmi-password-reset (exit_code=0)
* 11:06 volans@cumin2002: Updating IPMI password on 0 hosts - volans@cumin2002 - [[phab:T283050|T283050]]
* 11:05 volans@cumin2002: START - Cookbook sre.hosts.ipmi-password-reset
* 11:05 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.ipmi-password-reset (exit_code=99)
* 11:05 volans@cumin2002: START - Cookbook sre.hosts.ipmi-password-reset
* 09:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:19 urbanecm: Wiki creation done. pwnwiki, amiwiki and lmowiktionary got created.
* 09:19 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|532f8e5d476a1e2d5d953371a07f0aeb8c01bbaf}}: lmowiktionary: Create Appendix namespace ([[phab:T291390|T291390]]) (duration: 01m 03s)
* 09:15 urbanecm@deploy1002: Synchronized wmf-config/interwiki.php: Update interwiki cache (duration: 02m 06s)
* 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating lmowiktionary ([[phab:T291390|T291390]]) (duration: 01m 02s)
* 09:13 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --wiki=lmowiktionary --cluster=all # [[phab:T291390|T291390]]
* 09:12 urbanecm@deploy1002: Synchronized wmf-config/logos.php: Creating lmowiktionary ([[phab:T291390|T291390]]) (duration: 01m 02s)
* 09:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:11 urbanecm@deploy1002: Synchronized static/images/project-logos/: Creating lmowiktionary ([[phab:T291390|T291390]]) (duration: 01m 02s)
* 09:10 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating lmowiktionary ([[phab:T291390|T291390]])
* 09:09 urbanecm@deploy1002: Synchronized dblists: Creating lmowiktionary ([[phab:T291390|T291390]]) (duration: 01m 02s)
* 09:08 urbanecm@deploy1002: Synchronized wmf-config/db-codfw.php: Creating lmowiktionary ([[phab:T291390|T291390]]) (duration: 01m 02s)
* 09:07 urbanecm@deploy1002: Synchronized wmf-config/db-eqiad.php: Creating lmowiktionary ([[phab:T291390|T291390]]) (duration: 01m 02s)
* 09:02 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:59 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:52 urbanecm@deploy1002: Synchronized langlist: Creating amiwiki ([[phab:T292414|T292414]]) (duration: 01m 02s)
* 08:51 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating amiwiki ([[phab:T292414|T292414]]) (duration: 01m 02s)
* 08:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --wiki=amiwiki --cluster=all # [[phab:T292414|T292414]]
* 08:50 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating amiwiki ([[phab:T292414|T292414]])
* 08:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:49 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CirrusSearch/maintenance/UpdateSearchIndexConfig.php --wiki=pwnwiki --cluster=all # [[phab:T292415|T292415]]
* 08:49 urbanecm@deploy1002: Synchronized dblists: Creating amiwiki ([[phab:T292414|T292414]]) (duration: 01m 02s)
* 08:48 urbanecm@deploy1002: Synchronized wmf-config/db-codfw.php: Creating amiwiki ([[phab:T292414|T292414]]) (duration: 01m 02s)
* 08:47 urbanecm@deploy1002: Synchronized wmf-config/db-eqiad.php: Creating amiwiki ([[phab:T292414|T292414]]) (duration: 01m 02s)
* 08:46 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:43 urbanecm@deploy1002: Synchronized langlist: Creating pwnwiki ([[phab:T292415|T292415]]) (duration: 01m 02s)
* 08:41 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Creating pwnwiki ([[phab:T292415|T292415]]) (duration: 01m 03s)
* 08:40 urbanecm@deploy1002: rebuilt and synchronized wikiversions files: Creating pwnwiki ([[phab:T292415|T292415]])
* 08:39 urbanecm@deploy1002: Synchronized dblists: Creating pwnwiki ([[phab:T292415|T292415]]) (duration: 01m 02s)
* 08:38 urbanecm@deploy1002: Synchronized wmf-config/db-codfw.php: Creating pwnwiki ([[phab:T292415|T292415]]) (duration: 01m 03s)
* 08:37 urbanecm@deploy1002: Synchronized wmf-config/db-eqiad.php: Creating pwnwiki ([[phab:T292415|T292415]]) (duration: 01m 03s)
* 08:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:30 ema: A:cp start rolling varnish upgrades to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 08:20 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 08:20 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 07:32 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp5011.eqsin.wmnet
* 06:38 elukey: depool cp5011 and restart varnish-frontend (ABI errors while reloading after digicert changes)
* 06:17 XioNoX: Remove GRE tunnel between cr4-ulsfo and cr2-eqsin - [[phab:T273308|T273308]]
* 05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Remove contributions replicas from s6 eqiad [[phab:T263127|T263127]]', diff saved to https://phabricator.wikimedia.org/P17622 and previous config saved to /var/cache/conftool/dbconfig/20211028-050052-marostegui.json


== 2021-10-27 ==
== 2022-09-28 ==
* 23:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:53 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host logstash2037.codfw.wmnet with OS buster
* 23:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:52 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['logstash2037']
* 23:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:51 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2037']
* 23:40 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Allow upload by URL for Wikisources ([[phab:T293205|T293205]]), and enable it on enwikisource for autoconfirmed ([[phab:T294447|T294447]]) (duration: 01m 03s)
* 23:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1134 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35103 and previous config saved to /var/cache/conftool/dbconfig/20220928-231719-ladsgroup.json
* 23:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 23:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1134.eqiad.wmnet with reason: Maintenance
* 23:28 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734361{{!}}Add mobile wordmark for Meetei (Manipuri) Wikipedia to config (T294189)]] (duration: 01m 02s)
* 22:20 ejegg: updated fundraising CiviCRM from {{Gerrit|d31c19a0}} to {{Gerrit|f3461a44}}
* 23:27 catrope@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-mni.svg: Config: [[gerrit:734361{{!}}Add mobile wordmark for Meetei (Manipuri) Wikipedia to config (T294189)]] (duration: 01m 03s)
* 21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2170:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35102 and previous config saved to /var/cache/conftool/dbconfig/20220928-213701-ladsgroup.json
* 23:26 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:36 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 22:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:36 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 22:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:36 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P35101 and previous config saved to /var/cache/conftool/dbconfig/20220928-213640-ladsgroup.json
* 22:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:21 ladsgroup@cumin1001
* 22:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:06 reedy@deploy1002: Synchronized php-1.38.0-wmf.6/extensions/SecurePoll/includes/Crypt/GpgCrypt.php: [[phab:T294489|T294489]] (duration: 01m 15s)
* 21:42 urbanecm: [urbanecm@mwmaint1002 ~]$ foreachwikiindblist wikipedia namespaceDupes.php --fix {{!}} tee namespacedupes-wikipedia-real.log # run namespaceDupes.php for all Wikipedias
* 21:38 urbanecm: run namespaceDupes.php for a bunch of Wikipedias
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:51 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: [[phab:T294489|T294489]] (duration: 01m 59s)
* 20:51 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2255.codfw.wmnet
* 20:47 mutante: mw2255 - scap pull, repooling - after DRAC firmware was upgraded - [[phab:T283582|T283582]]
* 20:47 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2255.codfw.wmnet
* 19:53 bblack: cp5xxx: switching unified cert to digicert-2021
* 19:49 bblack: cp5007: switching unified cert to digicert-2021
* 19:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:36 twentyafterfour@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]] (duration: 01m 47s)
* 19:34 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.6  refs [[phab:T293947|T293947]]
* 19:28 bblack: cp5001: switching unified cert to digicert-2021
* 19:24 bblack: cp5xxx: disabling puppet ahead of digicert unified certificate update rollout
* 18:46 legoktm: installing python-swiftclient on mw1305 for debugging
* 18:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:23 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:22 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:15 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734697{{!}}Disable Education Program namespaces in eswiki (T294365)]] (duration: 01m 04s)
* 18:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:10 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:734451{{!}}Temporarily change the votewiki lang to fa (T292685)]] (duration: 01m 04s)
* 17:40 otto@deploy1002: Finished deploy [analytics/refinery@0d79e18] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0d79e18


== 2021-10-09 ==
== 2022-09-27 ==
* 05:01 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:16 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1002.eqiad.wmnet with OS bullseye
* 04:28 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:13 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc-wf1001.eqiad.wmnet with OS bullseye
* 01:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 22:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 00:46 mutante: ms-be2045 - started systemd-timedated which had been killed by something
* 21:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 00:28 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 21:58 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1002.eqiad.wmnet with reason: host reimage
* 00:24 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.force-unfreeze (exit_code=99)
* 21:55 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc-wf1001.eqiad.wmnet with reason: host reimage
* 00:23 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.force-unfreeze
* 21:47 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host mc-wf1002.eqiad.wmnet with OS bullseye
* 00:13 ryankemper: [[phab:T292814|T292814]] Write queue stuck at 133 events in partition 1 of topic `codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite`, will try again at another time
* 21:44 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host mc-wf1001.eqiad.wmnet with OS bullseye
* 00:12 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 21:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1119 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34971 and previous config saved to /var/cache/conftool/dbconfig/20220927-213028-ladsgroup.json
* 21:30 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 21:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1119.eqiad.wmnet with reason: Maintenance
* 21:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34970 and previous config saved to /var/cache/conftool/dbconfig/20220927-213006-ladsgroup.json
* 21:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P34969 and previous config saved to /var/cache/conftool/dbconfig/20220927-211500-ladsgroup.json
* 21:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 21:12 TheresNoTime: closing UTC late backport window
* 21:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]] (duration: 04m 53s)
* 21:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:06 samtar@deploy1002: samtar and ssastry: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 21:06 samtar@deploy1002: Started scap: Backport for [[gerrit:835593{{!}}Remove figures from text extracts (T318727)]]
* 21:06 samtar@deploy1002: Finished scap: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]] (duration: 06m 58s)
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118', diff saved to https://phabricator.wikimedia.org/P34968 and previous config saved to /var/cache/conftool/dbconfig/20220927-205953-ladsgroup.json
* 20:59 TheresNoTime: extending UTC late backport window
* 20:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-wf1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:58 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mc-wf1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:58 samtar@deploy1002: samtar and ssastry: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:58 samtar@deploy1002: Started scap: Backport for [[gerrit:835594{{!}}Remove figures from text extracts (T318727)]]
* 20:57 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:57 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:53 samtar@deploy1002: Finished scap: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]] (duration: 05m 29s)
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:48 samtar@deploy1002: samtar and stang: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:48 samtar@deploy1002: Started scap: Backport for [[gerrit:835681{{!}}romdwikimedia: Enable subpages in NS0 (T318491)]]
* 20:46 samtar@deploy1002: Finished scap: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]] (duration: 05m 14s)
* 20:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host mc-wf1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:45 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host mc-wf1001.mgmt.eqiad.wmnet with reboot policy FORCED
* 20:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34967 and previous config saved to /var/cache/conftool/dbconfig/20220927-204446-ladsgroup.json
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:41 samtar@deploy1002: samtar and ryankemper: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:41 samtar@deploy1002: Started scap: Backport for [[gerrit:833860{{!}}elastic: rebalance enwiki_content shard counts (T318270)]]
* 20:38 samtar@deploy1002: Finished scap: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]] (duration: 06m 02s)
* 20:38 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:35 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:33 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 20:32 samtar@deploy1002: Started scap: Backport for [[gerrit:835689{{!}}Add wmgMFDefaultEditor back in for future use]]
* 20:30 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:24 samtar@deploy1002: Started scap: Backport for [[gerrit:835206{{!}}Disable MobileFrontend default editor a/b test (T302356)]]
* 20:24 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:24 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:22 samtar@deploy1002: Started scap: Backport for [[gerrit:835206{{!}}Disable MobileFrontend default editor a/b test (T302356)]]
* 20:20 samtar@deploy1002: Finished scap: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]] (duration: 04m 58s)
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:15 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:15 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host centrallog1002.eqiad.wmnet with OS bullseye
* 20:15 samtar@deploy1002: Started scap: Backport for [[gerrit:835648{{!}}Enable DiscussionTools reply button visual enhancements on cswiki+huwiki (T315626)]]
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]] (duration: 05m 46s)
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:04 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet
* 20:04 samtar@deploy1002: Started scap: Backport for [[gerrit:835635{{!}}MobileWebUIActions sample rate to 1 on testwiki (T302108)]]
* 20:02 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on centrallog1002.eqiad.wmnet with reason: host reimage
* 19:59 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on centrallog1002.eqiad.wmnet with reason: host reimage
* 19:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34966 and previous config saved to /var/cache/conftool/dbconfig/20220927-194908-ladsgroup.json
* 19:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 19:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 19:48 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host centrallog1002.eqiad.wmnet with OS bullseye
* 18:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:09 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]]
* 18:02 brennen: 1.40.0-wmf.3 ([[phab:T314192|T314192]]) no current blockers, promoting to group0
* 17:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1001.eqiad.wmnet
* 17:50 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1002.eqiad.wmnet
* 17:49 dduvall@deploy1002: helmfile [eqiad] DONE helmfile.d/services/blubberoid: apply
* 17:48 dduvall@deploy1002: helmfile [eqiad] START helmfile.d/services/blubberoid: apply
* 17:48 dduvall@deploy1002: helmfile [codfw] DONE helmfile.d/services/blubberoid: apply
* 17:48 dduvall@deploy1002: helmfile [codfw] START helmfile.d/services/blubberoid: apply
* 17:47 dduvall@deploy1002: helmfile [staging] DONE helmfile.d/services/blubberoid: apply
* 17:47 dduvall@deploy1002: helmfile [staging] START helmfile.d/services/blubberoid: apply
* 17:39 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1001.eqiad.wmnet
* 17:38 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1002.eqiad.wmnet
* 17:38 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1003.eqiad.wmnet
* 17:29 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest[1001-1002].eqiad.wmnet
* 17:28 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest[1001-1002].eqiad.wmnet
* 17:26 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1003.eqiad.wmnet
* 17:19 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudvirt-wdqs1003.eqiad.wmnet
* 17:08 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudvirt-wdqs1003.eqiad.wmnet
* 14:56 mforns@deploy1002: Finished deploy [airflow-dags/analytics@25dda27]: (no justification provided) (duration: 00m 11s)
* 14:56 mforns@deploy1002: Started deploy [airflow-dags/analytics@25dda27]: (no justification provided)
* 14:38 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 14:38 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 14:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34958 and previous config saved to /var/cache/conftool/dbconfig/20220927-143831-ladsgroup.json
* 14:35 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host logstash2036.codfw.wmnet with OS buster
* 14:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1118 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34957 and previous config saved to /var/cache/conftool/dbconfig/20220927-143109-ladsgroup.json
* 14:31 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 14:30 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1118.eqiad.wmnet with reason: Maintenance
* 14:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34956 and previous config saved to /var/cache/conftool/dbconfig/20220927-143047-ladsgroup.json
* 14:26 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host logstash2036.codfw.wmnet with OS buster
* 14:25 Lucas_WMDE: END lucaswerkmeister-wmde@mwmaint1002:~$ PHP=php7.4 mwscript updateCollation.php incubatorwiki --force # [[phab:T315552|T315552]], 710183 rows done
* 14:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P34955 and previous config saved to /var/cache/conftool/dbconfig/20220927-142324-ladsgroup.json
* 14:23 mforns@deploy1002: Finished deploy [airflow-dags/analytics@66dfa44]: (no justification provided) (duration: 00m 46s)
* 14:22 mforns@deploy1002: Started deploy [airflow-dags/analytics@66dfa44]: (no justification provided)
* 14:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P34954 and previous config saved to /var/cache/conftool/dbconfig/20220927-141541-ladsgroup.json
* 14:13 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:11 Lucas_WMDE: BEGIN lucaswerkmeister-wmde@mwmaint1002:~$ PHP=php7.4 mwscript updateCollation.php incubatorwiki --force # [[phab:T315552|T315552]]
* 14:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130', diff saved to https://phabricator.wikimedia.org/P34953 and previous config saved to /var/cache/conftool/dbconfig/20220927-140817-ladsgroup.json
* 14:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:06 taavi@deploy1002: Finished scap: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]] (duration: 06m 59s)
* 14:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:00 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107', diff saved to https://phabricator.wikimedia.org/P34952 and previous config saved to /var/cache/conftool/dbconfig/20220927-140034-ladsgroup.json
* 14:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:59 taavi@deploy1002: taavi and migr: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 13:59 taavi@deploy1002: Started scap: Backport for [[gerrit:835590{{!}}Track use of Searchbox footer on Wikidata (T306933)]], [[gerrit:835591{{!}}Track use of Searchbox footer on Wikidata (T306933)]]
* 13:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34951 and previous config saved to /var/cache/conftool/dbconfig/20220927-135310-ladsgroup.json
* 13:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34950 and previous config saved to /var/cache/conftool/dbconfig/20220927-134528-ladsgroup.json
* 12:42 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:36 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:31 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:28 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:26 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:23 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:20 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:18 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:15 klausman@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 11:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:57 jbond: upload new wmf-laptop_0.5.4 package
* 11:52 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:28 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 10:58 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:58 mvernon@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:57 mvernon@cumin1001: START - Cookbook sre.dns.netbox
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[2028-2039].codfw.wmnet
* 10:55 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:52 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:38 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:38 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:16 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:14 mvernon@cumin2002: START - Cookbook sre.hosts.decommission for hosts ms-be[2028-2039].codfw.wmnet
* 10:11 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:11 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 10:10 mvernon@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:06 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts ms-be[1028-1033,1035-1039].eqiad.wmnet
* 10:03 moritzm: rebalance ganeti/codfw row D after completed Bullseye update [[phab:T311686|T311686]]
* 09:14 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 09:13 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 09:12 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2130 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34942 and previous config saved to /var/cache/conftool/dbconfig/20220927-082023-ladsgroup.json
* 08:20 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2130.codfw.wmnet with reason: Maintenance
* 08:20 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34941 and previous config saved to /var/cache/conftool/dbconfig/20220927-082001-ladsgroup.json
* 08:15 moritzm: restarting apache/FPM on mw canaries to pick up Expat security updates
* 08:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P34938 and previous config saved to /var/cache/conftool/dbconfig/20220927-080454-ladsgroup.json
* 08:00 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.thumbor (exit_code=0) rolling restart_daemons on A:thumbor-eqiad
* 07:58 jmm@cumin2002: START - Cookbook sre.misc-clusters.thumbor rolling restart_daemons on A:thumbor-eqiad
* 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.thumbor (exit_code=0) rolling restart_daemons on A:thumbor-codfw
* 07:54 jmm@cumin2002: START - Cookbook sre.misc-clusters.thumbor rolling restart_daemons on A:thumbor-codfw
* 07:52 XioNoX: upgrade python3-pynetbox to 6.6.0 on cumin1001 - [[phab:T310745|T310745]]
* 07:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116', diff saved to https://phabricator.wikimedia.org/P34937 and previous config saved to /var/cache/conftool/dbconfig/20220927-074948-ladsgroup.json
* 07:49 XioNoX: upgrade python3-pynetbox to 6.6.0 on cumin2002 - [[phab:T310745|T310745]]
* 07:48 moritzm: installing expat security updates on stretch/buster/bullseye
* 07:39 moritzm: uploaded expat 2.2.0-2+deb9u5+wmf1 to apt.wikimedia.org/stretch-wikimedia
* 07:36 jayme: published image docker-registry.discovery.wmnet/golang1.18:1.18-1
* 07:35 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1107 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34936 and previous config saved to /var/cache/conftool/dbconfig/20220927-073523-ladsgroup.json
* 07:35 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 07:34 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance
* 07:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34935 and previous config saved to /var/cache/conftool/dbconfig/20220927-073451-ladsgroup.json
* 07:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34934 and previous config saved to /var/cache/conftool/dbconfig/20220927-073441-ladsgroup.json
* 07:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P34933 and previous config saved to /var/cache/conftool/dbconfig/20220927-071938-ladsgroup.json
* 07:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P34932 and previous config saved to /var/cache/conftool/dbconfig/20220927-070431-ladsgroup.json
* 06:59 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'show' for AS: 8220
* 06:58 ayounsi@cumin1001: START - Cookbook sre.network.peering with action 'show' for AS: 8220
* 06:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34930 and previous config saved to /var/cache/conftool/dbconfig/20220927-064925-ladsgroup.json
* 05:28 marostegui: Install 10.6.10 on db1124, db1125, pc1014, pc2014 [[phab:T318128|T318128]]
* 03:57 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:40 mwpresync@deploy1002: Pruned MediaWiki: 1.40.0-wmf.1 (duration: 02m 03s)
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]] (duration: 36m 01s)
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.3  refs [[phab:T314192|T314192]]
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:06 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:05 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2116 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34928 and previous config saved to /var/cache/conftool/dbconfig/20220927-020124-ladsgroup.json
* 02:01 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 02:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2116.codfw.wmnet with reason: Maintenance
* 02:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34927 and previous config saved to /var/cache/conftool/dbconfig/20220927-020103-ladsgroup.json
* 01:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P34926 and previous config saved to /var/cache/conftool/dbconfig/20220927-014556-ladsgroup.json
* 01:30 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103', diff saved to https://phabricator.wikimedia.org/P34925 and previous config saved to /var/cache/conftool/dbconfig/20220927-013050-ladsgroup.json
* 01:17 eileen: civicrm upgraded from {{Gerrit|dcef393d}} to {{Gerrit|e198fb4c}}
* 01:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34924 and previous config saved to /var/cache/conftool/dbconfig/20220927-011543-ladsgroup.json
* 00:50 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1007.wikimedia.org
* 00:42 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1006.wikimedia.org
* 00:40 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1007.wikimedia.org
* 00:32 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1005.wikimedia.org
* 00:31 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1006.wikimedia.org
* 00:16 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1005.wikimedia.org
* 00:15 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudnet1005.eqiad.wmnet
* 00:15 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 00:13 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudnet1005.eqiad.wmnet
* 00:13 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudnet1005.eqiad.wmnet
* 00:05 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34923 and previous config saved to /var/cache/conftool/dbconfig/20220927-000525-ladsgroup.json
* 00:05 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:04 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices1005.wikimedia.org
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on clouddb[1013,1017,1021].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1106.eqiad.wmnet with reason: Maintenance
* 00:04 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34922 and previous config saved to /var/cache/conftool/dbconfig/20220927-000434-ladsgroup.json


== 2021-10-08 ==
== 2022-09-26 ==
* 23:16 legoktm: sudo cumin -b 10 C:mediawiki::packages 'apt-get purge lilypond-data -y'
* 23:56 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1005.wikimedia.org
* 23:10 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 23:49 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P34921 and previous config saved to /var/cache/conftool/dbconfig/20220926-234928-ladsgroup.json
* 21:38 mutante: mwmaint2002 - disable-puppet, stop bacula-fd, recovery in progress
* 23:34 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P34920 and previous config saved to /var/cache/conftool/dbconfig/20220926-233422-ladsgroup.json
* 21:34 mutante: disabling puppet on bacula - going through a restore https://wikitech.wikimedia.org/wiki/Bacula#Restore_from_a_non-existent_host_(missing_private_key)
* 23:34 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudservices1004.wikimedia.org
* 21:30 legoktm: running puppet across C:mediawiki::packages to uninstall lilypond and ploticus: legoktm@cumin1001:~$ sudo cumin -b 4 C:mediawiki::packages 'run-puppet-agent'
* 23:21 andrew@cumin1001: START - Cookbook sre.hosts.reboot-single for host cloudservices1004.wikimedia.org
* 20:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 23:19 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34919 and previous config saved to /var/cache/conftool/dbconfig/20220926-231915-ladsgroup.json
* 20:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 23:14 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2032.codfw.wmnet with OS bullseye
* 20:08 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 22:59 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
* 20:08 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 22:56 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2032.codfw.wmnet with reason: host reimage
* 20:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 22:37 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2032.codfw.wmnet with OS bullseye
* 20:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 22:33 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2031.codfw.wmnet with OS bullseye
* 19:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 22:18 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti2031.codfw.wmnet with reason: host reimage
* 19:45 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 22:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti2031.codfw.wmnet with reason: host reimage
* 19:43 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 21:39 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2031.codfw.wmnet with OS bullseye
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 21:06 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host centrallog1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:42 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 20:41 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host centrallog1002.mgmt.eqiad.wmnet with reboot policy FORCED
* 19:39 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 20:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:15 cstone: civicrm revision changed from {{Gerrit|5cb7d487cb}} to {{Gerrit|598b59b0ee}}
* 20:37 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:19 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=enwiki --force # to measure performance on a large wiki
* 20:31 TheresNoTime: closing UTC late backport window
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 20:18 samtar@deploy1002: Finished scap: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]] (duration: 06m 52s)
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 20:16 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:29 jelto: enable puppet on gitlab1001 again for [[phab:T283076|T283076]]
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:05 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:01 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 09:49 Amir1: wikiadmin@10.64.16.85(wikidatawiki)> delete from wb_changes_subscription where cs_subscriber_id in ('testcommonswiki', 'mowiki');
* 20:13 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 09:39 Emperor: installing stress on ms-be2045 given recent h/w issues [[phab:T290881|T290881]]
* 20:11 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 08:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:11 samtar@deploy1002: Started scap: Backport for [[gerrit:835255{{!}}Fix VisualEditor on wikis where RESTBase was never set up (T318325)]]
* 08:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:10 samtar@deploy1002: Finished scap: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]] (duration: 06m 13s)
* 08:04 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force
* 20:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:43 Emperor: reboot ms-be2045 [[phab:T290881|T290881]]
* 20:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:41 gehel: manually resuming the data reloads on wdqs1009 and wdqs2008
* 20:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 20:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:42 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 20:06 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['logstash2036']
* 06:28 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 20:06 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2036']
* 06:28 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 20:06 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['logstash2036']
* 05:35 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 20:06 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['logstash2036']
* 04:56 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 20:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti2032']
* 04:32 ryankemper: [[phab:T292814|T292814]] Beginning rolling restart of `cloudelastic`: `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic restart" --nodes-per-run 1 --start-datetime 2021-10-08T03:53:49 --task-id [[phab:T292814|T292814]]` on `ryankemper@cumin1001` tmux `elastic`
* 20:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2032']
* 04:31 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 20:05 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti2032']
* 04:29 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 20:05 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2032']
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across both test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 20:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['ganeti2031']
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 20:04 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2031']
* 04:23 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@8f57a56]: 0.3.89 (duration: 08m 22s)
* 20:04 samtar@deploy1002: samtar and matmarex: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet
* 04:20 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 20:03 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ganeti2031']
* 04:20 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 20:03 samtar@deploy1002: Started scap: Backport for [[gerrit:835245{{!}}wgMFMobileFormatterOptions: Set maxImages and maxHeadings to very high values (T317070)]]
* 04:18 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 20:03 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ganeti2031']
* 04:17 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 19:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2103 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34918 and previous config saved to /var/cache/conftool/dbconfig/20220926-195019-ladsgroup.json
* 04:15 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.89` on canary `wdqs1003`; proceeding to rest of fleet
* 19:50 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 04:14 ryankemper@deploy1002: Started deploy [wdqs/wdqs@8f57a56]: 0.3.89
* 19:50 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2103.codfw.wmnet with reason: Maintenance
* 04:14 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.89`. Pre-deploy tests passing on canary `wdqs1003`
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 03:58 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 19:40 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 03:58 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 19:40 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host kafka-logging1004.eqiad.wmnet with OS bullseye
* 02:04 Krinkle: krinkle@deploy1002$ echo 'https://en.wikipedia.org/static/images/project-logos/jvwiktionary.png' {{!}} mwscript purgeList.php , ref [[phab:T287425|T287425]], [[phab:T292810|T292810]]
* 19:04 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2184.codfw.wmnet with OS bullseye
* 00:07 tgr_: deploy window over
* 18:51 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 00:05 tgr@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments: Backport: [[gerrit:727498{{!}}Mentee overview: Make UncachedMenteeOverviewDataProvider::getBlocksForUsers faster (T290609)]] (duration: 00m 56s)
* 18:49 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 18:47 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2184.codfw.wmnet with reason: host reimage
* 18:29 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db2184.codfw.wmnet with OS bullseye
* 18:27 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2183.codfw.wmnet with OS bullseye
* 18:18 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 18:17 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 18:13 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 18:10 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on db2183.codfw.wmnet with reason: host reimage
* 17:57 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:53 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:42 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host db2183.codfw.wmnet with OS bullseye
* 17:31 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 17:30 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2032.mgmt.codfw.wmnet with reboot policy FORCED
* 17:30 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:29 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:29 volans@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:28 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 17:27 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2037.mgmt.codfw.wmnet with reboot policy FORCED
* 17:27 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:26 volans@cumin2002: START - Cookbook sre.hosts.provision for host logstash2036.mgmt.codfw.wmnet with reboot policy FORCED
* 17:16 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2184']
* 17:16 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2184']
* 17:15 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['db2183']
* 17:15 pt1979@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2183']
* 17:10 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2037
* 17:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:08 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host logstash2037
* 17:08 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host logstash2036
* 17:07 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host logstash2036
* 17:07 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:07 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host ganeti2031.mgmt.codfw.wmnet with reboot policy FORCED
* 17:05 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 17:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1105:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34914 and previous config saved to /var/cache/conftool/dbconfig/20220926-170213-ladsgroup.json
* 17:02 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1105.eqiad.wmnet with reason: Maintenance
* 17:01 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34913 and previous config saved to /var/cache/conftool/dbconfig/20220926-170151-ladsgroup.json
* 17:01 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:00 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:57 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 16:56 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti2032
* 16:56 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti2032
* 16:55 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ganeti2031
* 16:55 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ganeti2031
* 16:52 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:46 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P34912 and previous config saved to /var/cache/conftool/dbconfig/20220926-164645-ladsgroup.json
* 16:35 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:31 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311', diff saved to https://phabricator.wikimedia.org/P34911 and previous config saved to /var/cache/conftool/dbconfig/20220926-163138-ladsgroup.json
* 16:26 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:25 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2184.mgmt.codfw.wmnet with reboot policy FORCED
* 16:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 100%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34910 and previous config saved to /var/cache/conftool/dbconfig/20220926-162322-ladsgroup.json
* 16:22 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:16 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 16:16 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34909 and previous config saved to /var/cache/conftool/dbconfig/20220926-161632-ladsgroup.json
* 16:15 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 75%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34908 and previous config saved to /var/cache/conftool/dbconfig/20220926-160817-ladsgroup.json
* 16:07 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 16:04 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 16:03 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 15:58 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 15:57 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:57 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 15:55 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 15:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 15:53 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 15:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 25%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34907 and previous config saved to /var/cache/conftool/dbconfig/20220926-155312-ladsgroup.json
* 15:52 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 15:51 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 15:47 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 15:43 volans@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 15:40 ladsgroup@deploy1002: Synchronized portals: Migrate wikiversity.org to the modern portals (duration: 03m 36s)
* 15:38 ladsgroup@cumin1001: dbctl commit (dc=all): 'db1166 (re)pooling @ 10%: Maint Done', diff saved to https://phabricator.wikimedia.org/P34906 and previous config saved to /var/cache/conftool/dbconfig/20220926-153807-ladsgroup.json
* 15:37 ladsgroup@deploy1002: Synchronized portals/wikipedia.org/assets: Migrate wikiversity.org to the modern portals (duration: 03m 49s)
* 14:49 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 14:48 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2102.codfw.wmnet with reason: Maintenance
* 13:59 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@a69b031]: Make Airflow jobs use Spark 3 on anlytics_test [airflow-dags@a69b031] (duration: 00m 09s)
* 13:59 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@a69b031]: Make Airflow jobs use Spark 3 on anlytics_test [airflow-dags@a69b031]
* 13:56 moritzm: installing mako security updates
* 13:47 aqu@deploy1002: Finished deploy [airflow-dags/analytics@a69b031]: Make Airflow jobs use Spark 3 on anlytics [airflow-dags@a69b031] (duration: 00m 10s)
* 13:46 aqu@deploy1002: Started deploy [airflow-dags/analytics@a69b031]: Make Airflow jobs use Spark 3 on anlytics [airflow-dags@a69b031]
* 13:45 Lucas_WMDE: UTC afternoon backport+config window done
* 13:41 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaIncubator/extension.json: Backport: [[gerrit:835130{{!}}Set default sortkey for prefixed pages (T315551)]] (2/2) (duration: 03m 39s)
* 13:40 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaIncubator/includes/WikimediaIncubator.php: Backport: [[gerrit:835130{{!}}Set default sortkey for prefixed pages (T315551)]] (1/2) (duration: 03m 51s)
* 13:33 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:30 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:835127{{!}}Enable wgCiteResponsiveReferences on etwiki (T318530)]] (duration: 03m 53s)
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:59 awight@deploy1002: Finished deploy [kartotherian/deploy@d1bd7dc]: Enable geopoints on production (duration: 02m 40s)
* 12:56 awight@deploy1002: Started deploy [kartotherian/deploy@d1bd7dc]: Enable geopoints on production
* 12:54 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:53 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:53 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:51 moritzm: installing bind9 security updates on Bullseye
* 12:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:51 ladsgroup@deploy1002: Finished scap: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]] (duration: 06m 05s)
* 12:45 ladsgroup@deploy1002: ladsgroup and ladsgroup: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 12:44 ladsgroup@deploy1002: Started scap: Backport for [[gerrit:835169{{!}}Bump portals to HEAD (T273179)]]
* 12:25 moritzm: installing unzip security updates
* 10:43 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:43 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 10:25 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:24 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM matomo1002.eqiad.wmnet
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34904 and previous config saved to /var/cache/conftool/dbconfig/20220926-094812-ladsgroup.json
* 09:48 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:47 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 09:45 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1099:3311 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34903 and previous config saved to /var/cache/conftool/dbconfig/20220926-094502-ladsgroup.json
* 09:44 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 09:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2097.codfw.wmnet with reason: Maintenance
* 09:44 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1099.eqiad.wmnet with reason: Maintenance
* 09:39 btullis@cumin1001: START - Cookbook sre.ganeti.reboot-vm for VM matomo1002.eqiad.wmnet
* 08:58 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|033ab75917932a6b6e1cda8cc26f5f069448e3b9}}: arwiki: Properly grant enrollasmentor to editor ([[phab:T310905|T310905]]) (duration: 03m 46s)
* 08:58 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:56 btullis: adding 80GB of virtual disk to matomo1002
* 08:55 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:55 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:54 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:48 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:48 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:47 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0a5486780a0543d7fb1c637d2abe48855e753d13}}: arwiki: Grant enrollasmentor to editor ([[phab:T310905|T310905]]) (duration: 03m 40s)
* 08:39 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:38 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:07 godog: upgrade grafana to 8.5.13
* 08:04 godog: add 20G to prometheus/analytics in codfw
* 07:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:31 oblivian@deploy1002: Finished scap: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]] (duration: 05m 31s)
* 07:29 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:29 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:28 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:26 oblivian@deploy1002: oblivian and oblivian: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 07:26 oblivian@deploy1002: Started scap: Backport for [[gerrit:823681{{!}}Move 100% of cookie-accepting clients to php 7.4 (T271736)]]
* 07:23 urbanecm@deploy1002: Synchronized wmf-config/InterwikiSortOrders.php: {{Gerrit|620bb80e3534c812d7f4de25547d92104b8609a0}}: Add ami, bjn, blk, dag, guw, ig, kcg, lmo, pcm, pwn, and  shi to InterwikiSortOrders (duration: 03m 40s)
* 07:23 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:11 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|81f66621e923cd2ee3aac6f8b5be0ba2e85fb51d}}: Add wordmark and tagline for mnwiki ([[phab:T318478|T318478]]) (duration: 03m 46s)
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 urbanecm@deploy1002: Synchronized static/images/mobile/copyright/: {{Gerrit|81f66621e923cd2ee3aac6f8b5be0ba2e85fb51d}}: Add wordmark and tagline for mnwiki ([[phab:T318478|T318478]]; 1/2) (duration: 03m 40s)
* 07:04 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:49 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:41 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:36 elukey: clean up my old home dir on matomo1002, ran `apt-get clean` + some other clean up steps on matomo1002 to free space on the root partition
* 06:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d2d2c08fc6e0dd5c0c85fbe31f85201721871aa9}}: eswiki: Enable structured mentor list ([[phab:T310905|T310905]]) (duration: 04m 30s)
* 06:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 06:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 06:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 06:29 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2021-10-07 ==
== 2022-09-25 ==
* 23:43 thcipriani@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 3/3 (duration: 00m 55s)
* 17:29 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 23:41 thcipriani@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 2/3 (duration: 00m 55s)
* 17:08 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 23:40 thcipriani@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 1/3 (duration: 00m 56s)
* 17:05 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1053.eqiad.wmnet with reason: host reimage
* 23:30 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 2/2 (duration: 00m 56s)
* 16:51 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1053.eqiad.wmnet with OS bullseye
* 23:28 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikiquote-wordmark-tr.svg: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 1/2 (duration: 00m 57s)
* 16:49 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 21:35 urbanecm: Password reset for SUL User:LA2-bot ([[phab:T292793|T292793]])
* 16:23 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 20:43 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3
* 16:20 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 20:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.2  refs [[phab:T281167|T281167]]
* 16:06 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 20:35 cmooney@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 15:59 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 20:35 cmooney@cumin1001: START - Cookbook sre.network.cf
* 15:31 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 20:23 krinkle@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Gadgets/: {{Gerrit|I7c858b8c4bc}} (duration: 00m 56s)
* 15:26 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1052.eqiad.wmnet with reason: host reimage
* 20:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Echo/: {{Gerrit|8a7ff05ba28f302adb581bf430a868bb815b4ffd}}: Revert "Use namespaced CentralAuthSessionProvider" (duration: 00m 57s)
* 15:26 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 02m 44s)
* 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/CentralAuth/: {{Gerrit|c01c2e4983bad8582ddd62aeb35ac9be852d493b}}: Revert "Namespace session providers" (duration: 00m 57s)
* 15:23 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 19:44 urbanecm: Backporting https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/727489, https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Echo/+/727487 in an unsafe way -- exceptions at testwikis expected, wmf.3 is not deployed elsewhere, so this should be ok
* 15:22 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 01m 11s)
* 19:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert all wikis to 1.38.0-wmf.2 ([[phab:T281167|T281167]])
* 15:20 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 19:33 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): variously blocked, rolling back to testwikis for safe deploy of backports
* 15:15 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data (duration: 01m 10s)
* 19:14 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group2 wikis to 1.38.0-wmf.2
* 15:14 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard now uses the dynamicproxy api to fetch zone data
* 19:07 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 15:13 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1052.eqiad.wmnet with OS bullseye
* 19:03 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to all wikis
* 18:50 urbanecm: [urbanecm@mwmaint1002 /srv/mediawiki/php]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=test2wiki
* 18:46 sukhe: running authdns-update for [[phab:T292537|T292537]]
* 18:29 urbanecm: Morning B&C window done
* 18:28 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a946c046ae17a520f8d3463a16b1435ceb4856c}}: Deploy Growth mentor dashboard to pilot wikis ([[phab:T278920|T278920]]) (duration: 01m 04s)
* 18:23 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 03s)
* 18:21 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 04s)
* 18:20 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|31770f2b3660e7d7490c0a9ab66285c1f069732d}}: shwiki: Deploy Growth features to newcomers ([[phab:T278240|T278240]]) (duration: 01m 04s)
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|33526dfed148068585289f5ac501feda72068fd9}}: Stream config changes for android_daily_stats schema ([[phab:T286000|T286000]]) (duration: 01m 06s)
* 18:10 ejegg: updated payments-wiki from {{Gerrit|6d3560d083}} to {{Gerrit|030b11da1a}}
* 18:07 arnoldokoth: gitlab2001 re-image complete ([[phab:T283076|T283076]])
* 17:30 mutante: rebooting gitlab2001.wikimedia.org
* 16:56 arnoldokoth: down timing gitlab2001 for re-imaging ([[phab:T283076|T283076]])
* 16:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 16:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 16:32 hnowlan: roll restarting maps cassandra instances for java updates
* 16:19 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 16:19 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 15:07 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001 (duration: 00m 08s)
* 15:07 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001
* 14:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:49 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001 (duration: 00m 10s)
* 14:49 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001
* 14:48 hashar: Upgrading Gerrit replica to 3.3.6 # [[phab:T290236|T290236]]
* 14:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:56 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:46 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:30 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:29 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:29 hashar: restarting CI Jenkins for git plugin update
* 13:19 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:15 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 hashar: Upgraded CI Jenkins on contint2001
* 13:14 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:13 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:10 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:09 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:06 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:06 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 13:05 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:05 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:56 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:40 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 12:16 moritzm: installing testvm2005
* 11:59 moritzm: installing openssl security updates for stretch (buster/bullseye already fixed)
* 11:52 Lucas_WMDE: EU backport+config window (aka UTC morning) done
* 11:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725858{{!}}Enable Content and Section Translation to Kurdish WP (T290238)]] (duration: 01m 04s)
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:44 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/WikidataPageBanner/includes/WikidataPageBannerFunctions.php: Backport: [[gerrit:727188{{!}}Change PropertyId to NumericPropertyId (T289125, T292667)]] (duration: 01m 05s)
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:10 jbond: update puppet stdlib gerrit:726872
* 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:31 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2004.codfw.wmnet
* 09:26 mvernon@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host ms-be2045.codfw.wmnet
* 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2005.codfw.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2004.codfw.wmnet
* 09:08 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2005.codfw.wmnet
* 08:49 mvernon@cumin2002: START - Cookbook sre.experimental.reimage for host ms-be2045.codfw.wmnet
* 08:36 moritzm: imported jenkins 2.303.2 to thirdparty/ci component for buster-wikimedia
* 07:57 Emperor: re-enabling puppet on ms-be2045 after hw work [[phab:T290881|T290881]]
* 07:39 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 07:39 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 07:38 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 07:37 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 07:34 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 07:33 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 07:32 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:31 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 06:21 ryankemper: [Elastic] Restart of `relforge` complete
* 06:05 ryankemper: [Elastic] Cluster in green status, proceeding to next and final node => `ryankemper@relforge1003:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:53 ryankemper: [Elastic] `ryankemper@relforge1004:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:48 ryankemper: [Elastic] Performing rolling restarts of `relforge`. `relforge1003` is the master so I'll restart `relforge1004` first to minimize disruption
* 03:00 ejegg: updated payments-wiki from {{Gerrit|23d0ffac66}} to {{Gerrit|6d3560d083}}
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: enable Parsoid API everywhere (duration: 01m 04s)
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 mutante: [grafana2001:~] $ sudo systemctl start rsync-var-lib-grafana  because of "PROBLEM - Check systemd state on grafana2001 is CRITICAL: CRITICAL - degraded" because of some race condition where a file vanished during sync


== 2021-10-06 ==
== 2022-09-23 ==
* 23:57 mutante: releases2002 - rm /srv/org/wikimedia/reprepro/conf/distributions - contains only jessie-mediawiki - see 725670 and EOL of MediaWiki 1.31
* 19:10 mforns@deploy1002: Finished deploy [airflow-dags/analytics@4c973d6]: (no justification provided) (duration: 00m 12s)
* 23:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:10 mforns@deploy1002: Started deploy [airflow-dags/analytics@4c973d6]: (no justification provided)
* 23:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:49 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@7620b25]: (no justification provided) (duration: 00m 10s)
* 23:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:48 nokafor@deploy1002: Started deploy [airflow-dags/analytics@7620b25]: (no justification provided)
* 23:21 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 13:39 hashar@deploy1002: Finished scap: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]] (duration: 07m 10s)
* 23:20 jforrester@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ckb.svg: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:16 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726603{{!}}Enable NewUserMessage for ptwikivoyage (T290820)]] (duration: 01m 05s)
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:30 mutante: re-enabling puppet on mw*, an-worker* after deploying gerrit:726954. no issue this time
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:23 mutante: temp. disabling puppet on an-worker*, mw*
* 13:32 hashar@deploy1002: hashar and hashar: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 20:50 mutante: global puppet failure - revert is merged, puppet run will recover on next run everywhere. partially forcing with cumin, partially letting it recover naturally
* 13:31 hashar@deploy1002: Started scap: Backport for [[gerrit:834531{{!}}Stop using Elastica::Type and set the target indices (T318356)]]
* 20:43 mutante: [cumin1001:~] $ sudo cumin -b 5 -p 95 'mw2*' 'run-puppet-agent -q --failed-only'
* 13:29 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard improved error handling (duration: 03m 06s)
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:26 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6]: wmf-proxy-dashboard improved error handling
* 19:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:24 taavi@deploy1002: Finished deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard improved error handling (duration: 01m 11s)
* 19:05 brennen@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]] (duration: 01m 03s)
* 13:23 taavi@deploy1002: Started deploy [horizon/deploy@9d02cd6] (dev): wmf-proxy-dashboard improved error handling
* 19:04 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 09:26 jynus: stopping db1117:s3 for maintenance [[phab:T315713|T315713]]
* 19:01 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): still unblocked after triage meeting, rolling to group1
* 08:51 Emperor: rebalance ms-eqiad swift rings [[phab:T294550|T294550]]
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:36 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db[2134,2160].codfw.wmnet,db[1117,1159].eqiad.wmnet with reason: Grants fixing
* 18:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:36 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db[2134,2160].codfw.wmnet,db[1117,1159].eqiad.wmnet with reason: Grants fixing
* 18:44 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Revert disabling static mapframes on eswiki (duration: 01m 14s)
* 06:10 marostegui: Shutdown db1189 [[phab:T317662|T317662]]
* 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:09 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on db1189.eqiad.wmnet with reason: on site maintenance
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:09 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 4 days, 0:00:00 on db1189.eqiad.wmnet with reason: on site maintenance
* 18:31 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: eswiki: Disable static mapframes ([[phab:T291736|T291736]]) (duration: 01m 17s)
* 18:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: viwikibooks: Set $wgRestrictDisplayTitle to false ([[phab:T289837|T289837]]) (duration: 01m 21s)
* 17:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:53 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 16:47 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:43 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to group0
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:35 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726596{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 04s)
* 16:35 jynus: stopping db1127 for hw maintenance [[phab:T292366|T292366]]
* 16:31 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 16:31 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 16:28 brennen@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726597{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 10s)
* 16:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:01 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 15:45 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): proceeding to deploy backports for [[phab:T292589|T292589]]
* 15:37 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 15:35 volans: installer spicerack 1.0.4 on cumin2002
* 12:50 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:48 volans: uploaded spicerack_1.0.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2004.codfw.wmnet
* 12:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:18 effie: pool mw1455 mw1422
* 12:17 urbanecm: wikiadmin@10.64.0.164(viwiki)> delete from growthexperiments_mentee_data; # cleanup after disabling mentor dashboard backend
* 12:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2004.codfw.wmnet
* 12:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1aa67d4846f39f59127a835cb7a8ed2974506025}}: viwiki: Disable mentor dashboard backend ([[phab:T278920|T278920]]) (duration: 01m 06s)
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:56 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2003.codfw.wmnet
* 11:55 XioNoX: esams - Advertise 185.15.59.0/24 instead of 185.15.58.0/23 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2003.codfw.wmnet
* 10:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 10:50 jelto: disable puppet on gitlab1001 to test puppetized code on GitLab replica - [[phab:T283076|T283076]]
* 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1003.wikimedia.org
* 10:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1003.wikimedia.org
* 10:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 urbanecm@deploy1002: Synchronized wmf-config/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 2/2) (duration: 01m 05s)
* 10:01 urbanecm@deploy1002: Synchronized dblists/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 1/2) (duration: 01m 04s)
* 09:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host copernicium.wikimedia.org
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host copernicium.wikimedia.org
* 09:19 jbond: update ipaddress6 fact - https://gerrit.wikimedia.org/r/c/operations/puppet/+/726625
* 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:13 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:725923{{!}}Don't fail job if subscribed wiki is unknown (T292446 T292440)]] (duration: 01m 15s)
* 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:29 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 08:21 XioNoX: add ROAs for 185.15.58.0/24 and 185.15.59.0/24 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 08:04 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews --fix # [[phab:T291344|T291344]]
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews # [[phab:T291344|T291344]]
* 07:55 urbanecm: mwdebug1001: scap pull ([[phab:T291344|T291344]] fix done)
* 07:51 urbanecm: Staging at mwdebug1001 for [[phab:T291344|T291344]]
* 05:53 kart_: Updated cxserver to use nodejs12 ([[phab:T290754|T290754]])
* 05:47 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:39 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:36 Amir1: start of mwscript extensions/Wikibase/repo/maintenance/pruneChanges.php --wiki wikidatawiki --number-of-days=2
* 05:31 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 04:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:29 ryankemper: [WDQS] `wdqs1012` is back up after restarting blazegraph (blazegraph was locked up)
* 04:27 ryankemper: [WDQS] `ryankemper@wdqs1012:~$ sudo systemctl restart wdqs-blazegraph` (attempting to bring downed `wdqs1012` back into health)
* 04:25 ryankemper: [WDQS] Repooling eqiad hosts following the brief outage from earlier: `wdqs1004`, `wdqs1006`, `wdqs1007`
* 03:19 eileen: civicrm revision changed from {{Gerrit|b6f5f71c18}} to {{Gerrit|82efd2e195}}, config revision is {{Gerrit|f4c57d4733}}
* 03:11 tstarling@deploy1002: Synchronized php-1.38.0-wmf.3/includes/CommentFormatter/RowCommentIterator.php: fix UBN [[phab:T292590|T292590]] (duration: 01m 04s)
* 01:39 legoktm: legoktm@mwmaint1002:~$ echo "https://en.wikiversity.org/static/images/mobile/copyright/wikiversity.svg" {{!}}mwscript purgeList.php
* 01:17 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 03s)
* 01:12 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 17s)
* 00:59 arlolra@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable legacy media dom on metawiki (duration: 01m 05s)
* 00:37 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:35 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 03s)
* 00:32 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:29 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 04s)
* 00:16 mutante: puppetmasters: rm /etc/logrotate.d/geoipupdate && systemctl start logrotate && puppet agent -tv
* 00:14 mutante: puppetmaster2002 - rm /etc/logrotate.d/geoipupdate (not managed by puppet anymore but not removed, caused duplicate logrotate config, made logrotate service fail), start logrotate
* 00:08 cstone: civicrm revision changed from {{Gerrit|34d3c3aae8}} to {{Gerrit|b6f5f71c18}}
* 00:01 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725132{{!}}Add WN as an alias to project namespace in Polish Wikinews (T291344)]] (duration: 01m 04s)


== 2021-10-05 ==
== 2022-09-22 ==
* 23:54 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikiversity.svg: Config: [[gerrit:725413{{!}}Wikiversity Logo Update for 2017 Logo Version (T292109)]] (duration: 01m 03s)
* 22:20 joal@deploy1002: Finished deploy [airflow-dags/analytics@901f810]: (no justification provided) (duration: 00m 11s)
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 04s)
* 22:19 joal@deploy1002: Started deploy [airflow-dags/analytics@901f810]: (no justification provided)
* 23:44 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-az.svg: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 23s)
* 21:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725386{{!}}Add image_suggestion_interaction event stream]] (duration: 01m 12s)
* 21:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:02 legoktm: deleting old stretch docker images from the registry for [[phab:T292485|T292485]]
* 21:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 22:24 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 22:20 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]) rolling back to testwikis for the day; will revisit in US-morning
* 21:23 dancy@deploy1002: backport aborted: (duration: 00m 05s)
* 20:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:44 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/includes/page: Backport: [[gerrit:726594{{!}}Pre-format comments for non-local files too]] ([[phab:T292570|T292570]]) (duration: 01m 04s)
* 20:56 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:18 mutante: puppetmaster1003 et al - converting maxmind geoip database fetching from cron to timers
* 20:56 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:06 mutante: cumin 'puppetmaster*' "disable-puppet '[[phab:T288844|T288844]] - [[phab:T273673|T273673]] - gerrit:721595 - $<nowiki>{</nowiki>USER<nowiki>}</nowiki>'"
* 20:55 brennen: end of utc late backport & config window
* 19:30 mutante: restoring /home/amire80 from and to mwmaint2002 via Bacula bconsole ([[phab:T292573|T292573]])
* 20:55 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:09 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 20:54 brennen@deploy1002: Finished scap: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]] (duration: 06m 33s)
* 19:03 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 20:53 joal@deploy1002: Finished deploy [airflow-dags/analytics@6c81e6f]: (no justification provided) (duration: 00m 10s)
* 18:26 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.23 (duration: 01m 57s)
* 20:53 joal@deploy1002: Started deploy [airflow-dags/analytics@6c81e6f]: (no justification provided)
* 18:23 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.21 (duration: 04m 20s)
* 20:48 brennen@deploy1002: brennen and arlolra: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 18:21 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]): pruning old branches, starting with 1.37.0-wmf.21, proceeeding to 1.37.0-wmf.23 if time allows
* 20:47 brennen@deploy1002: Started scap: Backport for [[gerrit:834364{{!}}Restrict figure to the size of the media (T305357 T318300)]], [[gerrit:834366{{!}}Fix media alignment since disabling wgParserEnableLegacyMediaDOM (T318300)]]
* 18:11 ppchelko@deploy1002: Synchronized wmf-config: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] Php72ToUpper.php removal (duration: 01m 06s)
* 20:36 brennen@deploy1002: backport aborted:  (duration: 02m 16s)
* 18:04 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] CS.php (duration: 01m 06s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:55 brennen@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]] (duration: 45m 59s)
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:12 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 20:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 17:09 brennen@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 20:32 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 17:03 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 20:27 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 17:02 btullis@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 17:02 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 20:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:56 brennen: successfully applied security patches for 1.38.0-wmf.3 train ([[phab:T281167|T281167]])
* 20:25 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:47 brennen: coordinated with deployment backup and starting train prep for 1.38.0-wmf.3 ([[phab:T281167|T281167]]), branched at {{Gerrit|65279490f82c785181b8b6961e40901a4aaafca4}}
* 20:25 brennen@deploy1002: Finished scap: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]] (duration: 06m 09s)
* 15:57 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for puppetboard2002.codfw.wmnet: Renew puppet certificate - jbond@cumin2002
* 20:19 brennen@deploy1002: brennen and tpt: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 15:57 jbond@cumin2002: START - Cookbook sre.puppet.renew-cert for puppetboard2002.codfw.wmnet: Renew puppet certificate - jbond@cumin2002
* 20:19 brennen@deploy1002: Started scap: Backport for [[gerrit:833817{{!}}Drops JS-side creation of "Source" link (T318266)]]
* 15:38 jbond: reimage puppetboard2002
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:15 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for puppetboard1002.eqiad.wmnet: Renew puppet certificate - jbond@cumin1001
* 20:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:15 jbond@cumin1001: START - Cookbook sre.puppet.renew-cert for puppetboard1002.eqiad.wmnet: Renew puppet certificate - jbond@cumin1001
* 20:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:10 moritzm: imported routinator 0.10.1-1bullseye to thirdparty/routinator for bullseye-wikimedia [[phab:T292503|T292503]]
* 20:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:58 jbond: reimage puppetboard1002
* 19:45 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 14:40 effie: depool  mw1455 and mw1422
* 18:38 jhuneidi@deploy1002: Started scap: testing
* 14:30 Pchelolo: run foreachwiki uppercaseTitlesForUnicodeTransition.php --charmap current_to_php7_overrides.php [[phab:T219279|T219279]]
* 18:38 dancy@deploy1002: Started scap: testing
* 13:51 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor - Drop REL1_31, start REL1_37 (duration: 00m 57s)
* 18:37 jhuneidi@deploy1002: Started scap: testing
* 13:46 Pchelolo: run renameInvalidUsernames.php --wiki loginwiki --list /tmp/rename_users_for_uppercase_all.txt [[phab:T219279|T219279]]
* 18:34 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@265686e]: (no justification provided) (duration: 00m 13s)
* 13:39 elukey@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - elukey@cumin1001
* 18:33 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@265686e]: (no justification provided)
* 13:39 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - elukey@cumin1001
* 18:29 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 13:23 ppchelko@deploy1002: Synchronized php-1.38.0-wmf.2/maintenance/uppercaseTitlesForUnicodeTransition.php: Backport uppercaseTitlesForUnicodeTransition.php maintenance script improvements [[phab:T219279|T219279]] (duration: 00m 58s)
* 18:23 dancy@deploy1002: Unlocked for deployment [ALL REPOSITORIES]: testing (duration: 00m 02s)
* 12:53 ema: upload varnish 6.0.8-1wm1 to apt.wikimedia.org [[phab:T292290|T292290]]
* 18:23 dancy@deploy1002: Locking from deployment [ALL REPOSITORIES]: testing (planned duration: 60m 00s)
* 12:43 elukey: import AMD ROCm 4.2 to buster-wikimedia's thirdparty/amd-rocm42 - [[phab:T287267|T287267]]
* 18:22 dancy@deploy1002: Installation of scap version "4.22.0" completed for 561 hosts
* 12:24 ema: deployment-cache-text06: upgrade varnish to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 18:22 dancy@deploy1002: Installing scap version "4.22.0" for 561 hosts
* 11:58 hnowlan: reverted restbase2023 to use CN=hostname certificate due to loading errors
* 18:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:57 hnowlan@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:57 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 18:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:37 hnowlan@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 18:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:28 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 16:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:17 hnowlan_: disabling puppet on cassandra nodes for rollout of 724061 - defaulting to cn=fqdn certificates
* 16:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:15 effie: upgrade scap to 4.0.2 - [[phab:T291095|T291095]]
* 16:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:12 urbanecm@deploy1002: Synchronized dblists/commonsuploads.dblist: {{Gerrit|04524992865b0ae5750eb6fb0a374aa74a65b383}}: Enable local uploads for tcywiki ([[phab:T166763|T166763]]) (duration: 00m 59s)
* 16:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 10:11 vgutierrez: update acme-chief to version 0.32 on acmechief hosts - [[phab:T290249|T290249]]
* 16:39 dancy@deploy1002: Sync cancelled.
* 10:09 vgutierrez: update acme-chief to version 0.32 on acmechief-test hosts - [[phab:T290249|T290249]]
* 16:39 dancy@deploy1002: dancy and dancy: Backport for [[gerrit:834352{{!}}InitialiseSettings-labs.php: Added test text (to be reverted) (T317242)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 10:06 vgutierrez: upload acme-chief 0.32 to apt.wm.o (buster) - [[phab:T290249|T290249]]
* 16:38 dancy@deploy1002: Started scap: Backport for [[gerrit:834352{{!}}InitialiseSettings-labs.php: Added test text (to be reverted) (T317242)]]
* 09:46 hnowlan_: generated cassandra certificate using FQDN for restbase2023
* 13:24 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:09 topranks: updating routinator on rpki2001 ([[phab:T291543|T291543]])
* 13:23 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:59 dcausse: depool and restart blazegraph on wdqs1007
* 13:23 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:51 moritzm: installing openssl security updates for stretch (buster/bullseye already fixed)
* 13:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:58 moritzm: installing apache security updates
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:57 elukey: upgrade GPU drivers (AMD ROCm 4.3.1) on an-worker1[096-101]
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:27 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:26 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:15 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:26 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1004.wmnet
* 13:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|dcf37106d32ddda58948dbd6bc7ef3eb823a8e3d}}: Remove Research Incentive survey on idwiki ([[phab:T316466|T316466]]) (duration: 03m 50s)
* 06:38 elukey: reboot an-worker1096 after installing new GPU drivers
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 04:20 eileen: civicrm revision changed from {{Gerrit|d74e9aa0a1}} to {{Gerrit|34d3c3aae8}}, config revision is {{Gerrit|cae09f7691}}
* 13:09 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|ff867a48d617bc556be23ac595c4e3c5466f69c1}}: Add wgMetaNamespace for knwiktionary and knwikiquote ([[phab:T318318|T318318]]) (duration: 03m 57s)
* 13:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:38 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 12:37 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 12:24 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 12:24 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 12:22 dcausse@deploy1002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 12:22 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 12:21 dcausse@deploy1002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 07:35 apergos: UTC morning backport and config training deployment window closed a bit belatedly
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:09 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833885{{!}}Enable Content and Section Translation in Bhojpuri Wikipedia (T313296)]] (duration: 04m 03s)
* 07:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2021-10-04 ==
== 2022-09-21 ==
* 23:30 foks: resetting some emails used for abuse by a globally-banned user
* 20:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 23:19 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:726084{{!}} Bumping portals to master (T128546)]] (duration: 00m 59s)
* 20:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 23:18 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:726084{{!}} Bumping portals to master (T128546)]] (duration: 00m 59s)
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 23:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|75645c9cc59b37dbf59942eabbc014b7dc147626}}: Add explicit config for licensing/copyright message overrides ([[phab:T284097|T284097]]) (duration: 00m 59s)
* 20:50 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 23:05 mutante: [deneb:~] $ sudo systemctl start docker-reporter-releng-images
* 20:46 tgr_: UTC late deploys done
* 22:54 mutante: puppetmaster2001 - rm /etc/logrotate.d/geoipupdate_ipinfo  and geoipupdate_ipinfo ; running puppet, starting logrotate service
* 20:45 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:13 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:44 tgr@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/WikimediaEvents/includes/BlockMetrics/BlockMetricsHooks.php: Backport: [[gerrit:833810{{!}}Block metrics: Bump schema to un-require some fields (T317343)]] (duration: 03m 42s)
* 16:51 bblack: rolling restart of haproxy for DoTLS on dns300[12],authdns1001,authdns2001 to recycle connections
* 20:44 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:24 vgutierrez: pool cp5006
* 20:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:17 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 20:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 15:16 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 20:36 tgr@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/WikimediaEvents/includes/BlockMetrics/BlockMetricsHooks.php: Backport: [[gerrit:833809{{!}}Block metrics: Bump schema to un-require some fields (T317343)]] (duration: 03m 55s)
* 14:50 phuedx: phuedx@mwmaint1002:~$ mwscript extensions/SecurePoll/cli/purgeDecryptionKeys.php --wiki=votewiki --before="20210101000000"
* 20:29 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:46 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 20:28 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:46 effie: uploading scap 4.0.2 - [[phab:T291095|T291095]]
* 20:28 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:45 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:39 brennen: gitlab: upgrade to 14.3.2 (note there was an additional patch release on 2021-10-01) complete ([[phab:T292256|T292256]])
* 20:25 samtar@deploy1002: Finished scap: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]] (duration: 04m 19s)
* 14:25 Amir1: cleaning up wb_changes_subscription rows from closed wikis ([[phab:T292440|T292440]])
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:24 brennen: gitlab: downtime for upgrade to 14.3.1
* 20:21 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:19 elukey: import AMD ROCm 4.3.1 packages in buster-wikimedia's thirdparty/amd-rocm431 - [[phab:T287267|T287267]]
* 20:21 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:21 samtar@deploy1002: samtar and ebernhardson: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 14:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:20 samtar@deploy1002: Started scap: Backport for [[gerrit:833463{{!}}cirrus: Limit shard count to 1 in deployment-prep (T316711)]]
* 14:13 ladsgroup@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:725905{{!}}Explicitly enable dispatching and pruning for wikidata (T48643)]] (duration: 00m 58s)
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:17 samtar@deploy1002: Finished scap: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]] (duration: 05m 31s)
* 14:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:03 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T292256
* 20:12 samtar@deploy1002: samtar and kemayo: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 14:03 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T292256
* 20:11 samtar@deploy1002: Started scap: Backport for [[gerrit:833837{{!}}Enable DiscussionTools visual enhancements as beta on en/dewiki (T315625)]]
* 14:01 ladsgroup@deploy1002: Synchronized wmf-config: Config: [[gerrit:725502{{!}}Enable dispatching via jobs everywhere (T48643)]] (duration: 01m 00s)
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:56 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725785{{!}}Enable dispatching for wikidatawiki and commonswiki (T292088)]] (duration: 01m 00s)
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:09 samtar@deploy1002: Finished scap: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]] (duration: 05m 16s)
* 12:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:04 samtar@deploy1002: samtar and zabe: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti2026.codfw.wmnet with reason: Ganeti tests
* 20:04 samtar@deploy1002: Started scap: Backport for [[gerrit:833830{{!}}Remove deployment-db08 (T318126)]]
* 12:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti2026.codfw.wmnet with reason: Ganeti tests
* 19:33 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@ce20ecd]: (no justification provided) (duration: 00m 10s)
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti2025.codfw.wmnet with reason: Ganeti tests
* 19:33 nokafor@deploy1002: Started deploy [airflow-dags/analytics@ce20ecd]: (no justification provided)
* 12:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti2025.codfw.wmnet with reason: Ganeti tests
* 19:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:55 urbanecm: EU B&C window done
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:55 urbanecm@deploy1002: Synchronized multiversion/MWWikiversions.php: {{Gerrit|508cf5cc6d213373f7c9ba1cdef142ebc8398022}}: Let DB expressions intersect DB lists ([[phab:T290609|T290609]]) (duration: 00m 58s)
* 19:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:50 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|a855078cf52d88cc2cd27a0adc7c6a680c80dd39}}: dewiki, nlwiki: Bump Growth features to 80% ([[phab:T288420|T288420]], [[phab:T285254|T285254]]) (duration: 00m 58s)
* 19:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b8b2ebd3933cb891b62bb6aea01b2342c017cec8}}: Growth: Switch pilot wikis to structured mentor list ([[phab:T310905|T310905]]) (duration: 03m 59s)
* 11:46 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: {{Gerrit|5728376}}: Update [[phab:T250887|T250887]] mitigations (duration: 00m 58s)
* 19:02 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:44 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b0a96bed4562bcc975187b1d34626201d407404b}}: Undeploy GettingStarted V: Remove now-obsolete logging channels ([[phab:T235752|T235752]]) (duration: 00m 59s)
* 19:01 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:42 urbanecm@deploy1002: Synchronized wmf-config/extension-list: {{Gerrit|9709bcfc8dacbcd1704471df08c31cec0711bea6}}: Undeploy GettingStarted IV: Dont build i18n ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 19:01 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d60f332785868797e7ecc9b5e410616d5604b392}}: Undeploy getting started III: Dont set wmgUseGettingStarted, now ignored ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 19:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:37 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|9eaf960c4b7c304be57dfc8d248aca0c6501d04c}}: Undeploy GettingStarted II: Dont load regardless of config ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 18:55 nokafor@deploy1002: Finished deploy [analytics/refinery@91d0cf8] (thin): Regular analytics weekly train THIN [analytics/refinery@91d0cf8] (duration: 00m 08s)
* 11:35 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1c7405ad1eb323a8da524819f17d6f1a66afaa57}}: Undeploy GettingStarted I: Disable on all wikis ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 18:55 nokafor@deploy1002: Started deploy [analytics/refinery@91d0cf8] (thin): Regular analytics weekly train THIN [analytics/refinery@91d0cf8]
* 11:31 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724992{{!}}Remove deprecated SectionTranslationTargetLanguage config (T290302)]] (duration: 00m 58s)
* 18:44 nokafor@deploy1002: Finished deploy [analytics/refinery@91d0cf8]: Regular analytics weekly train [analytics/refinery@91d0cf8] (duration: 05m 40s)
* 11:22 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725042{{!}}Add wikisource-bot.toolforge.org to Commons copy upload list (T292213)]] (duration: 00m 59s)
* 18:38 nokafor@deploy1002: Started deploy [analytics/refinery@91d0cf8]: Regular analytics weekly train [analytics/refinery@91d0cf8]
* 11:16 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720058{{!}}Add IA-Upload tool domains to Commons wgCopyUploadsDomains (T287241)]] (duration: 00m 59s)
* 14:56 Emperor: set thanos ring replicas to 3.75 [[phab:T311690|T311690]]
* 11:12 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 14:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833783{{!}}Pool deployment-db09, depool deployment-db08 (T318126)]] (Beta-only, exchange one replica for another) [*actually* sync it this time since I forgot to git rebase before the last sync 🤦] (duration: 03m 41s)
* 11:10 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 14:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 11:07 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:46 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 11:06 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 14:46 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 11:04 effie: depool  wtp1026 for tests
* 14:45 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 11:04 effie: pool  wtp1025
* 14:44 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833783{{!}}Pool deployment-db09, depool deployment-db08 (T318126)]] (Beta-only, exchange one replica for another) (duration: 03m 48s)
* 10:59 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 09:13 akosiaris: hbal -L -G row_C -X on ganeti01.svc.eqiad.wmnet
* 13:59 Lucas_WMDE: UTC afternoon backport+config window done
* 08:59 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@071f7c3] (eqiad): Increase mirrored traffic to 100% for eqiad (duration: 00m 54s)
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:58 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@071f7c3] (eqiad): Increase mirrored traffic to 100% for eqiad
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:37 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc] (duration: 06m 14s)
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:31 joal@deploy1002: Started deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc]
* 13:57 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833776{{!}}Add back deployment-db08 (T318126)]] (Beta-only, restore old replica) (duration: 03m 48s)
* 07:30 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc] (duration: 00m 06s)
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:30 joal@deploy1002: Started deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc]
* 13:42 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:29 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc] (duration: 19m 18s)
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:19 dcausse: restarting blazegraph on wdqs2001 & wdqs2004 (allocators burning too quickly)
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:18 elukey: depool + restart blazegraph + restart updater for wdqs1006
* 13:32 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:18 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs1006.wmnet
* 13:32 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/db-labs.php: Config: [[gerrit:833461{{!}}Replace deployment-db08 with deployment-db09 (T318126)]] (Beta-only, replace one replica with another) (duration: 03m 56s)
* 07:18 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs1004.wmnet
* 13:31 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:10 joal@deploy1002: Started deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc]
* 13:31 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:02 godog: swift eqiad-prod: add weight to ms-be10[64-67] - [[phab:T290546|T290546]]
* 13:30 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:44 elukey: depool + restart blazegraph + restart updater on wdqs1004
* 13:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 05:50 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 13:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830817{{!}}Add editcontentmodel right for metawiki translation administrators (T311587)]] (duration: 03m 50s)
* 05:49 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 05:47 ladsgroup@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 13:17 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:11 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:830707{{!}}Disable wgParserEnableLegacyMediaDOM on enwikivoyage (T314318)]] (turning on new-style media output) (duration: 04m 03s)
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:25 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:22 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:22 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:19 jnuche@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]] (duration: 04m 02s)
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:15 jnuche@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 08:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 08:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 08:14 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 08:07 hashar: Restarting Gerrit to clear stalled sockets in Zuul


== 2021-10-03 ==
== 2022-09-20 ==
* 14:45 _joe_: restarting acmechief on acmechief1001
* 20:19 cjming: end of UTC late backport window
* 12:55 kormat@cumin1001: dbctl commit (dc=all): 'Depool db1127, bad ram', diff saved to https://phabricator.wikimedia.org/P17414 and previous config saved to /var/cache/conftool/dbconfig/20211003-125530-kormat.json
* 20:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 08:24 elukey: powercycle cp5006 (unresponsive to ssh, remote tty available but not able to login as root, no prometheus metrics in hours)
* 20:13 cjming@deploy1002: Finished scap: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]] (duration: 09m 02s)
* 08:23 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:05 mforns@deploy1002: Finished deploy [analytics/refinery@62d8262] (thin): Regular analytics weekly train THIN [analytics/refinery@62d8262] (duration: 00m 07s)
* 20:05 mforns@deploy1002: Started deploy [analytics/refinery@62d8262] (thin): Regular analytics weekly train THIN [analytics/refinery@62d8262]
* 20:05 cjming@deploy1002: cjming and jdlrobson: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]] synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug1002.eqiad.wmnet
* 20:04 mforns@deploy1002: Finished deploy [analytics/refinery@62d8262]: Regular analytics weekly train [analytics/refinery@62d8262] (duration: 08m 00s)
* 20:04 cjming@deploy1002: Started scap: Backport for [[gerrit:833435{{!}}Enable Nearby everywhere (T246493)]]
* 20:02 gmodena@deploy1002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 20:02 gmodena@deploy1002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 20:01 eileen: civicrm upgraded from {{Gerrit|e82d9cd0}} to {{Gerrit|dcef393d}}
* 19:56 mforns@deploy1002: Started deploy [analytics/refinery@62d8262]: Regular analytics weekly train [analytics/refinery@62d8262]
* 19:05 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 18:50 jynus: restart db2100:s7 to apply new config
* 18:48 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 18:47 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 18:47 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 18:47 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:47 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:46 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:46 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 18:45 cstone: payments-wiki upgraded from {{Gerrit|de4b2bb9}} to {{Gerrit|0456850e}}
* 18:45 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:36 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 18:33 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:33 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:32 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:31 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:31 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:30 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 18:29 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:28 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:28 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:27 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 18:27 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 18:26 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 18:23 tchin@deploy1002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:22 tchin@deploy1002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 18:22 tchin@deploy1002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:21 tchin@deploy1002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 18:20 tchin@deploy1002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:19 tchin@deploy1002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 16:42 dancy@deploy1002: Sync cancelled.
* 16:42 dancy@deploy1002: dancy: testing, disregard synced to the testservers: mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 16:41 dancy@deploy1002: Started scap: testing, disregard
* 16:09 awight@deploy1002: backport aborted:  (duration: 00m 33s)
* 16:04 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833411{{!}}Disable Tech Wishes survey on dewiki (T316676)]] (take 2) (duration: 03m 42s)
* 15:55 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:833411{{!}}Disable Tech Wishes survey on dewiki (T316676)]] (duration: 03m 53s)
* 14:16 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 14:10 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 14:00 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@1a7c3b9]: (no justification provided) (duration: 00m 15s)
* 14:00 nokafor@deploy1002: Started deploy [airflow-dags/analytics@1a7c3b9]: (no justification provided)
* 13:50 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1189', diff saved to https://phabricator.wikimedia.org/P34884 and previous config saved to /var/cache/conftool/dbconfig/20220920-135006-ladsgroup.json
* 13:46 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:44 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:43 urbanecm@deploy1002: Synchronized php-1.40.0-wmf.2/extensions/GrowthExperiments/extension.json: {{Gerrit|1ac09d4709c645558f644a885fadc49c05cc04b9}}: Update HomepageModule schema version ([[phab:T310320|T310320]]) (duration: 03m 39s)
* 13:39 urbanecm@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/GrowthExperiments/extension.json: {{Gerrit|1a27e05a7ca53a063d5f9e284d6a09546ac8691c}}: Update HomepageModule schema version ([[phab:T310320|T310320]]) (duration: 03m 52s)
* 13:39 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:38 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:37 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:25 nokafor@deploy1002: Finished deploy [airflow-dags/analytics@0e9fb6b]: (no justification provided) (duration: 00m 11s)
* 13:25 nokafor@deploy1002: Started deploy [airflow-dags/analytics@0e9fb6b]: (no justification provided)
* 13:17 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0b55db6f80df5f4c89f969332a6b31077a7172c4}}: Enable Tech Wishes survey on dewiki ([[phab:T316676|T316676]]) (duration: 04m 12s)
* 09:58 jbond@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 09:27 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 08:46 awight@deploy1002: Finished deploy [kartotherian/deploy@4759a78]: Merge "Update kartotherian to e3f3854" (duration: 02m 27s)
* 08:43 awight@deploy1002: Started deploy [kartotherian/deploy@4759a78]: Merge "Update kartotherian to e3f3854"
* 08:35 hashar: Restarted CI Jenkins for plugin update
* 08:33 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 08:33 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 07:18 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:832993{{!}}testwiki: Enable Section Translation on haw, la, ps and, xh Wikipedias (T317289)]] (duration: 03m 46s)
* 07:15 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:13 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:10 kart_: Updated cxserver to 2022-09-15-113346-production ([[phab:T317289|T317289]], [[phab:T315209|T315209]])
* 07:08 kartik@deploy1002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 07:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:07 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:07 kartik@deploy1002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 07:06 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:06 kartik@deploy1002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 07:05 kartik@deploy1002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 07:03 kartik@deploy1002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 07:02 kartik@deploy1002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 04:09 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 04:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 04:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:56 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:51 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:45 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:45 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:40 mwpresync@deploy1002: Pruned MediaWiki: 1.39.0-wmf.28 (duration: 02m 02s)
* 03:38 mwpresync@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]] (duration: 36m 08s)
* 03:38 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:07 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 03:06 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 03:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 03:02 mwpresync@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.2  refs [[phab:T314191|T314191]]
* 02:42 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 02:35 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 02:08 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 02:07 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2021-10-02 ==
== 2022-09-19 ==
* 17:28 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 22:59 ebernhardson: [[phab:T317200|T317200]] start cirrussearch in-place reindex process for eqiad, codfw and cloudelastic
* 16:10 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 21:21 maryum: Deployed security patch for [[phab:T302479|T302479]]
* 21:21 mstyles@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/Translate/src/: (no justification provided) (duration: 03m 40s)
* 21:15 sbassett: Deployed security patch for [[phab:T312820|T312820]]
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 21:03 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:59 cjming: end of UTC late backport window
* 20:59 ebernhardson@deploy1002: Synchronized php-1.40.0-wmf.1/extensions/CirrusSearch/includes/Maintenance/MappingConfigBuilder.php: Backport: [[gerrit:833031{{!}}Add token_count subfield to outgoing_link (T317546)]] (duration: 03m 51s)
* 20:55 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:54 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:54 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:51 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:31 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:30 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:30 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:27 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:21 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:820459{{!}}Wikifunctions: Drop two config items moved to docker]] (duration: 03m 38s)
* 20:21 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:20 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:17 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:16 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:829877{{!}}ExtensionDistributor: Add REL1_39 (T313925)]] (duration: 03m 38s)
* 20:12 cjming@deploy1002: Finished scap: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]] (duration: 06m 31s)
* 20:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:06 cjming@deploy1002: cjming and arlolra: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet
* 20:06 cjming@deploy1002: Started scap: Backport for [[gerrit:832715{{!}}Disable wgParserEnableLegacyMediaDOM on cswiki (T314318)]]
* 19:33 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 19:33 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:33 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 19:30 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 19:30 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 19:30 bking@cumin2002: START - Cookbook sre.wdqs.data-reload
* 17:43 dancy@deploy1002: Installation of scap version "4.21.0" completed for 561 hosts
* 17:42 dancy@deploy1002: Installing scap version "4.21.0" for 561 hosts
* 17:36 dancy@deploy1002: Sync cancelled.
* 17:36 dancy@deploy1002: dancy: testing, disregard synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 17:36 dancy@deploy1002: Started scap: testing, disregard
* 14:03 urbanecm: Purge https://en.wikipedia.org/static/images/project-logos/ukwikivoyage<nowiki>{</nowiki>.png,-1.5x.png,-2x.png<nowiki>}</nowiki> ([[phab:T317718|T317718]])
* 14:02 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|6c7151d969b6997bd9cce042b7bc78c282dd9b26}}: Regenerate ukwikivoyage logo ([[phab:T317718|T317718]]) (duration: 03m 46s)
* 14:00 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:59 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:58 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|cbf161d148228e0e706813f923ab1a5d4b42757a}}: GrowthExperiments: Enable image recommendations for el/pl/zh/id/ro ([[phab:T314518|T314518]]) (duration: 04m 01s)
* 13:14 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:14 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:30 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:26 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:22 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a6c1ddf5cd1a46ab05f5d6fda4b938a3ee37238}}: Remove unnecessary wgNamespaceAliases from bnwiki ([[phab:T318003|T318003]]) (duration: 04m 16s)
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:11 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:11 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:10 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply


== 2021-10-01 ==
== 2022-09-17 ==
* 23:19 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 12:17 Emperor: set thanos ring replicas to 3.80 [[phab:T311690|T311690]]
* 22:27 mutante: puppetmaster2001 - systemctl reset-failed
* 10:39 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34879 and previous config saved to /var/cache/conftool/dbconfig/20220917-103903-ladsgroup.json
* 22:16 mutante: puppetmaster2001 systemctl disable geoip_update_ipinfo.timer
* 10:23 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P34878 and previous config saved to /var/cache/conftool/dbconfig/20220917-102356-ladsgroup.json
* 22:15 mutante: puppetmaster2001 - sudo /usr/local/bin/geoipupdate_job after adding new shell command and timer - succesfully downloaded enterprise database for [[phab:T288844|T288844]]
* 10:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105', diff saved to https://phabricator.wikimedia.org/P34877 and previous config saved to /var/cache/conftool/dbconfig/20220917-100850-ladsgroup.json
* 21:56 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 09:53 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34876 and previous config saved to /var/cache/conftool/dbconfig/20220917-095344-ladsgroup.json
* 21:44 mutante: puppetmasters - temp. disabling puppet one more time, now for a different deploy, to fetch an additional MaxMind database - [[phab:T288844|T288844]]
* 09:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34875 and previous config saved to /var/cache/conftool/dbconfig/20220917-094856-ladsgroup.json
* 21:19 mutante: puppetmaster2001 - puppet removed cron sync_volatile and cron sync_ca - starting and verifying new timers: 'systemctl status sync-puppet-volatile', 'systemctl status sync-puppet-ca' [[phab:T273673|T273673]]
* 09:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P34874 and previous config saved to /var/cache/conftool/dbconfig/20220917-093349-ladsgroup.json
* 21:12 mutante: puppetmaster1002, puppetmaster1003, puppetmaster2002, puppetmaster2003: re-enabled puppet, they are backends. backends don't have the sync cron/job/timer, so noop as well, just like 1004/1005/2004/2005. this just leaves the actual change on 2001  - [[phab:T273673|T273673]]
* 09:18 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127', diff saved to https://phabricator.wikimedia.org/P34873 and previous config saved to /var/cache/conftool/dbconfig/20220917-091843-ladsgroup.json
* 21:07 mutante: puppetmaster1004, puppetmaster1005, puppetmaster2004, puppetmaster2005: re-enabled puppet, they are "insetup" role
* 09:03 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34872 and previous config saved to /var/cache/conftool/dbconfig/20220917-090336-ladsgroup.json
* 21:06 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@d309a6e] (eqiad): tegola: reduce load to 50% during the weekend (duration: 00m 54s)
* 07:48 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34871 and previous config saved to /var/cache/conftool/dbconfig/20220917-074806-ladsgroup.json
* 21:05 mbsantos@deploy1002: Started deploy [kartotherian/deploy@d309a6e] (eqiad): tegola: reduce load to 50% during the weekend
* 07:33 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P34870 and previous config saved to /var/cache/conftool/dbconfig/20220917-073300-ladsgroup.json
* 21:05 mutante: puppetmaster1001 - re-enabled puppet, noop as expected, the passive host pulls from the active one, so only 2001 has the cron/job/timer
* 07:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129', diff saved to https://phabricator.wikimedia.org/P34869 and previous config saved to /var/cache/conftool/dbconfig/20220917-071753-ladsgroup.json
* 21:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:02 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34868 and previous config saved to /var/cache/conftool/dbconfig/20220917-070247-ladsgroup.json
* 21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:17 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2105 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34867 and previous config saved to /var/cache/conftool/dbconfig/20220917-051719-ladsgroup.json
* 21:01 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Revert "Have PdfHandler use Shellbox on Commons for 10% of requests" (duration: 00m 59s)
* 05:17 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 20:58 mutante: temp disabling puppet on puppetmasters - deploying gerrit:724115 (gerrit:723310) [[phab:T273673|T273673]]
* 05:17 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2105.codfw.wmnet with reason: Maintenance
* 18:58 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-db1002.eqiad.wmnet with reason: REIMAGE
* 05:15 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2129 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34866 and previous config saved to /var/cache/conftool/dbconfig/20220917-051527-ladsgroup.json
* 18:56 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-db1001.eqiad.wmnet with reason: REIMAGE
* 05:15 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 18:55 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-db1002.eqiad.wmnet with reason: REIMAGE
* 05:15 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db2129.codfw.wmnet with reason: Maintenance
* 18:53 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-db1001.eqiad.wmnet with reason: REIMAGE
* 05:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1127 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34865 and previous config saved to /var/cache/conftool/dbconfig/20220917-051203-ladsgroup.json
* 18:07 robh@cumin1001: END (ERROR) - Cookbook sre.experimental.reimage (exit_code=97) for host an-db1001.eqiad.wmnet
* 05:11 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 18:05 robh@cumin1001: START - Cookbook sre.experimental.reimage for host an-db1001.eqiad.wmnet
* 05:11 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: Maintenance
* 17:58 effie: depool mw1025, mw1319, mw1312 for test
* 16:20 dancy: testing upcoming Scap 4.0.2 release on beta
* 14:04 bblack: C:envoyproxy (appservers and others): restarting envoyproxy
* 14:04 bblack: C:envoyproxy (appservers and others): ca-certificates updated via cumin to workaround [[phab:T292291|T292291]] issues
* 13:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:45 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:23 bblack: manually trying LE expired root workaround on mwdebug1001 with puppet disabled ...
* 13:12 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 13:11 gehel@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 13:11 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 13:10 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 11:42 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:11 jynus: manually migrating some vms out of ganeti1009 to avoid excessive memory pressure
* 10:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17413 and previous config saved to /var/cache/conftool/dbconfig/20211001-105849-root.json
* 10:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17412 and previous config saved to /var/cache/conftool/dbconfig/20211001-105735-root.json
* 10:43 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@d4caf6d] (eqiad): Increase mirrored traffic to 100% for eqiad (duration: 00m 49s)
* 10:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17411 and previous config saved to /var/cache/conftool/dbconfig/20211001-104345-root.json
* 10:43 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@d4caf6d] (eqiad): Increase mirrored traffic to 100% for eqiad
* 10:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17410 and previous config saved to /var/cache/conftool/dbconfig/20211001-104232-root.json
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17409 and previous config saved to /var/cache/conftool/dbconfig/20211001-102841-root.json
* 10:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17408 and previous config saved to /var/cache/conftool/dbconfig/20211001-102728-root.json
* 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17407 and previous config saved to /var/cache/conftool/dbconfig/20211001-101338-root.json
* 10:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17406 and previous config saved to /var/cache/conftool/dbconfig/20211001-101224-root.json
* 10:00 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@c123ab9] (eqiad): Increase mirrored traffic to 80% for eqiad (duration: 00m 51s)
* 09:59 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@c123ab9] (eqiad): Increase mirrored traffic to 80% for eqiad
* 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17405 and previous config saved to /var/cache/conftool/dbconfig/20211001-095834-root.json
* 09:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17404 and previous config saved to /var/cache/conftool/dbconfig/20211001-095720-root.json
* 09:55 marostegui: Upgrade db1164 and db1177
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1177 and db1164 for upgrade', diff saved to https://phabricator.wikimedia.org/P17403 and previous config saved to /var/cache/conftool/dbconfig/20211001-095433-marostegui.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17402 and previous config saved to /var/cache/conftool/dbconfig/20211001-094913-root.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17401 and previous config saved to /var/cache/conftool/dbconfig/20211001-094902-root.json
* 09:38 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force # to get an idea about timing for [[phab:T290609|T290609]], runs in a tmux session under my account
* 09:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17400 and previous config saved to /var/cache/conftool/dbconfig/20211001-093410-root.json
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17399 and previous config saved to /var/cache/conftool/dbconfig/20211001-093358-root.json
* 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2002.codfw.wmnet
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17398 and previous config saved to /var/cache/conftool/dbconfig/20211001-091906-root.json
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17397 and previous config saved to /var/cache/conftool/dbconfig/20211001-091854-root.json
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17396 and previous config saved to /var/cache/conftool/dbconfig/20211001-090402-root.json
* 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17395 and previous config saved to /var/cache/conftool/dbconfig/20211001-090351-root.json
* 09:02 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2002.codfw.wmnet
* 09:00 _joe_: restarting pybal low-traffic in eqiad to pick up the drop of proxyfetch to kubernetes services
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17394 and previous config saved to /var/cache/conftool/dbconfig/20211001-084859-root.json
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17393 and previous config saved to /var/cache/conftool/dbconfig/20211001-084847-root.json
* 08:44 marostegui: Upgrade db1135 and db1172
* 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1172 for upgrade', diff saved to https://phabricator.wikimedia.org/P17392 and previous config saved to /var/cache/conftool/dbconfig/20211001-084435-marostegui.json
* 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1135 for upgrade', diff saved to https://phabricator.wikimedia.org/P17391 and previous config saved to /var/cache/conftool/dbconfig/20211001-084411-marostegui.json
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2080 [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17390 and previous config saved to /var/cache/conftool/dbconfig/20211001-084345-marostegui.json
* 08:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 08:15 _joe_: restarting pybal in codfw to pick up config changes
* 08:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on testvm[2001,2003].codfw.wmnet with reason: Ganeti tests
* 08:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 8:00:00 on testvm[2001,2003].codfw.wmnet with reason: Ganeti tests
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17388 and previous config saved to /var/cache/conftool/dbconfig/20211001-062846-root.json
* 06:27 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 06:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17387 and previous config saved to /var/cache/conftool/dbconfig/20211001-062453-root.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17386 and previous config saved to /var/cache/conftool/dbconfig/20211001-061342-root.json
* 06:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17385 and previous config saved to /var/cache/conftool/dbconfig/20211001-060949-root.json
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17384 and previous config saved to /var/cache/conftool/dbconfig/20211001-055838-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17383 and previous config saved to /var/cache/conftool/dbconfig/20211001-055445-root.json
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17382 and previous config saved to /var/cache/conftool/dbconfig/20211001-054335-root.json
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17381 and previous config saved to /var/cache/conftool/dbconfig/20211001-053942-root.json
* 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17380 and previous config saved to /var/cache/conftool/dbconfig/20211001-052831-root.json
* 05:26 marostegui: Upgrade db1114
* 05:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114 for upgrade', diff saved to https://phabricator.wikimedia.org/P17379 and previous config saved to /var/cache/conftool/dbconfig/20211001-052509-marostegui.json
* 05:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17378 and previous config saved to /var/cache/conftool/dbconfig/20211001-052438-root.json
* 05:22 marostegui: Upgrade db1119
* 05:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 for upgrade', diff saved to https://phabricator.wikimedia.org/P17377 and previous config saved to /var/cache/conftool/dbconfig/20211001-052133-marostegui.json
* 04:00 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Have PdfHandler use Shellbox on Commons for 10% of requests ([[phab:T289228|T289228]]) (duration: 00m 59s)
* 04:00 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:24 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 03:15 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .


== 2021-09-30 ==
== 2022-09-16 ==
* 23:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:29 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 23:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:29 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 23:51 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Put a https protocol into values (duration: 01m 00s)
* 21:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34864 and previous config saved to /var/cache/conftool/dbconfig/20220916-212905-ladsgroup.json
* 23:48 dpifke@deploy1002: Finished deploy [statsv/statsv@afeff42]: Deploy statsv with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 00m 05s)
* 21:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P34863 and previous config saved to /var/cache/conftool/dbconfig/20220916-211358-ladsgroup.json
* 23:48 dpifke@deploy1002: Started deploy [statsv/statsv@afeff42]: Deploy statsv with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 20:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P34862 and previous config saved to /var/cache/conftool/dbconfig/20220916-205852-ladsgroup.json
* 23:41 dpifke@deploy1002: Finished deploy [performance/coal@1be49f8]: Deploy Coal with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 01m 07s)
* 20:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34861 and previous config saved to /var/cache/conftool/dbconfig/20220916-204345-ladsgroup.json
* 23:40 dpifke@deploy1002: Started deploy [performance/coal@1be49f8]: Deploy Coal with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 19:16 mutante: cp1081 /usr/local/sbin/update-ocsp-all
* 23:39 dpifke@deploy1002: Finished deploy [performance/navtiming@29264fb]: Deploy Navtiming with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 00m 05s)
* 17:01 mutante: gitlab-runner*: deployed gerrit:832584 and systemctl restart buildkitd on 6 hosts for [[phab:T317904|T317904]]
* 23:39 dpifke@deploy1002: Started deploy [performance/navtiming@29264fb]: Deploy Navtiming with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 16:56 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 23:34 ejegg: updated Fundraising CiviCRM from {{Gerrit|d4da344274}} to {{Gerrit|d74e9aa0a1}}
* 16:55 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 22:09 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 16:55 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 22:07 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 16:53 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 22:06 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 16:53 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 21:53 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 16:46 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host db2183.mgmt.codfw.wmnet with reboot policy FORCED
* 21:06 eileen: civicrm revision changed from {{Gerrit|2ecb8f0bcd}} to {{Gerrit|d4da344274}}, config revision is {{Gerrit|77cb7ec866}}
* 16:45 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:54 ryankemper: [WCQS] `ryankemper@wcqs1003:~$ sudo pool` (merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/725110 to unbreak readiness probe)
* 16:43 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 20:54 topranks: Routinator on rpki1001 upgraded to  0.10.0 and working again after force refresh.
* 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2184
* 20:49 brennen: gitlab1001: upgrade to 14.2.5 complete
* 16:42 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2184
* 20:32 brennen: gitlab2001, gitlab1001: downtime for upgrades to 14.2.5
* 16:42 pt1979@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2183
* 20:18 ryankemper: [WCQS] `ryankemper@wcqs1003:~$ sudo depool` (not sure why pybal can't depool it, the other 2 servers are pooled)
* 16:41 pt1979@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host db2183
* 19:51 topranks: Updating routinator on rpki1001 [[phab:T291543|T291543]]
* 16:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1198 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34860 and previous config saved to /var/cache/conftool/dbconfig/20220916-161409-ladsgroup.json
* 19:39 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 16:14 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:13 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1198.eqiad.wmnet with reason: Maintenance
* 19:37 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 16:13 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34859 and previous config saved to /var/cache/conftool/dbconfig/20220916-161346-ladsgroup.json
* 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P34858 and previous config saved to /var/cache/conftool/dbconfig/20220916-155840-ladsgroup.json
* 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:52 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:52 dancy@deploy1002: Installation of scap version "4.20.0" completed for 561 hosts
* 19:14 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.2  refs [[phab:T281166|T281166]]
* 15:51 dancy@deploy1002: Installing scap version "4.20.0" for 561 hosts
* 19:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:51 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:44 dancy@deploy1002: Finished scap: testing (duration: 04m 53s)
* 19:07 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/MobileFrontend: Backport: [[gerrit:724979{{!}}Fix search within pages alignment (T292107)]] (duration: 01m 09s)
* 15:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P34857 and previous config saved to /var/cache/conftool/dbconfig/20220916-154333-ladsgroup.json
* 19:05 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/EventBus/includes/EventBus.php: Backport: [[gerrit:724481{{!}}Guard against undefined index notice when setting x-client-ip (T288853)]] (duration: 01m 09s)
* 15:39 dancy@deploy1002: Started scap: testing
* 19:04 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/EventBus/includes/EventBus.php: Backport: [[gerrit:724480{{!}}Guard against undefined index notice when setting x-client-ip (T288853)]] (duration: 01m 09s)
* 15:28 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1189 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34856 and previous config saved to /var/cache/conftool/dbconfig/20220916-152827-ladsgroup.json
* 18:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 18:58 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/skins/Vector/resources/skins.vector.styles.legacy/components/MenuDropdown.less: Backport: [[gerrit:724798{{!}}Restore original more menu padding in legacy Vector (T289163)]] (duration: 01m 08s)
* 15:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 18:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:05 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:04 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 18:43 thcipriani@deploy1002: Scap failed!: Call to mwscript eval.php stderr: not empty
* 15:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 18:42 moritzm: imported gitlab 14.2.5 to thirdparty/gitlab [[phab:T292219|T292219]]
* 15:02 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 18:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:02 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 18:38 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704167{{!}}Use Wikimania's logo in a new vector (T286405)]] Part III (duration: 01m 07s)
* 15:01 jbond@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 18:37 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikimania-wordmark.svg: Config: [[gerrit:704167{{!}}Use Wikimania's logo in a new vector (T286405)]] Part II (duration: 01m 07s)
* 15:01 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 18:35 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikimania.svg: Config: [[gerrit:704167{{!}}Use Wikimania's logo in a new vector (T286405)]] part I (duration: 01m 07s)
* 15:01 jbond@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts sretest1002.eqiad.wmnet
* 18:32 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:58 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 18:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:58 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 18:31 thcipriani@deploy1002: Synchronized wmf-config: Config: [[gerrit:724514{{!}}Enable sticky header on beta cluster (T289721)]] (duration: 01m 08s)
* 14:57 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 18:29 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 14:57 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 18:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:48 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 18:27 otto@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts thorium.eqiad.wmnet
* 14:47 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 18:22 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:45 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 18:20 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724861{{!}}Disable legacy media dom on a few more wikis (T51097)]] (duration: 01m 08s)
* 14:45 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 18:07 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:42 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 17:49 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:39 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 17:49 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:23 elukey@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 17:49 otto@cumin1001: START - Cookbook sre.hosts.decommission for hosts thorium.eqiad.wmnet
* 14:22 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 17:42 bstorm: updating packages for thirdparty/kubeadm-k8s-1-20 and thirdparty/kubeadm-k8s-1-19 in stretch-wikimedia on apt1001 [[phab:T292131|T292131]]
* 14:22 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 17:09 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad (duration: 00m 55s)
* 14:17 godog: add 100G to prometheus/eqiad instance k8s-mlserve
* 17:08 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad
* 13:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 17:03 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad (duration: 00m 08s)
* 13:54 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 17:02 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad
* 13:52 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 17:02 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 13:52 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 17:00 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad (duration: 00m 11s)
* 13:51 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 17:00 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad
* 13:51 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 16:49 sukhe: restart dnsdist.service on doh[1001-1002,2001-2002,3001-3002,4001-4002,5001-5002].wikimedia.org
* 13:50 elukey@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 16:43 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@67a4d22]: Increase mirrored traffic to 10% (duration: 02m 33s)
* 13:50 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:40 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@67a4d22]: Increase mirrored traffic to 10%
* 13:49 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 16:38 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@67a4d22] (eqiad): Increase mirrored traffic to 10% (duration: 00m 40s)
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34855 and previous config saved to /var/cache/conftool/dbconfig/20220916-131902-root.json
* 16:37 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@67a4d22] (eqiad): Increase mirrored traffic to 10%
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34854 and previous config saved to /var/cache/conftool/dbconfig/20220916-130357-root.json
* 16:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34853 and previous config saved to /var/cache/conftool/dbconfig/20220916-125841-root.json
* 16:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34852 and previous config saved to /var/cache/conftool/dbconfig/20220916-124850-root.json
* 16:32 hnowlan: Ran `GRANT pg_monitor TO prometheus` for maps in eqiad and codfw to fix empty prometheus connection metrics
* 12:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34851 and previous config saved to /var/cache/conftool/dbconfig/20220916-124336-root.json
* 16:30 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@67a4d22] (eqiad): Increase mirrored traffic to 10% (duration: 00m 16s)
* 12:43 jbond@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts sretest1002.eqiad.wmnet
* 16:30 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@67a4d22] (eqiad): Increase mirrored traffic to 10%
* 12:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34850 and previous config saved to /var/cache/conftool/dbconfig/20220916-123346-root.json
* 16:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34849 and previous config saved to /var/cache/conftool/dbconfig/20220916-122831-root.json
* 16:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34848 and previous config saved to /var/cache/conftool/dbconfig/20220916-121841-root.json
* 16:11 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725032{{!}}Disable jQuery migrate in metawiki (T280944)]] (duration: 01m 09s)
* 12:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34847 and previous config saved to /var/cache/conftool/dbconfig/20220916-121326-root.json
* 16:08 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725019{{!}}Enable dispatching via job to 10 prod wikis]] (duration: 01m 09s)
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34846 and previous config saved to /var/cache/conftool/dbconfig/20220916-120336-root.json
* 15:55 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34845 and previous config saved to /var/cache/conftool/dbconfig/20220916-115821-root.json
* 14:36 elukey: drop /etc/helmfile-defaults/private/backup_old_paths from deploy1002 (old data not needed anymore)
* 11:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34844 and previous config saved to /var/cache/conftool/dbconfig/20220916-114935-root.json
* 14:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17374 and previous config saved to /var/cache/conftool/dbconfig/20210930-143325-root.json
* 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34843 and previous config saved to /var/cache/conftool/dbconfig/20220916-114831-root.json
* 14:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17373 and previous config saved to /var/cache/conftool/dbconfig/20210930-143044-root.json
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34842 and previous config saved to /var/cache/conftool/dbconfig/20220916-114316-root.json
* 14:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17372 and previous config saved to /var/cache/conftool/dbconfig/20210930-141822-root.json
* 11:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134', diff saved to https://phabricator.wikimedia.org/P34841 and previous config saved to /var/cache/conftool/dbconfig/20220916-113543-root.json
* 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17370 and previous config saved to /var/cache/conftool/dbconfig/20210930-141540-root.json
* 11:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34840 and previous config saved to /var/cache/conftool/dbconfig/20220916-113431-root.json
* 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17369 and previous config saved to /var/cache/conftool/dbconfig/20210930-140318-root.json
* 11:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34839 and previous config saved to /var/cache/conftool/dbconfig/20220916-113325-root.json
* 14:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17368 and previous config saved to /var/cache/conftool/dbconfig/20210930-140037-root.json
* 11:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114', diff saved to https://phabricator.wikimedia.org/P34838 and previous config saved to /var/cache/conftool/dbconfig/20220916-112750-root.json
* 13:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17367 and previous config saved to /var/cache/conftool/dbconfig/20210930-134815-root.json
* 11:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34837 and previous config saved to /var/cache/conftool/dbconfig/20220916-111925-root.json
* 13:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17366 and previous config saved to /var/cache/conftool/dbconfig/20210930-134533-root.json
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34836 and previous config saved to /var/cache/conftool/dbconfig/20220916-110420-root.json
* 13:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2001.codfw.wmnet
* 10:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1189 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34835 and previous config saved to /var/cache/conftool/dbconfig/20220916-105819-ladsgroup.json
* 13:40 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:58 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 13:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:58 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 13:38 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:58 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34834 and previous config saved to /var/cache/conftool/dbconfig/20220916-105809-ladsgroup.json
* 13:37 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 10:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34832 and previous config saved to /var/cache/conftool/dbconfig/20220916-104916-root.json
* 13:36 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 10:43 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P34831 and previous config saved to /var/cache/conftool/dbconfig/20220916-104303-ladsgroup.json
* 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17365 and previous config saved to /var/cache/conftool/dbconfig/20210930-133311-root.json
* 10:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34830 and previous config saved to /var/cache/conftool/dbconfig/20220916-103411-root.json
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17364 and previous config saved to /var/cache/conftool/dbconfig/20210930-133029-root.json
* 10:27 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P34829 and previous config saved to /var/cache/conftool/dbconfig/20220916-102756-ladsgroup.json
* 13:29 marostegui: Upgrade db1111
* 10:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34828 and previous config saved to /var/cache/conftool/dbconfig/20220916-101905-root.json
* 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1111 for upgrade', diff saved to https://phabricator.wikimedia.org/P17363 and previous config saved to /var/cache/conftool/dbconfig/20210930-132831-marostegui.json
* 10:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34827 and previous config saved to /var/cache/conftool/dbconfig/20220916-101250-ladsgroup.json
* 13:27 marostegui: Upgrade db1134
* 10:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2180 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34826 and previous config saved to /var/cache/conftool/dbconfig/20220916-100400-root.json
* 13:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134 (s1) for upgrade', diff saved to https://phabricator.wikimedia.org/P17362 and previous config saved to /var/cache/conftool/dbconfig/20210930-132700-marostegui.json
* 09:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 100%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34825 and previous config saved to /var/cache/conftool/dbconfig/20220916-093635-root.json
* 13:26 marostegui: Upgrade db1133
* 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 100%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34824 and previous config saved to /var/cache/conftool/dbconfig/20220916-093121-root.json
* 13:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2001.codfw.wmnet
* 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 75%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34823 and previous config saved to /var/cache/conftool/dbconfig/20220916-092130-root.json
* 13:02 urbanecm: Start server-side upload for 2 video files ([[phab:T292096|T292096]], [[phab:T291492|T291492]])
* 09:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 75%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34822 and previous config saved to /var/cache/conftool/dbconfig/20220916-091616-root.json
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17361 and previous config saved to /var/cache/conftool/dbconfig/20210930-130116-root.json
* 09:12 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1166 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34821 and previous config saved to /var/cache/conftool/dbconfig/20220916-091234-ladsgroup.json
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17360 and previous config saved to /var/cache/conftool/dbconfig/20210930-130109-root.json
* 09:12 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17359 and previous config saved to /var/cache/conftool/dbconfig/20210930-124612-root.json
* 09:12 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1166.eqiad.wmnet with reason: Maintenance
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17358 and previous config saved to /var/cache/conftool/dbconfig/20210930-124606-root.json
* 09:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 50%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34820 and previous config saved to /var/cache/conftool/dbconfig/20220916-090625-root.json
* 12:31 Reedy: downloading files for [[phab:T290900|T290900]] in screen on mwmaint1002
* 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 50%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34819 and previous config saved to /var/cache/conftool/dbconfig/20220916-090111-root.json
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17357 and previous config saved to /var/cache/conftool/dbconfig/20210930-123109-root.json
* 08:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 25%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34818 and previous config saved to /var/cache/conftool/dbconfig/20220916-085120-root.json
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17356 and previous config saved to /var/cache/conftool/dbconfig/20210930-123101-root.json
* 08:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 25%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34817 and previous config saved to /var/cache/conftool/dbconfig/20220916-084607-root.json
* 12:18 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 17s)
* 08:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 10%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34816 and previous config saved to /var/cache/conftool/dbconfig/20220916-083615-root.json
* 12:18 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 08:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 10%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34815 and previous config saved to /var/cache/conftool/dbconfig/20220916-083102-root.json
* 12:17 moritzm: adapted MX records to point to both mx1001.wikimedia.org and mx2001.wikimedia.org with equal weights [[phab:T286911|T286911]]
* 08:22 elukey@deploy1002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:17 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 16s)
* 08:21 elukey@deploy1002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:16 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 5%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34814 and previous config saved to /var/cache/conftool/dbconfig/20220916-082110-root.json
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17355 and previous config saved to /var/cache/conftool/dbconfig/20210930-121605-root.json
* 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 5%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34813 and previous config saved to /var/cache/conftool/dbconfig/20220916-081557-root.json
* 12:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17354 and previous config saved to /var/cache/conftool/dbconfig/20210930-121558-root.json
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 3%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34812 and previous config saved to /var/cache/conftool/dbconfig/20220916-080605-root.json
* 12:14 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 15s)
* 08:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 3%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34811 and previous config saved to /var/cache/conftool/dbconfig/20220916-080052-root.json
* 12:13 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1189 (re)pooling @ 1%: After being recloned', diff saved to https://phabricator.wikimedia.org/P34810 and previous config saved to /var/cache/conftool/dbconfig/20220916-075100-root.json
* 12:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 15s)
* 07:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1198 (re)pooling @ 1%: Repooling after cloning db1189', diff saved to https://phabricator.wikimedia.org/P34809 and previous config saved to /var/cache/conftool/dbconfig/20220916-074548-root.json
* 12:13 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 07:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34808 and previous config saved to /var/cache/conftool/dbconfig/20220916-074251-root.json
* 12:11 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 10s)
* 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2180', diff saved to https://phabricator.wikimedia.org/P34807 and previous config saved to /var/cache/conftool/dbconfig/20220916-072958-root.json
* 12:10 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 07:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34806 and previous config saved to /var/cache/conftool/dbconfig/20220916-072746-root.json
* 12:10 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 01s)
* 07:26 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 12:10 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 07:25 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 12:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:25 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17353 and previous config saved to /var/cache/conftool/dbconfig/20210930-120102-root.json
* 07:21 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17352 and previous config saved to /var/cache/conftool/dbconfig/20210930-120054-root.json
* 07:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34805 and previous config saved to /var/cache/conftool/dbconfig/20220916-071241-root.json
* 12:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34804 and previous config saved to /var/cache/conftool/dbconfig/20220916-065737-root.json
* 11:58 hnowlan: imported wikidiff2_1.13.0-1/php-wikidiff2_1.13.0-1_amd64.deb to buster-wikimedia component/php72
* 06:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34803 and previous config saved to /var/cache/conftool/dbconfig/20220916-064232-root.json
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099 (s1 and s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17351 and previous config saved to /var/cache/conftool/dbconfig/20210930-115631-marostegui.json
* 06:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34802 and previous config saved to /var/cache/conftool/dbconfig/20220916-062727-root.json
* 11:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 3%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34801 and previous config saved to /var/cache/conftool/dbconfig/20220916-061222-root.json
* 11:47 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 03s)
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34800 and previous config saved to /var/cache/conftool/dbconfig/20220916-055717-root.json
* 11:47 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1168', diff saved to https://phabricator.wikimedia.org/P34799 and previous config saved to /var/cache/conftool/dbconfig/20220916-055542-root.json
* 11:47 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 01s)
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1168 (re)pooling @ 1%: After upgrade', diff saved to https://phabricator.wikimedia.org/P34798 and previous config saved to /var/cache/conftool/dbconfig/20220916-055424-root.json
* 11:47 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 05:51 marostegui: Install 10.6 on db1168 [[phab:T301879|T301879]]
* 11:47 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1168', diff saved to https://phabricator.wikimedia.org/P34797 and previous config saved to /var/cache/conftool/dbconfig/20220916-055031-root.json
* 11:46 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 01s)
* 05:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1198', diff saved to https://phabricator.wikimedia.org/P34795 and previous config saved to /var/cache/conftool/dbconfig/20220916-054438-root.json
* 11:46 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 01:57 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 09s)
* 11:44 effie: downgrading scap to 3.17.1-1 on maps* hosts - [[phab:T291990|T291990]]
* 01:57 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 11:43 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724732{{!}}Make reply tool available as opt-out almost everywhere (phase 3) (T288485)]] (duration: 01m 07s)
* 01:54 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 10s)
* 11:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:54 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 11:35 kartik@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/DiscussionTools: Backport: [[gerrit:724789{{!}}Add a link to preferences within the Reply and New Discussion Tools (T291002)]] (duration: 01m 08s)
* 00:14 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 17s)
* 11:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:14 bmansurov@deploy1002: Started deploy [airflow-dags/research@b9be20d]: (no justification provided)
* 11:30 kartik@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/DiscussionTools: Backport: [[gerrit:724788{{!}}Add a link to preferences within the Reply and New Discussion Tools (T291002)]] (duration: 01m 09s)
* 11:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:14 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724458{{!}}Enable SectionTranslation in Igbo, Hausa, Yoruba Wikipedias (T290175)]] (duration: 01m 08s)
* 11:05 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 11:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 10:13 akosiaris: upgrade znuny to 6.0.37
* 10:06 godog: test bounce logstash on logstash1023
* 08:21 moritzm: installing nettle security updates on stretch
* 08:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2003.codfw.wmnet
* 07:49 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2003.codfw.wmnet
* 07:31 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 06s)
* 07:31 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 07:03 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 06:58 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 06:56 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 06:48 marostegui: Deploy schema change on s8 codfw (lag will show up) [[phab:T270620|T270620]]
* 06:01 marostegui: Deploy schema change on s1 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:53 marostegui: Deploy schema change on s3 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:52 marostegui: Deploy schema change on s7 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:47 marostegui: Deploy schema change on s5 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:45 marostegui: Deploy schema change on s4 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:45 marostegui: Deploy schema change on s2 codfw (lag will show up) [[phab:T270620|T270620]]


== 2021-09-29 ==
== 2022-09-15 ==
* 23:20 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:51 mutante: gerrit1001 - disabled puppet - gerrit:832411
* 23:05 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 22:01 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on wcqs2001.codfw.wmnet with reason: [[phab:T316236|T316236]]
* 23:02 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 22:01 bking@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on wcqs2001.codfw.wmnet with reason: [[phab:T316236|T316236]]
* 22:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:30 ebernhardson: depool wcqs2001 for [[phab:T316236|T316236]]
* 22:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:25 thcipriani@deploy1002: Finished scap: Backport for [[gerrit:832526{{!}}Increase coverage of Research Incentive Survey on idwiki (T316466)]] (duration: 07m 06s)
* 21:57 legoktm@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/timeline/includes/Timeline.php: Catch TimelineException from fixMap() ([[phab:T292126|T292126]]) (duration: 01m 07s)
* 20:18 thcipriani@deploy1002: thcipriani and dani: Backport for [[gerrit:832526{{!}}Increase coverage of Research Incentive Survey on idwiki (T316466)]] synced to the testservers: mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 21:48 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:18 thcipriani@deploy1002: Started scap: Backport for [[gerrit:832526{{!}}Increase coverage of Research Incentive Survey on idwiki (T316466)]]
* 21:44 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:15 thcipriani@deploy1002: Finished scap: Backport for [[gerrit:832323{{!}}Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]] (duration: 07m 39s)
* 21:37 legoktm@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/timeline/includes/Timeline.php: Bump Timeline::CACHE_VERSION (duration: 01m 08s)
* 20:08 thcipriani@deploy1002: thcipriani and dcausse: Backport for [[gerrit:832323{{!}}Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]] synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:07 thcipriani@deploy1002: Started scap: Backport for [[gerrit:832323{{!}}Revert "cirrus: Handle transition to elasticsearch 7.10" (T308676)]]
* 20:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:26 ebernhardson: pool'd wdqs2001, some blockers before reload can start [[phab:T316236|T316236]]
* 20:22 jhuneidi@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.2  refs [[phab:T281166|T281166]] (duration: 01m 08s)
* 18:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.40.0-wmf.1 refs [[phab:T314190|T314190]]
* 20:21 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.2 refs [[phab:T281166|T281166]]
* 18:39 dancy@deploy1002: Finished scap: Backport for [[gerrit:832547{{!}}Use more permissive match for TOC_PLACEHOLDER in parser output (T317857)]] (duration: 09m 53s)
* 20:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:38 cwhite: restart thanos-compact (thanos-fe2001) and swift_ring_manager (thanos-fe1001)
* 20:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:29 dancy@deploy1002: dancy and cscott: Backport for [[gerrit:832547{{!}}Use more permissive match for TOC_PLACEHOLDER in parser output (T317857)]] synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug2001.codfw.wmnet
* 20:16 jhuneidi@deploy1002: Finished scap: Fix pywikibot feature detection (duration: 13m 38s)
* 18:29 dancy@deploy1002: Started scap: Backport for [[gerrit:832547{{!}}Use more permissive match for TOC_PLACEHOLDER in parser output (T317857)]]
* 20:02 jhuneidi@deploy1002: Started scap: Fix pywikibot feature detection
* 18:17 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe2003.codfw.wmnet on all recursors
* 19:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:17 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe2003.codfw.wmnet on all recursors
* 19:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:17 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe2002.codfw.wmnet on all recursors
* 19:06 legoktm@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/timeline/scripts/renderTimeline.sh: Fix passing temp directory to EasyTimeline.pl (duration: 01m 07s)
* 18:17 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe2002.codfw.wmnet on all recursors
* 18:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:17 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe2001.codfw.wmnet on all recursors
* 18:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:16 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe2001.codfw.wmnet on all recursors
* 18:52 dancy@deploy1002: Synchronized php-1.38.0-wmf.2/skins/MinervaNeue/resources/skins.minerva.base.styles/ui.less: Backport: [[gerrit:724787{{!}}Search header should be vertically centered, not top aligned(take 2) (T292071)]] (duration: 01m 08s)
* 18:16 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe1003.eqiad.wmnet on all recursors
* 17:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:16 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe1003.eqiad.wmnet on all recursors
* 17:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:16 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe1002.eqiad.wmnet on all recursors
* 17:14 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724776{{!}}Fully enable change dispatching via jobs on test wikis]], Part I (duration: 01m 09s)
* 18:16 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe1002.eqiad.wmnet on all recursors
* 17:13 ladsgroup@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:724776{{!}}Fully enable change dispatching via jobs on test wikis]], Part I (duration: 01m 07s)
* 18:16 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe1001.eqiad.wmnet on all recursors
* 16:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:16 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe1001.eqiad.wmnet on all recursors
* 16:47 pt1979@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host thumbor2006.codfw.wmnet
* 18:15 ebernhardson: depool wcqs2001 for [[phab:T316236|T316236]]
* 16:47 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:15 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 akosiaris: start hbal -L -G row_B -X on ganeti01.svc.codfw.wmnet . Rows C and D are fine
* 18:13 cwhite@cumin2002: START - Cookbook sre.dns.netbox
* 16:42 akosiaris: start hbal -L -G row_A -X on ganeti01.svc.codfw.wmnet
* 18:07 godog: restart envoyproxy on thanos-fe*
* 16:40 akosiaris: migrate kubemaster2001 off ganeti2007 and to ganeti2008 due to memory starvation on ganeti2007
* 18:06 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) thanos-fe2002.codfw.wmnet on all recursors
* 16:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:06 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache thanos-fe2002.codfw.wmnet on all recursors
* 16:34 pt1979@cumin2002: START - Cookbook sre.experimental.reimage for host thumbor2006.codfw.wmnet
* 17:39 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 16:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 16:25 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/WikimediaBadges/: Backport: [[gerrit:724561{{!}}Handle missing items in WikibaseClientSiteLinksForItemHandler (T291953)]] (duration: 01m 08s)
* 16:17 andrew@cumin1001: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging BryanDavis out of all services on: 2047 hosts
* 16:24 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/WikimediaBadges/: Backport: [[gerrit:724560{{!}}Handle missing items in WikibaseClientSiteLinksForItemHandler (T291953)]] (duration: 01m 10s)
* 16:16 andrew@cumin1001: START - Cookbook sre.idm.logout Logging BryanDavis out of all services on: 2047 hosts
* 15:58 pt1979@cumin2002: END (FAIL) - Cookbook sre.experimental.reimage (exit_code=99) for host thumbor2006.codfw.wmnet
* 15:39 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:37 cwhite@cumin2002: START - Cookbook sre.dns.netbox
* 15:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:28 hnowlan@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=sessionstore,name=eqiad
* 15:45 Amir1: disabled cron dispatching for mediawikiwiki
* 15:27 hnowlan@deploy1002: helmfile [eqiad] DONE helmfile.d/services/sessionstore: sync
* 15:44 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724765{{!}}Enable change dispatching via jobs in wikidatawiki (T48643)]] (duration: 01m 08s)
* 15:27 hnowlan@deploy1002: helmfile [eqiad] START helmfile.d/services/sessionstore: sync
* 15:44 pt1979@cumin2002: START - Cookbook sre.experimental.reimage for host thumbor2006.codfw.wmnet
* 15:22 hnowlan: starting cassandra on sessionstore1001-a
* 15:39 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2035.codfw.wmnet
* 15:18 hnowlan@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=sessionstore,name=eqiad
* 15:39 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Wikibase/client: Backport: [[gerrit:724558{{!}}Track time until dispatched recent changes are inserted (T291962)]] (duration: 01m 10s)
* 15:11 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34792 and previous config saved to /var/cache/conftool/dbconfig/20220915-151131-ladsgroup.json
* 15:24 pt1979@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host thumbor2006.codfw.wmnet
* 14:56 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P34791 and previous config saved to /var/cache/conftool/dbconfig/20220915-145625-ladsgroup.json
* 15:22 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be2035.codfw.wmnet
* 14:41 moritzm: installing libtirpc security updates
* 15:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 14:41 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P34790 and previous config saved to /var/cache/conftool/dbconfig/20220915-144118-ladsgroup.json
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 14:26 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34789 and previous config saved to /var/cache/conftool/dbconfig/20220915-142612-ladsgroup.json
* 15:02 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:01 sukhe: retarting bird.service on A:dns-auth for zlib update
* 14:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 14:00 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6b9784a0708cf1e7762034ccfba7e5604b2f6dc2}}: Enable the Vue version of the mentee overview in pilot wikis ([[phab:T300532|T300532]]) (duration: 03m 45s)
* 14:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:58 aqu@deploy1002: Finished deploy [airflow-dags/analytics@b9be20d]: Regular analytics weekly train [airflow-dags@b9be20d] (duration: 00m 09s)
* 14:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:58 aqu@deploy1002: Started deploy [airflow-dags/analytics@b9be20d]: Regular analytics weekly train [airflow-dags@b9be20d]
* 14:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:57 sukhe: retarting haproxy.service on A:dns-auth for zlib update
* 14:25 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2001.codfw.wmnet
* 13:57 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@b9be20d]: Regular analytics weekly train TEST [airflow-dags@b9be20d] (duration: 00m 10s)
* 14:17 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
* 13:56 aqu@deploy1002: Started deploy [airflow-dags/analytics_test@b9be20d]: Regular analytics weekly train TEST [airflow-dags@b9be20d]
* 14:08 dcausse@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 13:51 jayme: updated rsyslog to 8.2208.0-1~bpo11+1 on all kubernetes masters and nodes - [[phab:T289766|T289766]]
* 14:07 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 14:07 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:47 aqu@deploy1002: Finished deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST (second try after freeing up some disk space) [analytics/refinery@278c383] (duration: 06m 01s)
* 14:04 pt1979@cumin2002: START - Cookbook sre.experimental.reimage for host thumbor2006.codfw.wmnet
* 13:43 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 14:01 dcausse@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 13:43 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:38 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 13:42 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:38 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 13:41 aqu@deploy1002: Started deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST (second try after freeing up some disk space) [analytics/refinery@278c383]
* 13:34 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 13:38 sukhe: restarting bird.service on A:dns-rec for zlib update
* 13:31 dcausse@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:11 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:11 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' .
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 13:35 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:09 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'canary' .
* 13:33 sukhe: restarting pdns-recursor on A:dns-rec for zlib update
* 13:09 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 13:33 urbanecm@deploy1002: Synchronized php-1.39.0-wmf.28/extensions/GrowthExperiments/: {{Gerrit|f592e85858d17a2de99cde93627054ee4972c2bd}}: Mentee overview: avoid requiring the non-vue mentee overview script when loading the Vue one ([[phab:T300532|T300532]]) (duration: 04m 05s)
* 13:08 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 12:50 elukey@deploy1002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 13:08 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 12:50 elukey@deploy1002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 13:04 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:46 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: temporarily disabled due to sessionstore issues
* 13:04 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:46 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: temporarily disabled due to sessionstore issues
* 12:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:25 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sessionstore1001.eqiad.wmnet with OS buster
* 12:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:17 jayme: fleet wide update of prometheus-rsyslog-exporter to 0.0.0+git20201008-4 - [[phab:T289766|T289766]]
* 11:56 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sessionstore1001.eqiad.wmnet with reason: host reimage
* 11:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:06 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sessionstore1001.eqiad.wmnet with reason: host reimage
* 11:48 Lucas_WMDE: EU backport+config window done
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 100%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34787 and previous config saved to /var/cache/conftool/dbconfig/20220915-120013-root.json
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.2/skins/MinervaNeue/skinStyles/mobile.startup/Overlay.less: Backport: [[gerrit:724553{{!}}Revert "Search header should be vertically centered, not top aligned." (T292030)]] (duration: 01m 07s)
* 11:51 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1001.eqiad.wmnet with OS buster
* 11:43 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Wikibase/repo/includes/Store/Sql/SqlSiteLinkConflictLookup.php: Backport: [[gerrit:724371{{!}}Use CONN_TRX_AUTOCOMMIT in SqlSiteLinkConflictLookup (T291377)]] (duration: 01m 07s)
* 11:50 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sessionstore1001.eqiad.wmnet with OS buster
* 11:43 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 11:45 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1001.eqiad.wmnet with OS buster
* 11:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 75%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34786 and previous config saved to /var/cache/conftool/dbconfig/20220915-114508-root.json
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:44 hnowlan@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host sessionstore1001.eqiad.wmnet with OS buster
* 11:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:43 moritzm: restart exim on lists1001 to pick up zlib security updates
* 11:26 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:722279{{!}}Enable line numbering on all namespaces (pilot wikis) (T280027)]] (duration: 01m 09s)
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 50%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34785 and previous config saved to /var/cache/conftool/dbconfig/20220915-113003-root.json
* 11:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:22 jayme: importing prometheus-rsyslog-exporter 0.0.0+git20201008-4 to stretch-wikimedia, buster-wikimedia, bullseye-wikimedia - [[phab:T289766|T289766]]
* 11:20 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/DiscussionTools/modules/dt.ui.ReplyWidget.js: Backport: [[gerrit:724378{{!}}Fix almost all errors codes being logged as `http-0` (T290514)]] (duration: 01m 09s)
* 11:22 hnowlan@cumin1001: START - Cookbook sre.hosts.reimage for host sessionstore1001.eqiad.wmnet with OS buster
* 11:17 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/DiscussionTools/modules/dt.ui.ReplyWidget.js: Backport: [[gerrit:724379{{!}}Fix almost all errors codes being logged as `http-0` (T290514)]] (duration: 01m 09s)
* 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.wdqs.restart-nginx (exit_code=0) rolling restart_daemons on A:wcqs-public
* 11:16 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.wdqs.restart-nginx rolling restart_daemons on A:wcqs-public
* 11:15 volans@cumin2002: END (ERROR) - Cookbook sre.experimental.reimage (exit_code=97) for host sretest1001.eqiad.wmnet
* 11:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 25%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34784 and previous config saved to /var/cache/conftool/dbconfig/20220915-111458-root.json
* 10:35 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 11:12 hnowlan: sessionstore1001: c-foreach-nt drain
* 10:34 volans@cumin2002: END (ERROR) - Cookbook sre.experimental.reimage (exit_code=97) for host sretest1001.eqiad.wmnet
* 11:10 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: Testing reimage
* 10:24 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 11:10 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on sessionstore1001.eqiad.wmnet with reason: Testing reimage
* 10:02 volans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin1001.eqiad.wmnet with reason: testing latest change
* 11:04 marostegui@cumin1001: dbctl commit (dc=all): 'pool db2129 into s6 API', diff saved to https://phabricator.wikimedia.org/P34783 and previous config saved to /var/cache/conftool/dbconfig/20220915-110453-root.json
* 10:00 volans@cumin1001: START - Cookbook sre.hosts.downtime for 0:05:00 on cumin1001.eqiad.wmnet with reason: testing latest change
* 10:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 10%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34782 and previous config saved to /var/cache/conftool/dbconfig/20220915-105953-root.json
* 09:54 godog: bounce mtail on centrallog* - [[phab:T246470|T246470]]
* 10:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 5%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34781 and previous config saved to /var/cache/conftool/dbconfig/20220915-104448-root.json
* 09:47 dcausse@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 10:36 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 09:40 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 11s)
* 10:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 3%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34780 and previous config saved to /var/cache/conftool/dbconfig/20220915-102943-root.json
* 09:39 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2131 (re)pooling @ 1%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34779 and previous config saved to /var/cache/conftool/dbconfig/20220915-101438-root.json
* 08:58 dcausse@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 10:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 100%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34778 and previous config saved to /var/cache/conftool/dbconfig/20220915-101425-root.json
* 08:39 dcausse@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 10:04 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db2131.codfw.wmnet with reason: reboot
* 08:22 ema: fleet-wide rm /etc/rsyslog.d/00-abort-unclean-config.conf && systemctl restart rsyslog
* 10:03 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db2131.codfw.wmnet with reason: reboot
* 07:51 godog: fail sdg on be2036 - [[phab:T291988|T291988]]
* 10:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2131', diff saved to https://phabricator.wikimedia.org/P34777 and previous config saved to /var/cache/conftool/dbconfig/20220915-100212-root.json
* 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2081 [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17345 and previous config saved to /var/cache/conftool/dbconfig/20210929-072520-marostegui.json
* 09:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 75%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34775 and previous config saved to /var/cache/conftool/dbconfig/20220915-095920-root.json
* 07:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:58 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 06:15 marostegui: Deploy schema change on s8 codfw (lag will show up) [[phab:T283499|T283499]]
* 09:58 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 06:10 ryankemper: [[phab:T289517|T289517]] Ran puppet across query_service fleet `sudo cumin -b 6 'P<nowiki>{</nowiki>w*qs*<nowiki>}</nowiki>' 'sudo run-puppet-agent'`
* 09:57 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 06:09 ryankemper: [[phab:T289517|T289517]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/720746 (fix dcat-ap loading)
* 09:57 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2103 [[phab:T290865|T290865]]', diff saved to https://phabricator.wikimedia.org/P17344 and previous config saved to /var/cache/conftool/dbconfig/20210929-055645-marostegui.json
* 09:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 04:50 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2081 [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17342 and previous config saved to /var/cache/conftool/dbconfig/20210929-045033-marostegui.json
* 09:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 50%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34774 and previous config saved to /var/cache/conftool/dbconfig/20220915-094415-root.json
* 03:18 eileen: civicrm revision changed from {{Gerrit|a0bc324a61}} to {{Gerrit|2ecb8f0bcd}}, config revision is {{Gerrit|77cb7ec866}}
* 09:38 aqu@deploy1002: Finished deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@278c383] (duration: 14m 21s)
* 03:01 eileen: civicrm revision changed from {{Gerrit|1b7bae4033}} to {{Gerrit|a0bc324a61}}, config revision is {{Gerrit|77cb7ec866}}
* 09:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 25%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34773 and previous config saved to /var/cache/conftool/dbconfig/20220915-092910-root.json
* 03:00 eileen: civicrm revision changed from {{Gerrit|a480bf03c9}} to {{Gerrit|1b7bae4033}}, config revision is {{Gerrit|77cb7ec866}}
* 09:23 aqu@deploy1002: Started deploy [analytics/refinery@278c383] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@278c383]
* 02:36 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Have PdfHandler/PagedTiffHandler use Shellbox on all wikis but Commons (duration: 01m 07s)
* 09:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 10%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34772 and previous config saved to /var/cache/conftool/dbconfig/20220915-091405-root.json
* 00:52 eileen: civicrm revision changed from {{Gerrit|a1929b3dfd}} to {{Gerrit|a480bf03c9}}, config revision is {{Gerrit|77cb7ec866}}
* 09:13 aqu@deploy1002: Finished deploy [analytics/refinery@278c383] (thin): Regular analytics weekly train THIN [analytics/refinery@278c383] (duration: 00m 08s)
* 00:27 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Have SyntaxHighlight use Shellbox on all wikis (duration: 01m 18s)
* 09:13 aqu@deploy1002: Started deploy [analytics/refinery@278c383] (thin): Regular analytics weekly train THIN [analytics/refinery@278c383]
* 00:21 ryankemper: [[phab:T280001|T280001]] `ryankemper@authdns1001:~$ sudo -i authdns-update` following merge of https://gerrit.wikimedia.org/r/c/operations/dns/+/724538
* 09:12 aqu@deploy1002: Finished deploy [analytics/refinery@278c383]: Regular analytics weekly train [analytics/refinery@278c383] (duration: 27m 31s)
* 00:19 ryankemper: [[phab:T280001|T280001]] Okay now we're clear to proceed to https://wikitech.wikimedia.org/wiki/LVS#For_active/active_services; merging https://gerrit.wikimedia.org/r/c/operations/dns/+/724538
* 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 5%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34770 and previous config saved to /var/cache/conftool/dbconfig/20220915-085900-root.json
* 00:15 ryankemper: [[phab:T280001|T280001]] `ryankemper@cumin1001:~$ sudo cumin 'A:icinga or A:dns-auth' run-puppet-agent` per https://wikitech.wikimedia.org/wiki/LVS#Make_the_service_page,_add_discovery_resources
* 08:49 apergos: UTC backport training window closed at lsat
* 00:14 ryankemper: [[phab:T280001|T280001]] Moving wcqs state from `monitoring_setup` to `production`; merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/724536
* 08:46 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 08:45 aqu@deploy1002: Started deploy [analytics/refinery@278c383]: Regular analytics weekly train [analytics/refinery@278c383]
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 3%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34768 and previous config saved to /var/cache/conftool/dbconfig/20220915-084355-root.json
* 08:43 aqu: about to deploy analytics/refinery
* 08:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 100%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34767 and previous config saved to /var/cache/conftool/dbconfig/20220915-084046-root.json
* 08:34 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons.
* 08:28 marostegui@cumin1001: dbctl commit (dc=all): 'db2129 (re)pooling @ 1%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34766 and previous config saved to /var/cache/conftool/dbconfig/20220915-082851-root.json
* 08:26 tsepothoabala@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:832339{{!}}Enable action blocks on ptwiki (T317157)]] (duration: 04m 07s)
* 08:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 75%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34765 and previous config saved to /var/cache/conftool/dbconfig/20220915-082541-root.json
* 08:21 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34764 and previous config saved to /var/cache/conftool/dbconfig/20220915-082112-root.json
* 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2129 [[phab:T317850|T317850]]', diff saved to https://phabricator.wikimedia.org/P34763 and previous config saved to /var/cache/conftool/dbconfig/20220915-081627-root.json
* 08:15 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2114 to s6 codfw master [[phab:T317850|T317850]]', diff saved to https://phabricator.wikimedia.org/P34762 and previous config saved to /var/cache/conftool/dbconfig/20220915-081517-marostegui.json
* 08:14 marostegui: Starting s6 codfw failover from db2129 to db2114 - [[phab:T317850|T317850]]
* 08:10 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 50%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34761 and previous config saved to /var/cache/conftool/dbconfig/20220915-081036-root.json
* 08:06 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 75%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34760 and previous config saved to /var/cache/conftool/dbconfig/20220915-080607-root.json
* 08:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2114 from API [[phab:T317850|T317850]]', diff saved to https://phabricator.wikimedia.org/P34759 and previous config saved to /var/cache/conftool/dbconfig/20220915-080157-root.json
* 08:01 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 26 hosts with reason: Primary codfw s6 [[phab:T317850|T317850]]
* 08:01 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2114 with weight 0 [[phab:T317850|T317850]]', diff saved to https://phabricator.wikimedia.org/P34758 and previous config saved to /var/cache/conftool/dbconfig/20220915-080122-root.json
* 08:01 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 26 hosts with reason: Primary codfw s6 [[phab:T317850|T317850]]
* 07:56 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:55 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 25%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34757 and previous config saved to /var/cache/conftool/dbconfig/20220915-075531-root.json
* 07:51 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 50%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34756 and previous config saved to /var/cache/conftool/dbconfig/20220915-075102-root.json
* 07:50 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:50 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2132,2160].codfw.wmnet with reason: reboot
* 07:47 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db[2132,2160].codfw.wmnet with reason: reboot
* 07:47 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2133,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db[2133,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2134,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db[2134,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7:00:00 on db[2135,2160].codfw.wmnet with reason: reboot
* 07:46 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7:00:00 on db[2135,2160].codfw.wmnet with reason: reboot
* 07:43 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 10%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34755 and previous config saved to /var/cache/conftool/dbconfig/20220915-074026-root.json
* 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 25%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34754 and previous config saved to /var/cache/conftool/dbconfig/20220915-073557-root.json
* 07:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 5%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34753 and previous config saved to /var/cache/conftool/dbconfig/20220915-072520-root.json
* 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 10%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34752 and previous config saved to /var/cache/conftool/dbconfig/20220915-072053-root.json
* 07:17 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on db2151.codfw.wmnet with reason: reboot
* 07:17 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on db2151.codfw.wmnet with reason: reboot
* 07:14 moritzm: installing zlib security updates
* 07:13 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 07:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 07:12 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 07:11 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 07:10 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 3%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34751 and previous config saved to /var/cache/conftool/dbconfig/20220915-071015-root.json
* 07:09 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry (exit_code=0) rolling restart_daemons on A:docker-registry
* 07:06 jmm@cumin2002: START - Cookbook sre.misc-clusters.roll-restart-reboot-docker-registry rolling restart_daemons on A:docker-registry
* 07:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 5%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34750 and previous config saved to /var/cache/conftool/dbconfig/20220915-070548-root.json
* 07:05 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 06:55 marostegui@cumin1001: dbctl commit (dc=all): 'db2115 (re)pooling @ 1%: Repooling after upgrade', diff saved to https://phabricator.wikimedia.org/P34749 and previous config saved to /var/cache/conftool/dbconfig/20220915-065510-root.json
* 06:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 3%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34748 and previous config saved to /var/cache/conftool/dbconfig/20220915-065043-root.json
* 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'Give some weight to db2096 [[phab:T317842|T317842]]', diff saved to https://phabricator.wikimedia.org/P34747 and previous config saved to /var/cache/conftool/dbconfig/20220915-064750-marostegui.json
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2115 [[phab:T317842|T317842]]', diff saved to https://phabricator.wikimedia.org/P34746 and previous config saved to /var/cache/conftool/dbconfig/20220915-064635-marostegui.json
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2096 to x1 primary and set section read-write [[phab:T317842|T317842]]', diff saved to https://phabricator.wikimedia.org/P34745 and previous config saved to /var/cache/conftool/dbconfig/20220915-064525-root.json
* 06:44 marostegui: Starting x1 codfw failover from db2115 to db2096 - [[phab:T317842|T317842]]
* 06:40 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 10 hosts with reason: Primary switchover x1 [[phab:T317842|T317842]]
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2096 with weight 0 [[phab:T317842|T317842]]', diff saved to https://phabricator.wikimedia.org/P34744 and previous config saved to /var/cache/conftool/dbconfig/20220915-064014-root.json
* 06:40 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 10 hosts with reason: Primary switchover x1 [[phab:T317842|T317842]]
* 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 1%: Repooling for warm up after upgrade', diff saved to https://phabricator.wikimedia.org/P34743 and previous config saved to /var/cache/conftool/dbconfig/20220915-063538-root.json
* 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2105 [[phab:T317839|T317839]]', diff saved to https://phabricator.wikimedia.org/P34742 and previous config saved to /var/cache/conftool/dbconfig/20220915-061421-root.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2127 to s3 codfw [[phab:T317839|T317839]]', diff saved to https://phabricator.wikimedia.org/P34741 and previous config saved to /var/cache/conftool/dbconfig/20220915-061317-marostegui.json
* 06:12 marostegui: Starting s3 codfw failover from db2105 to db2127 - [[phab:T317839|T317839]]
* 06:03 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2127 with weight 0 [[phab:T317839|T317839]]', diff saved to https://phabricator.wikimedia.org/P34740 and previous config saved to /var/cache/conftool/dbconfig/20220915-060307-root.json
* 06:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 23 hosts with reason: Codfw switchover s3 [[phab:T317839|T317839]]
* 06:02 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 23 hosts with reason: Codfw switchover s3 [[phab:T317839|T317839]]
* 05:32 marostegui@cumin1001: END (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down [[phab:T317662|T317662]]
* 05:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down [[phab:T317662|T317662]]
* 05:12 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down [[phab:T317662|T317662]]
* 05:12 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 7 days, 0:00:00 on db1189.eqiad.wmnet with reason: down [[phab:T317662|T317662]]


== 2021-09-28 ==
== 2022-09-14 ==
* 23:53 ryankemper: [[phab:T280001|T280001]] New icinga checks are green, will proceed to next step of moving wcqs state from `monitoring_setup` -> `production`
* 22:08 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34739 and previous config saved to /var/cache/conftool/dbconfig/20220914-220822-ladsgroup.json
* 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 22:08 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 22:08 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 22:07 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 22:07 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34738 and previous config saved to /var/cache/conftool/dbconfig/20220914-220744-ladsgroup.json
* 21:52 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P34737 and previous config saved to /var/cache/conftool/dbconfig/20220914-215238-ladsgroup.json
* 21:38 dduvall@deploy1002: Finished deploy [phabricator/deployment@3137c92]: testing phabricator deployment to phab2002 (duration: 01m 48s)
* 21:37 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149', diff saved to https://phabricator.wikimedia.org/P34736 and previous config saved to /var/cache/conftool/dbconfig/20220914-213732-ladsgroup.json
* 21:37 dduvall@deploy1002: Started deploy [phabricator/deployment@3137c92]: testing phabricator deployment to phab2002
* 21:36 dduvall: testing phabricator deployment to phab2002. should have no production impact (not serving traffic, no access to r/w db)
* 21:35 dduvall@deploy1002: Installation of scap version "4.19.1" completed for 561 hosts
* 21:35 dduvall@deploy1002: Installing scap version "4.19.1" for 561 hosts
* 21:34 dduvall: Deploying scap 4.19.1 (https://gerrit.wikimedia.org/r/c/mediawiki/tools/scap/+/832297/1/changelog)
* 21:22 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34735 and previous config saved to /var/cache/conftool/dbconfig/20220914-212225-ladsgroup.json
* 20:47 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:47 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:47 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:47 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:44 dancy@deploy1002: Sync cancelled.
* 20:44 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet, mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet
* 20:44 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:44 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:40 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:40 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:39 dancy@deploy1002: Started scap: testing
* 20:38 dancy@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]] (duration: 05m 49s)
* 20:34 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:34 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:34 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:33 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:32 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 20:28 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:24 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:24 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:21 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:19 dancy@deploy1002: deploy-promote aborted:  (duration: 08m 52s)
* 20:19 dancy@deploy1002: sync-file aborted: group1 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]] (duration: 01m 24s)
* 20:18 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:18 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 20:14 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:13 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:13 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:12 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:09 dancy@deploy1002: Sync cancelled.
* 20:09 dancy@deploy1002: dancy: testing synced to the testservers: mwdebug1001.eqiad.wmnet, mwdebug2001.codfw.wmnet, mwdebug1002.eqiad.wmnet, mwdebug2002.codfw.wmnet
* 20:09 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 20:06 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 20:02 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 20:02 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 20:02 dancy@deploy1002: Started scap: testing
* 20:01 TheresNoTime: Nothing to deploy in this UTC late backport window
* 19:57 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: sync
* 19:57 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: sync
* 19:55 dancy@deploy1002: scap failed: CalledProcessError Command '['helmfile', '-e', 'eqiad', 'apply']' returned non-zero exit status 1. (duration: 07m 12s)
* 19:55 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:51 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:51 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:49 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:49 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:49 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:48 dancy@deploy1002: Started scap: testing
* 19:46 dancy@deploy1002: scap failed: CalledProcessError Command '['helmfile', '-e', 'eqiad', 'apply']' returned non-zero exit status 1. (duration: 07m 23s)
* 19:46 dancy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:39 dancy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:39 dancy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:39 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:38 dancy@deploy1002: Started scap: testing
* 19:38 dancy@deploy1002: sync-world aborted: testing (duration: 13m 25s)
* 19:35 dancy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:26 dancy: dancy@deploy1002 touch /var/lib/deploy-mwdebug/pause
* 19:24 dancy@deploy1002: Started scap: testing
* 19:22 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:17 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@48e506e]: drop-snapshots: Remove directory handling (duration: 02m 03s)
* 19:15 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:15 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:15 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@48e506e]: drop-snapshots: Remove directory handling
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 19:02 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 19:02 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 19:01 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:59 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 18:50 ebernhardson@deploy1002: Finished deploy [wikimedia/discovery/analytics@e358893]: drop-snapshots: tables are partitioned by wiki (duration: 02m 05s)
* 18:48 ebernhardson@deploy1002: Started deploy [wikimedia/discovery/analytics@e358893]: drop-snapshots: tables are partitioned by wiki
* 18:41 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:40 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:39 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:36 dancy@deploy1002: Finished scap: testwikis wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]] (duration: 04m 41s)
* 18:34 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 18:33 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 18:33 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 18:33 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 18:31 dancy@deploy1002: Started scap: testwikis wikis to 1.40.0-wmf.1  refs [[phab:T314190|T314190]]
* 16:47 aikochou@deploy1002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 16:18 jclark@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:17 jclark@cumin1001: START - Cookbook sre.dns.netbox
* 16:10 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:08 cwhite@cumin2002: START - Cookbook sre.dns.netbox
* 16:05 volans@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 - volans@cumin1001
* 16:04 volans@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet,cumin1001.eqiad.wmnet with reason: Release v0.6.1 - volans@cumin1001
* 15:58 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash2024.codfw.wmnet on all recursors
* 15:58 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache logstash2024.codfw.wmnet on all recursors
* 15:58 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash2002.codfw.wmnet on all recursors
* 15:58 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache logstash2002.codfw.wmnet on all recursors
* 15:58 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash1026.eqiad.wmnet on all recursors
* 15:57 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache logstash1026.eqiad.wmnet on all recursors
* 15:57 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash1029.eqiad.wmnet on all recursors
* 15:57 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache logstash1029.eqiad.wmnet on all recursors
* 15:57 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash1028.eqiad.wmnet on all recursors
* 15:57 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache logstash1028.eqiad.wmnet on all recursors
* 15:57 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash1027.eqiad.wmnet on all recursors
* 15:57 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache logstash1027.eqiad.wmnet on all recursors
* 15:55 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) logstash2001.codfw.wmnet on all recursors
* 15:55 cwhite@cumin2002: START - Cookbook sre.dns.wipe-cache logstash2001.codfw.wmnet on all recursors
* 15:50 dduvall@deploy1002: Finished deploy [phabricator/deployment@3137c92]: testing phabricator deployment to phab2002 (duration: 00m 39s)
* 15:49 dduvall@deploy1002: Started deploy [phabricator/deployment@3137c92]: testing phabricator deployment to phab2002
* 15:48 dduvall: testing phabricator deployment to phab2002. should have no production impact (not serving traffic, no access to r/w db)
* 15:24 cwhite@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:22 cwhite@cumin2002: START - Cookbook sre.dns.netbox
* 15:22 cwhite@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:17 cwhite@cumin2002: START - Cookbook sre.dns.netbox
* 15:05 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 15:04 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 15:04 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 15:03 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 14:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34732 and previous config saved to /var/cache/conftool/dbconfig/20220914-145956-ladsgroup.json
* 14:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P34731 and previous config saved to /var/cache/conftool/dbconfig/20220914-144449-ladsgroup.json
* 14:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P34730 and previous config saved to /var/cache/conftool/dbconfig/20220914-142941-ladsgroup.json
* 14:14 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34729 and previous config saved to /var/cache/conftool/dbconfig/20220914-141434-ladsgroup.json
* 14:06 ladsgroup@cumin1001: conftool action : set/pooled=inactive; selector: cluster=wikireplicas-a,name=dbproxy1019.eqiad.wmnet
* 14:05 ladsgroup@cumin1001: conftool action : set/pooled=yes; selector: cluster=wikireplicas-a,name=dbproxy1018.eqiad.wmnet
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.misc-clusters.thumbor (exit_code=0) rolling restart_daemons on A:thumbor-codfw
* 13:59 jmm@cumin2002: START - Cookbook sre.misc-clusters.thumbor rolling restart_daemons on A:thumbor-codfw
* 13:48 moritzm: imported zlib 1:1.2.8.dfsg-5+deb9u1+wmf1 to apt.wikimedia.org
* 13:40 Lucas_WMDE: UTC afternoon backport+config window done
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:37 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript namespaceDupes.php bnwiktionary --fix # [[phab:T317745|T317745]] – dry run result: 6043 links to fix, 6043 were resolvable, 0 were deleted
* 13:36 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:831970{{!}}Move namespace in the Bengali Wiktionary: উইকিসরাস → পরিশিষ্ট and set wgNamespaceAliases for newly created namespaces (T317745)]] (duration: 03m 41s)
* 13:28 topranks: upgrading routinator on rpki2002
* 13:20 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:20 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:832145{{!}}Enable Content/Section translation on WPs with new MT support from Google (T313296)]] (duration: 03m 39s)
* 13:19 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:18 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 13:13 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply
* 13:10 mwdebug-deploy@deploy1002: helmfile [eqiad] DONE helmfile.d/services/mwdebug: apply
* 13:09 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:831872{{!}}Enable Section Translation in Odia Wikipedia (T313300)]] (duration: 03m 55s)
* 13:09 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply
* 12:54 jayme: imported rsyslog 8.2208.0-1~bpo11+1 into bullseye-wikimedia component/rsyslog-k8s - [[phab:T289766|T289766]]
* 11:59 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 11:59 ladsgroup@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1189.eqiad.wmnet with reason: Maintenance
* 11:59 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179 ([[phab:T314041|T314041]])', diff saved to https://phabricator.wikimedia.org/P34725 and previous config saved to /var/cache/conftool/dbconfig/20220914-115920-ladsgroup.json
* 11:49 cmooney@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cr2-eqdfw,cr2-eqdfw IPv6
* 11:49 cmooney@cumin1001: START - Cookbook sre.hosts.remove-downtime for cr2-eqdfw,cr2-eqdfw IPv6
* 11:44 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P34723 and previous config saved to /var/cache/conftool/dbconfig/20220914-114413-ladsgroup.json
* 11:29 topranks: rebooting cr2-eqdfw to complete upgrade
* 11:29 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1179', diff saved to https://phabricator.wikimedia.org/P34721 and previous config saved to /var/cache/conftool/dbconfig/20220914-112907-ladsgroup.json
* 11:14 topranks: Shutting do