You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Labslogbot
(lvchange -ay labstore/others on labstore1002 (godog))
imported>Stashbot
(catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Create alias for Appendix and Appendix_talk namespaces on mywiktionary (T291146) (duration: 00m 55s))
Line 1: Line 1:
== 2015-08-30 ==
== 2021-10-25 ==
* 12:58 godog: lvchange -ay labstore/others on labstore1002
* 23:12 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Create alias for Appendix and Appendix_talk namespaces on mywiktionary ([[phab:T291146|T291146]]) (duration: 00m 55s)
* 12:52 godog: start-nfs on labstore1002
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:31 godog: lvchange -ay labstore/tools on labstore1002
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:30 godog: also disabled puppet on labstore1002 while investigating
* 22:57 ryankemper: [wcqs] Downtimed `wcqs*` until roughly a week from now (while we setup oauth)
* 12:15 godog: trying to manually assemble missing raid on labstore1002 with mdadm --assemble /dev/md/slice51 --uuid 0747643d:b89b36ff:57156095:c33694fc --verbose
* 22:53 legoktm: uploaded PHP 7.4.25 to apt.wm.o (DSA-4992-1)
* 11:19 YuviPanda: powered labstore1002 back up
* 22:44 ryankemper@deploy1002: Started deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS
* 11:17 YuviPanda: shut down labstore1002, going to powercycle from mgmt
* 22:30 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 03m 04s)
* 10:34 YuviPanda: disabled backups on labstore1002 to prevent overwriting of good backups on 2001
* 22:27 ryankemper@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 10:08 YuviPanda: rebooted labstore1002
* 21:53 mutante: new project language "pwn" added - Paiwan is a native language of Taiwan, spoken by the Paiwan, a Taiwanese indigenous people. [[phab:T292415|T292415]]
* 04:16 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Aug 30 04:16:17 UTC 2015 (duration 16m 16s)
* 21:52 mutante: new project language "ami" added - Sowal no 'Amis is the Formosan language of the 'Amis (or Ami), an indigenous people living along the east coast of Taiwan. - [[phab:T292414|T292414]]
* 02:23 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf20) at 2015-08-30 02:23:07+00:00
* 21:50 mutante: log authdns1001 (DNS) - sudo authdns-update, add new project language "ami" (Amis) for [[phab:T292414|T292414]] - edited langlist.tmpl which regenerates all project zones
* 02:20 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf20/cache/l10n: l10nupdate for 1.26wmf20 (duration: 05m 36s)
* 21:40 mutante: authdns1001 (DNS) - sudo authdns-update, add new project language "pwn" (Paiwan) for [[phab:T292415|T292415]]
* 19:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on mw2255.codfw.wmnet with reason: DRAC upgrade
* 19:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on mw2255.codfw.wmnet with reason: DRAC upgrade
* 19:47 mutante: mw2255 - depooled=inactive (incl "dsh groups"), shut down physically for [[phab:T283582|T283582]] - can be worked on anytime
* 19:45 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2255.codfw.wmnet
* 19:45 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2255.codfw.wmnet
* 19:42 mutante: icinga - ACKing all unhandled CRIT alerts on hosts with "dev" or "test" in their name, regardless of notifications being disabled or not. just so that we get more signal than noise in actual unhandled CRITs in web UI
* 19:40 mutante: cumin2002 - sudo systemctl reset-failed to clear Icinga alert about failed but (now) non-existing service database-backups-snapshots.service, assuming it's a case of "only in active DC"
* 19:12 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1112.eqiad.wmnet with reason: hardware fail
* 19:12 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db1112.eqiad.wmnet with reason: hardware fail
* 19:07 kormat@cumin1001: dbctl commit (dc=all): 'Temporarily move mw groups to db1123 [[phab:T294295|T294295]]', diff saved to https://phabricator.wikimedia.org/P17597 and previous config saved to /var/cache/conftool/dbconfig/20211025-190717-kormat.json
* 19:06 mutante: db1112 - powercycling
* 19:04 legoktm@cumin1001: dbctl commit (dc=all): 'Depool db1112 ([[phab:T294295|T294295]])', diff saved to https://phabricator.wikimedia.org/P17596 and previous config saved to /var/cache/conftool/dbconfig/20211025-190436-legoktm.json
* 18:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:40 jforrester@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/timeline/includes/Timeline.php: Backport: [[gerrit:734312{{!}}Input may be null when rendering a self-closing tag `<timeline />` (T294020)]] (duration: 00m 55s)
* 18:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:24 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732971{{!}}Fix some easy codestyle issues]] (duration: 00m 55s)
* 18:22 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:732971{{!}}Fix some easy codestyle issues]] (duration: 00m 54s)
* 18:19 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732840{{!}}Fix array declaration of NS_USER_TALK abbreviation on ruwikiquote (T197058)]] (duration: 00m 55s)
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:15 jforrester@deploy1002: Synchronized wmf-config/flaggedrevs.php: Config: [[gerrit:732836{{!}}flaggedrevs: Drop legacy wgFlaggedRevsStatsAge config, no longer read]] (duration: 00m 55s)
* 18:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:12 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732254{{!}}Make reply tool available as opt-out on frwiki (T293687)]] (duration: 00m 56s)
* 17:41 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2253.codfw.wmnet
* 17:40 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2253.codfw.wmnet
* 17:39 mutante: mw2253 - scap pull after hw maintenance is over
* 17:32 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:24 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:23 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:22 XioNoX: update core routers ACLs
* 17:20 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 16:49 XioNoX: update management routers ACLs
* 16:36 XioNoX: DNS - Add eqsin-ulsfo transport v6 prefix - [[phab:T273308|T273308]]
* 16:31 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:28 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 16:25 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 16:25 mmandere@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:21 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 16:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:10 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2253.codfw.wmnet
* 16:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:08 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:734298{{!}}Empty wikibase disabled access entity types on Beta (T294159)]] (beta-only) (duration: 01m 47s)
* 16:04 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:01 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 15:57 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:734328{{!}} Bumping portals to master (T128546)]] (duration: 01m 52s)
* 15:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:49 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:734328{{!}} Bumping portals to master (T128546)]] (duration: 01m 54s)
* 15:46 jbond: upgrade cas/idp to 6.4.2
* 14:56 mutante: mw2253 - shut down and downtimed for 2 days
* 14:50 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on mw2253.codfw.wmnet with reason: DRAC upgrade
* 14:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on mw2253.codfw.wmnet with reason: DRAC upgrade
* 14:49 mutante: depooling mw2253 for DRAC upgrade ([[phab:T283582|T283582]])
* 14:48 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2253.codfw.wmnet
* 14:45 jbond: update cas package
* 14:31 marostegui: Deploy schema change on s3 codfw - [[phab:T291719|T291719]]
* 12:04 ema: cp3062: upgrade varnish to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 11:57 ema: deployment-cache-text06: upgrade varnish to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 11:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 Lucas_WMDE: UTC morning backport+config window done
* 11:22 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732969{{!}}Remove dispatchLagToMaxLagFactor Wikibase setting (T292604)]] (duration: 00m 54s)
* 11:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732951{{!}}Remove wikibaseDispatchRedisLockManager config (T292604)]] (duration: 00m 54s)
* 11:14 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732950{{!}}Remove wmg variables for dispatchChanges.php Wikibase settings (T292604)]] (duration: 00m 55s)
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732949{{!}}Remove dispatchChanges.php-related Wikibase settings (T292604)]] (duration: 00m 55s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:05 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732372{{!}}Remove dispatchViaJobs-related Wikibase settings (T291828)]] (duration: 00m 56s)
* 09:52 godog: bounce uwsgi graphite web on graphite2003 - [[phab:T294220|T294220]]
* 09:52 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:48 volans@cumin1001: START - Cookbook sre.dns.netbox
* 09:43 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:733089{{!}}[BETA CLUSTER] Enable WikibaseLexeme Scribunto access (T294159)]] (merged on Friday, syncing now to avoid outdated files even if it’s just -labs.php) (duration: 00m 55s)
* 09:18 godog: bounce graphite-web on graphite2003 to test timeout bump - [[phab:T294220|T294220]]
* 08:08 XioNoX: merge DNS changes to add drmrs
* 07:50 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:50 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 05:47 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid,name=wtp1026.*
* 05:43 _joe_: pooling wtp1042 [[phab:T294212|T294212]]
* 05:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1109.eqiad.wmnet with OS buster
* 05:01 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1109.eqiad.wmnet with OS buster
* 04:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1109 (s8) for reimage [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17590 and previous config saved to /var/cache/conftool/dbconfig/20211025-043028-marostegui.json


== 2015-08-29 ==
== 2021-10-23 ==
* 15:26 jynus: killing idle mysql connections from phabricator and setting wait and interactive timeout to 60
* 16:40 dcausse: restarting blazegraph on wdqs1004 and wdqs1006 (free allocators alert)
* 09:30 jynus: SCAP failed, cannot depool db1028
* 15:45 urbanecm: Start server-side upload for 1 video file ([[phab:T289781|T289781]]), testing whether [[phab:T291137|T291137]] is still an issue
* 09:28 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1028, return ES servers back from maintenance (duration: 00m 03s)
* 09:28 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1028, return ES servers back from maintenance (duration: 00m 03s)
* 09:05 jynus: about to depool db1028 due to disk issue
* 04:17 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Aug 29 04:17:55 UTC 2015 (duration 17m 54s)
* 02:24 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf20) at 2015-08-29 02:24:01+00:00
* 02:21 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf20/cache/l10n: l10nupdate for 1.26wmf20 (duration: 05m 48s)


== 2015-08-28 ==
== 2021-10-22 ==
* 23:45 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/234679/ (duration: 06m 56s)
* 23:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:51 logmsgbot: bd808@tin Synchronized wmf-config/CommonSettings-labs.php: Use ffmpeg instead of avconv on labs beta (I250fe33) (duration: 06m 05s)
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:05 ori: disabling puppet on tin for a few minutes to test an ssh-agent-proxy change
* 20:57 bblack: re-pooling eqiad in DNS
* 20:04 logmsgbot: catrope@tin Synchronized php-1.26wmf20/resources/src/mediawiki.legacy/shared.css: T110716 (duration: 00m 12s)
* 20:54 legoktm: <XioNoX> I disabled the interface on cr1, going to re-enabled the active on on cr2
* 18:09 robh: updating ldap-codfw cert
* 20:48 legoktm: bblack has temporarily depooled eqiad https://gerrit.wikimedia.org/r/733043
* 17:10 logmsgbot: catrope@tin Synchronized php-1.26wmf20/extensions/Flow/includes/Parsoid/Utils.php: T110676 (duration: 00m 13s)
* 20:41 XioNoX: disable sessions to equinix eqiad IXP
* 17:08 urandom: bouncing Cassandra on restbase1001 to apply default (puppet-managed) settings
* 19:17 urbanecm: Start server-side upload of 1 video file ([[phab:T294134|T294134]])
* 16:03 chasemp: ferm for elasticsearch10(0[8-9|1[0-13])
* 15:06 jbond: upload puppetboard_3.1.0-1_all.deb to ullseye-wikimedia
* 15:31 awight: updated crm from fc0fcc8f5af262b56392d3f4f5998f8ea08c99a8 to 0fc8474338e7a31fdde79287bd667b98cd96a252
* 13:42 ema: deployment-cache-upload06: restart varnish-frontend, package got upgraded to 6.0.8 [[phab:T294116|T294116]]
* 15:23 chasemp: ferm for elasticsearch10[14-17]
* 13:30 jbond: upload python3-pypuppetdb_2.4.0-1_all.deb to bullseye
* 11:09 logmsgbot: aude@tin Synchronized php-1.26wmf20/extensions/Wikidata/Wikidata.php: Sync entry point - updated to work on Jenkins together with ContentTranslation (duration: 00m 12s)
* 10:46 jbond: upload cas_6.4.2-1+wmf10u1
* 10:29 godog: reenable puppet on ms-fe1, ferm changes will go out on monday
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 09:48 jynus: Cloning es1001 database into es1012
* 10:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 09:45 moritzm: enabled ferm for swift on esams
* 09:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294029|T294029]]
* 09:28 moritzm: enabled ferm on strontium puppetmaster backend
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2025.codfw.wmnet with OS buster
* 09:00 moritzm: enabled ferm on rhodium puppetmaster backend
* 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 08:29 moritzm: uploaded debdeploy 0.0.3 to carbon
* 08:27 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:23 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool es1001, increas weight of es1011, pool es1014 for the first time (duration: 00m 13s)
* 08:24 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 05:59 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Aug 28 05:59:09 UTC 2015 (duration 59m 8s)
* 08:23 ema: cp3062: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 04:58 logmsgbot: ori@tin Synchronized php-1.26wmf20/includes/parser/Parser.php: 754b222daf: Add ParserOutput cache and expiry times to NewPP report (duration: 00m 13s)
* 08:00 ema: deployment-cache-text06: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 02:41 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf20) at 2015-08-28 02:41:26+00:00
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17580 and previous config saved to /var/cache/conftool/dbconfig/20211022-055403-root.json
* 02:35 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf20/cache/l10n: l10nupdate for 1.26wmf20 (duration: 10m 47s)
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17579 and previous config saved to /var/cache/conftool/dbconfig/20211022-053900-root.json
* 01:59 Tim: on ruthenium: started parsoid_vd which was previously killed by oom-killer
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17578 and previous config saved to /var/cache/conftool/dbconfig/20211022-052356-root.json
* 01:58 Tim: on ruthenium, reduced parsoid-rt-client concurrency from 16 to 8 since it was OOM and oom-killer was killing random things
* 05:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17577 and previous config saved to /var/cache/conftool/dbconfig/20211022-050852-root.json
* 01:37 Tim: on ruthenium restarted parsoid-rt-client and parsoid-vd-client
* 04:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17576 and previous config saved to /var/cache/conftool/dbconfig/20211022-045349-root.json
* 00:24 mutante: powercycled mw2027
* 04:46 marostegui_: Deploy schema change on s8 codfw - [[phab:T291719|T291719]]
* 00:19 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/234450/ (duration: 01m 14s)
* 04:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17575 and previous config saved to /var/cache/conftool/dbconfig/20211022-043845-root.json
* 00:06 logmsgbot: krenair@tin Synchronized wmf-config/mobile.php: live hack to make previous commit work (duration: 01m 14s)
* 02:59 ejegg: updated payments-wiki from {{Gerrit|088a8cda1e}} to {{Gerrit|6e810fb401}}
* 00:05 Krenair: Another codfw host broke: mw2027
* 00:01 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/234330/ (duration: 00m 13s)


== 2015-08-27 ==
== 2021-10-21 ==
* 23:58 logmsgbot: krenair@tin Synchronized php-1.26wmf20/extensions/MobileFrontend/includes/MobileFormatter.php: https://gerrit.wikimedia.org/r/#/c/234331/1 (duration: 00m 12s)
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:57 logmsgbot: krenair@tin Synchronized php-1.26wmf20/extensions/MobileFrontend/includes/config/Experimental.php: https://gerrit.wikimedia.org/r/#/c/234331/1 (duration: 00m 14s)
* 23:38 jforrester@deploy1002: Synchronized w/fatal-error.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 23:55 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/233439/ (duration: 00m 12s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:30 logmsgbot: krenair@tin Synchronized php-1.26wmf20/extensions/Gadgets/extension.json: touch (duration: 00m 13s)
* 23:37 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 23:24 logmsgbot: krenair@tin Synchronized php-1.26wmf20/includes/DefaultSettings.php: https://gerrit.wikimedia.org/r/#/c/234328/ (duration: 00m 12s)
* 23:36 jforrester@deploy1002: Synchronized multiversion/: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 23:24 logmsgbot: krenair@tin Synchronized php-1.26wmf20/includes/registration/ExtensionProcessor.php: https://gerrit.wikimedia.org/r/#/c/234328/ (duration: 00m 12s)
* 23:34 jforrester@deploy1002: Synchronized docroot/noc/conf/index.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 23:23 logmsgbot: krenair@tin Synchronized php-1.26wmf20/includes/MWNamespace.php: https://gerrit.wikimedia.org/r/#/c/234328/ (duration: 00m 13s)
* 23:33 jforrester@deploy1002: Synchronized wmf-config: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 23:15 logmsgbot: krenair@tin Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/234009/ (duration: 00m 13s)
* 23:32 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:04 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/233100/ (duration: 00m 12s)
* 23:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:11 chasemp: ferm setup on elasticsearch10(1[8-9|2[0-3])
* 23:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:06 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedia wikis to 1.26wmf20
* 23:25 thcipriani@deploy1002: Synchronized wmf-config: Config: [[gerrit:730946{{!}}CommonSettings: Drop legacy CentralAuth config flag, never read (T277932)]] (duration: 00m 55s)
* 19:57 logmsgbot: twentyafterfour@tin Synchronized php-1.26wmf20/includes/media/XMP.php: deploy fix for T89532 on 1.26wmf20 (duration: 00m 13s)
* 23:18 thcipriani@deploy1002: Synchronized tests/multiversion/StaticSettingsTest.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 18:16 chasemp: setting up ferm on elastic1027-31
* 23:15 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 17:47 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/234320/ (duration: 00m 13s)
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:43 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/234320/2 (duration: 00m 13s)
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:37 urandom: ack'd Cassandra process alert on restbase1001; temporary command args have pushed the class name beyond the limit
* 22:42 mutante: [[phab:T294038|T294038]] [krb1001:~] $ sudo manage_principals.py create effeietsanders ... Principal successfully created.  . .Successfully sent email
* 17:34 logmsgbot: krenair@tin Synchronized multiversion/MWMultiVersion.php: (no message) (duration: 00m 12s)
* 21:44 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 02m 47s)
* 17:24 logmsgbot: krenair@tin Synchronized multiversion/MWMultiVersion.php: https://gerrit.wikimedia.org/r/#/c/234320/ (duration: 00m 12s)
* 21:41 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 17:08 urandom: bouncing Cassandra on restbase1001 to apply temporary GC settings
* 20:54 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 13s)
* 16:51 moritzm: ferm rules on logstash100[1-3] have been amended to allow grafana from reading dashboard configs
* 20:53 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 16:39 bd808: new ferm rules on logstash100[1-3] are blocking grafana from reading dashboard configs.
* 20:53 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 35s)
* 16:22 moritzm: ferm enabled on logstash1003
* 20:52 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 16:18 moritzm: ferm enabled on logstash1002
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 16:16 bd808: ferm enabled on logstash1001
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 16:06 bd808: logstash1001 back up after system reboot; we applied a default drop rule without applying the other iptables changes; will try again
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:58 chasemp: rebooting logstash1001.mgmt.eqiad.wmnet for moritz as it is having issues
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 15:47 bblack: killed hung ubuntu mirror rsync commands on carbon, from Jul 10
* 19:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:45 bd808: logstash1001 not responding over ssh following ferm rules application; moritzm investigating
* 19:43 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:30 bd808: Disabled puppet on logstash100[1-3] prior to trying to enable ferm
* 19:42 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Update $wgTimelineFonts for new path to unifont in Shellbox container ([[phab:T293050|T293050]]) (duration: 00m 55s)
* 15:11 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable newarticle campaign in itwiki [[gerrit:234223]] (duration: 01m 52s)
* 19:38 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 14:52 bblack: re-imaging lvs200[123]
* 19:35 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 14:47 godog: reenable puppet on ms-be1*
* 19:31 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:22 godog: disable puppet on ms-fe1 / ms-be1 in prepration for puppet work
* 19:23 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 14:15 godog: reenable puppet on ms-fe2*
* 19:10 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs (duration: 00m 23s)
* 13:47 bblack: re-imaging lvs2004 + lvs2005
* 19:09 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs
* 13:29 ottomata: doing rolling restart of kafka brokers to apply auto_create_topics change
* 19:07 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: (no justification provided) (duration: 00m 08s)
* 13:21 godog: enable puppet on ms-be2*
* 19:07 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: (no justification provided)
* 13:21 ottomata: stopping kafka on analytics1021, it is no longer a kafka broker.
* 18:53 urbanecm: Deploy security patch for [[phab:T285116|T285116]] (wmf.4, wmf.5)
* 13:09 godog: disable puppet on ms-be2* in preparation for firewall changes
* 18:53 mutante: dumpsdata1003 - sudo systemctl reset-failed to clear Icinga alert about failed cleanup_tmpdumps.service
* 13:09 jynus: cloning es1008 into es1014
* 17:55 mutante: that's a key for https://www.worldcat.org/whatis/default.jsp btw for those wondering
* 13:04 ottomata: running leader election now that all topics and partitions are rebalanced across new kafka nodes
* 17:53 mutante: citoid - replaced "wskey" for worldcat in private repo as requested on [[phab:T294010|T294010]] (is in 4 places, 3 for deployment_server/k8s and one remnant for scb)
* 12:46 bblack: re-imaging lvs2006
* 17:53 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 12:45 andrewbogott: re-imaging labnet1001 (I hope)
* 17:52 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 11:33 _joe_: restarted hhvm on mw1143, locked in __lll_lock_wait for stat_cache deadlock
* 17:50 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 11:10 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Pool es1011 for the first time, depool es1008 (duration: 00m 12s)
* 16:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:27 jynus: installing and configuring servers es1012-es1019
* 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:39 ostriches: tin: dropped useless "gerrit" remote from /srv/mediawiki-staging (uses ssh, lol), pointed {origin,readonly} at the actual repo instead of a redirect.
* 16:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 06:00 _joe_: powercycling mw2140, not responding to ping, blank console
* 16:12 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 03:17 awight: deploy config cleanup for paymentswiki
* 16:07 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 02:38 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 10m 44s)
* 16:06 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (2/3) (duration: 00m 54s)
* 02:16 awight: push config change to the payments orphan slayer: explitly give stomp port to work around strict notice, clean up unused globals. T109911
* 16:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:32 ejegg: updated payments from 8ba4b5299f195cf48e6809b18a21e2d53f6eec1b to 6ac552f280fb839069d117386c4ecbe9e52f90a8
* 16:04 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 00:31 twentyafterfour: finished phabricator upgrade, everything appears to be working
* 16:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 00:24 logmsgbot: aaron@tin Synchronized php-1.26wmf19/extensions/CentralAuth: 47e181adb2898977b146de7398eaa35aebb870e3 (duration: 01m 13s)
* 16:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:22 logmsgbot: aaron@tin Synchronized php-1.26wmf20/extensions/CentralAuth: 47e181adb2898977b146de7398eaa35aebb870e3 (duration: 01m 13s)
* 16:01 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 00:20 twentyafterfour: taking phabricator offline for scheduled upgrade
* 15:59 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (2/3) (duration: 00m 55s)
* 15:58 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (1/3) (duration: 00m 57s)
* 15:43 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 robh@cumin1001: START - Cookbook sre.dns.netbox
* 15:14 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 15:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:13 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 54s)
* 15:12 Lucas_WMDE: my next message accidentally says 1/3 again but it’s 2/3, sorry
* 15:11 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 15:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:56 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 14:42 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/Wikibase.default.php: Backport: [[gerrit:732666{{!}}Enable dispatching via jobs by default (T291828)]] (duration: 00m 55s)
* 14:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:39 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client/: Backport: [[gerrit:732674{{!}}Fix ExternalUserNames service wiring for local database]] (duration: 00m 57s)
* 14:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:56 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:55 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:34 volans: uploaded spicerack_1.0.6 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:04 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 12:56 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 12:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 12:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 12:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 12:48 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 12:48 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 12:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 12:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 12:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 12:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 11:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 11:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 11:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 11:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 11:13 Lucas_WMDE: UTC morning backport+config window done
* 11:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294008|T294008]]
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 jgiannelos@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730848{{!}}Configure event stream for map tiles state change (T289771)]] (duration: 01m 04s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:48 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:47 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:14 jbond: mergeing refactor of P:base Gerrit:714975
* 09:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:49 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:56 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 01m 03s)
* 08:33 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:26 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:25 ema: cp3062: revert vsl_space experiment [[phab:T293879|T293879]]
* 08:24 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite1004.eqiad.wmnet with OS bullseye
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17563 and previous config saved to /var/cache/conftool/dbconfig/20211021-080330-root.json
* 07:56 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite1004.eqiad.wmnet with OS bullseye
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17562 and previous config saved to /var/cache/conftool/dbconfig/20211021-074826-root.json
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17561 and previous config saved to /var/cache/conftool/dbconfig/20211021-073323-root.json
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17560 and previous config saved to /var/cache/conftool/dbconfig/20211021-071819-root.json
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17559 and previous config saved to /var/cache/conftool/dbconfig/20211021-070315-root.json
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17558 and previous config saved to /var/cache/conftool/dbconfig/20211021-064812-root.json
* 06:35 elukey: `systemctl reload nginx` on cloudelastic100[5,6] to pick up the new TLS certificate and clear alerts - [[phab:T293826|T293826]]
* 04:47 marostegui: Deploy schema change on s5 codfw - [[phab:T291719|T291719]]
* 04:37 marostegui: Deploy schema change on s6 codfw - [[phab:T291719|T291719]]
* 04:04 legoktm: restarted apache on lists1001 so it only uses new TLS cert ([[phab:T293826|T293826]])
* 03:29 eileen: civicrm revision changed from {{Gerrit|e889831012}} to {{Gerrit|733a8fceda}}, config revision is {{Gerrit|eed79486d5}}
* 00:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2015-08-26 ==
== 2021-10-20 ==
* 23:59 Krinkle: mwscript deleteEqualMessages.php --wiki rowiki
* 23:56 thcipriani@deploy1002: Finished scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]] (duration: 11m 41s)
* 23:57 yurik: git deployed tilerator - had the 4/5 issue - https://phabricator.wikimedia.org/T110434
* 23:44 thcipriani@deploy1002: Started scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]]
* 23:46 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/234072/ (duration: 01m 12s)
* 23:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:37 logmsgbot: krenair@tin Synchronized php-1.26wmf20/maintenance/deleteEqualMessages.php: https://gerrit.wikimedia.org/r/#/c/234038/ (duration: 01m 12s)
* 23:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:35 logmsgbot: krenair@tin Synchronized php-1.26wmf19/maintenance/deleteEqualMessages.php: https://gerrit.wikimedia.org/r/#/c/234037/1 (duration: 01m 12s)
* 23:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:27 yurik: deployed kartotherian
* 23:29 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: fawiki require login for creation of pages in the draft namespace [[phab:T291018|T291018]] (duration: 01m 02s)
* 23:21 jynus: cloning es1005 into es1011, ETA 9 hours
* 23:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:41 ori: armed keyholder on tin
* 23:27 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: fawiki require login to edit main namespace [[phab:T291018|T291018]] (duration: 01m 04s)
* 22:40 ori: Disabled Puppet on mw1017 for 2hrs and applied I059b0c96c9 for testing.
* 22:13 dancy@deploy1002: Synchronized README: testing (4/4) (duration: 02m 52s)
* 21:55 logmsgbot: krinkle@tin Synchronized php-1.26wmf19/includes/poolcounter/PoolWorkArticleView.php: (no message) (duration: 01m 12s)
* 22:00 dancy@deploy1002: Synchronized README: testing (3/4) (duration: 02m 57s)
* 21:48 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool es1005 (duration: 01m 12s)
* 21:54 dancy@deploy1002: Synchronized README: testing (2) (duration: 01m 02s)
* 21:40 logmsgbot: krinkle@tin Synchronized php-1.26wmf20/includes/poolcounter/PoolWorkArticleView.php: (no message) (duration: 01m 12s)
* 21:52 dancy@deploy1002: Synchronized README: (no justification provided) (duration: 01m 03s)
* 21:32 ori: Disabling Puppet on tin again to test an ssh-agent-proxy change
* 21:50 dancy: Testing a series of one-file scap sync-file runs
* 20:30 logmsgbot: ori@tin Synchronized README: testing ssh-agent-proxy changes (duration: 00m 13s)
* 21:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:25 ori: Disabling puppet on tin and hacking some debug logging into ssh-agent-proxy
* 21:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:24 ori: armed ssh-agent key on mira
* 21:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:21 logmsgbot: krinkle@tin Synchronized php-1.26wmf20/includes/poolcounter/PoolWorkArticleView.php: (no message) (duration: 00m 03s)
* 21:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b9cf996a38d82fdd67e600a5a951e88423957e8d}}: Promote Growth features out of darkmode on several wikis  ([[phab:T291826|T291826]], [[phab:T255037|T255037]], [[phab:T287878|T287878]]) (duration: 01m 04s)
* 20:11 subbu: deployed parsoid version 44d657de
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:52 logmsgbot: krenair@tin Synchronized php-1.26wmf20/extensions/Echo/includes/mapper/EventMapper.php: https://gerrit.wikimedia.org/r/#/c/234082/ (duration: 00m 12s)
* 20:38 eileen: civicrm revision changed from {{Gerrit|9b5e0d015b}} to {{Gerrit|e889831012}}, config revision is {{Gerrit|eed79486d5}}
* 19:47 mutante: sodium - deleting shunted messages older than 7 days
* 20:25 legoktm: uploaded php7.4 on buster to apt.wm.o ([[phab:T293449|T293449]])
* 19:23 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/234042/ (duration: 00m 12s)
* 19:24 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations (duration: 00m 46s)
* 19:22 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/234024/ (duration: 00m 12s)
* 19:24 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations
* 19:20 logmsgbot: krenair@tin Synchronized multiversion/MWWikiversions.php: https://gerrit.wikimedia.org/r/#/c/232672/ (duration: 00m 12s)
* 19:09 mutante: disabling puppet on mw* for a minute to deploy a change
* 18:50 logmsgbot: krinkle@tin Synchronized php-1.26wmf20/maintenance/deleteEqualMessages.php: (no message) (duration: 00m 11s)
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:50 logmsgbot: krinkle@tin Synchronized php-1.26wmf19/maintenance/deleteEqualMessages.php: (no message) (duration: 00m 13s)
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:38 twentyafterfour: ^ stupid typo. That sync was group1 to 1.26wmf20
* 18:31 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:37 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: tig
* 18:30 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:31 logmsgbot: ori@tin Synchronized w/404.php: Ided1facc0: Remove auto-redirection from 404 page. (duration: 00m 13s)
* 18:24 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 17:51 ejegg: updated SmashPig from 258f2c917b1ae50b01231927bcd6f58ecaa8940b to fdb053efa617162ac9f695e493c390987a069140
* 17:28 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org -  [[phab:T293810|T293810]]
* 17:30 urandom: bouncing Cassandra on restbase1001 to apply temporary GC setting
* 17:27 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org
* 17:12 andrewbogott: ok, /now/ I’m running a dist-upgrade on labcontrol1001, to sort out weird oslo dependencies
* 17:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:09 chasemp: adding firewall to elasticsearch2[4-6] (3 was just done as a pilot)
* 17:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:03 andrewbogott: upgraded labnet1002 nova services to Juno
* 17:01 razzi@deploy1002: Finished deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f] (duration: 23m 42s)
* 16:34 andrewbogott: stopping keystone, updating db, restarting
* 17:00 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client: Update deprecated calls to ParserOutput in ShortDescHandler - [[phab:T293860|T293860]] (duration: 01m 03s)
* 16:18 andrewbogott: switching labcontrol1001 hiera to Juno which will add the cloud-archive repo for Juno.
* 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:11 andrewbogott: backing up labs openstack databases into /home/andrew/openstackdbbackups on db1009
* 16:53 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:11 andrewbogott: starting labs openstack update to Juno
* 16:53 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/LiquidThreads/pages/LqtDiscussionPager.php: Remove deprecated usage of setProperty - [[phab:T293895|T293895]] (duration: 01m 03s)
* 15:53 moritzm: ferm enabled on elastic1023
* 16:49 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/GeoCrumbs: Replace use of deprecated ParserOutput:getProperty() - [[phab:T293894|T293894]] (duration: 01m 09s)
* 15:45 godog: repool restbase1009 in pybal
* 16:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:28 logmsgbot: thcipriani@tin Synchronized php-1.26wmf20/extensions/Wikidata: SWAT: Update Wikidata - wrap usage tracking batch updates in transaction [[gerrit:233970]] (duration: 00m 23s)
* 16:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:47 andrewbogott: rebooting/reimaging labnet1001
* 16:37 razzi@deploy1002: Started deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f]
* 13:11 mobrovac: restbase deploying 1dfba85
* 16:36 razzi: deploy refinery change for https://phabricator.wikimedia.org/T287084
* 12:54 yurik: git synced kartotherian
* 16:13 jbond: upload cas_6.4.2-1_amd64.deb
* 11:02 jynus: dropping optin_survey_old table on all wikis
* 15:42 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:33 godog: reenable puppet on ms-fe/ms-be, base::firewall still not enabled
* 15:39 volans@cumin2002: START - Cookbook sre.dns.netbox
* 09:58 godog: test-reboot ms-be2001
* 14:57 moritzm: installing modsecurity-crs security updates on Buster
* 08:17 godog: disable puppet on ms-be/ms-fe in preparation for merging firewall changes
* 14:48 moritzm: installing xmlgraphics-commons security updates on Buster
* 07:53 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Aug 26 07:53:31 UTC 2015 (duration 53m 30s)
* 14:46 moritzm: installing irssi security updates on Buster
* 07:01 jynus: restarting mw1239 HHVM, which is unresponsive
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 04:47 logmsgbot: ori@tin Synchronized wmf-config: I73721936: Enable ParsoidBatchAPI everywhere (duration: 00m 13s)
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 03:11 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf20) at 2015-08-26 03:11:29+00:00
* 14:35 moritzm: installing commons-io security updates on Buster
* 03:06 logmsgbot: awight@tin Synchronized wmf-config/InitialiseSettings-labs.php: Push labs config to keep in sync with master (duration: 00m 13s)
* 14:27 ema: cp3062: test higher vsl_space values [[phab:T293879|T293879]]
* 03:05 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf20/cache/l10n: l10nupdate for 1.26wmf20 (duration: 10m 45s)
* 14:27 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 02:37 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf19) at 2015-08-26 02:37:51+00:00
* 14:12 moritzm: installing ruby2.3 security updates
* 02:34 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 06m 29s)
* 13:40 moritzm: installing apache2 security updates on buster
* 02:00 ottomata: kafka topic webrequest_upload has finished rebalancing across new brokers. starting move of last topic webrequest_text
* 13:27 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:50 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf19/extensions/Flow/: Sync Flow for reply fix (duration: 00m 15s)
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:28 logmsgbot: ori@tin Synchronized php-1.26wmf20/extensions/Scribunto/engines/LuaCommon/LuaCommon.php: (no message) (duration: 00m 13s)
* 13:21 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 01m 02s)
* 00:26 logmsgbot: ori@tin Synchronized php-1.26wmf19/extensions/Scribunto/engines/LuaCommon/LuaCommon.php: (no message) (duration: 00m 13s)
* 13:20 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 00:26 Danny_B: 2586dd1c7c obviously broke many pages
* 13:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 00:19 logmsgbot: ori@tin Synchronized php-1.26wmf19/extensions/Scribunto/engines/LuaCommon/LuaCommon.php: (no message) (duration: 00m 14s)
* 13:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 00:14 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: I79ffa78fa: Collection/OCG: Turn on plain text output format in Book Creator (duration: 00m 12s)
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=ats-tls
* 00:12 logmsgbot: ori@tin Synchronized php-1.26wmf20/extensions/Scribunto/engines/LuaCommon/LuaCommon.php: 2586dd1c7c: Updated mediawiki/core Project: mediawiki/extensions/Scribunto (duration: 00m 13s)
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=varnish-fe
* 12:51 ema: cp3062: bump vsl_space from 80M (default) to 512M [[phab:T293879|T293879]] - varnish restart needed
* 12:37 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 12:36 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 12:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:02 urbanecm@deploy1002: Finished scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]]) (duration: 25m 19s)
* 11:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2007.codfw.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2007.codfw.wmnet
* 11:37 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 11:37 urbanecm@deploy1002: Started scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]])
* 11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2005.codfw.wmnet
* 11:21 moritzm: installing ffmpeg security updates
* 11:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e520fc57411bb19123766192cd636396ea6fc59d}}: GrowthExperiments: Add campaign pattern for enwiki ([[phab:T293699|T293699]]) (duration: 01m 22s)
* 11:11 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2005.codfw.wmnet
* 10:13 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 10:13 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 09:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 09:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 09:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 09:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 09:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 09:04 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 08:50 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 08:50 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1118.eqiad.wmnet with OS buster
* 07:09 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:49 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1118.eqiad.wmnet with OS buster
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 (s1) for reimage [[phab:T290865|T290865]]', diff saved to https://phabricator.wikimedia.org/P17552 and previous config saved to /var/cache/conftool/dbconfig/20211020-064529-marostegui.json
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1126.eqiad.wmnet with OS buster
* 06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106 (s1) after upgrade', diff saved to https://phabricator.wikimedia.org/P17551 and previous config saved to /var/cache/conftool/dbconfig/20211020-063926-marostegui.json
* 06:35 marostegui: Upgrade db1106
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 (s1) for upgrade', diff saved to https://phabricator.wikimedia.org/P17550 and previous config saved to /var/cache/conftool/dbconfig/20211020-063431-marostegui.json
* 06:31 dcausse: restarting blazegraph on wdqs1012
* 06:28 elukey: reboot analytics1066 - OS showing CPU soft lockups, tons of defunct processes (including node manager) and high CPU usage
* 06:21 marostegui: Depool clouddb1013 for upgrade
* 06:14 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1126.eqiad.wmnet with OS buster
* 06:12 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17549 and previous config saved to /var/cache/conftool/dbconfig/20211020-061202-marostegui.json
* 06:06 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:05 XioNoX: put transport link between ulsfo and eqsin in service - [[phab:T273308|T273308]]
* 05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS buster
* 05:26 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 04:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:40 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable $wgLocalHTTPProxy on group0 wikis ([[phab:T288848|T288848]]) (duration: 01m 05s)
* 01:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:00 tgr: west coast evening deploys done


== 2015-08-25 ==
== 2021-10-19 ==
* 23:39 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/233860/ (duration: 00m 12s)
* 23:59 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732103{{!}}Reorder some wikis at wgExtraNamespaces and wmgVisualEditorAvailableNamespaces (T293846)]] (duration: 01m 02s)
* 23:16 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/233872/ (duration: 00m 13s)
* 23:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:13 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/232963/ (duration: 00m 12s)
* 23:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:12 logmsgbot: krenair@tin Synchronized wmf-config/extension-list: https://gerrit.wikimedia.org/r/#/c/232963/ (duration: 00m 12s)
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732053{{!}}ruwikiversity: Add 'portal' and 'faculty' namespaces (T293545)]] (duration: 01m 03s)
* 23:10 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/232962/ (duration: 00m 12s)
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:10 logmsgbot: krenair@tin Synchronized wmf-config/extension-list: https://gerrit.wikimedia.org/r/#/c/232962/ (duration: 00m 12s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:05 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/233781/ (duration: 00m 12s)
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:710565{{!}}Set the project namespace and sitename for Javanese Wikipedia and Wiktionary (T287437)]] (duration: 01m 02s)
* 22:20 cscott: updated Parsoid to version c3b037b0
* 23:23 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731953{{!}}Create Portal and Portal talk namespace for shiwiki (T288909)]] (duration: 01m 03s)
* 22:10 ejegg: disabled paypal audit downloader and parser due to them warning of incorrect data
* 23:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:16 logmsgbot: ori@tin Synchronized php-1.26wmf19/extensions/AbuseFilter: I15f5b5b6 & I9c23b607 (duration: 00m 13s)
* 23:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:13 logmsgbot: ori@tin Synchronized php-1.26wmf19/extensions/Cite/modules/ext.cite.styles.css: 7344e02216: Updated mediawiki/core Project: mediawiki/extensions/Cite (duration: 00m 12s)
* 23:13 tgr@deploy1002: Synchronized static: Config: [[gerrit:731231{{!}}Repair the size of the logo of Kashmiri Wikipedia (T293342)]] (duration: 02m 14s)
* 21:09 logmsgbot: ori@tin Synchronized php-1.26wmf20/extensions/AbuseFilter: I15f5b5b6 & I9c23b607 (duration: 00m 14s)
* 21:34 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete {{!}} fixed Icinga alert: RECOVERY - Check for large files in client bucket on mwmaint1002 is OK: OK: [[phab:T165885|T165885]]
* 20:54 tgr: finished OAuth migration
* 21:32 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete
* 20:34 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings.php: make OAuth DB writable again T108648 (duration: 00m 12s)
* 20:56 ejegg: updated payments-wiki from {{Gerrit|0f48acea49}} to {{Gerrit|30e596903d}}
* 20:32 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings.php: change wgMWOAuthCentralWiki mediawikiwiki -> metawiki T108648 (duration: 00m 12s)
* 19:03 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 20:24 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings.php: set OAuth to readonly for DB migration T108648 (duration: 00m 13s)
* 18:46 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/MediaSearch/: {{Gerrit|a84a675}}: {{Gerrit|3231578}}: MediaSearch backports ([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 20:13 subbu: deployed parsoid version 759916fc
* 18:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/: {{Gerrit|694580a}}: {{Gerrit|c02e301}}: MediaSearch backports([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 19:24 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.26wmf20
* 18:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 19:21 logmsgbot: twentyafterfour@tin Finished scap: testwiki to 1.26wmf20 (duration: 50m 12s)
* 18:30 foks: deleting 1 more email with deleteUserEmail.php
* 18:31 logmsgbot: twentyafterfour@tin Started scap: testwiki to 1.26wmf20
* 18:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1476a2d93}}: {{Gerrit|dd8393c1a0}}: foundationwiki: Restrict sensitive namespaces to editor group ([[phab:T205350|T205350]]) (duration: 01m 03s)
* 17:11 YuviPanda: run authdns-update on radon (ns0.wikimedia.org)
* 18:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 17:10 urandom: bouncing Cassandra on restbase1001 to apply temporary GC settings
* 18:12 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9a2893c7190e615a247674dbf7f87348bf43b91c}}: Enable topic subscriptions as a beta feature on all remaining projects ([[phab:T287802|T287802]]) (duration: 01m 04s)
* 16:58 Krinkle: mwscript deleteEqualMessages.php --wiki kawiki
* 18:00 legoktm@deploy1002: Synchronized wmf-config/: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (2/2) (duration: 01m 06s)
* 16:56 andrewbogott: restarting pdns on labcontrol1001 and labcontrol2001 to handle a nembus reboot
* 17:59 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (1/2) (duration: 01m 05s)
* 16:53 Krinkle: mwscript deleteEqualMessages.php --wiki huwiki
* 17:57 foks: removing six email addresses on request (with deleteUserEmail.php)
* 16:31 Krinkle: mwscript deleteEqualMessages.php --wiki frwiki
* 17:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 16:17 Krinkle: mwscript deleteEqualMessages.php --wiki frpwiki
* 17:25 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 15:50 godog: powercycle ms-be1004, likely xfs
* 17:11 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 15:44 andrewbogott: dist-upgrade and rebooting nembus in an attempt to resolve this acpi_pad issue
* 17:09 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 15:36 Krinkle: mwscript deleteEqualMessages.php --wiki euwiki (T45917)
* 16:48 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:29 Krinkle: mwscript deleteEqualMessages.php --wiki eowiki (T45917)
* 16:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:07 logmsgbot: krenair@tin Synchronized php-1.26wmf19/extensions/Flow: https://gerrit.wikimedia.org/r/#/c/233718/ (duration: 00m 16s)
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 13:56 jynus: dropping old tables on s7 - T5493
* 16:12 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 13:48 jynus: dropping old tables on s6 - T54932
* 16:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 12:53 Jeff_Green: authdns-update to change bismuth's IP
* 16:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 11:16 jynus: dropping old tables on s3 - T54932
* 16:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 10:46 jynus: dropping old tables on s2 - T54932
* 16:06 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 10:05 YuviPanda: restart puppetmaster on labcontrol1001 for https://gerrit.wikimedia.org/r/#/c/233184/
* 16:06 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 07:35 _joe_: stopping redis, wiping aof, restarting redis on rdb100{1,2} - snapshot saved on rdb1002:/root
* 16:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 07:12 _joe_: stopping redis on rdb1003,4, wiping AOF, restarting
* 16:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 06:38 jynus: performing schema change on officewiki, mediawikiwiki and metawiki
* 15:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 02:21 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 06m 26s)
* 15:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 01:48 ottomata: starting move of kafka partitions for topic webrequest_upload to new brokers.  this will take a while!
* 15:40 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams - remove now redundant stream setting - [[phab:T277193|T277193]] (duration: 01m 04s)
* 01:44 ottomata: restarting kafka on new brokers kafka1013,1014,1020 to apply increase in num.replica.fetchers
* 15:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 15:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:32 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:32 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:30 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:28 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 14:34 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:29 jbond: disable puppet on lvs, cp, authdns, mc, mw-be and wcqs to while i merge G:662699
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:11 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 45m 13s)
* 13:52 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 13:45 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:26 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17547 and previous config saved to /var/cache/conftool/dbconfig/20211019-131927-root.json
* 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17546 and previous config saved to /var/cache/conftool/dbconfig/20211019-131651-root.json
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17545 and previous config saved to /var/cache/conftool/dbconfig/20211019-130424-root.json
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17544 and previous config saved to /var/cache/conftool/dbconfig/20211019-130147-root.json
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17543 and previous config saved to /var/cache/conftool/dbconfig/20211019-124920-root.json
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17542 and previous config saved to /var/cache/conftool/dbconfig/20211019-124644-root.json
* 12:40 moritzm: installing aftpd security updates
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17541 and previous config saved to /var/cache/conftool/dbconfig/20211019-123416-root.json
* 12:34 marostegui: Upgrade dbstore1003
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17540 and previous config saved to /var/cache/conftool/dbconfig/20211019-123140-root.json
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17539 and previous config saved to /var/cache/conftool/dbconfig/20211019-121913-root.json
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17538 and previous config saved to /var/cache/conftool/dbconfig/20211019-121636-root.json
* 12:12 XioNoX: push anycast tuning to all Lumen and NTT transit links - [[phab:T288843|T288843]]
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1167 (s8) after upgrade', diff saved to https://phabricator.wikimedia.org/P17537 and previous config saved to /var/cache/conftool/dbconfig/20211019-120918-marostegui.json
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1167 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17536 and previous config saved to /var/cache/conftool/dbconfig/20211019-120458-marostegui.json
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17535 and previous config saved to /var/cache/conftool/dbconfig/20211019-120409-root.json
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17534 and previous config saved to /var/cache/conftool/dbconfig/20211019-120348-root.json
* 12:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/WikibaseMediaInfo/: {{Gerrit|ec0125770775c1a1a54c3b592d86d287fd9e3ad6}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 55s)
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17533 and previous config saved to /var/cache/conftool/dbconfig/20211019-120132-root.json
* 12:00 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikibaseMediaInfo/: {{Gerrit|79808a90a95dd5dac2b532b87fb7ec1a490ea0f0}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 56s)
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17532 and previous config saved to /var/cache/conftool/dbconfig/20211019-120024-root.json
* 11:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:56 XioNoX: push anycast tuning to Tele2, Init7, DT transit links - [[phab:T288843|T288843]]
* 11:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17531 and previous config saved to /var/cache/conftool/dbconfig/20211019-114844-root.json
* 11:46 marostegui: Upgrade db1105 (s1,s2)
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105 (s1,s2) for upgrade', diff saved to https://phabricator.wikimedia.org/P17530 and previous config saved to /var/cache/conftool/dbconfig/20211019-114649-marostegui.json
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17529 and previous config saved to /var/cache/conftool/dbconfig/20211019-114520-root.json
* 11:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17527 and previous config saved to /var/cache/conftool/dbconfig/20211019-113340-root.json
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17526 and previous config saved to /var/cache/conftool/dbconfig/20211019-113017-root.json
* 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17525 and previous config saved to /var/cache/conftool/dbconfig/20211019-111837-root.json
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17524 and previous config saved to /var/cache/conftool/dbconfig/20211019-111513-root.json
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7c31b04e50101a60db7ae8acae64bc031f5e1007}}: DPL: Explicitly note it is not possible to enable DPL on any more wikis (duration: 00m 55s)
* 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17523 and previous config saved to /var/cache/conftool/dbconfig/20211019-110333-root.json
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17522 and previous config saved to /var/cache/conftool/dbconfig/20211019-110009-root.json
* 10:56 marostegui: Upgrade clouddb1021
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 10:51 moritzm: failover master in ganeti-test to ganeti2026
* 10:50 godog: bounce superset on an-tool1005 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS stretch
* 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17521 and previous config saved to /var/cache/conftool/dbconfig/20211019-104829-root.json
* 10:45 godog: bounce navtiming on webperf1001 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:45 godog: bounce superset on an-tool1010 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17520 and previous config saved to /var/cache/conftool/dbconfig/20211019-104506-root.json
* 10:38 oblivian@deploy1002: Synchronized w/static.php: Config: [[gerrit:730182{{!}}static.php: Add support for /static/current rewrites (take 2) (T285232)]] (duration: 00m 55s)
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 10:37 marostegui: Upgrade db1101 (s7,s8)
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101 (s7,s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17519 and previous config saved to /var/cache/conftool/dbconfig/20211019-103634-marostegui.json
* 10:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:29 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 10:28 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 10:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:22 oblivian@deploy1002: Synchronized tests/WmfConfigServicesTest.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 10:22 godog: flip mw statsd traffic with https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/731918 - [[phab:T247963|T247963]]
* 10:21 oblivian@deploy1002: Synchronized wmf-config/ProductionServices.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 10:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:18 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS stretch
* 10:16 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 09:52 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:50 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 09:44 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.3 (duration: 01m 39s)
* 09:42 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.2 (duration: 16m 06s)
* 09:37 godog: move graphite/statsd writes to graphite2003 - [[phab:T247963|T247963]]
* 09:34 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:27 hashar: sap clean --delete 1.38.0-wmf.2 && scap clean --delete 1.38.0-wmf.3  # [[phab:T281169|T281169]]
* 09:27 hashar: Cloned and applied security patches for 1.38.0-wmf.5 # [[phab:T281169|T281169]]
* 09:19 marostegui: Stop slave on db2112 [[phab:T290865|T290865]]
* 09:18 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 09:18 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 09:03 XioNoX: push anycast tuning to all Telia transit links - [[phab:T288843|T288843]]
* 08:50 godog: point graphite.discovery.wmnet to graphite2003 - [[phab:T247963|T247963]]
* 08:40 XioNoX: push prep-work for anycast tuning to all sites - [[phab:T288843|T288843]]
* 08:33 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:33 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:32 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php hrwiki --fix
* 08:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift-ro
* 08:03 XioNoX: push prep-work for anycast tuning in ulsfo (try 2) - [[phab:T288843|T288843]]
* 08:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:32 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:24 ema: A:cp start rolling varnish upgrades to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17517 and previous config saved to /var/cache/conftool/dbconfig/20211019-072111-root.json
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17516 and previous config saved to /var/cache/conftool/dbconfig/20211019-071519-root.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17515 and previous config saved to /var/cache/conftool/dbconfig/20211019-070607-root.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17514 and previous config saved to /var/cache/conftool/dbconfig/20211019-070016-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17513 and previous config saved to /var/cache/conftool/dbconfig/20211019-065104-root.json
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17512 and previous config saved to /var/cache/conftool/dbconfig/20211019-064512-root.json
* 06:38 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17511 and previous config saved to /var/cache/conftool/dbconfig/20211019-063559-root.json
* 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17510 and previous config saved to /var/cache/conftool/dbconfig/20211019-063008-root.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17509 and previous config saved to /var/cache/conftool/dbconfig/20211019-062054-root.json
* 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17508 and previous config saved to /var/cache/conftool/dbconfig/20211019-061505-root.json
* 06:06 marostegui: Upgrade dbstore1005
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17507 and previous config saved to /var/cache/conftool/dbconfig/20211019-060551-root.json
* 06:04 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 06:03 marostegui: Upgrade db1184, db1178
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1178 for upgrade', diff saved to https://phabricator.wikimedia.org/P17506 and previous config saved to /var/cache/conftool/dbconfig/20211019-060123-marostegui.json
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17505 and previous config saved to /var/cache/conftool/dbconfig/20211019-060001-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1184 for upgrade', diff saved to https://phabricator.wikimedia.org/P17504 and previous config saved to /var/cache/conftool/dbconfig/20211019-055429-marostegui.json
* 05:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 05:46 marostegui: Reimage db2112 (s1 codfw master) [[phab:T290865|T290865]]
* 04:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 03:49 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:21 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 02:18 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:38 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer


== 2015-08-24 ==
== 2021-10-18 ==
* 23:46 logmsgbot: mattflaschen@tin Synchronized wmf-config: Remove wgFlowOccupyPages (duration: 00m 12s)
* 23:40 hoo: Updated the Wikidata property suggester with data from the 2021-10-04 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 23:38 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/233636/ (duration: 00m 12s)
* 23:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b654980240d51fff3c6e9c48f7076d4609c2560f}}: Create an alias for the Draft namespace on hrwiki ([[phab:T291755|T291755]]) (duration: 00m 56s)
* 22:16 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings-labs.php: change OAuth DB on beta +enable writes (duration: 00m 12s)
* 23:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:55 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings-labs.php: set beta OAuth to readonly (duration: 00m 13s)
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:54 logmsgbot: tgr@tin Synchronized wmf-config/CommonSettings-labs.php: set beta OAuth to readonly (duration: 00m 13s)
* 23:12 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=thwiktionary --fix # [[phab:T291761|T291761]]
* 21:42 akosiaris: enabled puppet on maps-test200{1,2,3,4}.codfw.wmnet
* 23:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|abe777d28594da852e49ccb1c1597b2598f3e483}}: Create Rhymes namespace for thwiktionary ([[phab:T291761|T291761]]) (duration: 00m 57s)
* 20:21 arlolra: updated Parsoid to version 0b2fbae7
* 23:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:58 bblack: reloading primary LVS pybals for BlankPage change ( https://gerrit.wikimedia.org/r/#/c/233053/ ) + ulimit fixup ( https://gerrit.wikimedia.org/r/#/c/233484/ )
* 23:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:31 bblack: reloading backup LVS pybals for BlankPage change ( https://gerrit.wikimedia.org/r/#/c/233053/ )
* 22:56 legoktm@deploy1002: Synchronized php-1.38.0-wmf.4/includes/http/MWHttpRequest.php: Allow using a reverse proxy for local HTTP requests ([[phab:T288848|T288848]]) (duration: 00m 56s)
* 17:19 urandom: bouncing Cassandra on restbase1001 to apply temporary GC settings
* 22:06 maryum: deployed security patch for [[phab:T293589|T293589]]
* 16:23 logmsgbot: bd808@tin Purged l10n cache for 1.26wmf18
* 21:23 maryum: deployed security patch for [[phab:T293556|T293556]]
* 16:23 logmsgbot: bd808@tin Purged l10n cache for 1.26wmf17
* 21:05 mutante: mwmaint1002 - sudo -u www-data /usr/local/bin/mw-cli-wrapper /usr/local/bin/mwscript extensions/TranslationNotifications/scripts/DigestEmailer.php --wiki mediawikiwiki {{!}} Fatal error: Uncaught Error: Class 'MediaWiki\MediaWikiServices' not found
* 16:05 andrewbogott: rebooting labnet1001
* 20:58 mutante: mwmaint1002 - attempt to start mediawiki_job_translationnotifications-mediawikiwiki which was alerting as failed
* 15:53 _joe_: restarted nutcracker on mw1010, holding a 150 GB deleted logfile
* 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:47 Krenair: running sync-common on mw1010 to bring it up to date after clearing some space
* 20:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:44 logmsgbot: krenair@tin Purged l10n cache for 1.26wmf16
* 19:46 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:41 logmsgbot: krenair@tin Purged l10n cache for 1.26wmf15
* 19:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 15:38 logmsgbot: krenair@tin Synchronized php-1.26wmf19/extensions/Wikidata: https://gerrit.wikimedia.org/r/#/c/233411/1 (duration: 00m 49s)
* 19:29 mutante: LDAP: removed non-existent user gerrit2 from group labsadminbots ([[phab:T160122|T160122]])
* 15:37 hashar: stopped and restarted Zuul
* 19:29 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/resources/store/state.js: {{Gerrit|ac7b4fc2ccc69589e00a42f49d18a8f6d71777f2}}: Revert 727328 ([[phab:T293554|T293554]]) (duration: 00m 56s)
* 15:31 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/232919/ and https://gerrit.wikimedia.org/r/#/c/232915/ (duration: 01m 34s)
* 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:29 logmsgbot: krenair@tin Synchronized w/static/images/project-logos/knwikiquote.png: https://gerrit.wikimedia.org/r/#/c/232919/ (duration: 02m 04s)
* 19:26 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:19 Krenair: No space left on mw1010, cannot ping or ssh to mw2180
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:16 logmsgbot: krenair@tin Synchronized docroot/noc/db.php: https://gerrit.wikimedia.org/r/#/c/232920/ (duration: 01m 34s)
* 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:14 hashar: apt-get upgrade on gallium
* 18:45 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Convert $wgEventStreams to be an associative array - [[phab:T277193|T277193]] (duration: 00m 57s)
* 14:48 andrewbogott: forcing wikitech logouts in order to flush everyone’s service catalog
* 18:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:18 ottomata: starting to move kafka topic-partitions to new brokers (and off of analytics1021)
* 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:12 yurik: git deploy synced kartotherian
* 18:07 mutante: gerrit - removed tonina from wmde-mediawiki gerrit group ([[phab:T293621|T293621]])
* 13:55 akosiaris: disable puppet on fermium preparing for reinstallation
* 17:51 mutante: puppet run on all bastion hosts via cumin
* 13:55 akosiaris: disable puppet on fermium
* 15:32 mvernon@cumin2002: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 12:54 akosiaris: stop etcd on etcd1002.eqiad.wmnet. Already removed from the cluster
* 15:32 mvernon@cumin2002: START - Cookbook sre.discovery.service-route
* 11:58 _joe_: stopping etcd on etcd1001
* 15:23 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 11:50 _joe_: restarting etcd on etcd1001
* 15:23 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 09:00 YuviPanda: starting up replicate for tools on labstore1002
* 15:16 herron: reprepro copied anycast-healthchecker, python3-json-logger and python3-anycast-healthchecker from buster-wikimedia to bullseye-wikimedia [[phab:T292196|T292196]]
* 09:00 YuviPanda: cleaning up lockdir on labstore for maps and tools
* 15:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 09:00 YuviPanda: others replication on labstore1002 completed successfuly
* 15:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 08:31 YuviPanda: cleaned up others lockdir for replication on labstore1002 and started it manually
* 14:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 06:43 jynus: reloading dbproxy1003 service
* 14:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 02:21 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 06m 36s)
* 14:54 herron: rebuilt and uploaded kafkatee for bullseye [[phab:T292196|T292196]]
* 14:50 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731346{{!}}[beta] Rename $wgIPInfoGeoIP2Path to $wgIPInfoGeoIP2Prefix (T289361)]] (duration: 00m 56s)
* 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (2/2) (duration: 00m 56s)
* 13:47 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (1/2) (duration: 00m 56s)
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:731014{{!}}Unconditionally enable Wikibase dispatching via jobs (T291828)]] (duration: 00m 56s)
* 13:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2079.codfw.wmnet with OS buster
* 12:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:55 Lucas_WMDE: UTC morning backport window done
* 11:55 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (2/2) (duration: 00m 56s)
* 11:54 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (1/2) (duration: 00m 56s)
* 11:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2079.codfw.wmnet with OS buster
* 11:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 marostegui: Reimage db2079 (codfw s8 master) [[phab:T290868|T290868]]
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730747{{!}}Set dispatchViaJobsAllowedClients to null everywhere (T291828)]] (duration: 00m 56s)
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731239{{!}}Make deduplication actually work for DispatchChangesJob (T291118)]] (duration: 00m 55s)
* 11:10 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/Hooks/RecentChangeSaveHookHandler.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (2/2) (duration: 00m 56s)
* 11:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (duration: 00m 56s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:47 moritzm: copied wmf-certificates from buster-wikimedia to stretch-wikimedia in reprepro
* 10:38 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:731237{{!}}Don't filter by change Id when dispatching to client wikis ()]] (duration: 00m 59s)
* 09:48 moritzm: installing node-tar security updates on buster
* 09:39 vgutierrez: updating acme-chief to version 0.34 on acmechief instances - [[phab:T292619|T292619]]
* 09:38 godog: sync metrics from graphite1004 to graphite2003 - [[phab:T247963|T247963]]
* 09:13 moritzm: installing apr security updates on bullseye
* 08:57 godog: cleanup graphite metrics not modified for >= ~3yr (1024 days)
* 07:34 ema: cp3060 (text), cp3061 (upload): upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 07:34 elukey: depool + restart blazegraph on wdqs1013
* 07:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:31 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2015-08-23 ==
== 2021-10-16 ==
* 16:54 urandom: bouncing Cassandra on restbase1001 to apply temporary GC settings
* 03:56 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:20 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 06m 23s)
* 02:19 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 01:30 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)


== 2015-08-22 ==
== 2021-10-15 ==
* 23:08 logmsgbot: krenair@tin Synchronized php-1.26wmf19/extensions/AbuseFilter/maintenance/addMissingLoggingEntries.php: (no message) (duration: 01m 05s)
* 23:48 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 19:41 YuviPanda: manually remove old snapshots from labstore1002
* 23:27 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 17:28 chasemp: tweaking apache on iridum T109941
* 23:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 16:45 chasemp: scratch that as we have mpm_prefork enabled :)
* 22:38 mutante: apt1001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 16:33 chasemp: raising values in mpm_worker.conf for iridium to to debug and hopefully head off further crashing
* 22:36 mutante: apt2001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 14:44 twentyafterfour: restarted apache2 on iridium. Segfault again. This time I at least got one clue in the log: "zend_mm_heap corrupted"
* 22:34 mutante: apt2001 - upgraded nginx
* 09:18 twentyafterfour: phabricator seems stable now, restarting apache2 on iridium did the trick, unfortunately we didn't learn why
* 22:18 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 08:36 twentyafterfour: restarted phd on iridium
* 22:14 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 08:36 twentyafterfour: restarted apache2 on iridium
* 22:05 dpifke@deploy1002: Finished deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes (duration: 00m 05s)
* 02:20 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 06m 09s)
* 22:05 dpifke@deploy1002: Started deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes
* 00:26 mutante: deleting blog.sh and blog_pageviews crontab from stat1003
* 21:51 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:44 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 21:36 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:09 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 17:20 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 17:17 mutante: gitlab1001 - disabling puppet for debugging
* 17:05 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold - [[phab:T283076|T283076]]
* 17:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 16:50 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold
* 16:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 16:44 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:48 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:15 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:30 elukey: start topic rebalancing for kafka main-eqiad (long maintenance, it will last a couple of days)
* 13:24 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:21 vgutierrez: updating acme-chief to version 0.34 on acmechief-test instances - [[phab:T292619|T292619]]
* 13:19 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 13:14 vgutierrez: upload acme-chief 0.34 to apt.wikimedia.org (buster) - [[phab:T292619|T292619]]
* 11:55 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2007.codfw.wmnet
* 11:45 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:33 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2007.codfw.wmnet
* 11:14 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 09:15 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 09:06 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 08:58 jelto: jelto@gitlab1001:~$ sudo disable-puppet "disable puppet on gitlab1001 to test 728380 on GitLab replica - [[phab:T283076|T283076]]"
* 07:41 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 06:20 urbanecm: Start server-side upload for 1 video file
* 02:14 ryankemper: [[phab:T288231|T288231]] `wdqs2006` data transfer complete and all tests passing on the host. All of `codfw wdqs-internal` is on the new streaming updater
* 00:09 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 00:07 brennen: end of UTC late backport & config training window


== 2015-08-21 ==
== 2021-10-14 ==
* 23:34 urandom: restarting Cassandra on restbase1001 to restore baseline settings
* 23:59 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 23:11 yurik: synced kartotherian
* 23:58 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 22:35 mutante: deleting held messages on mailman that are older than 1 year
* 23:56 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 56s)
* 21:56 awight: increasing paymentswiki orphan gc-cc-limbo expiry time to 30 days
* 23:49 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 21:45 mutante: had to reset list creator password for mailman - ask me if you think you should have it and don't (this is not the master pass)
* 23:48 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 20:37 logmsgbot: ori@tin Synchronized php-1.26wmf19/includes: I1eb8dfc: Revert Count API and hook calls, with 1:1000 sampling (duration: 01m 09s)
* 23:46 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 56s)
* 19:43 awight: update paymentswiki from 2b08853c977eee0fd17bf00a673a3bbf2a146554 to 8ba4b5299f195cf48e6809b18a21e2d53f6eec1b
* 23:43 ejegg: updated payments-wiki from {{Gerrit|19d18c1852}} to {{Gerrit|0f48acea49}}
* 18:58 awight: disabling Amazon gateway
* 23:34 cjming@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikimediaEvents/includes/VectorPrefDiffInstrumentation.php: Backport: [[gerrit:730733{{!}}Change VectorPrefDiffInstrumentation stream name to `mediawiki.skin_diff` (T289622)]] (duration: 00m 56s)
* 18:52 awight: updated paymentswiki from 049ad15323564fd5cd7f5efcadddb532a3590cef to 2b08853c977eee0fd17bf00a673a3bbf2a146554
* 23:24 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730936{{!}}allow sysops to add and remove users to other groups on ptwikivoyage (T292806)]] (duration: 00m 56s)
* 16:06 jynus: checksumming dewiki database, higher write rate/dbstore lag expected temporarily
* 23:21 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 15:10 ottomata: rebooting kafka broker analytics1021 to hopefully reload /dev/sdg with new disk, also will turn on hyperthreading
* 23:11 brennen@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730933{{!}}Add americanantiquarian.org to the wgCopyUploadsDomains allowlist of Wikimedia Commons (T292918)]] (duration: 00m 57s)
* 14:13 ottomata: rebooting analytics1056 after upgrading kernel to linux-image-3.13.0-61-generic
* 23:11 mutante: mw1452 - re-pooled, scap pull
* 13:58 urandom: restarting restbase1001 to apply temporary GC setting
* 23:09 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 13:34 ottomata: stopping kafka broker on analytics1021 due to bad disk.
* 22:35 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 13:30 bblack: wiped ganglia apache access log on uranium, to free up half of the (full) rootfs
* 22:35 ryankemper: [[phab:T288231|T288231]] Ran puppet on `wdqs2006`, now back to the cookbook run
* 10:07 godog: enable puppet on ms-fe1/ms-be1
* 22:33 ryankemper: [[phab:T288231|T288231]] Forgot about running puppet-agent on `wdqs2006`; aborted cookbook run
* 09:49 godog: disable puppet on ms-fe1/ms-be1 before merging https://gerrit.wikimedia.org/r/#/c/231240/
* 22:33 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 07:06 _joe_: restarting gitblit, because it will be decommissioned "soon"...
* 22:33 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 02:34 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 11m 19s)
* 22:32 ryankemper: [[phab:T288231|T288231]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/730795; proceeding to data-transfer on `wdqs2006`: `sudo rm -fv /srv/wdqs/data_loaded` on `wdqs2006` followed by `ryankemper@cumin1001:~$ sudo cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2006.codfw.wmnet --reason "streaming updater cutover for wdqs2005" --blazegraph_instance blazegraph --task-id [[phab:T288231|T288231]]`
* 22:31 mutante: depooling mw1452 for testig
* 22:28 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo pool`: transfer completed successfully; tests passing on host (used `ssh -L 9999:localhost:80 wdqs2005.codfw.wmnet` to establish tunnel)
* 22:23 dpifke@deploy1002: Finished deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]] (duration: 00m 05s)
* 22:23 dpifke@deploy1002: Started deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]]
* 22:17 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 22:07 eileen: civicrm revision changed from {{Gerrit|018d3b19fe}} to {{Gerrit|9b5e0d015b}}, config revision is {{Gerrit|781d6a1b1f}}
* 21:34 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:25 robh@cumin1001: START - Cookbook sre.dns.netbox
* 21:10 robh@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 21:06 robh@cumin1001: START - Cookbook sre.dns.netbox
* 19:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 19:23 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 19:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:53 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:53 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=dagwiki --fix
* 18:47 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=frwiktionary --logwiki=metawiki 'TURK FASTER' 'ARTHUR MORGAN'
* 18:42 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'George Dum Fulton' 'George Fulton' # [[phab:T293403|T293403]]
* 18:41 urbanecm: UTC evening B&C done
* 18:40 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/extension.json: {{Gerrit|6da3523daaba85a4199721980c0a9c96b20697e7}}: Fix assessment quickview labels ([[phab:T292596|T292596]]) (duration: 01m 03s)
* 18:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c8dffefd0d095abe3709dcc962d5d24f27b55869}}: Create Salima namespace for dagwiki ([[phab:T289911|T289911]]) (duration: 01m 04s)
* 18:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bccd4bc45498db8628567574d0bb3a23f8fb378}}: Add $wgSitename and $wgMetaNamespace for kswiki and kswiktionary ([[phab:T289752|T289752]], [[phab:T289767|T289767]]) (duration: 01m 04s)
* 18:17 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|262e588b44f126fb9e1aa933a3ca59b191b42bd7}}: Enable Growth mentor dashboard backend on all wikis ([[phab:T278920|T278920]]) (duration: 01m 05s)
* 18:07 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|41baa8c41d64510986f009b9be2d70dad0915f8c}}: Add new mediawiki.skin_diff event logging stream ([[phab:T289622|T289622]]) (duration: 01m 05s)
* 18:03 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 18:02 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 18:01 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 17:54 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:52 rzl: repooled mw1452 (with `sudo pool` so no auto log from conftool)
* 17:47 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:45 rzl@cumin1001: conftool action : set/pooled=no; selector: name=mw1452.eqiad.wmnet
* 17:42 rzl: depool mw1452 for training
* 17:32 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 17:31 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 17:29 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 16:44 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:44 ryankemper: [[phab:T288231|T288231]] Manually killed dangling `pigz` / `nc` processes on `wdqs2008` (and `wdqs2005` implicitly). Should be in the right state to re-start the `data-transfer` cookbook from again
* 16:41 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:37 elukey: drop kubeflow-kfserving* docker images from deneb
* 16:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:34 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 16:33 moritzm: installing node-ansi-regex security updates
* 16:28 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere (duration: 02m 24s)
* 16:25 mbsantos@deploy1002: Started deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere
* 16:24 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 04s)
* 16:16 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad (duration: 02m 41s)
* 16:14 mbsantos@deploy1002: Started deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad
* 16:08 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:07 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:07 ryankemper: [[phab:T288231|T288231]] About to ctrl+c out of ongoing data transfer because puppet run following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/730794 restarted blazegraph; we'll manually disable updater and kick off the transfer again
* 16:04 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo run-puppet-agent --force`
* 15:56 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:54 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2008:~$ sudo depool`
* 15:52 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo depool`
* 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:13 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/VisualEditor/includes/VisualEditorHooks.php: Backport: [[gerrit:730729{{!}}Fix value of 'namespacesWithSubpages' in wgVisualEditorConfig (T293310)]] (duration: 01m 04s)
* 15:02 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 03s)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:59 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:53 kormat: upgrading orchestrator.wm.o to 3.2.6-1 [[phab:T275784|T275784]]
* 14:49 jbond@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=apt
* 14:43 jbond: migrate apt.w.o to a dns active/passiev discovery address (cc moritzm)
* 14:23 moritzm: installing krb5 security updates on KDCs
* 14:19 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:10 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 2/2) (duration: 01m 03s)
* 14:07 urbanecm: Run extensions/GrowthExperiments/initWikiConfig.php for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:07 urbanecm: Create growthexperiments DB tables for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 1/2) (duration: 01m 04s)
* 14:03 urbanecm@deploy1002: Synchronized dblists/visualeditor-nondefault.dblist: {{Gerrit|82d0a4bf45126ecba2cfcd1a0c2081a00f58dca3}}: Enable VE by default on 4 more wikis ([[phab:T290614|T290614]]) (duration: 01m 05s)
* 13:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 13:55 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 kormat: uploaded orchestrator 3.2.6-1 packages to apt.wm.o (buster) [[phab:T275784|T275784]]
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 12:44 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 12:42 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:42 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730746{{!}}Untangle “dispatch via jobs” settings in Wikibase.php (T291828)]] (no-op) (duration: 01m 04s)
* 12:12 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730725{{!}}Set wmgWikibaseDispatchViaJobsPruneChangesTableInJobEnabled for wikidatawiki (T291828)]] (no-op) (duration: 01m 05s)
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet
* 10:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/: {{Gerrit|1f33fc3}}, {{Gerrit|e0ea1b8}}, {{Gerrit|cba2ac9}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 05s)
* 10:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|465b564}}, {{Gerrit|a8cc98b}}, {{Gerrit|6e95c48}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 06s)
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
* 09:20 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:20 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:18 volans@deploy1002: Finished deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1 (duration: 00m 50s)
* 09:17 volans@deploy1002: Started deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1
* 09:04 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 45s)
* 09:03 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 09:02 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 23s)
* 09:02 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 08:52 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:52 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:22 volans: rolling out debmonitor-client upgrade to 0.3.1 across the fleet
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:24 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:24 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift-ro,name=eqiad
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift,name=eqiad
* 07:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:37 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:52 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:50 foks: changing user email for "Region of Peel Archives"
* 01:41 ejegg: updated payments-wiki from {{Gerrit|b329d2dea2}} to {{Gerrit|19d18c1852}}
* 01:35 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:31 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .


== 2015-08-20 ==
== 2021-10-13 ==
* 23:40 logmsgbot: ebernhardson@tin Synchronized php-1.26wmf19/extensions/CirrusSearch/: Fix some cirrussearch logspam (duration: 00m 13s)
* 23:37 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:30 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 12s)
* 23:36 eileen: civicrm revision changed from {{Gerrit|946dfb6c5a}} to {{Gerrit|018d3b19fe}}, config revision is {{Gerrit|85277466ed}}
* 23:29 logmsgbot: ebernhardson@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 13s)
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730575{{!}}Create an alias for the project namespace on kswiki (T291740)]] (duration: 01m 05s)
* 23:23 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/232854/ (duration: 00m 13s)
* 22:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:22 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/232671/ (duration: 00m 12s)
* 22:01 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Collection/includes/Specials/SpecialCollection.php: Backport: [[gerrit:730578{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 23:15 logmsgbot: krenair@tin Synchronized php-1.26wmf19/extensions/LiquidThreads/classes/Hooks.php: https://gerrit.wikimedia.org/r/#/c/232783/ (duration: 00m 12s)
* 21:50 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection: Backport: [[gerrit:730577{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 23:13 logmsgbot: krinkle@tin Synchronized php-1.26wmf19/includes/resourceloader/ResourceLoaderFileModule.php: T102578 (duration: 00m 13s)
* 21:47 foks: removing 8 files for legal compliance
* 23:08 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/232782/ (duration: 00m 12s)
* 21:03 foks: removing 2 files for legal compliance
* 22:48 logmsgbot: ori@tin Synchronized php-1.26wmf19/includes/libs/CSSMin.php: Icc1c23a2: CSSMin: remove dot segments in relative local URLs (duration: 00m 12s)
* 21:00 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 21:36 cscott: updated Parsoid to version db6e6404f67a9f971b4fbefe9de239735426c738
* 20:50 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 21:25 matt_flaschen: Ran FlowUpdateRevContentModelFromOccupyPages.php on all wikis
* 20:49 brennen@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/Api/ApiGetBookCreatorBoxContent.php: Backport: [[gerrit:730574{{!}}Fall back to main page if given title is invalid (T293299)]] (duration: 01m 04s)
* 20:41 twentyafterfour: scap failed to sync to mw2180.codwf.wmnet
* 20:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:41 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedia wikis to 1.26wmf19
* 20:40 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:38 logmsgbot: twentyafterfour@tin Synchronized php-1.26wmf19: Silence the undefined index error in CirrusSearch (duration: 06m 24s)
* 20:31 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 19:40 chasemp: moving enwiki_content_1432182861 elastic shard from 1022 to 1004 due to space (1022 is at 91%)
* 20:27 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1021.eqiad.wmnet with OS stretch
* 20:57 mutante: no log bot
* 20:04 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 18:56 mutante: labvirt1007 "only" 29G space left - but since we have 2.2T there that means 99% full
* 20:03 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubernetes1021.eqiad.wmnet with OS stretch
* 17:39 ottomata: stopping kafka on analytics1018 and bringing it down for reinstall as kafka1018 with Jessie
* 20:01 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 16:38 YuviPanda: puppet swat done
* 19:18 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 15:44 akosiaris: uploaded to apt.wikimedia.org jessie-wikimedia: etherpad-lite_1.5.7-1
* 19:16 mutante: gitlab2001 - status before was that "gitlab-ctl status" showed components "gitlab-workhorse" and "postgres-exporter" as "down". this was either pre-broken or caused by the restore process. after manually 'gitlab-ctl start gitlab-workhorse' all of the components are in "run" and https://gitlab-replica.wikimedia.org is up ( [[phab:T285867|T285867]])
* 15:43 logmsgbot: krenair@tin Synchronized php-1.26wmf18/extensions/ContentTranslation/modules/tools/ext.cx.tools.reference.js: https://gerrit.wikimedia.org/r/#/c/232729/ (duration: 00m 12s)
* 19:08 mutante: gitl1b2001 - started workhorse which was for some reason marked as down after restore command ran
* 15:42 logmsgbot: krenair@tin Synchronized php-1.26wmf19/extensions/ContentTranslation/modules/tools/ext.cx.tools.reference.js: https://gerrit.wikimedia.org/r/#/c/232730/ (duration: 00m 13s)
* 19:08 mutante: [gitlab2001:~] $ sudo /usr/bin/gitlab-ctl start gitlab-workhorse
* 15:39 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/206480/ (duration: 00m 13s)
* 19:06 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 01m 03s)
* 15:38 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/206480/ (duration: 00m 13s)
* 19:05 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 15:32 logmsgbot: krenair@tin Synchronized php-1.26wmf18/extensions/ContentTranslation/api/ApiContentTranslationPublish.php: https://gerrit.wikimedia.org/r/#/c/232687/ (duration: 00m 13s)
* 19:02 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87879865c35edab3ead523027681146e00d6fc02}}: Create Translation namespace for viwikisource ([[phab:T290691|T290691]]) (duration: 01m 04s)
* 15:31 logmsgbot: krenair@tin Synchronized php-1.26wmf19/extensions/ContentTranslation/api/ApiContentTranslationPublish.php: https://gerrit.wikimedia.org/r/#/c/232688/ (duration: 00m 11s)
* 18:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|06fd0f225575448771cdba0d4e6bf36bb6715bc1}}: add extendedconfimed for autoreview group on ptwiki ([[phab:T292912|T292912]]) (duration: 01m 04s)
* 15:27 greg-g: on mw2187: rsync: failed to set times on "/srv/mediawiki/wmf-config": Read-only file system (30)
* 18:37 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript initSiteStats.php --wiki=ptwiki --update
* 15:25 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/231464/ (duration: 00m 13s)
* 18:33 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=ptwiki extendedconfirmed
* 15:08 urandom: restarting restbase1001 to apply temporary heap size of 12G
* 18:31 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bb2b388217aa91a39ed3684f87fdf7edb06fd81}}:  Set autoconfirmedextended and confirmedextended for ptwiki ([[phab:T292915|T292915]]) (duration: 01m 04s)
* 15:02 jynus: performing online schema change on wikidata
* 18:16 urbanecm@deploy1002: Synchronized static/images/project-logos: {{Gerrit|694bc234ab5dbb9a2387a6129998d45a53ac0ab3}}: Remove an old dawiki temporary logo (duration: 01m 04s)
* 15:00 andrewbogott: rebooting labvirt1008
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|224e2a374b1cc6327e9d8c2bca576091ce4efc74}}: Add NS_MAIN back to wgExtraSignatureNamespaces for mediawikiwiki ([[phab:T291630|T291630]]) (duration: 01m 05s)
* 12:48 jynus: restarted nutcracker on mw1142
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 12:08 godog: reenable puppet on ms-fe1/ms-be1
* 18:12 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 12:04 godog: repool ms-fe1001
* 18:11 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|1b96f54a518620b0dc6a0ab63b402d0ea2c6bf70}}: Update logo for liwiktionary ([[phab:T291479|T291479]]) (duration: 01m 14s)
* 11:53 godog: depool ms-fe1001 to test a reboot
* 18:10 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 11:45 godog: disable puppet on ms-fe/be1 in preparation to apply https://gerrit.wikimedia.org/r/#/c/231237
* 18:10 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 12:08 kart_: Updated cxserver to e221462
* 18:09 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 03:00 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 06m 27s)
* 18:09 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 02:45 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-20 02:45:14+00:00
* 18:08 volans: uploaded debmonitor-client_0.3.1 to apt.wikimedia.org stretch-wikimedia,buster-wikimedia,bullseye-wikimedia
* 02:38 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 10m 41s)
* 17:14 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|dd7a3314602ffddc5b917cccc71c917301639388}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 04s)
* 17:13 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|5c27154cf434bebc37f5e98e2ad1b5cea7cde1d4}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 15s)
* 16:57 mutante: stat1008 - short on disk space, mostly used in /tmp, high CPU usage by R proccess, sent a message about it to all shell users via wall
* 16:50 mutante: stat1008 - apt-get clean - freed 1.3 GB disk space - was alerting in Icinga because / was 97% full
* 16:37 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 16:37 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 16:23 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 16:23 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:29 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:28 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:26 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:26 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 15:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:12 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:12 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 15:04 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:03 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:03 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:01 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:01 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:01 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:59 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 14:59 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:59 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:57 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:54 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:54 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:52 ema: repool cp4021, further testing can be performed on sretest1001 [[phab:T201317|T201317]]
* 14:51 volans: restarting ircecho.service on alert1001 to get back icinga-wm without the underscore
* 14:50 elukey: restart pybal on lvs1015 (low-traffic primary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 14:48 moritzm: reverted to clean package state on deneb
* 14:44 elukey@puppetmaster1001: conftool action : ge; selector: cluster=ml_serve,service=inference
* 14:36 elukey: restart pybal on lvs1016 (low-traffic secondary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 14:27 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:27 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:25 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:25 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:21 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:21 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:20 moritzm: temporarily downgrade sphinx packages on deneb to 1.7.9-1~bpo9+1 to build a Ganeti 2.16 stretch backport with delicate toolchain needs
* 14:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 13:59 XioNoX: push prep-work for anycast tuning in ulsfo - [[phab:T288843|T288843]]
* 13:38 jayme: imported helm-diff_3.1.3-2 to buster-wikimedia (https://gerrit.wikimedia.org/r/c/operations/debs/helm-diff/+/730509)
* 13:37 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 13:34 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 12:13 Lucas_WMDE: UTC morning backport+config window done
* 12:12 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/includes: Backport: [[gerrit:730370{{!}}Add Link: Do not log "no suggestion found" errors in production log (T291251)]] (duration: 01m 04s)
* 12:11 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]'  # after applying 730512 at mwmaint1002 to workaround [[phab:T293219|T293219]] # [[phab:T255037|T255037]]
* 12:11 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/modules: Backport: [[gerrit:730371{{!}}Suggested Edits: Update local config.presets when topics/difficulty presets change (T292536)]] (duration: 01m 07s)
* 11:56 urbanecm@deploy1002: Synchronized wmf-config/config/itwiki.yaml: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]) (duration: 01m 04s)
* 11:55 urbanecm: mwscript extensions/Translate/scripts/moveTranslatablePage.php --wiki=mediawikiwiki "Growth/Communities/How to introduce yourself as a mentor" "Growth/Communities/How to configure the mentors' list" "Martin Urbanec (WMF)" --reason '[[:phab:T293184]]' # [[phab:T293184|T293184]]
* 11:55 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 2/3) (duration: 01m 04s)
* 11:54 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 1/3) (duration: 01m 05s)
* 11:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]' # [[phab:T255037|T255037]]
* 11:49 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=itwiki growthexperiments # [[phab:T255037|T255037]]
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:730380{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 07s)
* 11:43 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Wikibase/repo/: Backport: [[gerrit:730385{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 18s)
* 11:33 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 11:19 ema: pool cp4021 after reimage [[phab:T201317|T201317]]
* 11:05 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 10:15 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:09 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:728490{{!}}Add more types of QuickSurveys on beta cluster (T292459)]] (duration: 01m 53s)
* 10:06 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 09:22 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 08:35 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:28 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:21 elukey: run kafka preferred-replica-election on kafka-main1001 to rebalance partition leaders - [[phab:T288825|T288825]]
* 08:15 godog: bounce graphite on graphite1004 to apply new config
* 07:33 elukey: increase kafka topic partition size of the top 4 high traffic topics of main-eqiad as described in https://phabricator.wikimedia.org/T288825#7422726
* 07:13 XioNoX: provision new eqsin-ulsfo link - [[phab:T273308|T273308]]
* 06:26 elukey: `kafka topics --alter --topic <nowiki>{</nowiki>eqiad,codfw<nowiki>}</nowiki>.change-prop.transcludes.resource-change --partitions 3` on kafka-main2001 - [[phab:T288825|T288825]]
* 00:38 ejegg: updated payments-wiki from {{Gerrit|030b11da1a}} to {{Gerrit|b329d2dea2}}


== 2015-08-19 ==
== 2021-10-12 ==
* 23:27 logmsgbot: rmoen@tin Synchronized wmf-config/InitialiseSettings.php: Remove reference to lost wikitech apple touch icon file (duration: 00m 13s)
* 23:48 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:21 logmsgbot: rmoen@tin Synchronized php-1.26wmf19/extensions/TimedMediaHandler/: Re-disable 2-pass Theora encoding temporarily
* 23:16 urbanecm: UTC late B&C window done
* 23:16 logmsgbot: rmoen@tin Synchronized php-1.26wmf18/extensions/Flow: Add debugging code to detect and workaround type hint failure (duration: 00m 14s)
* 23:15 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 01m 04s)
* 21:20 robh: livehack reverted sodium back to normal, testing done
* 23:12 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 02m 09s)
* 21:08 robh: disabled puppet on sodium for livehacking tests for T109609
* 23:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 21:04 andrewbogott: disabling puppeton labnet1001 and labnet1002
* 22:53 urbanecm: [urbanecm@labweb1001 ~]$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=labswiki Jamesmontalvo3 #
* 20:46 urandom: restarting Cassandra on restbase1001 to enable -XX:+PrintAdaptiveSizePolicy
* 22:51 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:43 urandom: disabling puppet on restbase1001 to temporarily enable additional GC logging
* 20:21 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:17 subbu: deployed parsoid version 8d617c99
* 19:31 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.1 (duration: 04m 02s)
* 20:00 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/232580/ (duration: 00m 12s)
* 19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:18 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/232565/ (duration: 00m 12s)
* 19:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 19:04 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/232543/ (duration: 00m 13s)
* 19:02 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 18:59 logmsgbot: twentyafterfour@tin Synchronized php-1.26wmf19: deploy hotfix for Wikidata: https://gerrit.wikimedia.org/r/#/c/232556/ (duration: 02m 39s)
* 18:47 dancy@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 45m 36s)
* 18:30 ottomata: starting reinsatll of analytics1022 -> kafka1022 as jessie
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 18:06 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf19
* 18:01 dancy@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 16:26 ottomata: added analytics105[012456] into hadoop cluster as worker nodes
* 17:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:02 logmsgbot: thcipriani@tin Synchronized php-1.26wmf18/extensions/ContentTranslation/api/ApiContentTranslationPublish.php: SWAT: Temp disable notifications for cx [[gerrit:232504]] (duration: 00m 13s)
* 17:56 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/CentralNotice: Backport: [[gerrit:730141]] (duration: 00m 59s)
* 15:57 logmsgbot: thcipriani@tin Synchronized php-1.26wmf19/extensions/ContentTranslation/api/ApiContentTranslationPublish.php: SWAT: Temporarily disable notifications for cx [[gerrit:232505]] (duration: 00m 12s)
* 17:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:51 logmsgbot: thcipriani@tin Synchronized php-1.26wmf18/includes/changetags/ChangeTags.php: SWAT: Avoid full RC table scans in ChangeTags::updateTags() [[gerrit:232484]] (duration: 00m 12s)
* 17:46 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 15:35 logmsgbot: thcipriani@tin Synchronized php-1.26wmf19/includes/changetags/ChangeTags.php: SWAT: Avoid full RC table scans in ChangeTags::updateTags() [[gerrit:232485]] (duration: 00m 13s)
* 17:43 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:25 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Adjust mediawiki.org RSS whitelist to allow technology blog feeds [[gerrit:118956]] (duration: 00m 13s)
* 17:41 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:00 andrewbogott: rebooting labvirt1007
* 17:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:58 godog: stop puppet on ms-fe1* while merging swift refactoring
* 17:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:58 godog: stop puppet on ms-be1* while merging swift refactoring
* 17:32 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SyntaxHighlight_GeSHi/includes/ResourceLoaderPygmentsModule.php: Backport: [[gerrit:730233{{!}}Include generated styles before Mediawiki overrides (T292736)]] (duration: 00m 57s)
* 09:51 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1018 for normal traffic only (duration: 00m 13s)
* 17:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:49 logmsgbot: reedy@tin Synchronized wmf-config/InitialiseSettings.php: Add *.webarchive.org.uk to wgCopyUploadsDomains whitelist (duration: 00m 12s)
* 17:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:28 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Aug 19 08:28:19 UTC 2015 (duration 28m 18s)
* 17:23 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730236{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 07:39 jynus: About to perform a schema change on flowdb
* 17:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:44 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf19/includes/rcfeed/RCFeedFormatter.php: Fix Flow RC regression in 1.26wmf19 (duration: 00m 12s)
* 17:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:43 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf19/includes/changes/RecentChange.php: Fix Flow RC regression in 1.26wmf19 (duration: 00m 12s)
* 17:16 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730235{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 03:43 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf18/includes/rcfeed/RCFeedFormatter.php: Fix Flow RC regression in 1.26wmf18 (duration: 00m 12s)
* 17:12 moritzm: installing rsync bugfix updates
* 03:42 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf18/includes/changes/RecentChange.php: Fix Flow RC regression in 1.26wmf18 (duration: 00m 12s)
* 17:09 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 03:14 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf19) at 2015-08-19 03:14:06+00:00
* 16:56 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 03:07 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf19/cache/l10n: l10nupdate for 1.26wmf19 (duration: 10m 48s)
* 16:55 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 02:39 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-19 02:39:54+00:00
* 16:53 moritzm: failed over ganeti master for test cluster to ganeti2025
* 02:36 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 07m 51s)
* 16:50 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:10 yurik: updated kartotherian
* 16:48 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 00:40 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf19/extensions/Flow/: Sync Flow 1.26wmf19 for RC insert failure. (duration: 00m 15s)
* 16:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:39 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf18/extensions/Flow/: Sync Flow 1.26wmf18 for RC insert failure. (duration: 00m 14s)
* 16:30 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts testvm2009.codfw.wmnet
* 16:30 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 16:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:26 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 16:26 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes: Backport: [[gerrit:730226{{!}}Pre-format comments for non-local files too (T292570)]] (duration: 01m 15s)
* 16:17 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 16:16 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2009.codfw.wmnet
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:10 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 16:09 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730231{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 00m 57s)
* 16:00 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 15:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:58 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730230{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 02m 13s)
* 15:57 volans@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2009.codfw.wmnet
* 15:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:51 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:49 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:48 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 15:48 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:41 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for analytics1069.eqiad.wmnet
* 15:41 btullis@cumin1001: START - Cookbook sre.hosts.remove-downtime for analytics1069.eqiad.wmnet
* 15:02 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:50 volans@cumin2002: START - Cookbook sre.dns.netbox
* 13:49 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2006.codfw.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 13:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:14 godog: add 50G to prometheus/k8s in eqiad
* 13:13 otto@deploy1002: Synchronized wmf-config/CommonSettings.php: Enable x_client_ip_forwarding_enabled for eventgate-analytics and eventgate-analytics-external - [[phab:T288853|T288853]] (duration: 00m 56s)
* 13:11 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 13:11 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 13:05 volans: upgraed spicerack to 1.0.5 on cumin hosts
* 12:25 volans: uploaded spicerack_1.0.5 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 12:15 elukey: `kafka topics --alter --topic codfw.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:15 elukey: `kafka topics --alter --topic eqiad.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:10 elukey: `kafka topics --alter --topic codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:09 elukey: `kafka topics --alter --topic eqiad.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 11:58 elukey: `kafka topics --alter --topic codfw.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 11:49 elukey: `kafka topics --alter --topic eqiad.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 11:34 urbanecm: UTC morning B&C window done
* 11:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|860ea0944d6dc1e6b5061eb84eec378eb5ac8441}}: Remove NS_MAIN from wgExtraSignatureNamespaces on most special wikis ([[phab:T291630|T291630]]) (duration: 00m 57s)
* 11:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:14 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:06 urbanecm@deploy1002: Synchronized w/static.php: {{Gerrit|e77ae17efb34723598fc69e87109944384df442a}}: static.php: correctly report a bad request (duration: 00m 57s)
* 11:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2003.codfw.wmnet
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2003.codfw.wmnet
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 10:30 ema: apply https://gerrit.wikimedia.org/r/726912 to all A:cp nodes [[phab:T288106|T288106]]
* 10:24 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4028.ulsfo.wmnet,service=ats-be
* 10:23 ema: depool/repool ats-be on cp4028 to verify updates to /etc/varnish/directors.frontend.vcl on cp4027 keep on working fine [[phab:T288106|T288106]]
* 10:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:22 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4028.ulsfo.wmnet,service=ats-be
* 10:16 ema: cp4027: enable and run puppet to test https://gerrit.wikimedia.org/r/726912 [[phab:T288106|T288106]]
* 10:12 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti2025.codfw.wmnet with OS buster
* 09:16 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17456 and previous config saved to /var/cache/conftool/dbconfig/20211012-091614-kormat.json
* 09:01 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17455 and previous config saved to /var/cache/conftool/dbconfig/20211012-090111-kormat.json
* 08:46 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17454 and previous config saved to /var/cache/conftool/dbconfig/20211012-084607-kormat.json
* 08:31 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17453 and previous config saved to /var/cache/conftool/dbconfig/20211012-083103-kormat.json
* 08:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:58 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|17dc3aa}}, {{Gerrit|e0ca905}}, {{Gerrit|c0f4f4e}}: GrowthExperiments backports ([[phab:T292224|T292224]], [[phab:T290609|T290609]], [[phab:T290609|T290609]]) (duration: 00m 59s)
* 07:40 elukey: run kafka preferred-replica-election on kafka-main2001 to rebalance partition leaders after the last topic moves - [[phab:T288825|T288825]]
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 07:22 moritzm: installing RT security updates
* 04:43 eileen: civicrm revision changed from {{Gerrit|96090e4bd2}} to {{Gerrit|946dfb6c5a}}, config revision is {{Gerrit|85277466ed}}
* 03:56 kart_: cxserver: Remove Matxin Key from Production ([[phab:T292635|T292635]])
* 03:54 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:48 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:45 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 02:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 eileen: civicrm revision changed from {{Gerrit|598b59b0ee}} to {{Gerrit|96090e4bd2}}, config revision is {{Gerrit|85277466ed}}


== 2015-08-18 ==
== 2021-10-11 ==
* 23:59 logmsgbot: krenair@tin Synchronized wmf-config/wikitech.php: T59040 (duration: 00m 12s)
* 21:25 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 23:37 mutante: added papaul (pt1979) to WMF LDAP group
* 20:58 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 23:08 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf18/extensions/Flow/: Sync Flow 1.26wmf18 for watchlist fix. (duration: 00m 14s)
* 17:08 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 23:02 logmsgbot: hoo@tin Synchronized wmf-config/: Set $wgPropertySuggesterClassifyingPropertyIds for testwikidata (duration: 00m 14s)
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 22:28 logmsgbot: krenair@tin Synchronized php-1.26wmf19/extensions/VisualEditor/modules/ve-mw/ui/dialogs/ve.ui.MWSaveDialog.js: https://gerrit.wikimedia.org/r/#/c/232385/ (duration: 00m 12s)
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 22:17 logmsgbot: ori@tin Synchronized php-1.26wmf19/includes/OutputPage.php: 1a4f1df2fe (duration: 00m 12s)
* 15:31 jgleeson: smashpig updated from {{Gerrit|3607b16f83}} to {{Gerrit|dd3a81c7c2}}
* 21:22 awight: updated paymentswiki from 823393264d6795bbaec490ff86f17580f722e598 to fca36026b1e90298abd93562803d3ea7d6893d96
* 14:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 19:26 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.26wmf19
* 14:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 19:21 logmsgbot: twentyafterfour@tin Finished scap: testwiki to 1.26wmf19 (duration: 51m 01s)
* 14:36 Emperor: start restoring weight to ms-be2045 [[phab:T290881|T290881]]
* 18:30 logmsgbot: twentyafterfour@tin Started scap: testwiki to 1.26wmf19
* 13:42 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 18:11 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/OutputPage.php: 6ee94ca47c: Load all CSS in the top queue (duration: 00m 13s)
* 12:53 moritzm: install apache security updates on buster
* 18:07 robh: sodium returned to normal, mailman window over.
* 12:49 topranks: Setting up BGP peering to AS12552 (GlobalConnect Group) at AMS-IX on cr2-esams
* 17:38 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes: 91ae6a39df, 4cc9622214: Added wfTransactionalTimeLimit() method and applied it; Try to make POSTs as transactional as possible (duration: 00m 16s)
* 12:45 ema: cp4027: upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 17:21 robh: T108099 complete, mailman restarted for a few minutes while i prepare next task.
* 12:04 moritzm: install apache security updates on bullseye
* 17:17 robh: puppet disabled on sodium, no touch.
* 10:23 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2003.codfw.wmnet
* 17:03 robh: mailman maint window starts now, list delivery will remain sporadic until I finish. (It'll work off and on, no messages should be lost)
* 09:50 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite2003.codfw.wmnet
* 15:42 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/232241/ (duration: 00m 13s)
* 09:45 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2003.codfw.wmnet
* 15:06 logmsgbot: thcipriani@tin Synchronized wmf-config/CommonSettings.php: SWAT: Remove extra transcode enablings [[gerrit:232228]] (duration: 00m 13s)
* 09:37 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 15:04 andrewbogott: rebooting labvirt1006
* 09:13 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite2003.codfw.wmnet
* 08:18 _joe_: reimaging mw1152
* 09:09 elukey: force kafka preferred-replica-election on kafka-main2001 after the first 50 topic partitions moves - [[phab:T288825|T288825]]
* 08:14 godog: restart cassandra on restbase100[569] to pick up latest openjdk
* 09:05 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 08:04 _joe_: depooling mw1152 from the imagescalers pool
* 09:01 godog: bounce swift-object-replicator on ms-be2036
* 08:03 godog: restart cassandra on restbase100[348] to pick up latest openjdk
* 08:52 godog: bounce statsite on graphite1004 to apply unit config changes
* 07:23 legoktm: live hacking on mw1017 for T109236
* 08:48 volans@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 05:45 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Aug 18 05:45:47 UTC 2015 (duration 45m 46s)
* 08:41 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet
* 02:25 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-18 02:25:28+00:00
* 08:38 moritzm: updated buster d-i image for Bullseye 11.1 point release [[phab:T292844|T292844]]
* 02:21 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 06m 50s)
* 08:38 moritzm: updated buster d-i image for Buster 10.11 point release [[phab:T292838|T292838]]
* 00:02 ottomata: analytics1041 down, attempting power cycle
* 08:26 godog: swift eqiad-prod: final weight to ms-be10[64-67] - [[phab:T290546|T290546]]
* 08:25 moritzm: updated buster d-i image for Buster 10.11 point release [[phab:T292838|T292838]]
* 08:24 volans@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet
* 08:06 godog: bounce uwsgi on graphite hosts to bump request size limit - [[phab:T292877|T292877]]
* 07:58 volans: migrating physical hosts DHCP to the new reimage process - [[phab:T269855|T269855]]
* 07:57 elukey: start kafka topics rebalancing for main-codfw (long running maintenance) - [[phab:T288825|T288825]]


== 2015-08-17 ==
== 2021-10-09 ==
* 22:19 matt_flaschen: LQT->Flow done on MediaWiki.org.
* 05:01 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:57 logmsgbot: mattflaschen@tin Synchronized wmf-config: LQT->Flow: Make frozen wikis no longer able to create LQT pages (duration: 00m 13s)
* 04:28 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:31 chasemp: remove php5-xdebug from terbium per mattflaschen
* 01:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 21:10 MaxSem: renamed Gadget:Invention, Travel, & Adventure --> Gadget Invention, Travel, & Adventure on enwiki using moveBatch.php to work around a permissions screwup
* 00:46 mutante: ms-be2045 - started systemd-timedated which had been killed by something
* 20:54 bd808: T109369: Restarted logstash on logstash1003; parsoid gelf events not being recorded since 2015-08-15
* 00:28 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 20:16 subbu: deployed parsoid version 4b656b72
* 00:24 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.force-unfreeze (exit_code=99)
* 19:19 ottomata: stopping kafka on analytics1012, preparing to reinstall with Jessie and rename to kafka1012
* 00:23 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.force-unfreeze
* 15:44 logmsgbot: krenair@tin Synchronized wmf-config/extension-list: https://gerrit.wikimedia.org/r/#/c/232051/ - remove WikiGrok from extension-list, extension is no longer deployed (duration: 00m 11s)
* 00:13 ryankemper: [[phab:T292814|T292814]] Write queue stuck at 133 events in partition 1 of topic `codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite`, will try again at another time
* 15:40 logmsgbot: krenair@tin Synchronized multiversion/MWMultiVersion.php: https://gerrit.wikimedia.org/r/#/c/231982/2 - clear up --wiki usage to mwscript (duration: 00m 12s)
* 00:12 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 15:34 logmsgbot: krenair@tin Synchronized php-1.26wmf18/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.Target.js: https://gerrit.wikimedia.org/r/#/c/232048/ (duration: 00m 11s)
* 15:07 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/231984/ (duration: 00m 13s)
* 15:05 andrewbogott: rebooting labvirt1004
* 15:03 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/232021/ (duration: 00m 12s)
* 14:30 mobrovac: restbase updated production cluster to ed17952
* 14:12 mobrovac: restbase deployed ed17952 on restbase1001
* 13:58 mobrovac: restbase deploying ed17952 on staging
* 11:42 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Reverting change to settings before schema change (no more lag) (duration: 00m 12s)
* 11:29 jynus: reloading dbproxy1003 haproxy config- it was a temporal max_connections issue; db1043 should be the canonical server again
* 10:09 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Solving lag issues while schema change is ongoing (duration: 00m 12s)
* 09:43 jynus: about to perform schema change on centralauth
* 09:36 godog: upgrade openjdk on restbase100[127] and restart cassandra
* 05:54 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Aug 17 05:54:24 UTC 2015 (duration 54m 23s)
* 05:11 twentyafterfour: restarted phd to pick up new configuration, (and to silence the phabricator 'setup issue' warning
* 04:47 twentyafterfour: changed phabricator policy for the multimeter application from 'public' to 'all users'
* 04:44 twentyafterfour: deployed https://gerrit.wikimedia.org/r/#/c/231983/ to iridium and restarted apache
* 04:17 mutante: free some disk space on iridium. apt-get clean; gzip /var/log/account/pacct.0; some apache logs .;.
* 02:29 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-17 02:29:19+00:00
* 02:25 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 11m 08s)


== 2015-08-16 ==
== 2021-10-08 ==
* 18:15 logmsgbot: krenair@tin Synchronized wmf-config/extension-list: https://gerrit.wikimedia.org/r/#/c/169612/ - remove Extension:Oversight (duration: 00m 21s)
* 23:16 legoktm: sudo cumin -b 10 C:mediawiki::packages 'apt-get purge lilypond-data -y'
* 18:14 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/169612/ - remove Extension:Oversight (duration: 00m 25s)
* 23:10 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 05:39 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Aug 16 05:39:28 UTC 2015 (duration 39m 27s)
* 21:38 mutante: mwmaint2002 - disable-puppet, stop bacula-fd, recovery in progress
* 02:29 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-16 02:29:19+00:00
* 21:34 mutante: disabling puppet on bacula - going through a restore https://wikitech.wikimedia.org/wiki/Bacula#Restore_from_a_non-existent_host_(missing_private_key)
* 02:25 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 11m 52s)
* 21:30 legoktm: running puppet across C:mediawiki::packages to uninstall lilypond and ploticus: legoktm@cumin1001:~$ sudo cumin -b 4 C:mediawiki::packages 'run-puppet-agent'
* 20:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 20:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 20:08 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 20:08 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 20:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 20:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 19:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 19:45 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 19:43 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 19:42 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 19:39 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 18:15 cstone: civicrm revision changed from {{Gerrit|5cb7d487cb}} to {{Gerrit|598b59b0ee}}
* 16:19 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=enwiki --force # to measure performance on a large wiki
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 15:29 jelto: enable puppet on gitlab1001 again for [[phab:T283076|T283076]]
* 14:05 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:01 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:49 Amir1: wikiadmin@10.64.16.85(wikidatawiki)> delete from wb_changes_subscription where cs_subscriber_id in ('testcommonswiki', 'mowiki');
* 09:39 Emperor: installing stress on ms-be2045 given recent h/w issues [[phab:T290881|T290881]]
* 08:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:04 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force
* 07:43 Emperor: reboot ms-be2045 [[phab:T290881|T290881]]
* 07:41 gehel: manually resuming the data reloads on wdqs1009 and wdqs2008
* 06:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 06:42 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 06:28 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 06:28 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 05:35 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:56 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 04:32 ryankemper: [[phab:T292814|T292814]] Beginning rolling restart of `cloudelastic`: `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic restart" --nodes-per-run 1 --start-datetime 2021-10-08T03:53:49 --task-id [[phab:T292814|T292814]]` on `ryankemper@cumin1001` tmux `elastic`
* 04:31 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:29 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across both test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 04:23 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@8f57a56]: 0.3.89 (duration: 08m 22s)
* 04:20 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:20 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:18 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 04:17 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 04:15 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.89` on canary `wdqs1003`; proceeding to rest of fleet
* 04:14 ryankemper@deploy1002: Started deploy [wdqs/wdqs@8f57a56]: 0.3.89
* 04:14 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.89`. Pre-deploy tests passing on canary `wdqs1003`
* 03:58 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 03:58 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 02:04 Krinkle: krinkle@deploy1002$ echo 'https://en.wikipedia.org/static/images/project-logos/jvwiktionary.png' {{!}} mwscript purgeList.php , ref [[phab:T287425|T287425]], [[phab:T292810|T292810]]
* 00:07 tgr_: deploy window over
* 00:05 tgr@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments: Backport: [[gerrit:727498{{!}}Mentee overview: Make UncachedMenteeOverviewDataProvider::getBlocksForUsers faster (T290609)]] (duration: 00m 56s)


== 2015-08-15 ==
== 2021-10-07 ==
* 22:56 legoktm: removed 13 bounce_records for User:odder from bouncehandler database
* 23:43 thcipriani@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 3/3 (duration: 00m 55s)
* 16:07 _joe_: removing manually core dumps from last night's outage on all appservers in eqiad, they occpy on average 30 GB/server
* 23:41 thcipriani@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 2/3 (duration: 00m 55s)
* 16:05 ottomata: starting rolling restart of kafka brokers to apply auto leader rebalance enable = false
* 23:40 thcipriani@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 1/3 (duration: 00m 56s)
* 14:49 ottomata: stopping kafka broker on analytics1012 to again try to figure out why camus can't consume from it
* 23:30 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 2/2 (duration: 00m 56s)
* 12:46 bblack: restarted gitblit on antimony, because Java is Awesome
* 23:28 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikiquote-wordmark-tr.svg: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 1/2 (duration: 00m 57s)
* 05:41 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Aug 15 05:41:57 UTC 2015 (duration 41m 56s)
* 21:35 urbanecm: Password reset for SUL User:LA2-bot ([[phab:T292793|T292793]])
* 04:38 andrewbogott: killing some rsync processes on labstore1002 because iowaits are through the roof
* 20:43 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3
* 02:30 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-15 02:29:05+00:00
* 20:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.2  refs [[phab:T281167|T281167]]
* 02:25 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 06m 37s)
* 20:35 cmooney@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 01:15 ottomata: starting broker on analytics1012, camus  wasn't happy about that either. hrm.
* 20:35 cmooney@cumin1001: START - Cookbook sre.network.cf
* 00:58 ottomata: stopping kafka broker on analytics1012, it is causing consumption problems with camus, will look into why later.
* 20:23 krinkle@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Gadgets/: {{Gerrit|I7c858b8c4bc}} (duration: 00m 56s)
* 20:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Echo/: {{Gerrit|8a7ff05ba28f302adb581bf430a868bb815b4ffd}}: Revert "Use namespaced CentralAuthSessionProvider" (duration: 00m 57s)
* 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/CentralAuth/: {{Gerrit|c01c2e4983bad8582ddd62aeb35ac9be852d493b}}: Revert "Namespace session providers" (duration: 00m 57s)
* 19:44 urbanecm: Backporting https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/727489, https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Echo/+/727487 in an unsafe way -- exceptions at testwikis expected, wmf.3 is not deployed elsewhere, so this should be ok
* 19:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert all wikis to 1.38.0-wmf.2 ([[phab:T281167|T281167]])
* 19:33 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): variously blocked, rolling back to testwikis for safe deploy of backports
* 19:14 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group2 wikis to 1.38.0-wmf.2
* 19:07 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 19:03 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to all wikis
* 18:50 urbanecm: [urbanecm@mwmaint1002 /srv/mediawiki/php]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=test2wiki
* 18:46 sukhe: running authdns-update for [[phab:T292537|T292537]]
* 18:29 urbanecm: Morning B&C window done
* 18:28 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a946c046ae17a520f8d3463a16b1435ceb4856c}}: Deploy Growth mentor dashboard to pilot wikis ([[phab:T278920|T278920]]) (duration: 01m 04s)
* 18:23 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 03s)
* 18:21 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 04s)
* 18:20 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|31770f2b3660e7d7490c0a9ab66285c1f069732d}}: shwiki: Deploy Growth features to newcomers ([[phab:T278240|T278240]]) (duration: 01m 04s)
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|33526dfed148068585289f5ac501feda72068fd9}}: Stream config changes for android_daily_stats schema ([[phab:T286000|T286000]]) (duration: 01m 06s)
* 18:10 ejegg: updated payments-wiki from {{Gerrit|6d3560d083}} to {{Gerrit|030b11da1a}}
* 18:07 arnoldokoth: gitlab2001 re-image complete ([[phab:T283076|T283076]])
* 17:30 mutante: rebooting gitlab2001.wikimedia.org
* 16:56 arnoldokoth: down timing gitlab2001 for re-imaging ([[phab:T283076|T283076]])
* 16:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 16:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 16:32 hnowlan: roll restarting maps cassandra instances for java updates
* 16:19 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 16:19 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 15:07 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001 (duration: 00m 08s)
* 15:07 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001
* 14:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:49 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001 (duration: 00m 10s)
* 14:49 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001
* 14:48 hashar: Upgrading Gerrit replica to 3.3.6 # [[phab:T290236|T290236]]
* 14:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:56 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:46 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:30 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:29 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:29 hashar: restarting CI Jenkins for git plugin update
* 13:19 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:15 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 hashar: Upgraded CI Jenkins on contint2001
* 13:14 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 13:13 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 13:10 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:09 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 13:06 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 13:06 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 13:05 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 13:05 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:56 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 12:40 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 12:16 moritzm: installing testvm2005
* 11:59 moritzm: installing openssl security updates for stretch (buster/bullseye already fixed)
* 11:52 Lucas_WMDE: EU backport+config window (aka UTC morning) done
* 11:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725858{{!}}Enable Content and Section Translation to Kurdish WP (T290238)]] (duration: 01m 04s)
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:44 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/WikidataPageBanner/includes/WikidataPageBannerFunctions.php: Backport: [[gerrit:727188{{!}}Change PropertyId to NumericPropertyId (T289125, T292667)]] (duration: 01m 05s)
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:10 jbond: update puppet stdlib gerrit:726872
* 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:31 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2004.codfw.wmnet
* 09:26 mvernon@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host ms-be2045.codfw.wmnet
* 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2005.codfw.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2004.codfw.wmnet
* 09:08 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2005.codfw.wmnet
* 08:49 mvernon@cumin2002: START - Cookbook sre.experimental.reimage for host ms-be2045.codfw.wmnet
* 08:36 moritzm: imported jenkins 2.303.2 to thirdparty/ci component for buster-wikimedia
* 07:57 Emperor: re-enabling puppet on ms-be2045 after hw work [[phab:T290881|T290881]]
* 07:39 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 07:39 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 07:38 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 07:37 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 07:34 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 07:33 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 07:32 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:31 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 06:21 ryankemper: [Elastic] Restart of `relforge` complete
* 06:05 ryankemper: [Elastic] Cluster in green status, proceeding to next and final node => `ryankemper@relforge1003:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:53 ryankemper: [Elastic] `ryankemper@relforge1004:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:48 ryankemper: [Elastic] Performing rolling restarts of `relforge`. `relforge1003` is the master so I'll restart `relforge1004` first to minimize disruption
* 03:00 ejegg: updated payments-wiki from {{Gerrit|23d0ffac66}} to {{Gerrit|6d3560d083}}
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: enable Parsoid API everywhere (duration: 01m 04s)
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 mutante: [grafana2001:~] $ sudo systemctl start rsync-var-lib-grafana  because of "PROBLEM - Check systemd state on grafana2001 is CRITICAL: CRITICAL - degraded" because of some race condition where a file vanished during sync


== 2015-08-14 ==
== 2021-10-06 ==
* 23:47 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/libs/ReplacementArray.php: (no message) (duration: 00m 27s)
* 23:57 mutante: releases2002 - rm /srv/org/wikimedia/reprepro/conf/distributions - contains only jessie-mediawiki - see 725670 and EOL of MediaWiki 1.31
* 21:54 mutante: restarted nutcracker on mw1010
* 23:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:26 ori: deployed job runner 808d1ae08d40
* 23:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:15 ejegg: updated crm from 4f40ac6de0385982d8e672b1ed30ff1a2a2a2aa1 to fc0fcc8f5af262b56392d3f4f5998f8ea08c99a8
* 23:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:29 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/resourceloader/ResourceLoader.php: f72009a543: ResourceLoader: apply minify-js filter to config scripts (duration: 00m 13s)
* 23:21 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 18:27 logmsgbot: ori@tin Synchronized php-1.26wmf18/extensions/MultimediaViewer: 9ee0437bc6: Updated mediawiki/core Project: mediawiki/extensions/MultimediaViewer  645b6c9e93fae13e09e5b493547aecc5a2e933ae (duration: 00m 12s)
* 23:20 jforrester@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ckb.svg: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 18:24 ori: Repooling mw1041 now that T108601 is resolved.
* 23:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:59 yurik: deployed latest kartotherian
* 23:16 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726603{{!}}Enable NewUserMessage for ptwikivoyage (T290820)]] (duration: 01m 05s)
* 14:59 andrewbogott: rebooting labvirt1003
* 22:30 mutante: re-enabling puppet on mw*, an-worker* after deploying gerrit:726954. no issue this time
* 13:53 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Repool db1042 (vslow, dump) (duration: 00m 12s)
* 22:23 mutante: temp. disabling puppet on an-worker*, mw*
* 13:14 logmsgbot: reedy@tin Synchronized php-1.26wmf18/extensions/CirrusSearch/: Fix ElasticaQuery logspam (duration: 00m 13s)
* 20:50 mutante: global puppet failure - revert is merged, puppet run will recover on next run everywhere. partially forcing with cumin, partially letting it recover naturally
* 13:06 logmsgbot: reedy@tin Synchronized php-1.26wmf18/extensions/GeoData: Fix ElasticaQuery logspam (duration: 00m 13s)
* 20:43 mutante: [cumin1001:~] $ sudo cumin -b 5 -p 95 'mw2*' 'run-puppet-agent -q --failed-only'
* 13:06 logmsgbot: reedy@tin Synchronized php-1.26wmf18/extensions/Flow: Fix ElasticaQuery logspam (duration: 00m 13s)
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:30 jynus: Restarting db1042 after data import
* 19:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:11 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Load-balance db1036 roles (duration: 00m 11s)
* 19:05 brennen@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]] (duration: 01m 03s)
* 11:57 logmsgbot: reedy@tin Synchronized php-1.26wmf18/extensions/Translate: Stop calling deprecated Elastica function (duration: 00m 13s)
* 19:04 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 08:26 akosiaris: upgraded and restarted apertium on sca100{1,2}
* 19:01 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): still unblocked after triage meeting, rolling to group1
* 08:11 akosiaris: uploaded to apt.wikimedia.org jessie-wikimedia: apertium-apy_0.1+svn~61425-1
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:10 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Aug 14 07:10:37 UTC 2015 (duration 10m 36s)
* 18:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:34 Jamesofur: reset email/password for User:Auréola after multi factor user confirmation.
* 18:44 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Revert disabling static mapframes on eswiki (duration: 01m 14s)
* 02:35 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-14 02:35:19+00:00
* 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:31 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 06m 35s)
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:49 matt_flaschen: Resumed LQT->Flow conversion of mw:Project:Support_desk on mw1041
* 18:31 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: eswiki: Disable static mapframes ([[phab:T291736|T291736]]) (duration: 01m 17s)
* 01:46 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/OutputPage.php: I5e6c79c70: Optimize the order of styles and scripts in <head> (duration: 00m 12s)
* 18:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: viwikibooks: Set $wgRestrictDisplayTitle to false ([[phab:T289837|T289837]]) (duration: 01m 21s)
* 17:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:53 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 16:47 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:43 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to group0
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:35 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726596{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 04s)
* 16:35 jynus: stopping db1127 for hw maintenance [[phab:T292366|T292366]]
* 16:31 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 16:31 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 16:28 brennen@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726597{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 10s)
* 16:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:01 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 15:45 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): proceeding to deploy backports for [[phab:T292589|T292589]]
* 15:37 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 15:35 volans: installer spicerack 1.0.4 on cumin2002
* 12:50 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:48 volans: uploaded spicerack_1.0.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2004.codfw.wmnet
* 12:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:18 effie: pool mw1455 mw1422
* 12:17 urbanecm: wikiadmin@10.64.0.164(viwiki)> delete from growthexperiments_mentee_data; # cleanup after disabling mentor dashboard backend
* 12:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2004.codfw.wmnet
* 12:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1aa67d4846f39f59127a835cb7a8ed2974506025}}: viwiki: Disable mentor dashboard backend ([[phab:T278920|T278920]]) (duration: 01m 06s)
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 11:56 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2003.codfw.wmnet
* 11:55 XioNoX: esams - Advertise 185.15.59.0/24 instead of 185.15.58.0/23 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2003.codfw.wmnet
* 10:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 10:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 10:50 jelto: disable puppet on gitlab1001 to test puppetized code on GitLab replica - [[phab:T283076|T283076]]
* 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1003.wikimedia.org
* 10:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1003.wikimedia.org
* 10:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 urbanecm@deploy1002: Synchronized wmf-config/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 2/2) (duration: 01m 05s)
* 10:01 urbanecm@deploy1002: Synchronized dblists/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 1/2) (duration: 01m 04s)
* 09:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host copernicium.wikimedia.org
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host copernicium.wikimedia.org
* 09:19 jbond: update ipaddress6 fact - https://gerrit.wikimedia.org/r/c/operations/puppet/+/726625
* 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:13 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:725923{{!}}Don't fail job if subscribed wiki is unknown (T292446 T292440)]] (duration: 01m 15s)
* 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:29 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 08:21 XioNoX: add ROAs for 185.15.58.0/24 and 185.15.59.0/24 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 08:04 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews --fix # [[phab:T291344|T291344]]
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews # [[phab:T291344|T291344]]
* 07:55 urbanecm: mwdebug1001: scap pull ([[phab:T291344|T291344]] fix done)
* 07:51 urbanecm: Staging at mwdebug1001 for [[phab:T291344|T291344]]
* 05:53 kart_: Updated cxserver to use nodejs12 ([[phab:T290754|T290754]])
* 05:47 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:39 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:36 Amir1: start of mwscript extensions/Wikibase/repo/maintenance/pruneChanges.php --wiki wikidatawiki --number-of-days=2
* 05:31 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 04:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:29 ryankemper: [WDQS] `wdqs1012` is back up after restarting blazegraph (blazegraph was locked up)
* 04:27 ryankemper: [WDQS] `ryankemper@wdqs1012:~$ sudo systemctl restart wdqs-blazegraph` (attempting to bring downed `wdqs1012` back into health)
* 04:25 ryankemper: [WDQS] Repooling eqiad hosts following the brief outage from earlier: `wdqs1004`, `wdqs1006`, `wdqs1007`
* 03:19 eileen: civicrm revision changed from {{Gerrit|b6f5f71c18}} to {{Gerrit|82efd2e195}}, config revision is {{Gerrit|f4c57d4733}}
* 03:11 tstarling@deploy1002: Synchronized php-1.38.0-wmf.3/includes/CommentFormatter/RowCommentIterator.php: fix UBN [[phab:T292590|T292590]] (duration: 01m 04s)
* 01:39 legoktm: legoktm@mwmaint1002:~$ echo "https://en.wikiversity.org/static/images/mobile/copyright/wikiversity.svg" {{!}}mwscript purgeList.php
* 01:17 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 03s)
* 01:12 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 17s)
* 00:59 arlolra@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable legacy media dom on metawiki (duration: 01m 05s)
* 00:37 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:35 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 03s)
* 00:32 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:29 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 04s)
* 00:16 mutante: puppetmasters: rm /etc/logrotate.d/geoipupdate && systemctl start logrotate && puppet agent -tv
* 00:14 mutante: puppetmaster2002 - rm /etc/logrotate.d/geoipupdate (not managed by puppet anymore but not removed, caused duplicate logrotate config, made logrotate service fail), start logrotate
* 00:08 cstone: civicrm revision changed from {{Gerrit|34d3c3aae8}} to {{Gerrit|b6f5f71c18}}
* 00:01 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725132{{!}}Add WN as an alias to project namespace in Polish Wikinews (T291344)]] (duration: 01m 04s)


== 2015-08-13 ==
== 2021-10-05 ==
* 23:45 awight: rollback paymentswiki from 2e7b449224317779d53ff84527166c0d378a0a40 to 823393264d6795bbaec490ff86f17580f722e598
* 23:54 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikiversity.svg: Config: [[gerrit:725413{{!}}Wikiversity Logo Update for 2017 Logo Version (T292109)]] (duration: 01m 03s)
* 23:16 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/228477/ and https://gerrit.wikimedia.org/r/#/c/231074/ (duration: 00m 12s)
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 04s)
* 23:15 awight: update paymentswiki-staging 99e3ce08117d18b15bc8138b447c4c21bd452d28 to 65b05fc11896325ae9749318b296c4396a64f649
* 23:44 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-az.svg: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 23s)
* 23:15 logmsgbot: krenair@tin Synchronized w/static/images/project-logos/mrwikibooks.png: https://gerrit.wikimedia.org/r/#/c/228477/ (duration: 00m 12s)
* 23:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725386{{!}}Add image_suggestion_interaction event stream]] (duration: 01m 12s)
* 23:11 logmsgbot: krenair@tin Synchronized php-1.26wmf18/extensions/Flow: https://gerrit.wikimedia.org/r/#/c/231450/1 (duration: 00m 14s)
* 23:02 legoktm: deleting old stretch docker images from the registry for [[phab:T292485|T292485]]
* 23:03 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227330/ (duration: 00m 12s)
* 22:24 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 22:35 awight: rollback payments-wiki-staging to 99e3ce08117d18b15bc8138b447c4c21bd452d28
* 22:20 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]) rolling back to testwikis for the day; will revisit in US-morning
* 22:29 awight: update paymentswiki from 823393264d6795bbaec490ff86f17580f722e598 to 2e7b449224317779d53ff84527166c0d378a0a40
* 20:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 22:23 awight: update payments-wiki-staging from 99e3ce08117d18b15bc8138b447c4c21bd452d28 to 2e7b449224317779d53ff84527166c0d378a0a40
* 20:44 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/includes/page: Backport: [[gerrit:726594{{!}}Pre-format comments for non-local files too]] ([[phab:T292570|T292570]]) (duration: 01m 04s)
* 22:14 Krenair: Running refreshLinks --dfn-only in a screen on terbium for T44180
* 20:18 mutante: puppetmaster1003 et al - converting maxmind geoip database fetching from cron to timers
* 21:45 awight: rollback payments-wiki-staging to 99e3ce08117d18b15bc8138b447c4c21bd452d28
* 20:06 mutante: cumin 'puppetmaster*' "disable-puppet '[[phab:T288844|T288844]] - [[phab:T273673|T273673]] - gerrit:721595 - $<nowiki>{</nowiki>USER<nowiki>}</nowiki>'"
* 21:45 awight: update payments-wiki-staging from 99e3ce08117d18b15bc8138b447c4c21bd452d28 to 96a369651c1130b0a8e53a6395f83c0b9329b9f8
* 19:30 mutante: restoring /home/amire80 from and to mwmaint2002 via Bacula bconsole ([[phab:T292573|T292573]])
* 21:43 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/OutputPage.php: Roll back: Test impact of I5e6c79c (duration: 00m 12s)
* 19:09 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 21:37 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/OutputPage.php: Test impact of I5e6c79c (duration: 00m 12s)
* 19:03 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 21:16 mutante: torrus broken - doing https://wikitech.wikimedia.org/wiki/Torrus#Deadlock_problem
* 18:26 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.23 (duration: 01m 57s)
* 21:14 mutante: service gitblit restart on antimony (maybe that should be paging :)
* 18:23 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.21 (duration: 04m 20s)
* 21:06 bd808: `sudo /etc/init.d/ganglia-monitor restart` on logstash100[1-6] fixed ganglia data loss
* 18:21 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]): pruning old branches, starting with 1.37.0-wmf.21, proceeeding to 1.37.0-wmf.23 if time allows
* 20:46 mutante: killed ganglia aggregator for logstash on carbon
* 18:11 ppchelko@deploy1002: Synchronized wmf-config: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] Php72ToUpper.php removal (duration: 01m 06s)
* 20:40 bd808: ganglia not getting elasticsearch jvm data for logstash cluster since 2015-08-13T12:00 -- https://ganglia.wikimedia.org/latest/?c=Logstash+cluster+eqiad&&m=es_heap_used
* 18:04 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] CS.php (duration: 01m 06s)
* 19:56 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: all wikis to 1.26wmf18
* 17:55 brennen@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]] (duration: 45m 59s)
* 19:38 logmsgbot: demon@tin Synchronized php-1.26wmf18/extensions/CirrusSearch/: (no message) (duration: 00m 11s)
* 17:12 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 19:35 logmsgbot: demon@tin Synchronized php-1.26wmf18/extensions/Graph: (no message) (duration: 00m 10s)
* 17:09 brennen@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 19:31 logmsgbot: demon@tin Synchronized php-1.26wmf18/extensions/Graph: (no message) (duration: 00m 11s)
* 17:03 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 18:32 logmsgbot: ori@tin Synchronized php-1.26wmf18/extensions/SyntaxHighlight_GeSHi/extension.json: If0851400: Fix-up for I2de8a400d: explicitly declare module position (duration: 00m 12s)
* 17:02 btullis@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 17:47 logmsgbot: kaldari@tin Synchronized wmf-config/InitialiseSettings.php: syncing InitialiseSettings for WikidataPageBanner (duration: 00m 12s)
* 17:02 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 17:46 logmsgbot: kaldari@tin Finished scap: (no message) (duration: 50m 05s)
* 16:56 brennen: successfully applied security patches for 1.38.0-wmf.3 train ([[phab:T281167|T281167]])
* 16:55 logmsgbot: kaldari@tin Started scap: (no message)
* 16:47 brennen: coordinated with deployment backup and starting train prep for 1.38.0-wmf.3 ([[phab:T281167|T281167]]), branched at {{Gerrit|65279490f82c785181b8b6961e40901a4aaafca4}}
* 16:54 logmsgbot: kaldari@tin Synchronized wmf-config/InitialiseSettings.php: syncing InitialiseSettings for WikidataPageBanner off (duration: 00m 11s)
* 15:57 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for puppetboard2002.codfw.wmnet: Renew puppet certificate - jbond@cumin2002
* 16:51 logmsgbot: kaldari@tin Synchronized wmf-config/CommonSettings.php: syncing CommonSettings for WikidataPageBanner (duration: 00m 12s)
* 15:57 jbond@cumin2002: START - Cookbook sre.puppet.renew-cert for puppetboard2002.codfw.wmnet: Renew puppet certificate - jbond@cumin2002
* 16:50 logmsgbot: kaldari@tin Synchronized wmf-config/InitialiseSettings.php: syncing InitialiseSettings for WikidataPageBanner (duration: 00m 12s)
* 15:38 jbond: reimage puppetboard2002
* 16:31 logmsgbot: kaldari@tin Synchronized php-1.26wmf18/.gitmodules: (no message) (duration: 00m 11s)
* 15:15 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for puppetboard1002.eqiad.wmnet: Renew puppet certificate - jbond@cumin1001
* 16:30 logmsgbot: kaldari@tin Synchronized php-1.26wmf17/.gitmodules: (no message) (duration: 00m 13s)
* 15:15 jbond@cumin1001: START - Cookbook sre.puppet.renew-cert for puppetboard1002.eqiad.wmnet: Renew puppet certificate - jbond@cumin1001
* 16:29 logmsgbot: kaldari@tin Synchronized php-1.26wmf17/extensions/WikidataPageBanner: (no message) (duration: 00m 12s)
* 15:10 moritzm: imported routinator 0.10.1-1bullseye to thirdparty/routinator for bullseye-wikimedia [[phab:T292503|T292503]]
* 16:29 logmsgbot: kaldari@tin Synchronized php-1.26wmf18/extensions/WikidataPageBanner: (no message) (duration: 00m 12s)
* 14:58 jbond: reimage puppetboard1002
* 16:24 logmsgbot: demon@tin Synchronized wmf-config/: undeploy wikigrok (duration: 00m 12s)
* 14:40 effie: depool  mw1455 and mw1422
* 16:12 yurik: sync deployed tilerator
* 14:30 Pchelolo: run foreachwiki uppercaseTitlesForUnicodeTransition.php --charmap current_to_php7_overrides.php [[phab:T219279|T219279]]
* 15:20 logmsgbot: thcipriani@tin Synchronized php-1.26wmf17/extensions/ContentTranslation/modules/tools/ext.cx.tools.images.js: SWAT: Images: validate image id before adapting to prevent js error [[gerrit:231229]] (duration: 00m 11s)
* 13:51 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor - Drop REL1_31, start REL1_37 (duration: 00m 57s)
* 15:10 logmsgbot: thcipriani@tin Synchronized php-1.26wmf18/extensions/ContentTranslation/modules/tools/ext.cx.tools.images.js: SWAT: Images: validate image id before adapting to prevent js error [[gerrit:231230]] (duration: 00m 12s)
* 13:46 Pchelolo: run renameInvalidUsernames.php --wiki loginwiki --list /tmp/rename_users_for_uppercase_all.txt [[phab:T219279|T219279]]
* 15:04 andrewbogott: rebooting labvirt1002
* 13:39 elukey@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - elukey@cumin1001
* 13:36 jynus: kill custom query hiting s6 master from terbium. Use of a slave is required.
* 13:39 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - elukey@cumin1001
* 13:11 andrewbogott: graceful’d apache2 on labcontrol1001
* 13:23 ppchelko@deploy1002: Synchronized php-1.38.0-wmf.2/maintenance/uppercaseTitlesForUnicodeTransition.php: Backport uppercaseTitlesForUnicodeTransition.php maintenance script improvements [[phab:T219279|T219279]] (duration: 00m 58s)
* 12:42 andrewbogott: restarted keystone on labcontrol1001
* 12:53 ema: upload varnish 6.0.8-1wm1 to apt.wikimedia.org [[phab:T292290|T292290]]
* 08:16 _joe_: removing all stale aggregator configs from netmon1001
* 12:43 elukey: import AMD ROCm 4.2 to buster-wikimedia's thirdparty/amd-rocm42 - [[phab:T287267|T287267]]
* 08:15 godog: upgrade cassandra on restbase1009
* 12:24 ema: deployment-cache-text06: upgrade varnish to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 08:11 godog: upgrade cassandra on restbase1006
* 11:58 hnowlan: reverted restbase2023 to use CN=hostname certificate due to loading errors
* 08:07 godog: upgrade cassandra on restbase1005
* 11:57 hnowlan@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 08:00 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Aug 13 08:00:40 UTC 2015 (duration 0m 39s)
* 11:57 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 07:36 _joe_: killing all gmond instances on netmon1001, trying to fix ganglia-monitor-aggregator
* 11:37 hnowlan@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 06:48 matt_flaschen: Stopped Support desk LQT->Flow conversion for tonight
* 11:28 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 05:03 logmsgbot: ori@tin Synchronized php-1.26wmf17/includes/cache/MessageCache.php: 5f1ab59d31: MessageCache: derive the hash from the cache contents (duration: 00m 12s)
* 11:17 hnowlan_: disabling puppet on cassandra nodes for rollout of 724061 - defaulting to cn=fqdn certificates
* 05:02 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/cache/MessageCache.php: 5f1ab59d31: MessageCache: derive the hash from the cache contents (duration: 00m 12s)
* 11:15 effie: upgrade scap to 4.0.2 - [[phab:T291095|T291095]]
* 03:04 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-13 03:04:52+00:00
* 11:12 urbanecm@deploy1002: Synchronized dblists/commonsuploads.dblist: {{Gerrit|04524992865b0ae5750eb6fb0a374aa74a65b383}}: Enable local uploads for tcywiki ([[phab:T166763|T166763]]) (duration: 00m 59s)
* 03:01 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 06m 14s)
* 10:11 vgutierrez: update acme-chief to version 0.32 on acmechief hosts - [[phab:T290249|T290249]]
* 02:44 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf17) at 2015-08-13 02:44:49+00:00
* 10:09 vgutierrez: update acme-chief to version 0.32 on acmechief-test hosts - [[phab:T290249|T290249]]
* 02:38 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf17/cache/l10n: l10nupdate for 1.26wmf17 (duration: 10m 47s)
* 10:06 vgutierrez: upload acme-chief 0.32 to apt.wm.o (buster) - [[phab:T290249|T290249]]
* 01:26 matt_flaschen: Restarted conversion of support desk from LQT->Flow using convertLqtPageOnLocalWiki.php, using hhvm on mw1041
* 09:46 hnowlan_: generated cassandra certificate using FQDN for restbase2023
* 01:16 logmsgbot: ori@tin Synchronized wmf-config/StartProfiler.php: I482b120289: Ensure all Xenon records begin with the script base name (duration: 00m 12s)
* 09:09 topranks: updating routinator on rpki2001 ([[phab:T291543|T291543]])
* 01:07 ori: Depooled mw1041 so it can be set aside for LQT->Flow conversion script (T108601)
* 08:59 dcausse: depool and restart blazegraph on wdqs1007
* 08:51 moritzm: installing openssl security updates for stretch (buster/bullseye already fixed)
* 07:58 moritzm: installing apache security updates
* 07:57 elukey: upgrade GPU drivers (AMD ROCm 4.3.1) on an-worker1[096-101]
* 07:27 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 07:26 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 07:26 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1004.wmnet
* 06:38 elukey: reboot an-worker1096 after installing new GPU drivers
* 04:20 eileen: civicrm revision changed from {{Gerrit|d74e9aa0a1}} to {{Gerrit|34d3c3aae8}}, config revision is {{Gerrit|cae09f7691}}


== 2015-08-12 ==
== 2021-10-04 ==
* 23:55 ori: fluorine is struggling due to I941660b5; I'm fixing.
* 23:30 foks: resetting some emails used for abuse by a globally-banned user
* 23:54 logmsgbot: krenair@tin Synchronized php-1.26wmf17/extensions/WikimediaMaintenance: https://gerrit.wikimedia.org/r/#/c/231194/ (duration: 00m 12s)
* 23:19 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:726084{{!}} Bumping portals to master (T128546)]] (duration: 00m 59s)
* 23:47 logmsgbot: krenair@tin Synchronized php-1.26wmf18/extensions/WikimediaMaintenance: https://gerrit.wikimedia.org/r/#/c/231193/ (duration: 00m 12s)
* 23:18 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:726084{{!}} Bumping portals to master (T128546)]] (duration: 00m 59s)
* 23:44 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/231169/ (duration: 00m 12s)
* 23:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|75645c9cc59b37dbf59942eabbc014b7dc147626}}: Add explicit config for licensing/copyright message overrides ([[phab:T284097|T284097]]) (duration: 00m 59s)
* 23:16 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/231158/ (duration: 00m 11s)
* 23:05 mutante: [deneb:~] $ sudo systemctl start docker-reporter-releng-images
* 23:15 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/231158/ (duration: 00m 13s)
* 22:54 mutante: puppetmaster2001 - rm /etc/logrotate.d/geoipupdate_ipinfo  and geoipupdate_ipinfo ; running puppet, starting logrotate service
* 23:13 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/230729 (duration: 00m 13s)
* 18:13 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:02 logmsgbot: legoktm@tin Synchronized wmf-config: Isolate wikidata.org cookies and CORS policies (duration: 00m 12s)
* 16:51 bblack: rolling restart of haproxy for DoTLS on dns300[12],authdns1001,authdns2001 to recycle connections
* 22:21 matt_flaschen: Killed support desk conversion again to review XDebug information.
* 15:24 vgutierrez: pool cp5006
* 22:17 matt_flaschen: Resumed the support desk conversion.
* 15:17 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 21:53 awight: updated paymentswiki from 325640bd70680a08ae77fd117433565634a98d88 to 99e3ce08117d18b15bc8138b447c4c21bd452d28
* 15:16 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 20:42 subbu: deployed parsoid version a271c205
* 14:50 phuedx: phuedx@mwmaint1002:~$ mwscript extensions/SecurePoll/cli/purgeDecryptionKeys.php --wiki=votewiki --before="20210101000000"
* 20:23 milimetric: deployed the latest EventLogging master to eventlog1001
* 14:46 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 14:34 godog: upgrade cassandra on restbase1008
* 14:46 effie: uploading scap 4.0.2 - [[phab:T291095|T291095]]
* 14:30 godog: upgrade cassandra on restbase1004
* 14:45 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 14:22 godog: upgrade cassandra on restbase1003
* 14:39 brennen: gitlab: upgrade to 14.3.2 (note there was an additional patch release on 2021-10-01) complete ([[phab:T292256|T292256]])
* 13:00 akosiaris: disabled puppet on maps-test200X
* 14:25 Amir1: cleaning up wb_changes_subscription rows from closed wikis ([[phab:T292440|T292440]])
* 12:36 logmsgbot: aude@tin Synchronized arbitraryaccess.dblist: Enable arbitrary access on dewiki, frwiki, jawiki and s3 wikis (duration: 00m 12s)
* 14:24 brennen: gitlab: downtime for upgrade to 14.3.1
* 12:21 logmsgbot: aude@tin Synchronized wmf-config/Wikibase-labs.php: Add Wikisource badge config (duration: 00m 13s)
* 14:19 elukey: import AMD ROCm 4.3.1 packages in buster-wikimedia's thirdparty/amd-rocm431 - [[phab:T287267|T287267]]
* 12:20 logmsgbot: aude@tin Synchronized wmf-config/Wikibase-production.php: Add Wikisource badge config (duration: 00m 11s)
* 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:27 logmsgbot: jynus@tin Synchronized wmf-config/db-codfw.php: depool db2040 (duration: 00m 11s)
* 14:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:47 _joe_: restarted apertium-apy on sca1001 and sca1002, too many open files, probably leaking
* 14:13 ladsgroup@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:725905{{!}}Explicitly enable dispatching and pruning for wikidata (T48643)]] (duration: 00m 58s)
* 06:12 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Aug 12 06:12:31 UTC 2015 (duration 12m 30s)
* 14:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:54 matt_flaschen: Killed support desk conversion.  Will resume with profiling tomorrow.
* 14:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:45 bblack: starting slow "apt-get -y upgrade" on cp* (mostly, nginx -> +wmf2), will execute over ~18-24h
* 14:03 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T292256
* 03:34 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/229608 (duration: 00m 12s)
* 14:03 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T292256
* 03:06 ori: Installing xdebug on terbium so matt_flaschen can debug memory leak
* 14:01 ladsgroup@deploy1002: Synchronized wmf-config: Config: [[gerrit:725502{{!}}Enable dispatching via jobs everywhere (T48643)]] (duration: 01m 00s)
* 03:04 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf18) at 2015-08-12 03:04:47+00:00
* 12:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:58 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf18/cache/l10n: l10nupdate for 1.26wmf18 (duration: 11m 13s)
* 12:56 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725785{{!}}Enable dispatching for wikidatawiki and commonswiki (T292088)]] (duration: 01m 00s)
* 02:30 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf17) at 2015-08-12 02:30:14+00:00
* 12:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:26 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf17/cache/l10n: l10nupdate for 1.26wmf17 (duration: 06m 52s)
* 12:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:22 mutante: zirconium - shut down, i'm sure, mollyguard
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti2026.codfw.wmnet with reason: Ganeti tests
* 00:22 yurik: updated kartotherian
* 12:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti2026.codfw.wmnet with reason: Ganeti tests
* 00:09 mutante: restarted gitblit
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti2025.codfw.wmnet with reason: Ganeti tests
* 00:06 logmsgbot: ori@tin Synchronized php-1.26wmf17/extensions/Echo: Updated mediawiki/core Project: mediawiki/extensions/Echo  32e5bcf90c702 (duration: 00m 13s)
* 12:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti2025.codfw.wmnet with reason: Ganeti tests
* 00:06 logmsgbot: ori@tin Synchronized php-1.26wmf18/extensions/Echo: Updated mediawiki/core Project: mediawiki/extensions/Echo  3ab0b7e0f4948 (duration: 00m 12s)
* 12:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:02 matt_flaschen: Resumed convertLqtPageOnLocalWiki.php run on MediaWiki.org's Project:Support_desk.
* 11:55 urbanecm: EU B&C window done
* 11:55 urbanecm@deploy1002: Synchronized multiversion/MWWikiversions.php: {{Gerrit|508cf5cc6d213373f7c9ba1cdef142ebc8398022}}: Let DB expressions intersect DB lists ([[phab:T290609|T290609]]) (duration: 00m 58s)
* 11:50 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|a855078cf52d88cc2cd27a0adc7c6a680c80dd39}}: dewiki, nlwiki: Bump Growth features to 80% ([[phab:T288420|T288420]], [[phab:T285254|T285254]]) (duration: 00m 58s)
* 11:46 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: {{Gerrit|5728376}}: Update [[phab:T250887|T250887]] mitigations (duration: 00m 58s)
* 11:44 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b0a96bed4562bcc975187b1d34626201d407404b}}: Undeploy GettingStarted V: Remove now-obsolete logging channels ([[phab:T235752|T235752]]) (duration: 00m 59s)
* 11:42 urbanecm@deploy1002: Synchronized wmf-config/extension-list: {{Gerrit|9709bcfc8dacbcd1704471df08c31cec0711bea6}}: Undeploy GettingStarted IV: Dont build i18n ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 11:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d60f332785868797e7ecc9b5e410616d5604b392}}: Undeploy getting started III: Dont set wmgUseGettingStarted, now ignored ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 11:37 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|9eaf960c4b7c304be57dfc8d248aca0c6501d04c}}: Undeploy GettingStarted II: Dont load regardless of config ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 11:35 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1c7405ad1eb323a8da524819f17d6f1a66afaa57}}: Undeploy GettingStarted I: Disable on all wikis ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 11:31 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724992{{!}}Remove deprecated SectionTranslationTargetLanguage config (T290302)]] (duration: 00m 58s)
* 11:22 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725042{{!}}Add wikisource-bot.toolforge.org to Commons copy upload list (T292213)]] (duration: 00m 59s)
* 11:16 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720058{{!}}Add IA-Upload tool domains to Commons wgCopyUploadsDomains (T287241)]] (duration: 00m 59s)
* 11:12 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 11:10 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 11:07 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:06 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 11:04 effie: depool  wtp1026 for tests
* 11:04 effie: pool  wtp1025
* 10:59 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:13 akosiaris: hbal -L -G row_C -X on ganeti01.svc.eqiad.wmnet
* 08:59 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@071f7c3] (eqiad): Increase mirrored traffic to 100% for eqiad (duration: 00m 54s)
* 08:58 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@071f7c3] (eqiad): Increase mirrored traffic to 100% for eqiad
* 07:37 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc] (duration: 06m 14s)
* 07:31 joal@deploy1002: Started deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc]
* 07:30 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc] (duration: 00m 06s)
* 07:30 joal@deploy1002: Started deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc]
* 07:29 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc] (duration: 19m 18s)
* 07:19 dcausse: restarting blazegraph on wdqs2001 & wdqs2004 (allocators burning too quickly)
* 07:18 elukey: depool + restart blazegraph + restart updater for wdqs1006
* 07:18 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs1006.wmnet
* 07:18 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs1004.wmnet
* 07:10 joal@deploy1002: Started deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc]
* 07:02 godog: swift eqiad-prod: add weight to ms-be10[64-67] - [[phab:T290546|T290546]]
* 06:44 elukey: depool + restart blazegraph + restart updater on wdqs1004
* 05:50 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 05:49 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 05:47 ladsgroup@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .


== 2015-08-11 ==
== 2021-10-03 ==
* 23:19 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf18/extensions/Flow/: Sync Flow 1.26wmf18 for memory leaks (duration: 00m 14s)
* 14:45 _joe_: restarting acmechief on acmechief1001
* 23:08 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/230805/ (duration: 00m 12s)
* 12:55 kormat@cumin1001: dbctl commit (dc=all): 'Depool db1127, bad ram', diff saved to https://phabricator.wikimedia.org/P17414 and previous config saved to /var/cache/conftool/dbconfig/20211003-125530-kormat.json
* 23:05 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/230804/ (duration: 00m 12s)
* 08:24 elukey: powercycle cp5006 (unresponsive to ssh, remote tty available but not able to login as root, no prometheus metrics in hours)
* 21:03 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.26wmf18
* 08:23 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet
* 21:02 twentyafterfour: deployed scap fixes for my dumb mistakes
* 20:10 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/objectcache/MultiWriteBagOStuff.php: 0acfe6a5bb: Fix argument handling in MultiWriteBagOStuff::get() (duration: 00m 12s)
* 19:28 logmsgbot: ori@tin Synchronized php-1.26wmf18/includes/resourceloader/ResourceLoader.php: I2089b21fc: ResourceLoader: make "cacheReport" option false by default (duration: 00m 13s)
* 19:28 logmsgbot: ori@tin Synchronized php-1.26wmf17/includes/resourceloader/ResourceLoader.php: I2089b21fc: ResourceLoader: make "cacheReport" option false by default (duration: 00m 11s)
* 19:26 logmsgbot: catrope@tin Synchronized php-1.26wmf18/extensions/Flow/modules/editor/editors/visualeditor/mw.flow.ve.Target.js: Fix missing editor switcher (duration: 00m 12s)
* 18:58 logmsgbot: twentyafterfour@tin Finished scap: again: sync new branch 1.26wmf18 and update testwiki (duration: 04m 58s)
* 18:53 logmsgbot: twentyafterfour@tin Started scap: again: sync new branch 1.26wmf18 and update testwiki
* 18:44 logmsgbot: twentyafterfour@tin scap failed: OSError [Errno 1] Operation not permitted: '/srv/mediawiki-staging/wikiversions.php' (duration: 29m 27s)
* 18:37 mutante: grafana switched to node krypton (jessie/VM)
* 18:21 bd808: logstash log event volume back to normal levels following elasticsearch upgrade
* 18:15 logmsgbot: twentyafterfour@tin Started scap: sync new branch 1.26wmf18 and update testwiki
* 18:06 bd808: logstash cluster recovered after upgrade of elasticsearch on logstash1006
* 18:03 bd808: upgraded elasticsearch to 1.7.1 on logstash1006; logstash-2015.08.11 shard recovering
* 18:02 bd808: upgrading elasticsearch on logstash1006
* 18:01 bd808: logstash cluster recovered after upgrade of elasticsearch on logstash1005
* 17:43 bd808: log event volume in logstash dropped dramatically again; seems to correlate with final recovery of logstash-2015.08.11 shard
* 17:29 bd808: upgraded elasticsearch to 1.7.1 on logstash1005; logstash-2015.08.11 shard recovering
* 17:28 bd808: upgrading elasticsearch on logstash1005
* 17:27 bd808: logstash event volume recovered after restarting all 3 logstash services
* 17:14 bd808: log event volume in logstash dropped dramatically at 16:49; investigating
* 17:13 bd808: logstash cluster recovered after upgrade of elasticsearch on logstash1004
* 16:42 bd808: upgraded elasticsearch to 1.7.1 on logstash1004; logstash-2015.08.11 shard recovering
* 16:42 mutante: restarted Apache on Etherpad
* 16:38 bd808: upgraded elasticsearch to 1.7.1 on logstash1003
* 16:37 bd808: upgraded elaasticsearch to 1.7.1 on logstash1002
* 16:36 bd808: upgraded elaasticsearch to 1.7.1 on logstash1001
* 16:23 bd808: logstash upgrade on logstash1003 complete
* 16:20 bd808: logstash upgrade on logstash1002 complete
* 16:16 bd808: logstash upgrade on logstash1001 complete
* 15:50 jynus: nuking db1002-db1007 on icinga
* 15:49 bd808: upgrading logstash on logstash1001
* 15:47 bd808: Trebuchet deploy of logstash/plugins: Add logstash-filter-prune 0.1.5 (36144b2)
* 15:36 bd808: Disabled puppet on logstash100[1-3] in preparation for upgrade to 1.5.3
* 15:31 logmsgbot: bd808@tin Synchronized wmf-config/InitialiseSettings.php: logging: Only send info and higher to logstash by default (4388a84) 2/2 (actually rebased this time) (duration: 00m 11s)
* 15:30 logmsgbot: bd808@tin Synchronized wmf-config/logging.php: logging: Only send info and higher to logstash by default (4388a84) 1/2 (actually rebased this time) (duration: 00m 11s)
* 15:17 logmsgbot: bd808@tin Synchronized wmf-config/InitialiseSettings.php: Touched wmf-config/InitialiseSettings.php (duration: 00m 13s)
* 15:12 logmsgbot: bd808@tin Synchronized wmf-config/InitialiseSettings.php: logging: Only send info and higher to logstash by default (4388a84) 2/2 (duration: 00m 12s)
* 15:11 logmsgbot: bd808@tin Synchronized wmf-config/logging.php: logging: Only send info and higher to logstash by default (4388a84) 1/2 (duration: 00m 12s)
* 10:45 jynus: general maintenance on db1042 (restart, upgrade, db reconstruction)
* 10:38 godog: upgrade cassandra on restbase1007
* 10:31 godog: upgrade cassandra on restbase1002
* 10:25 godog: upgrade cassandra on restbase1001
* 09:56 paravoid: switched routing-system autonomous-system to eqiad's subAS on cr1-eqiad/cr2--eqiad
* 09:09 godog: reboot ms-be2009, cpu soft lockup
* 05:27 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Aug 11 05:27:33 UTC 2015 (duration 27m 32s)
* 02:27 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf17) at 2015-08-11 02:26:58+00:00
* 02:23 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf17/cache/l10n: l10nupdate for 1.26wmf17 (duration: 06m 48s)
* 01:20 logmsgbot: ori@tin Synchronized php-1.26wmf17/includes/resourceloader/ResourceLoader.php: I2089b21fc: Revert resourceloader: Add must-revalidate to Cache-Control (duration: 00m 12s)
* 00:10 mutante: apache restart on krypton


== 2015-08-10 ==
== 2021-10-02 ==
* 23:53 logmsgbot: ebernhardson@tin Synchronized wmf-config/CirrusSearch-common.php: Limit the number of states in a cirrussearch query (duration: 00m 11s)
* 17:28 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:44 logmsgbot: krenair@tin Synchronized php-1.26wmf17/extensions/TimedMediaHandler/TimedMediaIframeOutput.php: https://gerrit.wikimedia.org/r/#/c/230656/ (duration: 00m 12s)
* 16:10 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:40 logmsgbot: ori@tin Synchronized multiversion/MWMultiVersion.php: I511999: Convert multiversion scripts to use wikiversions.php (duration: 00m 12s)
* 23:11 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 12s)
* 23:07 logmsgbot: krenair@tin Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/228040/ (duration: 00m 14s)
* 23:04 ori: deployed scap a404a39b32... Build wikiversions.php in addition to wikiversions.cdb
* 22:22 mutante: restart gitblit
* 21:53 mutante: krypton unloaded mod proxy_balancer
* 21:30 awight: updated paymentswiki from af16d371f9c46d4f0b78986080f2a2be3226ace8 to 325640bd70680a08ae77fd117433565634a98d88
* 20:57 logmsgbot: ori Synchronized php-1.26wmf17/includes/cache/MessageCache.php: I2089b21fc: MessageCache: use APC for local caching, rather than files (duration: 00m 12s)
* 20:12 subbu: deployed parsoid version 7b554ce2f
* 19:57 yurik: synced new kartotherian
* 19:22 logmsgbot: ori Synchronized php-1.26wmf17/includes: I9a1aa76de: Moved ObjectCacheSessionHandler renewal logic to wfSetupSession() (duration: 00m 16s)
* 17:12 akosiaris: stopped postgres on maps-test200{2,3,4}
* 16:59 logmsgbot: ori Synchronized php-1.26wmf17/includes/OutputPage.php: I2089b21fc: Load mediawiki.legacy.commonPrint styles with a media type property (2/2) (duration: 00m 11s)
* 16:58 logmsgbot: ori Synchronized php-1.26wmf17/resources/Resources.php: I2089b21fc: Load mediawiki.legacy.commonPrint styles with a media type property (1/2) (duration: 00m 11s)
* 16:50 logmsgbot: ori Synchronized php-1.26wmf17/extensions/wikihiero: I2089b21fc: Updated mediawiki/core Project: mediawiki/extensions/wikihiero (duration: 00m 12s)
* 15:11 logmsgbot: thcipriani Synchronized php-1.26wmf17/extensions/ContentTranslation/api/ApiContentTranslationPublish.php: SWAT: Enable scrubWikitext=1 in HTML to wikitext conversion using parsoid [[gerrit:230381]] (duration: 00m 13s)
* 14:47 akosiaris: running an rsync from nas1001-a to local disks on helium
* 14:42 ottomata: restarted all varnishkafka instances to pick up proper confs (puppet should have done this!)
* 14:29 ottomata: starting upgrade of existing kafka cluster to 0.8.2.1 jessie - https://etherpad.wikimedia.org/p/kafka_0.8.2.1_migration2
* 12:38 bblack: deployed nginx-1.9.3-1+wmf2 to cp1065, cp1070, cp1071 (1x each text, upload, misc) for validation
* 11:16 logmsgbot: hoo Synchronized wmf-config/: Revert "Set dispatchBatchChunkFactor to 10 for now" (duration: 00m 12s)
* 11:16 logmsgbot: hoo Synchronized php-1.26wmf17/extensions/Wikidata/: Revert "Set dispatchBatchChunkFactor to 10 for now" (duration: 00m 20s)
* 09:35 paravoid: manually firewalled backup4001 TCP on neon to temporarily stop the nsca alert storm
* 09:12 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: depool db1042 for maintenance (duration: 00m 12s)
* 08:43 _joe_: manually running logrotate on iridium
* 07:44 godog: reboot ms-be2006, xfs hosed
* 07:27 akosiaris: rebooting backup4001
* 07:20 jynus: schema change on testwikidatawiki
* 06:52 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Increase db1035 weight (duration: 00m 13s)
* 05:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Aug 10 05:31:29 UTC 2015 (duration 31m 28s)
* 03:13 bblack: restarted apache2 on iridium JIC
* 03:13 bblack: rm /var/log/apache2/phabricator_access.log.1 on iridium (disk full, fixed for now)
* 02:23 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf17) at 2015-08-10 02:23:47+00:00
* 02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf17/cache/l10n: l10nupdate for 1.26wmf17 (duration: 06m 43s)


== 2015-08-09 ==
== 2021-10-01 ==
* 17:57 urandom: issuing nodetool cleanup on restbase1006
* 23:19 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 12:56 logmsgbot: hoo Synchronized php-1.26wmf17/extensions/Wikidata/: Set dispatchBatchChunkFactor to 10 for now (duration: 00m 20s)
* 22:27 mutante: puppetmaster2001 - systemctl reset-failed
* 06:20 twentyafterfour: restarted phabricator phd (just in case - the full partition may have caused the daemons to be in a broken state)
* 22:16 mutante: puppetmaster2001 systemctl disable geoip_update_ipinfo.timer
* 06:17 twentyafterfour: moved some log files on iridium into /srv/logs to free space on /
* 22:15 mutante: puppetmaster2001 - sudo /usr/local/bin/geoipupdate_job after adding new shell command and timer - succesfully downloaded enterprise database for [[phab:T288844|T288844]]
* 05:12 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Aug  9 05:12:23 UTC 2015 (duration 12m 22s)
* 21:56 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 02:23 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf17) at 2015-08-09 02:23:22+00:00
* 21:44 mutante: puppetmasters - temp. disabling puppet one more time, now for a different deploy, to fetch an additional MaxMind database - [[phab:T288844|T288844]]
* 02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf17/cache/l10n: l10nupdate for 1.26wmf17 (duration: 06m 32s)
* 21:19 mutante: puppetmaster2001 - puppet removed cron sync_volatile and cron sync_ca - starting and verifying new timers: 'systemctl status sync-puppet-volatile', 'systemctl status sync-puppet-ca' [[phab:T273673|T273673]]
* 21:12 mutante: puppetmaster1002, puppetmaster1003, puppetmaster2002, puppetmaster2003: re-enabled puppet, they are backends. backends don't have the sync cron/job/timer, so noop as well, just like 1004/1005/2004/2005. this just leaves the actual change on 2001  - [[phab:T273673|T273673]]
* 21:07 mutante: puppetmaster1004, puppetmaster1005, puppetmaster2004, puppetmaster2005: re-enabled puppet, they are "insetup" role
* 21:06 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@d309a6e] (eqiad): tegola: reduce load to 50% during the weekend (duration: 00m 54s)
* 21:05 mbsantos@deploy1002: Started deploy [kartotherian/deploy@d309a6e] (eqiad): tegola: reduce load to 50% during the weekend
* 21:05 mutante: puppetmaster1001 - re-enabled puppet, noop as expected, the passive host pulls from the active one, so only 2001 has the cron/job/timer
* 21:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:01 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Revert "Have PdfHandler use Shellbox on Commons for 10% of requests" (duration: 00m 59s)
* 20:58 mutante: temp disabling puppet on puppetmasters - deploying gerrit:724115 (gerrit:723310) [[phab:T273673|T273673]]
* 18:58 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-db1002.eqiad.wmnet with reason: REIMAGE
* 18:56 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-db1001.eqiad.wmnet with reason: REIMAGE
* 18:55 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-db1002.eqiad.wmnet with reason: REIMAGE
* 18:53 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-db1001.eqiad.wmnet with reason: REIMAGE
* 18:07 robh@cumin1001: END (ERROR) - Cookbook sre.experimental.reimage (exit_code=97) for host an-db1001.eqiad.wmnet
* 18:05 robh@cumin1001: START - Cookbook sre.experimental.reimage for host an-db1001.eqiad.wmnet
* 17:58 effie: depool mw1025, mw1319, mw1312 for test
* 16:20 dancy: testing upcoming Scap 4.0.2 release on beta
* 14:04 bblack: C:envoyproxy (appservers and others): restarting envoyproxy
* 14:04 bblack: C:envoyproxy (appservers and others): ca-certificates updated via cumin to workaround [[phab:T292291|T292291]] issues
* 13:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:45 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:23 bblack: manually trying LE expired root workaround on mwdebug1001 with puppet disabled ...
* 13:12 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 13:11 gehel@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 13:11 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 13:10 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 11:42 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:11 jynus: manually migrating some vms out of ganeti1009 to avoid excessive memory pressure
* 10:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17413 and previous config saved to /var/cache/conftool/dbconfig/20211001-105849-root.json
* 10:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17412 and previous config saved to /var/cache/conftool/dbconfig/20211001-105735-root.json
* 10:43 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@d4caf6d] (eqiad): Increase mirrored traffic to 100% for eqiad (duration: 00m 49s)
* 10:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17411 and previous config saved to /var/cache/conftool/dbconfig/20211001-104345-root.json
* 10:43 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@d4caf6d] (eqiad): Increase mirrored traffic to 100% for eqiad
* 10:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17410 and previous config saved to /var/cache/conftool/dbconfig/20211001-104232-root.json
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17409 and previous config saved to /var/cache/conftool/dbconfig/20211001-102841-root.json
* 10:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17408 and previous config saved to /var/cache/conftool/dbconfig/20211001-102728-root.json
* 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17407 and previous config saved to /var/cache/conftool/dbconfig/20211001-101338-root.json
* 10:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17406 and previous config saved to /var/cache/conftool/dbconfig/20211001-101224-root.json
* 10:00 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@c123ab9] (eqiad): Increase mirrored traffic to 80% for eqiad (duration: 00m 51s)
* 09:59 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@c123ab9] (eqiad): Increase mirrored traffic to 80% for eqiad
* 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17405 and previous config saved to /var/cache/conftool/dbconfig/20211001-095834-root.json
* 09:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17404 and previous config saved to /var/cache/conftool/dbconfig/20211001-095720-root.json
* 09:55 marostegui: Upgrade db1164 and db1177
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1177 and db1164 for upgrade', diff saved to https://phabricator.wikimedia.org/P17403 and previous config saved to /var/cache/conftool/dbconfig/20211001-095433-marostegui.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17402 and previous config saved to /var/cache/conftool/dbconfig/20211001-094913-root.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17401 and previous config saved to /var/cache/conftool/dbconfig/20211001-094902-root.json
* 09:38 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force # to get an idea about timing for [[phab:T290609|T290609]], runs in a tmux session under my account
* 09:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17400 and previous config saved to /var/cache/conftool/dbconfig/20211001-093410-root.json
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17399 and previous config saved to /var/cache/conftool/dbconfig/20211001-093358-root.json
* 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2002.codfw.wmnet
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17398 and previous config saved to /var/cache/conftool/dbconfig/20211001-091906-root.json
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17397 and previous config saved to /var/cache/conftool/dbconfig/20211001-091854-root.json
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17396 and previous config saved to /var/cache/conftool/dbconfig/20211001-090402-root.json
* 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17395 and previous config saved to /var/cache/conftool/dbconfig/20211001-090351-root.json
* 09:02 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2002.codfw.wmnet
* 09:00 _joe_: restarting pybal low-traffic in eqiad to pick up the drop of proxyfetch to kubernetes services
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17394 and previous config saved to /var/cache/conftool/dbconfig/20211001-084859-root.json
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17393 and previous config saved to /var/cache/conftool/dbconfig/20211001-084847-root.json
* 08:44 marostegui: Upgrade db1135 and db1172
* 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1172 for upgrade', diff saved to https://phabricator.wikimedia.org/P17392 and previous config saved to /var/cache/conftool/dbconfig/20211001-084435-marostegui.json
* 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1135 for upgrade', diff saved to https://phabricator.wikimedia.org/P17391 and previous config saved to /var/cache/conftool/dbconfig/20211001-084411-marostegui.json
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2080 [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17390 and previous config saved to /var/cache/conftool/dbconfig/20211001-084345-marostegui.json
* 08:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 08:15 _joe_: restarting pybal in codfw to pick up config changes
* 08:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on testvm[2001,2003].codfw.wmnet with reason: Ganeti tests
* 08:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 8:00:00 on testvm[2001,2003].codfw.wmnet with reason: Ganeti tests
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17388 and previous config saved to /var/cache/conftool/dbconfig/20211001-062846-root.json
* 06:27 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 06:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17387 and previous config saved to /var/cache/conftool/dbconfig/20211001-062453-root.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17386 and previous config saved to /var/cache/conftool/dbconfig/20211001-061342-root.json
* 06:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17385 and previous config saved to /var/cache/conftool/dbconfig/20211001-060949-root.json
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17384 and previous config saved to /var/cache/conftool/dbconfig/20211001-055838-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17383 and previous config saved to /var/cache/conftool/dbconfig/20211001-055445-root.json
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17382 and previous config saved to /var/cache/conftool/dbconfig/20211001-054335-root.json
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17381 and previous config saved to /var/cache/conftool/dbconfig/20211001-053942-root.json
* 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17380 and previous config saved to /var/cache/conftool/dbconfig/20211001-052831-root.json
* 05:26 marostegui: Upgrade db1114
* 05:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114 for upgrade', diff saved to https://phabricator.wikimedia.org/P17379 and previous config saved to /var/cache/conftool/dbconfig/20211001-052509-marostegui.json
* 05:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17378 and previous config saved to /var/cache/conftool/dbconfig/20211001-052438-root.json
* 05:22 marostegui: Upgrade db1119
* 05:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 for upgrade', diff saved to https://phabricator.wikimedia.org/P17377 and previous config saved to /var/cache/conftool/dbconfig/20211001-052133-marostegui.json
* 04:00 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Have PdfHandler use Shellbox on Commons for 10% of requests ([[phab:T289228|T289228]]) (duration: 00m 59s)
* 04:00 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:24 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 03:15 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .


== 2015-08-08 ==
== 2021-09-30 ==
* 14:58 urandom: issuing nodetool cleanup on restbase1005
* 23:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:57 urandom: issuing nodetool cleanup on restbase1007
* 23:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:30 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Aug  8 05:30:22 UTC 2015 (duration 30m 21s)
* 23:51 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Put a https protocol into values (duration: 01m 00s)
* 04:20 urandom: issuing nodetool cleanup on restbase1008
* 23:48 dpifke@deploy1002: Finished deploy [statsv/statsv@afeff42]: Deploy statsv with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 00m 05s)
* 03:00 logmsgbot: hoo Synchronized php-1.26wmf17/extensions/Wikidata/: Hack: Don't write change rows where LENGTH(change_info) > 65500 (duration: 00m 21s)
* 23:48 dpifke@deploy1002: Started deploy [statsv/statsv@afeff42]: Deploy statsv with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 02:26 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf17) at 2015-08-08 02:26:20+00:00
* 23:41 dpifke@deploy1002: Finished deploy [performance/coal@1be49f8]: Deploy Coal with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 01m 07s)
* 02:23 logmsgbot: l10nupdate Synchronized php-1.26wmf17/cache/l10n: l10nupdate for 1.26wmf17 (duration: 06m 20s)
* 23:40 dpifke@deploy1002: Started deploy [performance/coal@1be49f8]: Deploy Coal with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 01:53 hoo: Deleted change 237365841 as well
* 23:39 dpifke@deploy1002: Finished deploy [performance/navtiming@29264fb]: Deploy Navtiming with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 00m 05s)
* 01:37 hoo: Deleted changes 237357747 and 237363245 from wikidata's wb_changes
* 23:39 dpifke@deploy1002: Started deploy [performance/navtiming@29264fb]: Deploy Navtiming with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 01:32 logmsgbot: ori Synchronized php-1.26wmf17/resources/src/mediawiki.legacy/wikibits.js: I664ba9b0af: Override document.writeln to prevent it from blanking pages (duration: 00m 13s)
* 23:34 ejegg: updated Fundraising CiviCRM from {{Gerrit|d4da344274}} to {{Gerrit|d74e9aa0a1}}
* 22:09 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 22:07 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 22:06 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 21:53 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 21:06 eileen: civicrm revision changed from {{Gerrit|2ecb8f0bcd}} to {{Gerrit|d4da344274}}, config revision is {{Gerrit|77cb7ec866}}
* 20:54 ryankemper: [WCQS] `ryankemper@wcqs1003:~$ sudo pool` (merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/725110 to unbreak readiness probe)
* 20:54 topranks: Routinator on rpki1001 upgraded to  0.10.0 and working again after force refresh.
* 20:49 brennen: gitlab1001: upgrade to 14.2.5 complete
* 20:32 brennen: gitlab2001, gitlab1001: downtime for upgrades to 14.2.5
* 20:18 ryankemper: [WCQS] `ryankemper@wcqs1003:~$ sudo depool` (not sure why pybal can't depool it, the other 2 servers are pooled)
* 19:51 topranks: Updating routinator on rpki1001 [[phab:T291543|T291543]]
* 19:39 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:37 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:14 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.2  refs [[phab:T281166|T281166]]
* 19:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:07 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/MobileFrontend: Backport: [[gerrit:724979{{!}}Fix search within pages alignment (T292107)]] (duration: 01m 09s)
* 19:05 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/EventBus/includes/EventBus.php: Backport: [[gerrit:724481{{!}}Guard against undefined index notice when setting x-client-ip (T288853)]] (duration: 01m 09s)
* 19:04 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/EventBus/includes/EventBus.php: Backport: [[gerrit:724480{{!}}Guard against undefined index notice when setting x-client-ip (T288853)]] (duration: 01m 09s)
* 18:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:58 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/skins/Vector/resources/skins.vector.styles.legacy/components/MenuDropdown.less: Backport: [[gerrit:724798{{!}}Restore original more menu padding in legacy Vector (T289163)]] (duration: 01m 08s)
* 18:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:43 thcipriani@deploy1002: Scap failed!: Call to mwscript eval.php stderr: not empty
* 18:42 moritzm: imported gitlab 14.2.5 to thirdparty/gitlab [[phab:T292219|T292219]]
* 18:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:38 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704167{{!}}Use Wikimania's logo in a new vector (T286405)]] Part III (duration: 01m 07s)
* 18:37 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikimania-wordmark.svg: Config: [[gerrit:704167{{!}}Use Wikimania's logo in a new vector (T286405)]] Part II (duration: 01m 07s)
* 18:35 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikimania.svg: Config: [[gerrit:704167{{!}}Use Wikimania's logo in a new vector (T286405)]] part I (duration: 01m 07s)
* 18:32 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:31 thcipriani@deploy1002: Synchronized wmf-config: Config: [[gerrit:724514{{!}}Enable sticky header on beta cluster (T289721)]] (duration: 01m 08s)
* 18:29 cmooney@cumin1001: START - Cookbook sre.dns.netbox
* 18:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:27 otto@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts thorium.eqiad.wmnet
* 18:22 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:20 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724861{{!}}Disable legacy media dom on a few more wikis (T51097)]] (duration: 01m 08s)
* 18:07 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:49 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 17:49 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 17:49 otto@cumin1001: START - Cookbook sre.hosts.decommission for hosts thorium.eqiad.wmnet
* 17:42 bstorm: updating packages for thirdparty/kubeadm-k8s-1-20 and thirdparty/kubeadm-k8s-1-19 in stretch-wikimedia on apt1001 [[phab:T292131|T292131]]
* 17:09 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad (duration: 00m 55s)
* 17:08 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad
* 17:03 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad (duration: 00m 08s)
* 17:02 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad
* 17:02 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 17:00 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad (duration: 00m 11s)
* 17:00 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@8fbf87c] (eqiad): Increase mirrored traffic to 50% for eqiad
* 16:49 sukhe: restart dnsdist.service on doh[1001-1002,2001-2002,3001-3002,4001-4002,5001-5002].wikimedia.org
* 16:43 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@67a4d22]: Increase mirrored traffic to 10% (duration: 02m 33s)
* 16:40 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@67a4d22]: Increase mirrored traffic to 10%
* 16:38 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@67a4d22] (eqiad): Increase mirrored traffic to 10% (duration: 00m 40s)
* 16:37 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@67a4d22] (eqiad): Increase mirrored traffic to 10%
* 16:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:32 hnowlan: Ran `GRANT pg_monitor TO prometheus` for maps in eqiad and codfw to fix empty prometheus connection metrics
* 16:30 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@67a4d22] (eqiad): Increase mirrored traffic to 10% (duration: 00m 16s)
* 16:30 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@67a4d22] (eqiad): Increase mirrored traffic to 10%
* 16:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:11 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725032{{!}}Disable jQuery migrate in metawiki (T280944)]] (duration: 01m 09s)
* 16:08 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725019{{!}}Enable dispatching via job to 10 prod wikis]] (duration: 01m 09s)
* 15:55 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:36 elukey: drop /etc/helmfile-defaults/private/backup_old_paths from deploy1002 (old data not needed anymore)
* 14:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17374 and previous config saved to /var/cache/conftool/dbconfig/20210930-143325-root.json
* 14:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17373 and previous config saved to /var/cache/conftool/dbconfig/20210930-143044-root.json
* 14:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17372 and previous config saved to /var/cache/conftool/dbconfig/20210930-141822-root.json
* 14:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17370 and previous config saved to /var/cache/conftool/dbconfig/20210930-141540-root.json
* 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17369 and previous config saved to /var/cache/conftool/dbconfig/20210930-140318-root.json
* 14:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17368 and previous config saved to /var/cache/conftool/dbconfig/20210930-140037-root.json
* 13:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17367 and previous config saved to /var/cache/conftool/dbconfig/20210930-134815-root.json
* 13:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17366 and previous config saved to /var/cache/conftool/dbconfig/20210930-134533-root.json
* 13:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2001.codfw.wmnet
* 13:40 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 13:38 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 13:38 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 13:37 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 13:36 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 13:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1111 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17365 and previous config saved to /var/cache/conftool/dbconfig/20210930-133311-root.json
* 13:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1134 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17364 and previous config saved to /var/cache/conftool/dbconfig/20210930-133029-root.json
* 13:29 marostegui: Upgrade db1111
* 13:28 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1111 for upgrade', diff saved to https://phabricator.wikimedia.org/P17363 and previous config saved to /var/cache/conftool/dbconfig/20210930-132831-marostegui.json
* 13:27 marostegui: Upgrade db1134
* 13:27 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1134 (s1) for upgrade', diff saved to https://phabricator.wikimedia.org/P17362 and previous config saved to /var/cache/conftool/dbconfig/20210930-132700-marostegui.json
* 13:26 marostegui: Upgrade db1133
* 13:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2001.codfw.wmnet
* 13:02 urbanecm: Start server-side upload for 2 video files ([[phab:T292096|T292096]], [[phab:T291492|T291492]])
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17361 and previous config saved to /var/cache/conftool/dbconfig/20210930-130116-root.json
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17360 and previous config saved to /var/cache/conftool/dbconfig/20210930-130109-root.json
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17359 and previous config saved to /var/cache/conftool/dbconfig/20210930-124612-root.json
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17358 and previous config saved to /var/cache/conftool/dbconfig/20210930-124606-root.json
* 12:31 Reedy: downloading files for [[phab:T290900|T290900]] in screen on mwmaint1002
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17357 and previous config saved to /var/cache/conftool/dbconfig/20210930-123109-root.json
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17356 and previous config saved to /var/cache/conftool/dbconfig/20210930-123101-root.json
* 12:18 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 17s)
* 12:18 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 12:17 moritzm: adapted MX records to point to both mx1001.wikimedia.org and mx2001.wikimedia.org with equal weights [[phab:T286911|T286911]]
* 12:17 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 16s)
* 12:16 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17355 and previous config saved to /var/cache/conftool/dbconfig/20210930-121605-root.json
* 12:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17354 and previous config saved to /var/cache/conftool/dbconfig/20210930-121558-root.json
* 12:14 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 15s)
* 12:13 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 12:13 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 15s)
* 12:13 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 12:11 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 10s)
* 12:10 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 12:10 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors (duration: 00m 01s)
* 12:10 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@35b9174]: tegola: remove mirror_threshold variable because of parsing errors
* 12:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3311 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17353 and previous config saved to /var/cache/conftool/dbconfig/20210930-120102-root.json
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1099:3318 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17352 and previous config saved to /var/cache/conftool/dbconfig/20210930-120054-root.json
* 12:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:58 hnowlan: imported wikidiff2_1.13.0-1/php-wikidiff2_1.13.0-1_amd64.deb to buster-wikimedia component/php72
* 11:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1099 (s1 and s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17351 and previous config saved to /var/cache/conftool/dbconfig/20210930-115631-marostegui.json
* 11:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:47 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 03s)
* 11:47 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 11:47 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 01s)
* 11:47 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 11:47 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:46 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 01s)
* 11:46 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 11:44 effie: downgrading scap to 3.17.1-1 on maps* hosts - [[phab:T291990|T291990]]
* 11:43 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724732{{!}}Make reply tool available as opt-out almost everywhere (phase 3) (T288485)]] (duration: 01m 07s)
* 11:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:35 kartik@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/DiscussionTools: Backport: [[gerrit:724789{{!}}Add a link to preferences within the Reply and New Discussion Tools (T291002)]] (duration: 01m 08s)
* 11:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:30 kartik@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/DiscussionTools: Backport: [[gerrit:724788{{!}}Add a link to preferences within the Reply and New Discussion Tools (T291002)]] (duration: 01m 09s)
* 11:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:14 kartik@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724458{{!}}Enable SectionTranslation in Igbo, Hausa, Yoruba Wikipedias (T290175)]] (duration: 01m 08s)
* 11:05 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 11:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 10:13 akosiaris: upgrade znuny to 6.0.37
* 10:06 godog: test bounce logstash on logstash1023
* 08:21 moritzm: installing nettle security updates on stretch
* 08:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2003.codfw.wmnet
* 07:49 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2003.codfw.wmnet
* 07:31 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin (duration: 00m 06s)
* 07:31 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: tegola: use eqiad discovery endpoin
* 07:03 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 06:58 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 06:56 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 06:48 marostegui: Deploy schema change on s8 codfw (lag will show up) [[phab:T270620|T270620]]
* 06:01 marostegui: Deploy schema change on s1 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:53 marostegui: Deploy schema change on s3 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:52 marostegui: Deploy schema change on s7 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:47 marostegui: Deploy schema change on s5 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:45 marostegui: Deploy schema change on s4 codfw (lag will show up) [[phab:T270620|T270620]]
* 05:45 marostegui: Deploy schema change on s2 codfw (lag will show up) [[phab:T270620|T270620]]


== 2015-08-07 ==
== 2021-09-29 ==
* 21:00 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/230227/ - should be a noop for prod (duration: 00m 12s)
* 23:20 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 19:40 logmsgbot: hoo Synchronized wmf-config/: Bump wgCacheEpoch for Wikidata (duration: 00m 13s)
* 23:05 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 19:29 logmsgbot: hoo Synchronized php-1.26wmf17/extensions/Wikidata/: Update Wikibase: Fix WB spinner, UnresolvedRedirectException handling on client (duration: 00m 21s)
* 23:02 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:08 gwicke: switched restbase1001 to CMS temporarily, to gather metrics; will switch back to G1GC tonight
* 22:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:09 yurik: synced kartotherian to maps-test* servers again, restarted the service
* 22:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:00 jynus: repool db1035 database with low traffic
* 21:57 legoktm@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/timeline/includes/Timeline.php: Catch TimelineException from fixMap() ([[phab:T292126|T292126]]