You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Labslogbot
(holmium: apt-get clean for a little disk space - /var/log/designate/designate-mdns.log is more than half the size of / - needs logrotate (mutante))
imported>Stashbot
(catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Create alias for Appendix and Appendix_talk namespaces on mywiktionary (T291146) (duration: 00m 55s))
Line 1: Line 1:
== 2015-10-02 ==
== 2021-10-25 ==
* 22:46 mutante: holmium: apt-get clean for a little disk space - /var/log/designate/designate-mdns.log is more than half the size of / - needs logrotate
* 23:12 catrope@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Create alias for Appendix and Appendix_talk namespaces on mywiktionary ([[phab:T291146|T291146]]) (duration: 00m 55s)
* 21:33 Krinkle: mwscript cleanupRemovedModules.php --wiki zhwiktionary
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:21 Krinkle: mwscript cleanupRemovedModules.php --wiki testwikidatawiki
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:15 Krinkle: mwscript cleanupRemovedModules.php --wiki dewiki
* 22:57 ryankemper: [wcqs] Downtimed `wcqs*` until roughly a week from now (while we setup oauth)
* 21:15 Krinkle: mwscript cleanupRemovedModules.php --wiki nlwiki
* 22:53 legoktm: uploaded PHP 7.4.25 to apt.wm.o (DSA-4992-1)
* 21:15 Krinkle: mwscript cleanupRemovedModules.php --wiki test2wiki
* 22:44 ryankemper@deploy1002: Started deploy [wdqs/wdqs@e908052] (wcqs): Deploy 0.3.90 to WCQS
* 21:06 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 17s)
* 22:30 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 03m 04s)
* 18:39 ottomata: kafka preferred-replica-election
* 22:27 ryankemper@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 17:53 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.1/includes/resourceloader: I680f3fda66c5: Configure ResourceLoader-specific ObjectCache instance (duration: 00m 17s)
* 21:53 mutante: new project language "pwn" added - Paiwan is a native language of Taiwan, spoken by the Paiwan, a Taiwanese indigenous people. [[phab:T292415|T292415]]
* 17:53 ottomata: rolling restart of all hadoop-yarn-nodemanagers to pick up python3 + spark fix
* 21:52 mutante: new project language "ami" added - Sowal no 'Amis is the Formosan language of the 'Amis (or Ami), an indigenous people living along the east coast of Taiwan. - [[phab:T292414|T292414]]
* 17:12 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: I8318fe892: Configure MultiWriteBagOStuff for ResourceLoader (duration: 00m 17s)
* 21:50 mutante: log authdns1001 (DNS) - sudo authdns-update, add new project language "ami" (Amis) for [[phab:T292414|T292414]] - edited langlist.tmpl which regenerates all project zones
* 16:29 ori: deployed grafana 8e92884bae (with backport of upstream fb9f9548829f2d4cecf35cda933700e5c2fa1bd6)
* 21:40 mutante: authdns1001 (DNS) - sudo authdns-update, add new project language "pwn" (Paiwan) for [[phab:T292415|T292415]]
* 15:13 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.1/resources/src/mediawiki/mediawiki.js: I8029208: Don't clobber existing styles when adding more in IE9 (duration: 00m 17s)
* 19:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on mw2255.codfw.wmnet with reason: DRAC upgrade
* 14:37 jynus: providing extra grants to wiki db users for heartbeat monitoring
* 19:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on mw2255.codfw.wmnet with reason: DRAC upgrade
* 14:32 Coren: Change of NFS mount options in labs pushed - puppet may report failures to refresh the mounts (once) on instances; expected and harmless.
* 19:47 mutante: mw2255 - depooled=inactive (incl "dsh groups"), shut down physically for [[phab:T283582|T283582]] - can be worked on anytime
* 11:03 moritzm: installed rpcbind security updates on all Ubuntu servers which runs it (jessie was already updated, since the DSA was released earlier)
* 19:45 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2255.codfw.wmnet
* 10:53 jynus: restarting HHVM on mw1130
* 19:45 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2255.codfw.wmnet
* 08:45 hashar: restarting Nodepool to take in account changes made to the logging configuration https://gerrit.wikimedia.org/r/#/c/240986/
* 19:42 mutante: icinga - ACKing all unhandled CRIT alerts on hosts with "dev" or "test" in their name, regardless of notifications being disabled or not. just so that we get more signal than noise in actual unhandled CRITs in web UI
* 06:07 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Oct  2 06:07:05 UTC 2015 (duration 7m 4s)
* 19:40 mutante: cumin2002 - sudo systemctl reset-failed to clear Icinga alert about failed but (now) non-existing service database-backups-snapshots.service, assuming it's a case of "only in active DC"
* 02:42 logmsgbot: tstarling@tin Synchronized php-1.27.0-wmf.1/extensions/ParsoidBatchAPI: stats (duration: 00m 17s)
* 19:12 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1112.eqiad.wmnet with reason: hardware fail
* 02:41 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.27.0-wmf.1) at 2015-10-02 02:41:33+00:00
* 19:12 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on db1112.eqiad.wmnet with reason: hardware fail
* 02:36 logmsgbot: l10nupdate@tin Synchronized php-1.27.0-wmf.1/cache/l10n: l10nupdate for 1.27.0-wmf.1 (duration: 08m 04s)
* 19:07 kormat@cumin1001: dbctl commit (dc=all): 'Temporarily move mw groups to db1123 [[phab:T294295|T294295]]', diff saved to https://phabricator.wikimedia.org/P17597 and previous config saved to /var/cache/conftool/dbconfig/20211025-190717-kormat.json
* 00:44 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.1/includes/resourceloader: I21bb3f08e7f and follow-ups (duration: 00m 18s)
* 19:06 mutante: db1112 - powercycling
* 00:43 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.1/vendor: Add wikimedia/relpath 1.0.3 (duration: 00m 21s)
* 19:04 legoktm@cumin1001: dbctl commit (dc=all): 'Depool db1112 ([[phab:T294295|T294295]])', diff saved to https://phabricator.wikimedia.org/P17596 and previous config saved to /var/cache/conftool/dbconfig/20211025-190436-legoktm.json
* 18:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:40 jforrester@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/timeline/includes/Timeline.php: Backport: [[gerrit:734312{{!}}Input may be null when rendering a self-closing tag `<timeline />` (T294020)]] (duration: 00m 55s)
* 18:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:24 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732971{{!}}Fix some easy codestyle issues]] (duration: 00m 55s)
* 18:22 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:732971{{!}}Fix some easy codestyle issues]] (duration: 00m 54s)
* 18:19 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732840{{!}}Fix array declaration of NS_USER_TALK abbreviation on ruwikiquote (T197058)]] (duration: 00m 55s)
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:15 jforrester@deploy1002: Synchronized wmf-config/flaggedrevs.php: Config: [[gerrit:732836{{!}}flaggedrevs: Drop legacy wgFlaggedRevsStatsAge config, no longer read]] (duration: 00m 55s)
* 18:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:12 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732254{{!}}Make reply tool available as opt-out on frwiki (T293687)]] (duration: 00m 56s)
* 17:41 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2253.codfw.wmnet
* 17:40 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2253.codfw.wmnet
* 17:39 mutante: mw2253 - scap pull after hw maintenance is over
* 17:32 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:24 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:23 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:22 XioNoX: update core routers ACLs
* 17:20 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 16:49 XioNoX: update management routers ACLs
* 16:36 XioNoX: DNS - Add eqsin-ulsfo transport v6 prefix - [[phab:T273308|T273308]]
* 16:31 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:28 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 16:25 accraze@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 16:25 mmandere@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:21 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 16:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:10 dzahn@cumin1001: conftool action : set/pooled=inactive; selector: name=mw2253.codfw.wmnet
* 16:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:08 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:734298{{!}}Empty wikibase disabled access entity types on Beta (T294159)]] (beta-only) (duration: 01m 47s)
* 16:04 mmandere@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:01 mmandere@cumin2002: START - Cookbook sre.dns.netbox
* 15:57 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:734328{{!}} Bumping portals to master (T128546)]] (duration: 01m 52s)
* 15:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:49 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:734328{{!}} Bumping portals to master (T128546)]] (duration: 01m 54s)
* 15:46 jbond: upgrade cas/idp to 6.4.2
* 14:56 mutante: mw2253 - shut down and downtimed for 2 days
* 14:50 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on mw2253.codfw.wmnet with reason: DRAC upgrade
* 14:50 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on mw2253.codfw.wmnet with reason: DRAC upgrade
* 14:49 mutante: depooling mw2253 for DRAC upgrade ([[phab:T283582|T283582]])
* 14:48 dzahn@cumin1001: conftool action : set/pooled=no; selector: name=mw2253.codfw.wmnet
* 14:45 jbond: update cas package
* 14:31 marostegui: Deploy schema change on s3 codfw - [[phab:T291719|T291719]]
* 12:04 ema: cp3062: upgrade varnish to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 11:57 ema: deployment-cache-text06: upgrade varnish to 6.0.8-1wm2 [[phab:T293879|T293879]]
* 11:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 Lucas_WMDE: UTC morning backport+config window done
* 11:22 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732969{{!}}Remove dispatchLagToMaxLagFactor Wikibase setting (T292604)]] (duration: 00m 54s)
* 11:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:18 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732951{{!}}Remove wikibaseDispatchRedisLockManager config (T292604)]] (duration: 00m 54s)
* 11:14 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732950{{!}}Remove wmg variables for dispatchChanges.php Wikibase settings (T292604)]] (duration: 00m 55s)
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732949{{!}}Remove dispatchChanges.php-related Wikibase settings (T292604)]] (duration: 00m 55s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:05 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:732372{{!}}Remove dispatchViaJobs-related Wikibase settings (T291828)]] (duration: 00m 56s)
* 09:52 godog: bounce uwsgi graphite web on graphite2003 - [[phab:T294220|T294220]]
* 09:52 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:48 volans@cumin1001: START - Cookbook sre.dns.netbox
* 09:43 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:733089{{!}}[BETA CLUSTER] Enable WikibaseLexeme Scribunto access (T294159)]] (merged on Friday, syncing now to avoid outdated files even if it’s just -labs.php) (duration: 00m 55s)
* 09:18 godog: bounce graphite-web on graphite2003 to test timeout bump - [[phab:T294220|T294220]]
* 08:08 XioNoX: merge DNS changes to add drmrs
* 07:50 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:50 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 05:47 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,cluster=parsoid,name=wtp1026.*
* 05:43 _joe_: pooling wtp1042 [[phab:T294212|T294212]]
* 05:26 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1109.eqiad.wmnet with OS buster
* 05:01 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1109.eqiad.wmnet with OS buster
* 04:30 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1109 (s8) for reimage [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17590 and previous config saved to /var/cache/conftool/dbconfig/20211025-043028-marostegui.json


== 2015-10-01 ==
== 2021-10-23 ==
* 22:04 RoanKattouw: Running FlowFixLinks.php on all wikis
* 16:40 dcausse: restarting blazegraph on wdqs1004 and wdqs1006 (free allocators alert)
* 20:33 subbu: deployed parsoid version 62971510b
* 15:45 urbanecm: Start server-side upload for 1 video file ([[phab:T289781|T289781]]), testing whether [[phab:T291137|T291137]] is still an issue
* 19:33 twentyafterfour: re-deployed robots.txt patch and restarted apache on iridium (to expand the phabricator robots.txt)
* 18:22 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.1/includes/resourceloader: Ic1d802ee2: ResourceLoader: cache minified user and site modules (duration: 00m 17s)
* 18:04 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedia wikis to 1.27.0-wmf.1
* 16:40 ottomata: brought analytics1049 back into hadoop after missing a disk since sept. 25th
* 16:34 andrewbogott: test log
* 16:26 moritzm: installed PHP security updates on all precise/trusty systems (the respective DSA for jessie is already deployed, it was released three weeks ago)
* 16:26 valhallasw`cloud: testing, testing
* 15:56 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/240049 (duration: 00m 17s)
* 15:56 cmjohnson1: db1051 replacing failed disk slot 6
* 15:51 logmsgbot: mattflaschen@tin Synchronized php-1.27.0-wmf.1/extensions/ContentTranslation/modules/dashboard/styles/ext.cx.dashboard.less: Fix: Clicking on down arrow in language selector should trigger ULS (duration: 00m 17s)
* 15:48 logmsgbot: mattflaschen@tin Synchronized wmf-config/InitialiseSettings.php: Enable CX suggestions in ar, eo, hi, nl, vi and dawiki (duration: 00m 17s)
* 15:47 cmjohnson1: analytics1049 replacing failed disk /dev/sdi at slot 7
* 15:46 logmsgbot: mattflaschen@tin Synchronized php-1.27.0-wmf.1/tests/phpunit/includes/objectcache/BagOStuffTest.php: Memcached key decode fix (duration: 00m 18s)
* 15:45 logmsgbot: mattflaschen@tin Synchronized php-1.27.0-wmf.1/includes/objectcache/MemcachedBagOStuff.php: Memcached key decode fix (duration: 00m 17s)
* 15:44 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf24/tests/phpunit/includes/objectcache/BagOStuffTest.php: Memcached key decode fix (duration: 00m 17s)
* 15:43 logmsgbot: mattflaschen@tin Synchronized php-1.26wmf24/includes/objectcache/MemcachedBagOStuff.php: Memcached key decode fix (duration: 00m 18s)
* 15:33 cmjohnson1: replacing failed disk ms-be1012 /dev/sdf slot 5
* 15:09 matt_flaschen: Did final run of convertAllLqtPages.php on sewikimedia immediately before freezing LQT
* 15:08 logmsgbot: mattflaschen@tin Synchronized wmf-config/InitialiseSettings.php: Freeze LQT on se.wikimedia (duration: 00m 18s)
* 14:58 moritzm: installed PHP security updates on all precise/trusty systems (the respective DSA for jessie is already deployed, it was released three weeks ago)
* 14:19 andrewbogott: log test
* 14:17 andrewbogott: testing the log
* 14:07 _joe_: installing the new hhvm package to all the canaries
* 13:41 hashar: Am I logging?
* 13:32 _joe_: uploaded hhvm_3.6.5+dfsg1-1+wm7
* 13:03 bblack: repooling cp1046 (eqiad mobile) with caches wiped clean just before
* 12:24 akosiaris: disable SessionRemoteIPcheck on mendelevium's OTRS installation for checking
* 12:18 logmsgbot: aude@tin Finished scap: Put Wikidata extension back on wmf/1.27.0-wmf.1 (duration: 30m 46s)
* 12:09 moritzm: installed PHP security updates on all precise/trusty systems (the respective DSA for jessie is already deployed, it was released three weeks ago)
* 11:47 logmsgbot: aude@tin Started scap: Put Wikidata extension back on wmf/1.27.0-wmf.1
* 10:35 logmsgbot: hoo@tin Synchronized php-1.27.0-wmf.1/extensions/: Use 1.27.0-wmf.1 for Wikidata again after fixing T114290 (duration: 01m 07s)
* 10:20 akosiaris: uploaded php5_5.3.10-1ubuntu3.20+wmf1 on apt.wikimedia.org precise-wikimedia
* 09:51 godog: reboot praseodymium to test cassandra systemd unit
* 09:47 godog: stop puppet on restbase* in preparation for    https://gerrit.wikimedia.org/r/242548
* 08:59 akosiaris: disabling puppet on maps-test200* in expectance of https://gerrit.wikimedia.org/r/242548
* 06:56 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Oct  1 06:56:51 UTC 2015 (duration 56m 50s)
* 03:21 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.27.0-wmf.1) at 2015-10-01 03:21:39+00:00
* 03:14 logmsgbot: l10nupdate@tin Synchronized php-1.27.0-wmf.1/cache/l10n: l10nupdate for 1.27.0-wmf.1 (duration: 10m 38s)
* 02:47 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf24) at 2015-10-01 02:47:31+00:00
* 02:40 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf24/cache/l10n: l10nupdate for 1.26wmf24 (duration: 10m 54s)
* 01:30 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.1/includes/resourceloader/ResourceLoader.php: 1cfe27030e: Change load.php to minify per-module instead of per-request (duration: 00m 17s)
* 01:29 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes/resourceloader/ResourceLoader.php: 1cfe27030e: Change load.php to minify per-module instead of per-request (duration: 00m 17s)
* 00:44 mutante: applying puppet fix on fermium (illegal byte sequence in utf-8) as in T114289#1691038 but for other languages


== 2015-09-30 ==
== 2021-10-22 ==
* 23:00 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.1
* 23:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:55 logmsgbot: twentyafterfour@tin Synchronized php-1.27.0-wmf.1/: deploying several fixes to the branch (duration: 01m 52s)
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:44 ottomata: perf testing eventlogging in production by hammering https://bits.wikimedia.org/beacon/event.gif
* 20:57 bblack: re-pooling eqiad in DNS
* 22:13 logmsgbot: ori@tin Synchronized php-1.27.0-wmf.1/includes/User.php: 0ed7cc8526: Made User::loadFromId() skip cache with READ_LATEST (duration: 00m 17s)
* 20:54 legoktm: <XioNoX> I disabled the interface on cr1, going to re-enabled the active on on cr2
* 22:12 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes/User.php: 0ed7cc8526: Made User::loadFromId() skip cache with READ_LATEST (duration: 00m 17s)
* 20:48 legoktm: bblack has temporarily depooled eqiad https://gerrit.wikimedia.org/r/733043
* 21:36 cwdent: updated payments from bc4bcc44d2337d7a69c5a39f11ff45efdf0c8e11 to 24d5be6886d34b3600031290c7f55ee84f3dcee2
* 20:41 XioNoX: disable sessions to equinix eqiad IXP
* 21:27 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes/resourceloader/ResourceLoaderFileModule.php: a1e1619461: Fix LESS file dependency tracking in ResourceLoader (duration: 00m 17s)
* 19:17 urbanecm: Start server-side upload of 1 video file ([[phab:T294134|T294134]])
* 20:43 cscott: updated Parsoid to version 39c60c67
* 15:06 jbond: upload puppetboard_3.1.0-1_all.deb to ullseye-wikimedia
* 20:40 mutante: fixing puppet run on fermium, needs manual fix because puppet cant replace existing illegal character in some templates
* 13:42 ema: deployment-cache-upload06: restart varnish-frontend, package got upgraded to 6.0.8 [[phab:T294116|T294116]]
* 19:18 logmsgbot: krinkle@tin Synchronized php-1.27.0-wmf.1/resources/Resources.php: T114288 (duration: 00m 17s)
* 13:30 jbond: upload python3-pypuppetdb_2.4.0-1_all.deb to bullseye
* 18:51 jynus: stopping replication on s2, s3 and s7 for dbstore1001
* 10:46 jbond: upload cas_6.4.2-1+wmf10u1
* 18:50 ottomata: restarted eventlogging with blacklist=^Analytics$
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 18:40 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf24
* 10:05 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 18:14 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.27.0-wmf.1
* 09:11 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294029|T294029]]
* 17:39 ottomata: restarting eventlogging with 12 client side processors
* 09:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2025.codfw.wmnet with OS buster
* 16:28 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes/resourceloader: 89595ba49a: Cherry-pick I173a9820b and I7c7546ec (duration: 00m 18s)
* 08:36 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 16:20 logmsgbot: krenair@tin Finished scap: swat (duration: 53m 33s)
* 08:27 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 15:27 logmsgbot: krenair@tin Started scap: swat
* 08:24 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 13:53 moritzm: rebooted stat1001/stat1002/stat1003 for kernel updates (already happened between 13:00 UTC and 13:10 UTC, but forgot to log earlier)
* 08:23 ema: cp3062: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 12:34 moritzm: added debdeploy 0.0.8 to carbon
* 08:00 ema: deployment-cache-text06: test 0008-vsl_check_e_inval_assertion.patch https://gerrit.wikimedia.org/r/c/operations/debs/varnish4/+/732913/ [[phab:T293879|T293879]]
* 05:43 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Sep 30 05:43:19 UTC 2015 (duration 43m 18s)
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17580 and previous config saved to /var/cache/conftool/dbconfig/20211022-055403-root.json
* 03:10 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.27.0-wmf.1) at 2015-09-30 03:10:39+00:00
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17579 and previous config saved to /var/cache/conftool/dbconfig/20211022-053900-root.json
* 03:03 logmsgbot: l10nupdate@tin Synchronized php-1.27.0-wmf.1/cache/l10n: l10nupdate for 1.27.0-wmf.1 (duration: 10m 45s)
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17578 and previous config saved to /var/cache/conftool/dbconfig/20211022-052356-root.json
* 02:36 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf24) at 2015-09-30 02:36:37+00:00
* 05:08 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17577 and previous config saved to /var/cache/conftool/dbconfig/20211022-050852-root.json
* 02:32 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf24/cache/l10n: l10nupdate for 1.26wmf24 (duration: 07m 21s)
* 04:53 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17576 and previous config saved to /var/cache/conftool/dbconfig/20211022-045349-root.json
* 01:02 logmsgbot: ori@tin Synchronized php-1.26wmf24/vendor: 940124a7db: Updated mediawiki/core Project: mediawiki/vendor  ff5e254f7eddf811f6f66b4a4063b1a8cc70f265 (duration: 00m 21s)
* 04:46 marostegui_: Deploy schema change on s8 codfw - [[phab:T291719|T291719]]
* 04:38 marostegui@cumin1001: dbctl commit (dc=all): 'db1126 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17575 and previous config saved to /var/cache/conftool/dbconfig/20211022-043845-root.json
* 02:59 ejegg: updated payments-wiki from {{Gerrit|088a8cda1e}} to {{Gerrit|6e810fb401}}


== 2015-09-29 ==
== 2021-10-21 ==
* 23:59 ottomata: restarting eventlogging so that processors use etcd to pick up shared token with which to consistently hash IPs
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:46 logmsgbot: catrope@tin Synchronized php-1.26wmf24/extensions/Echo/: SWAT (duration: 00m 17s)
* 23:38 jforrester@deploy1002: Synchronized w/fatal-error.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 23:37 logmsgbot: catrope@tin Synchronized wmf-config/InitialiseSettings.php: Enable Flow opt-in on mediawikiwiki (duration: 00m 16s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:33 logmsgbot: catrope@tin Synchronized php-1.26wmf24/extensions/GuidedTour/GuidedTourHooks.php: T114144 Fix back button logging (duration: 00m 18s)
* 23:37 jforrester@deploy1002: Synchronized w/static.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 23:32 logmsgbot: catrope@tin Synchronized php-1.27.0-wmf.1/extensions/Echo/: SWAT (duration: 00m 17s)
* 23:36 jforrester@deploy1002: Synchronized multiversion/: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 23:30 logmsgbot: catrope@tin Synchronized php-1.27.0-wmf.1/extensions/GuidedTour/GuidedTourHooks.php: T114144 Fix back button logging (duration: 00m 17s)
* 23:34 jforrester@deploy1002: Synchronized docroot/noc/conf/index.php: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 54s)
* 22:50 ejegg: updated payments-wiki from 3b0915a51a0fd567bdf22f3d4e17548a83e735d8 to bc4bcc44d2337d7a69c5a39f11ff45efdf0c8e11
* 23:33 jforrester@deploy1002: Synchronized wmf-config: Config: [[gerrit:730038{{!}}build: Upgrade composer testing stack to latest as used Wikimedia-wide]] (duration: 00m 55s)
* 22:47 ejegg: updated SmashPig from bf302444eae8236734fd43883b06c7b2512b1532 to 513ec01123e6dbb97b00888a3610a7c5ec24a63b
* 23:32 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 22:38 twentyafterfour: finished deployment of 1.27.0-wmf.1 to group0
* 23:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:36 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.27.0-wmf.1
* 23:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:34 logmsgbot: twentyafterfour@tin Finished scap: php-1.27.0-wmf.1/extensions/ (duration: 04m 29s)
* 23:25 thcipriani@deploy1002: Synchronized wmf-config: Config: [[gerrit:730946{{!}}CommonSettings: Drop legacy CentralAuth config flag, never read (T277932)]] (duration: 00m 55s)
* 22:30 twentyafterfour: 22:27:17 sync-dir failed: <ValueError> /srv/mediawiki-staging/php-1.27.0-wmf.1/vendor/oyejorge/less.php/lib/Less/Version.php has content before opening <?php tag
* 23:18 thcipriani@deploy1002: Synchronized tests/multiversion/StaticSettingsTest.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 22:30 chasemp: es-tool unban-node elastic1031
* 23:15 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720362{{!}}Add new config names for CentralAuth denylist controls (T277932)]] (duration: 00m 55s)
* 22:29 logmsgbot: twentyafterfour@tin Started scap: php-1.27.0-wmf.1/extensions/
* 23:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:23 chasemp: unbanning elastic1006 for shard population
* 23:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:05 ori: 22:03:51 Synchronized php-1.26wmf24/extensions/CentralNotice: 30bdfcb386: Updated mediawiki/core Project: mediawiki/extensions/CentralNotice  6bd658e155a02edb4cc506bc3494a3f4699d3e94 (duration: 00m 17s)
* 22:42 mutante: [[phab:T294038|T294038]] [krb1001:~] $ sudo manage_principals.py create effeietsanders ... Principal successfully created. . .Successfully sent email
* 19:34 logmsgbot: twentyafterfour@tin Finished scap: sync php-1.27.0-wmf.1 for validation on testwiki (duration: 31m 49s)
* 21:44 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS (duration: 02m 47s)
* 19:30 bd808: Updated iegreview.wikimedia.org to c3ac5e6 (Update to Twig 1.20.0) and applied latest schema changes
* 21:41 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@13448f1] (wcqs): Deploy 0.3.90 to WCQS
* 19:02 logmsgbot: twentyafterfour@tin Started scap: sync php-1.27.0-wmf.1 for validation on testwiki
* 20:54 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 13s)
* 18:11 cmjohnson1: shutting down elastic1031 to relocate rack/row
* 20:53 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 18:09 cmjohnson1: shutting down elastic1006 to relocate row/rack
* 20:53 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy (duration: 00m 35s)
* 17:46 mutante: size of conntrack table on iron might be increased due to test scans
* 20:52 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@1309a97] (wcqs): dry run wcqs deploy
* 17:32 ejegg: updated appeal template setting on payments-wiki
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 16:30 cmjohnson1: swapping failed disk db1050
* 20:04 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 16:14 subbu: reverted configuration hotfix from yesterday's Parsoid deploy (re-enabled use of Parsoid batching API)
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 15:58 bblack: ending VCL tests on cp1053
* 20:02 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 15:52 bblack: ending VCL tests on cp1065, starting on cp1053 instead
* 19:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:39 bblack: cp1065: live-testing some VCL patches, puppet disabled, etc...
* 19:43 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:19 chasemp: ban elastic1031 for T112559
* 19:42 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Update $wgTimelineFonts for new path to unifont in Shellbox container ([[phab:T293050|T293050]]) (duration: 00m 55s)
* 15:05 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: VisualEditor: Set TransitionDefault true for the English Wikipedia [[gerrit:242040]] (duration: 00m 17s)
* 19:38 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 14:08 paravoid: repooling eqiad; 24h codfw test window is over
* 19:35 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 10:49 godog: bounce pybal on lvs2003 / lvs2006
* 19:31 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 09:13 paravoid: powercycling pybal-test2003
* 19:23 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 08:51 jynus: starting cloning of labsdb1005 (Tools DB), minimal disruption is expected
* 19:10 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs (duration: 00m 23s)
* 07:03 twentyafterfour: restarted apache2 and phd or iridium to get phabricator back into the correct state
* 19:09 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: deploy 0.3.90, incl oauth, to wcqs
* 06:58 twentyafterfour: checked out correct phabricator release tag on iridium. Something, somewhere, had reverted everything to an old deployment.
* 19:07 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@b2912b7]: (no justification provided) (duration: 00m 08s)
* 05:01 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Sep 29 05:01:30 UTC 2015 (duration 1m 29s)
* 19:07 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@b2912b7]: (no justification provided)
* 02:34 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf24) at 2015-09-29 02:34:55+00:00
* 18:53 urbanecm: Deploy security patch for [[phab:T285116|T285116]] (wmf.4, wmf.5)
* 02:30 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf24/cache/l10n: l10nupdate for 1.26wmf24 (duration: 06m 57s)
* 18:53 mutante: dumpsdata1003 - sudo systemctl reset-failed to clear Icinga alert about failed cleanup_tmpdumps.service
* 02:25 bblack: ipv6 flap experiment: raise ipv6/route/max_size from 4096 to 131072 manually on cp*, actually
* 17:55 mutante: that's a key for https://www.worldcat.org/whatis/default.jsp btw for those wondering
* 02:24 bblack: ipv6 flap experiment: raise ipv6/route/max_size from 4096 to 131072 manually on cp20*
* 17:53 mutante: citoid - replaced "wskey" for worldcat in private repo as requested on [[phab:T294010|T294010]] (is in 4 places, 3 for deployment_server/k8s and one remnant for scb)
* 01:55 robh: upload-lb.codfw.wikimedia.org_ipv6 page flap
* 17:53 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 01:43 logmsgbot: catrope@tin Synchronized wmf-config/CommonSettings.php: Plumbing for wmgVisualEditorTransitionDefault (duration: 00m 17s)
* 17:52 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 01:43 logmsgbot: catrope@tin Synchronized wmf-config/InitialiseSettings.php: Add wmgVisualEditorTransitionDefault (false everywhere) (duration: 00m 17s)
* 17:50 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 01:13 robh: didnt powercycle analytics1035 yet, it recovered on its own.
* 16:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:13 robh: powercycling analytics1035, seems oom, cannot login via ssh or serial console
* 16:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:56 logmsgbot: tstarling@tin Synchronized php-1.26wmf24/extensions/ParsoidBatchAPI: Fix fatal error I77fd7e8 (duration: 00m 17s)
* 16:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 16:12 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 16:07 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 16:06 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (2/3) (duration: 00m 54s)
* 16:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:04 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732669{{!}}Remove dispatchViaJobs repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 16:03 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 16:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:01 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 15:59 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (2/3) (duration: 00m 55s)
* 15:58 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732668{{!}}Remove dispatchViaJobsPruneChangesTableInJobEnabled repo setting (T292604)]] (1/3) (duration: 00m 57s)
* 15:43 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:21 robh@cumin1001: START - Cookbook sre.dns.netbox
* 15:14 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/tests/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (3/3) (duration: 00m 56s)
* 15:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:13 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 54s)
* 15:12 Lucas_WMDE: my next message accidentally says 1/3 again but it’s 2/3, sorry
* 15:11 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/includes/: Backport: [[gerrit:732667{{!}}Remove dispatchViaJobsAllowedClients repo setting (T292604)]] (1/3) (duration: 00m 56s)
* 15:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:56 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 14:42 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/repo/config/Wikibase.default.php: Backport: [[gerrit:732666{{!}}Enable dispatching via jobs by default (T291828)]] (duration: 00m 55s)
* 14:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:39 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client/: Backport: [[gerrit:732674{{!}}Fix ExternalUserNames service wiring for local database]] (duration: 00m 57s)
* 14:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 14:26 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:19 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:56 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:55 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 13:49 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 13:34 volans: uploaded spicerack_1.0.6 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 13:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:04 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 12:56 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 12:56 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 7 hosts with reason: Schema change s3 [[phab:T278619|T278619]]
* 12:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 12:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T278619|T278619]]
* 12:48 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 12:48 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s4 [[phab:T278619|T278619]]
* 12:43 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 12:43 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T278619|T278619]]
* 12:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 12:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T278619|T278619]]
* 11:55 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 11:54 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T278619|T278619]]
* 11:47 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 11:47 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T278619|T278619]]
* 11:13 Lucas_WMDE: UTC morning backport+config window done
* 11:10 Lucas_WMDE: lucaswerkmeister-wmde@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/ResubmitChanges.php wikidatawiki --minimum-age $((60*60*12)) # [[phab:T294008|T294008]]
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 jgiannelos@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730848{{!}}Configure event stream for map tiles state change (T289771)]] (duration: 01m 04s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:48 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:48 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 10:47 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 10:14 jbond: mergeing refactor of P:base Gerrit:714975
* 09:54 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:49 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 08:56 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: Update [[phab:T250887|T250887]] mitigations (duration: 01m 03s)
* 08:33 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:26 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp3062.esams.wmnet,service=(varnish-fe{{!}}ats-tls)
* 08:25 ema: cp3062: revert vsl_space experiment [[phab:T293879|T293879]]
* 08:24 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite1004.eqiad.wmnet with OS bullseye
* 08:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17563 and previous config saved to /var/cache/conftool/dbconfig/20211021-080330-root.json
* 07:56 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite1004.eqiad.wmnet with OS bullseye
* 07:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17562 and previous config saved to /var/cache/conftool/dbconfig/20211021-074826-root.json
* 07:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17561 and previous config saved to /var/cache/conftool/dbconfig/20211021-073323-root.json
* 07:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17560 and previous config saved to /var/cache/conftool/dbconfig/20211021-071819-root.json
* 07:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17559 and previous config saved to /var/cache/conftool/dbconfig/20211021-070315-root.json
* 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1118 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17558 and previous config saved to /var/cache/conftool/dbconfig/20211021-064812-root.json
* 06:35 elukey: `systemctl reload nginx` on cloudelastic100[5,6] to pick up the new TLS certificate and clear alerts - [[phab:T293826|T293826]]
* 04:47 marostegui: Deploy schema change on s5 codfw - [[phab:T291719|T291719]]
* 04:37 marostegui: Deploy schema change on s6 codfw - [[phab:T291719|T291719]]
* 04:04 legoktm: restarted apache on lists1001 so it only uses new TLS cert ([[phab:T293826|T293826]])
* 03:29 eileen: civicrm revision changed from {{Gerrit|e889831012}} to {{Gerrit|733a8fceda}}, config revision is {{Gerrit|eed79486d5}}
* 00:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2015-09-28 ==
== 2021-10-20 ==
* 23:47 logmsgbot: ebernhardson@tin Synchronized wmf-config/CommonSettings.php: Update ttmserver configuration to match elasticsearch security profile (duration: 00m 17s)
* 23:56 thcipriani@deploy1002: Finished scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]] (duration: 11m 41s)
* 23:43 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: I56db35b: Removed ignore_user_abort( true ) line (duration: 00m 18s)
* 23:44 thcipriani@deploy1002: Started scap: Backport: [[gerrit:732336{{!}}Restore title to mobile skin without logo (T290525)]]
* 23:13 logmsgbot: catrope@tin Synchronized php-1.26wmf24/extensions/Echo: SWAT (duration: 00m 18s)
* 23:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:13 logmsgbot: catrope@tin Synchronized php-1.26wmf24/extensions/Flow: SWAT (duration: 00m 19s)
* 23:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:12 logmsgbot: catrope@tin Synchronized php-1.26wmf24/extensions/CentralNotice: SWAT (duration: 00m 19s)
* 23:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:51 bearND: MobileApps deployed sha1 9df72ec
* 23:29 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: fawiki require login for creation of pages in the draft namespace [[phab:T291018|T291018]] (duration: 01m 02s)
* 20:42 subbu: deployed parsoid version b9e5244e + hotfix on tin to turn off batching api use since canary restart of wtp1002 showed some batching api errors
* 23:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:26 yuvipanda: depooled cp2017 from pybal config too
* 23:27 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: fawiki require login to edit main namespace [[phab:T291018|T291018]] (duration: 01m 04s)
* 20:22 yuvipanda: depooled cp2017 since it's down
* 22:13 dancy@deploy1002: Synchronized README: testing (4/4) (duration: 02m 52s)
* 19:02 chasemp: powercycle iridium via console as it's unresponsive
* 22:00 dancy@deploy1002: Synchronized README: testing (3/4) (duration: 02m 57s)
* 18:01 awight: update crm from 190f689ff7aec7fecefdf5af501293685c55e041 to 7003cc38797848631d0c4d5f6ff68ab1d6118ad8
* 21:54 dancy@deploy1002: Synchronized README: testing (2) (duration: 01m 02s)
* 15:50 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/241579/ (duration: 00m 17s)
* 21:52 dancy@deploy1002: Synchronized README: (no justification provided) (duration: 01m 03s)
* 15:28 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/241598/ (duration: 00m 17s)
* 21:50 dancy: Testing a series of one-file scap sync-file runs
* 15:11 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/241649/ (duration: 00m 17s)
* 21:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:00 paravoid: running failover eqiad->codfw test for all frontend traffic
* 21:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:11 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: I179be4bd3: Rely on timeouts specified in php.ini rather than calling ini_set() (duration: 00m 17s)
* 21:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:52 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Sep 28 05:52:34 UTC 2015 (duration 52m 33s)
* 21:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b9cf996a38d82fdd67e600a5a951e88423957e8d}}: Promote Growth features out of darkmode on several wikis  ([[phab:T291826|T291826]], [[phab:T255037|T255037]], [[phab:T287878|T287878]]) (duration: 01m 04s)
* 02:27 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf24) at 2015-09-28 02:27:16+00:00
* 21:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:20 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf24/cache/l10n: l10nupdate for 1.26wmf24 (duration: 06m 54s)
* 20:38 eileen: civicrm revision changed from {{Gerrit|9b5e0d015b}} to {{Gerrit|e889831012}}, config revision is {{Gerrit|eed79486d5}}
* 20:25 legoktm: uploaded php7.4 on buster to apt.wm.o ([[phab:T293449|T293449]])
* 19:24 ebernhardson@deploy1002: Finished deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations (duration: 00m 46s)
* 19:24 ebernhardson@deploy1002: Started deploy [search/mjolnir/deploy@985a139]: bulk_daemon: detect cross-cluste config from old and new locations
* 19:09 mutante: disabling puppet on mw* for a minute to deploy a change
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:31 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'canary' .
* 18:30 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 18:24 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 17:28 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org -  [[phab:T293810|T293810]]
* 17:27 mutante: [krb1001:~] $ sudo manage_principals.py create statwithlatte --email_address=naray-ctr@wikimedia.org
* 17:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:01 razzi@deploy1002: Finished deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f] (duration: 23m 42s)
* 17:00 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/Wikibase/client: Update deprecated calls to ParserOutput in ShortDescHandler - [[phab:T293860|T293860]] (duration: 01m 03s)
* 16:56 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:53 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:53 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/LiquidThreads/pages/LqtDiscussionPager.php: Remove deprecated usage of setProperty - [[phab:T293895|T293895]] (duration: 01m 03s)
* 16:49 hashar@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/GeoCrumbs: Replace use of deprecated ParserOutput:getProperty() - [[phab:T293894|T293894]] (duration: 01m 09s)
* 16:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:37 razzi@deploy1002: Started deploy [analytics/refinery@9e3295f]: Regular analytics weekly train [analytics/refinery@9e3295f]
* 16:36 razzi: deploy refinery change for https://phabricator.wikimedia.org/T287084
* 16:13 jbond: upload cas_6.4.2-1_amd64.deb
* 15:42 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:39 volans@cumin2002: START - Cookbook sre.dns.netbox
* 14:57 moritzm: installing modsecurity-crs security updates on Buster
* 14:48 moritzm: installing xmlgraphics-commons security updates on Buster
* 14:46 moritzm: installing irssi security updates on Buster
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:44 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:35 moritzm: installing commons-io security updates on Buster
* 14:27 ema: cp3062: test higher vsl_space values [[phab:T293879|T293879]]
* 14:27 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:12 moritzm: installing ruby2.3 security updates
* 13:40 moritzm: installing apache2 security updates on buster
* 13:27 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:21 hashar@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 01m 02s)
* 13:20 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.5 refs [[phab:T281169|T281169]]
* 13:11 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 13:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277116|T277116]]
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=ats-tls
* 13:04 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp3062.esams.wmnet,service=varnish-fe
* 12:51 ema: cp3062: bump vsl_space from 80M (default) to 512M [[phab:T293879|T293879]] - varnish restart needed
* 12:37 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 12:36 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277116|T277116]]
* 12:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:02 urbanecm@deploy1002: Finished scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]]) (duration: 25m 19s)
* 11:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2007.codfw.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2007.codfw.wmnet
* 11:37 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 11:37 urbanecm@deploy1002: Started scap: {{Gerrit|802d3b7}}: {{Gerrit|e4f7f85}}: CreateAccountCampaign: Support for recurring donors ([[phab:T293699|T293699]])
* 11:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2005.codfw.wmnet
* 11:21 moritzm: installing ffmpeg security updates
* 11:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e520fc57411bb19123766192cd636396ea6fc59d}}: GrowthExperiments: Add campaign pattern for enwiki ([[phab:T293699|T293699]]) (duration: 01m 22s)
* 11:11 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop test cluster: Restart of jvm daemons. - btullis@cumin1001
* 11:10 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2005.codfw.wmnet
* 10:13 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 10:13 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277116|T277116]]
* 09:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 09:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277116|T277116]]
* 09:52 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 09:52 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277116|T277116]]
* 09:05 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 09:04 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277116|T277116]]
* 08:50 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 08:50 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277116|T277116]]
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:01 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:16 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1118.eqiad.wmnet with OS buster
* 07:09 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:49 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1118.eqiad.wmnet with OS buster
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118 (s1) for reimage [[phab:T290865|T290865]]', diff saved to https://phabricator.wikimedia.org/P17552 and previous config saved to /var/cache/conftool/dbconfig/20211020-064529-marostegui.json
* 06:41 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1126.eqiad.wmnet with OS buster
* 06:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1106 (s1) after upgrade', diff saved to https://phabricator.wikimedia.org/P17551 and previous config saved to /var/cache/conftool/dbconfig/20211020-063926-marostegui.json
* 06:35 marostegui: Upgrade db1106
* 06:34 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 (s1) for upgrade', diff saved to https://phabricator.wikimedia.org/P17550 and previous config saved to /var/cache/conftool/dbconfig/20211020-063431-marostegui.json
* 06:31 dcausse: restarting blazegraph on wdqs1012
* 06:28 elukey: reboot analytics1066 - OS showing CPU soft lockups, tons of defunct processes (including node manager) and high CPU usage
* 06:21 marostegui: Depool clouddb1013 for upgrade
* 06:14 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db1126.eqiad.wmnet with OS buster
* 06:12 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17549 and previous config saved to /var/cache/conftool/dbconfig/20211020-061202-marostegui.json
* 06:06 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:05 XioNoX: put transport link between ulsfo and eqsin in service - [[phab:T273308|T273308]]
* 05:59 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS buster
* 05:26 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 04:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:40 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable $wgLocalHTTPProxy on group0 wikis ([[phab:T288848|T288848]]) (duration: 01m 05s)
* 01:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:00 tgr: west coast evening deploys done


== 2015-09-27 ==
== 2021-10-19 ==
* 23:17 logmsgbot: ori@tin Synchronized php-1.26wmf24/./resources/src/mediawiki.toolbar/toolbar.less: I94ced06178: mediawiki.toolbar: temporary workaround for T113868 (duration: 00m 17s)
* 23:59 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732103{{!}}Reorder some wikis at wgExtraNamespaces and wmgVisualEditorAvailableNamespaces (T293846)]] (duration: 01m 02s)
* 16:51 Krenair: ran sync-common on snapshot1001 to bring it up to date
* 23:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:43 logmsgbot: krenair@tin Purged l10n cache for 1.26wmf21
* 23:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:42 logmsgbot: krenair@tin Purged l10n cache for 1.26wmf20
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:732053{{!}}ruwikiversity: Add 'portal' and 'faculty' namespaces (T293545)]] (duration: 01m 03s)
* 16:42 logmsgbot: krenair@tin Purged l10n cache for 1.26wmf19
* 23:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:50 logmsgbot: krenair@tin Synchronized wmf-config/abusefilter.php: https://gerrit.wikimedia.org/r/#/c/241354/ - fix AbuseFilter block durations (duration: 00m 18s)
* 23:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:05 jynus: leaving innodb compression tests on es2005 running (could affect lag on that host)
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:710565{{!}}Set the project namespace and sitename for Javanese Wikipedia and Wiktionary (T287437)]] (duration: 01m 02s)
* 02:23 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf24/cache/l10n: l10nupdate for 1.26wmf24 (duration: 07m 05s)
* 23:23 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731953{{!}}Create Portal and Portal talk namespace for shiwiki (T288909)]] (duration: 01m 03s)
* 23:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:13 tgr@deploy1002: Synchronized static: Config: [[gerrit:731231{{!}}Repair the size of the logo of Kashmiri Wikipedia (T293342)]] (duration: 02m 14s)
* 21:34 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete {{!}} fixed Icinga alert:  RECOVERY - Check for large files in client bucket on mwmaint1002 is OK: OK: [[phab:T165885|T165885]]
* 21:32 mutante: mwmaint1002 - delete large files over 100MB from puppet clientbucket. sudo /usr/bin/find /var/lib/puppet/clientbucket/ -type f -size +100M -delete
* 20:56 ejegg: updated payments-wiki from {{Gerrit|0f48acea49}} to {{Gerrit|30e596903d}}
* 19:03 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 18:46 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/MediaSearch/: {{Gerrit|a84a675}}: {{Gerrit|3231578}}: MediaSearch backports ([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 18:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/: {{Gerrit|694580a}}: {{Gerrit|c02e301}}: MediaSearch backports([[phab:T291392|T291392]], [[phab:T293335|T293335]], [[phab:T291392|T291392]], [[phab:T291622|T291622]], [[phab:T293554|T293554]]) (duration: 01m 03s)
* 18:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 18:30 foks: deleting 1 more email with deleteUserEmail.php
* 18:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1476a2d93}}: {{Gerrit|dd8393c1a0}}: foundationwiki: Restrict sensitive namespaces to editor group ([[phab:T205350|T205350]]) (duration: 01m 03s)
* 18:12 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 18:12 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9a2893c7190e615a247674dbf7f87348bf43b91c}}: Enable topic subscriptions as a beta feature on all remaining projects ([[phab:T287802|T287802]]) (duration: 01m 04s)
* 18:00 legoktm@deploy1002: Synchronized wmf-config/: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (2/2) (duration: 01m 06s)
* 17:59 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Add framework for setting $wgLocalHTTPProxy ([[phab:T288848|T288848]]) (1/2) (duration: 01m 05s)
* 17:57 foks: removing six email addresses on request (with deleteUserEmail.php)
* 17:37 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 17:25 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 17:11 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1004.eqiad.wmnet with OS bullseye
* 17:09 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudmetrics1003.eqiad.wmnet with OS bullseye
* 16:48 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:12 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 16:11 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 7 hosts with reason: Schema change s3 [[phab:T277118|T277118]]
* 16:09 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 16:09 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 14 hosts with reason: Schema change s1 [[phab:T277118|T277118]]
* 16:06 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 16:06 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 13 hosts with reason: Schema change s4 [[phab:T277118|T277118]]
* 16:00 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 16:00 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T277118|T277118]]
* 15:46 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 15:46 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s2 [[phab:T277118|T277118]]
* 15:40 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: wgEventStreams - remove now redundant stream setting - [[phab:T277193|T277193]] (duration: 01m 04s)
* 15:35 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 15:35 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s5 [[phab:T277118|T277118]]
* 15:34 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:34 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:32 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:32 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 9 hosts with reason: Schema change s6 [[phab:T277118|T277118]]
* 15:30 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:28 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 14:34 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:29 jbond: disable puppet on lvs, cp, authdns, mc, mw-be and wcqs to while i merge G:662699
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:11 hashar@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]] (duration: 45m 13s)
* 13:52 kevinbazira@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 13:45 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:26 hashar@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.5  refs [[phab:T281169|T281169]]
* 13:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17547 and previous config saved to /var/cache/conftool/dbconfig/20211019-131927-root.json
* 13:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17546 and previous config saved to /var/cache/conftool/dbconfig/20211019-131651-root.json
* 13:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17545 and previous config saved to /var/cache/conftool/dbconfig/20211019-130424-root.json
* 13:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17544 and previous config saved to /var/cache/conftool/dbconfig/20211019-130147-root.json
* 12:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17543 and previous config saved to /var/cache/conftool/dbconfig/20211019-124920-root.json
* 12:46 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17542 and previous config saved to /var/cache/conftool/dbconfig/20211019-124644-root.json
* 12:40 moritzm: installing aftpd security updates
* 12:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17541 and previous config saved to /var/cache/conftool/dbconfig/20211019-123416-root.json
* 12:34 marostegui: Upgrade dbstore1003
* 12:31 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17540 and previous config saved to /var/cache/conftool/dbconfig/20211019-123140-root.json
* 12:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17539 and previous config saved to /var/cache/conftool/dbconfig/20211019-121913-root.json
* 12:16 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17538 and previous config saved to /var/cache/conftool/dbconfig/20211019-121636-root.json
* 12:12 XioNoX: push anycast tuning to all Lumen and NTT transit links - [[phab:T288843|T288843]]
* 12:09 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1167 (s8) after upgrade', diff saved to https://phabricator.wikimedia.org/P17537 and previous config saved to /var/cache/conftool/dbconfig/20211019-120918-marostegui.json
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1167 (s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17536 and previous config saved to /var/cache/conftool/dbconfig/20211019-120458-marostegui.json
* 12:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3311 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17535 and previous config saved to /var/cache/conftool/dbconfig/20211019-120409-root.json
* 12:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17534 and previous config saved to /var/cache/conftool/dbconfig/20211019-120348-root.json
* 12:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.5/extensions/WikibaseMediaInfo/: {{Gerrit|ec0125770775c1a1a54c3b592d86d287fd9e3ad6}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 55s)
* 12:01 marostegui@cumin1001: dbctl commit (dc=all): 'db1105:3312 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17533 and previous config saved to /var/cache/conftool/dbconfig/20211019-120132-root.json
* 12:00 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikibaseMediaInfo/: {{Gerrit|79808a90a95dd5dac2b532b87fb7ec1a490ea0f0}}: Escape captions when writing stored data into js state ([[phab:T293556|T293556]]) (duration: 00m 56s)
* 12:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17532 and previous config saved to /var/cache/conftool/dbconfig/20211019-120024-root.json
* 11:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:56 XioNoX: push anycast tuning to Tele2, Init7, DT transit links - [[phab:T288843|T288843]]
* 11:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17531 and previous config saved to /var/cache/conftool/dbconfig/20211019-114844-root.json
* 11:46 marostegui: Upgrade db1105 (s1,s2)
* 11:46 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105 (s1,s2) for upgrade', diff saved to https://phabricator.wikimedia.org/P17530 and previous config saved to /var/cache/conftool/dbconfig/20211019-114649-marostegui.json
* 11:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17529 and previous config saved to /var/cache/conftool/dbconfig/20211019-114520-root.json
* 11:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17527 and previous config saved to /var/cache/conftool/dbconfig/20211019-113340-root.json
* 11:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17526 and previous config saved to /var/cache/conftool/dbconfig/20211019-113017-root.json
* 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17525 and previous config saved to /var/cache/conftool/dbconfig/20211019-111837-root.json
* 11:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17524 and previous config saved to /var/cache/conftool/dbconfig/20211019-111513-root.json
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:08 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7c31b04e50101a60db7ae8acae64bc031f5e1007}}: DPL: Explicitly note it is not possible to enable DPL on any more wikis (duration: 00m 55s)
* 11:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17523 and previous config saved to /var/cache/conftool/dbconfig/20211019-110333-root.json
* 11:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17522 and previous config saved to /var/cache/conftool/dbconfig/20211019-110009-root.json
* 10:56 marostegui: Upgrade clouddb1021
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 10:51 moritzm: failover master in ganeti-test to ganeti2026
* 10:50 godog: bounce superset on an-tool1005 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:49 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2112.codfw.wmnet with OS stretch
* 10:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3318 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17521 and previous config saved to /var/cache/conftool/dbconfig/20211019-104829-root.json
* 10:45 godog: bounce navtiming on webperf1001 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:45 godog: bounce superset on an-tool1010 to pick up statsd changes - [[phab:T247963|T247963]]
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1101:3317 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17520 and previous config saved to /var/cache/conftool/dbconfig/20211019-104506-root.json
* 10:38 oblivian@deploy1002: Synchronized w/static.php: Config: [[gerrit:730182{{!}}static.php: Add support for /static/current rewrites (take 2) (T285232)]] (duration: 00m 55s)
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 10:37 marostegui: Upgrade db1101 (s7,s8)
* 10:36 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1101 (s7,s8) for upgrade', diff saved to https://phabricator.wikimedia.org/P17519 and previous config saved to /var/cache/conftool/dbconfig/20211019-103634-marostegui.json
* 10:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:29 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 10:28 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 10:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 10:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:22 oblivian@deploy1002: Synchronized tests/WmfConfigServicesTest.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 10:22 godog: flip mw statsd traffic with https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/731918 - [[phab:T247963|T247963]]
* 10:21 oblivian@deploy1002: Synchronized wmf-config/ProductionServices.php: Config: [[gerrit:731918{{!}}ProductionServices: use graphite2003 for statsd (T247963)]] (duration: 00m 54s)
* 10:20 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:18 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS stretch
* 10:16 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 09:52 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:50 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 09:44 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.3 (duration: 01m 39s)
* 09:42 hashar@deploy1002: Pruned MediaWiki: 1.38.0-wmf.2 (duration: 16m 06s)
* 09:37 godog: move graphite/statsd writes to graphite2003 - [[phab:T247963|T247963]]
* 09:34 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 09:27 hashar: sap clean --delete 1.38.0-wmf.2 && scap clean --delete 1.38.0-wmf.3  # [[phab:T281169|T281169]]
* 09:27 hashar: Cloned and applied security patches for 1.38.0-wmf.5 # [[phab:T281169|T281169]]
* 09:19 marostegui: Stop slave on db2112 [[phab:T290865|T290865]]
* 09:18 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 09:18 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 14 hosts with reason: Schema change s1 [[phab:T281058|T281058]]
* 09:03 XioNoX: push anycast tuning to all Telia transit links - [[phab:T288843|T288843]]
* 08:50 godog: point graphite.discovery.wmnet to graphite2003 - [[phab:T247963|T247963]]
* 08:40 XioNoX: push prep-work for anycast tuning to all sites - [[phab:T288843|T288843]]
* 08:33 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:33 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 13 hosts with reason: Schema change s8 [[phab:T281058|T281058]]
* 08:32 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php hrwiki --fix
* 08:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift
* 08:07 mvernon@cumin2002: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=swift-ro
* 08:03 XioNoX: push prep-work for anycast tuning in ulsfo (try 2) - [[phab:T288843|T288843]]
* 08:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:32 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:24 ema: A:cp start rolling varnish upgrades to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 07:21 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17517 and previous config saved to /var/cache/conftool/dbconfig/20211019-072111-root.json
* 07:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17516 and previous config saved to /var/cache/conftool/dbconfig/20211019-071519-root.json
* 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17515 and previous config saved to /var/cache/conftool/dbconfig/20211019-070607-root.json
* 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17514 and previous config saved to /var/cache/conftool/dbconfig/20211019-070016-root.json
* 06:51 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17513 and previous config saved to /var/cache/conftool/dbconfig/20211019-065104-root.json
* 06:45 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17512 and previous config saved to /var/cache/conftool/dbconfig/20211019-064512-root.json
* 06:38 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2112.codfw.wmnet with OS buster
* 06:36 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17511 and previous config saved to /var/cache/conftool/dbconfig/20211019-063559-root.json
* 06:30 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17510 and previous config saved to /var/cache/conftool/dbconfig/20211019-063008-root.json
* 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17509 and previous config saved to /var/cache/conftool/dbconfig/20211019-062054-root.json
* 06:15 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17508 and previous config saved to /var/cache/conftool/dbconfig/20211019-061505-root.json
* 06:06 marostegui: Upgrade dbstore1005
* 06:05 marostegui@cumin1001: dbctl commit (dc=all): 'db1178 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17507 and previous config saved to /var/cache/conftool/dbconfig/20211019-060551-root.json
* 06:04 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 06:03 marostegui: Upgrade db1184, db1178
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1178 for upgrade', diff saved to https://phabricator.wikimedia.org/P17506 and previous config saved to /var/cache/conftool/dbconfig/20211019-060123-marostegui.json
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'db1184 (re)pooling @ 5%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17505 and previous config saved to /var/cache/conftool/dbconfig/20211019-060001-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1184 for upgrade', diff saved to https://phabricator.wikimedia.org/P17504 and previous config saved to /var/cache/conftool/dbconfig/20211019-055429-marostegui.json
* 05:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2112.codfw.wmnet with OS buster
* 05:46 marostegui: Reimage db2112 (s1 codfw master) [[phab:T290865|T290865]]
* 04:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 03:49 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:21 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 02:18 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:38 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer


== 2015-09-26 ==
== 2021-10-18 ==
* 19:04 hashar: restarting Jenkins. Just in case :-D
* 23:40 hoo: Updated the Wikidata property suggester with data from the 2021-10-04 JSON dump (with pre-applied [[phab:T132839|T132839]] workarounds)
* 02:23 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf24/cache/l10n: l10nupdate for 1.26wmf24 (duration: 06m 48s)
* 23:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b654980240d51fff3c6e9c48f7076d4609c2560f}}: Create an alias for the Draft namespace on hrwiki ([[phab:T291755|T291755]]) (duration: 00m 56s)
* 23:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:12 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=thwiktionary --fix # [[phab:T291761|T291761]]
* 23:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|abe777d28594da852e49ccb1c1597b2598f3e483}}: Create Rhymes namespace for thwiktionary ([[phab:T291761|T291761]]) (duration: 00m 57s)
* 23:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:56 legoktm@deploy1002: Synchronized php-1.38.0-wmf.4/includes/http/MWHttpRequest.php: Allow using a reverse proxy for local HTTP requests ([[phab:T288848|T288848]]) (duration: 00m 56s)
* 22:06 maryum: deployed security patch for [[phab:T293589|T293589]]
* 21:23 maryum: deployed security patch for [[phab:T293556|T293556]]
* 21:05 mutante: mwmaint1002 - sudo -u www-data /usr/local/bin/mw-cli-wrapper /usr/local/bin/mwscript extensions/TranslationNotifications/scripts/DigestEmailer.php --wiki mediawikiwiki {{!}} Fatal error: Uncaught Error: Class 'MediaWiki\MediaWikiServices' not found
* 20:58 mutante: mwmaint1002 - attempt to start mediawiki_job_translationnotifications-mediawikiwiki which was alerting as failed
* 20:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:46 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:42 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 19:29 mutante: LDAP: removed non-existent user gerrit2 from group labsadminbots ([[phab:T160122|T160122]])
* 19:29 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/resources/store/state.js: {{Gerrit|ac7b4fc2ccc69589e00a42f49d18a8f6d71777f2}}: Revert 727328 ([[phab:T293554|T293554]]) (duration: 00m 56s)
* 19:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:26 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:45 otto@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Convert $wgEventStreams to be an associative array - [[phab:T277193|T277193]] (duration: 00m 57s)
* 18:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:07 mutante: gerrit - removed tonina from wmde-mediawiki gerrit group ([[phab:T293621|T293621]])
* 17:51 mutante: puppet run on all bastion hosts via cumin
* 15:32 mvernon@cumin2002: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 15:32 mvernon@cumin2002: START - Cookbook sre.discovery.service-route
* 15:23 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 15:23 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 18:00:00 on 7 hosts with reason: Schema change s3 [[phab:T281058|T281058]]
* 15:16 herron: reprepro copied anycast-healthchecker, python3-json-logger and python3-anycast-healthchecker from buster-wikimedia to bullseye-wikimedia [[phab:T292196|T292196]]
* 15:16 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 15:16 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 3:00:00 on 13 hosts with reason: Schema change s4 [[phab:T281058|T281058]]
* 14:59 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 14:59 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 11 hosts with reason: Schema change s7 [[phab:T281058|T281058]]
* 14:54 herron: rebuilt and uploaded kafkatee for bullseye [[phab:T292196|T292196]]
* 14:50 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:45 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 14:36 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731346{{!}}[beta] Rename $wgIPInfoGeoIP2Path to $wgIPInfoGeoIP2Prefix (T289361)]] (duration: 00m 56s)
* 14:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:15 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (2/2) (duration: 00m 56s)
* 13:47 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:731015{{!}}Remove wmg variables for dispatch via jobs (T291828)]] (1/2) (duration: 00m 56s)
* 13:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:35 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:731014{{!}}Unconditionally enable Wikibase dispatching via jobs (T291828)]] (duration: 00m 56s)
* 13:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:22 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2079.codfw.wmnet with OS buster
* 12:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:55 Lucas_WMDE: UTC morning backport window done
* 11:55 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (2/2) (duration: 00m 56s)
* 11:54 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730748{{!}}Remove $wmgWikibaseDispatchViaJobsAllowedClients (T291828)]] (1/2) (duration: 00m 56s)
* 11:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:51 marostegui@cumin1001: START - Cookbook sre.hosts.reimage for host db2079.codfw.wmnet with OS buster
* 11:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:49 marostegui: Reimage db2079 (codfw s8 master) [[phab:T290868|T290868]]
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730747{{!}}Set dispatchViaJobsAllowedClients to null everywhere (T291828)]] (duration: 00m 56s)
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:37 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731239{{!}}Make deduplication actually work for DispatchChangesJob (T291118)]] (duration: 00m 55s)
* 11:10 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/Hooks/RecentChangeSaveHookHandler.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (2/2) (duration: 00m 56s)
* 11:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:731238{{!}}Create DispatchChangesJob without change id (T291118)]] (duration: 00m 56s)
* 11:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:47 moritzm: copied wmf-certificates from buster-wikimedia to stretch-wikimedia in reprepro
* 10:38 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:731237{{!}}Don't filter by change Id when dispatching to client wikis ()]] (duration: 00m 59s)
* 09:48 moritzm: installing node-tar security updates on buster
* 09:39 vgutierrez: updating acme-chief to version 0.34 on acmechief instances - [[phab:T292619|T292619]]
* 09:38 godog: sync metrics from graphite1004 to graphite2003 - [[phab:T247963|T247963]]
* 09:13 moritzm: installing apr security updates on bullseye
* 08:57 godog: cleanup graphite metrics not modified for >= ~3yr (1024 days)
* 07:34 ema: cp3060 (text), cp3061 (upload): upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 07:34 elukey: depool + restart blazegraph on wdqs1013
* 07:01 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:31 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:09 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2015-09-25 ==
== 2021-10-16 ==
* 23:54 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes/GlobalFunctions.php: 37c6972f94: Made wfIsBadImage() use APC (duration: 00m 17s)
* 03:56 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 22:16 mutante: sodium - shutting down
* 02:19 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:53 ori: Armed new servicedeploy_rsa on mira
* 01:30 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 21:51 ori: Armed new servicedeploy_rsa on tin
* 20:45 urandom: starting a Cassandra repair on xeon (nodetool repair -pr)
* 19:30 mutante: restarted apache on gallium
* 19:07 bblack: configuring and enabling lvs-cross-row ports on asw-b-eqiad for lvs1007,8,10,11
* 18:04 jynus: performing schema change on m5-master "nova"
* 17:26 jynus: restarting and upgrading db1051 mysql (depooled)
* 17:20 godog: reboot ms-be1005, xfs
* 16:22 cwdent: updated payments from dc78ff5157b59a8f475dc86194a1059c2d6b2fad to 3b0915a51a0fd567bdf22f3d4e17548a83e735d8
* 16:17 akosiaris: duplicating database otrs to otrsupgradetest for testing the upgrade procedure
* 16:16 cmjohnson1: cp1046 stopped icinga checks for hardware troubleshooting
* 14:43 moritzm: installed rpcbind and apport security updates on various servers
* 14:36 logmsgbot: krenair@tin Synchronized php-1.26wmf24/includes/specials/SpecialMovepage.php: https://gerrit.wikimedia.org/r/#/c/241045/ (duration: 00m 17s)
* 14:22 chasemp: 'sudo -u hdfs hdfs haadmin -transitionToActive analytics1001-eqiad-wmnet' per otto on analytics1001
* 14:03 urandom: starting a Cassandra repair on restbase1006 (nodetool repair -pr -dc eqiad)
* 13:58 hashar: nodepool back in operations
* 13:44 moritzm: restarted saltmaster on palladium
* 13:12 mobrovac: restbase deploying e42bf0fc
* 13:12 moritzm: added debdeploy 0.0.7 to carbon
* 13:10 godog: enable puppet on restbase in production
* 12:54 andrewbogott: restarted rabbitmq-server on labcontrol1001
* 12:52 godog: stop puppet on restbase in production -- config deployment
* 12:46 hashar: stopping nodepool to clear out left over mysql connections
* 12:21 hashar: Nodepool is dead, or at least not adding new slaves anymore
* 12:16 hashar: Seems Zuul/Jenkins is in trouble somehow :-/
* 11:42 andrewbogott: upgrading linux-image-generic on labnet1002 to get us away from the recently-crashed 3.13.0-59
* 04:58 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: I252970886: Set "async" for SQL parser cache everywhere else (duration: 00m 18s)
* 02:57 Krinkle: sync-common failed on mw1010.eqiad.wmnet
* 02:53 logmsgbot: krinkle@tin Synchronized php-1.26wmf24/extensions/DonationInterface: 381faf5 (duration: 00m 21s)
* 02:33 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf24/cache/l10n: l10nupdate for 1.26wmf24 (duration: 11m 45s)
* 01:44 Krinkle: mwscript deleteEqualMessages.php --wiki roa_tarawiki
* 01:42 Krinkle: mwscript deleteEqualMessages.php --wiki itwikiquote
* 01:42 Krinkle: mwscript deleteEqualMessages.php --wiki alswiki
* 01:28 logmsgbot: krinkle@tin Synchronized php-1.26wmf24/extensions/WikimediaEvents: I5608f8ffd1c - Fix trailing comma (duration: 00m 17s)
* 00:10 logmsgbot: krinkle@tin Synchronized php-1.26wmf24/extensions/EventLogging/modules/ext.eventLogging.core.js: Increase maxUrlSize to 2000 (duration: 00m 17s)


== 2015-09-24 ==
== 2021-10-15 ==
* 23:02 logmsgbot: krenair@tin Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/240880/ (duration: 00m 17s)
* 23:48 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 21:18 ejegg: updated SmashPig from d1baa32267eaad7d69b47c657f4853eb306fad6b to bf302444eae8236734fd43883b06c7b2512b1532
* 23:27 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 21:15 ejegg: updated payments-wiki from 8428499feb8760d63faf681d53995697a2ba0fa7 to dc78ff5157b59a8f475dc86194a1059c2d6b2fad
* 23:23 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 19:57 logmsgbot: ori@tin Synchronized php-1.26wmf24/extensions/ContentTranslation: d079d5dd71: Updated mediawiki/core Project: mediawiki/extensions/ContentTranslation  8559ee614975f25b71a732ca0fb1bb6d489c9d33 (duration: 00m 18s)
* 22:38 mutante: apt1001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 19:35 bblack: depooled cp1046 from confd, committed pybal depool for LVS as well
* 22:36 mutante: apt2001 - removing nginx package, accidentally installed, should just be nginx-light of course, running puppet
* 19:34 chasemp: changing labs route on cr1 and cr2 from 10.68.16.0/22 to 10.68.16.0/21 which matches references, fw setting and manifests/network.pp
* 22:34 mutante: apt2001 - upgraded nginx
* 18:54 logmsgbot: catrope@tin Synchronized php-1.26wmf24/extensions/Flow/: Debugging for FlowFixLinks.php (duration: 00m 20s)
* 22:18 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:21 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedia wikis to 1.26wmf24
* 22:14 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:20 legoktm: moved oauthadmin group from User:Yuvipanda@metawiki to User:YuviPanda@metawiki
* 22:05 dpifke@deploy1002: Finished deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes (duration: 00m 05s)
* 18:19 godog: restart restbase on restbase1005
* 22:05 dpifke@deploy1002: Started deploy [performance/arc-lamp@40cb764]: Revert problematic arclamp patch to fix daemon crashes
* 18:18 godog: restart restbase on restbase1004
* 21:51 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:17 ejegg: reenabling CRM jenkins jobs
* 21:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:16 godog: restart restbase on restbase1003
* 21:44 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:08 ejegg: updated civicrm from 9fa38d06a75363a8009bce7ced190e39c75b68bc to 190f689ff7aec7fecefdf5af501293685c55e041
* 21:36 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:06 paravoid: depooling cp1046, stability issues
* 20:09 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:05 ejegg|afk: disabled CRM jenkins jobs
* 18:44 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 18:00 logmsgbot: demon@tin Synchronized multiversion/MWRealm.php: (no message) (duration: 00m 17s)
* 17:20 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 17:59 ori: Merged Apache config change Ia095457fb. It will refresh the Apache service as it rolls out, causing elevated 503s for the next 20 minutes.
* 17:17 mutante: gitlab1001 - disabling puppet for debugging
* 17:53 godog: rolling restart restbase in eqiad
* 17:05 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold - [[phab:T283076|T283076]]
* 17:35 chasemp: powercycling cp1046 at mgmt as I can't ssh in and it seems like it should be up
* 17:01 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 17:26 godog: bounce restbase on restbase1002, apply new datacenter config
* 16:50 mutante: gitlab2001 - temp stopped puppet - debugging gitlab restore script with Arnold
* 17:10 _joe_: cleaning up /tmp on mw1152
* 16:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 17:09 cmjohnson1: powering down for the last time es1001 - es1010
* 16:44 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 16:17 logmsgbot: thcipriani@tin Synchronized php-1.26wmf23/extensions/Wikidata: SWAT: Do not filter affected pages by namespace [[gerrit:240727]] (duration: 00m 26s)
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:01 robh: nothing on puppet swat window, easiest swat ever.
* 15:23 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:46 logmsgbot: thcipriani@tin Synchronized php-1.26wmf24/extensions/Wikidata: SWAT: Do not filter affected pages by namespace [[gerrit:240711]] (duration: 00m 26s)
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 15:23 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable suggestions in ca, en, es, fr, it, ja, tr, ru, zh [[gerrit:240638]] (duration: 00m 17s)
* 15:08 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 14:37 paravoid: repooling codfw
* 14:48 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:54 bblack: restarting varnish daemons on second half of maps, parsoid, misc clusters (package upgrade, shm_reclen change)
* 14:31 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:50 bblack: restarting varnishd instances on text, mobile, upload clusters for package upgrade (slow salt, no parallelism, ~5m spacing - FE cache loss, BE cache stays, should take ~9h)
* 14:15 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:05 moritzm: installed rpcbind security updates on eeden, baham, radon, maerlant, rhenium
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 11:56 bblack: restarting varnish daemons on half of maps, parsoid, misc clusters (package upgrade, shm_reclen change)
* 13:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 11:36 bblack: reinstall lvs300[12] to jessie - T96375
* 13:32 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 11:21 akosiaris: killed tail -f varnishncsa.log on cp1065 and ran apt-get clean to reclaim some disk space
* 13:30 elukey: start topic rebalancing for kafka main-eqiad (long maintenance, it will last a couple of days)
* 11:14 bblack: stopping pybal on lvs300[12]; lvs300[34] taking over
* 13:24 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 11:07 bblack: upgrading varnishes to 3.0.6plus-wm8 (non-restarting, just pkg update on-disk)
* 13:21 vgutierrez: updating acme-chief to version 0.34 on acmechief-test instances - [[phab:T292619|T292619]]
* 09:40 jynus: performing latest (software) steps to decom es1001-es1010 (puppet disabling, etc.)
* 13:19 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 08:39 jynus: restarted HHVM @ mw1056, 1104, 1122
* 13:14 vgutierrez: upload acme-chief 0.34 to apt.wikimedia.org (buster) - [[phab:T292619|T292619]]
* 05:33 yuvipanda: deleted logstash indexes for 08/27 and 28 too
* 11:55 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:31 yuvipanda: deleted indexes for 08/14, 15, 25, 26 on logstash
* 11:49 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 03:59 yuvipanda: restarting elasticsearch in logstash1001-3
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2007.codfw.wmnet
* 03:53 yuvipanda: restarted es on logstash1004-6
* 11:45 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:02 yuvipanda: jstack dumped logstash output onto /home/yuvipanda/stack on logstash1001 since strace seems useles
* 11:33 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:51 yuvipanda: restarted logstash on logstash1002
* 11:24 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2007.codfw.wmnet
* 02:41 yuvipanda: gmond at 100% again, killing it and stopping puppet again
* 11:14 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:40 yuvipanda: re-enabling and running puppet on hafnium to see what it's bringing up
* 10:46 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:38 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 30s)
* 09:15 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 02:23 yuvipanda: kill gmond on hafnium and disable puppet to prevent it from taking it back up. Was taking 100% CPU
* 09:06 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 02:16 Krinkle: Kibana/Logstash outage. Zero events received after 2015-09-23T23:59:59.999Z.
* 08:58 jelto: jelto@gitlab1001:~$ sudo disable-puppet "disable puppet on gitlab1001 to test 728380 on GitLab replica - [[phab:T283076|T283076]]"
* 02:14 Krinkle: Partial EventLogging outage (client-side events via hafnium abruptly stopped 2015-09-23 11:36 UTC - 15 hours ago)
* 07:41 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 01:53 mutante: started logstash on logstash1002 again
* 06:20 urbanecm: Start server-side upload for 1 video file
* 01:35 mutante: bast1001: unmounting /srv/home_pmtpa (backup on bacula)
* 02:14 ryankemper: [[phab:T288231|T288231]] `wdqs2006` data transfer complete and all tests passing on the host. All of `codfw wdqs-internal` is on the new streaming updater
* 01:34 mutante: removing subversion packages from bast1001
* 00:09 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 01:15 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes: Ifa0d4cfe8e3: Backport I1ff61153d and I8e4c3d5a5 (duration: 00m 23s)
* 00:07 brennen: end of UTC late backport & config training window
* 00:19 jynus: restarted replication on db1051
* 00:17 ori: restarting tcpircbot on neon
* 00:16 mutante: started logstash on logstash1002
* 00:08 bblack: varnish package on carbon for jessie updated to 3.0.6plus-wm8


== 2015-09-23 ==
== 2021-10-14 ==
* 23:17 logmsgbot: krenair@tin Synchronized php-1.26wmf24/includes/specials/SpecialSearch.php: https://gerrit.wikimedia.org/r/#/c/240596/ (duration: 00m 18s)
* 23:59 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 23:09 logmsgbot: krenair@tin Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/240575/ (duration: 00m 17s)
* 23:58 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 55s)
* 22:53 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes/resourceloader/ResourceLoaderFileModule.php: 14f46330d9: Backport fix from PS12 of I1ff6115 (duration: 00m 17s)
* 23:56 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730737{{!}}Change Kashmiri Wikipedia logo (T293342)]] (duration: 00m 56s)
* 22:39 paravoid: cr1-codfw RE switchover(s)
* 23:49 cjming@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 22:35 logmsgbot: yurik@tin Synchronized php-1.26wmf24/extensions/ZeroBanner: Deploying ZeroBanner interstitial handling (duration: 00m 18s)
* 23:48 cjming@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 55s)
* 21:13 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes/resourceloader/ResourceLoaderFileModule.php: 58bfb6f85b: Backport fix from PS9 of I1ff6115 (duration: 00m 17s)
* 23:46 cjming@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:730736{{!}}Change Kashmiri Wiktionary logo (T293373)]] (duration: 00m 56s)
* 20:46 subbu: deployed parsoid 6619409e
* 23:43 ejegg: updated payments-wiki from {{Gerrit|19d18c1852}} to {{Gerrit|0f48acea49}}
* 19:58 urandom: starting Cassandra on restbase2006
* 23:34 cjming@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/WikimediaEvents/includes/VectorPrefDiffInstrumentation.php: Backport: [[gerrit:730733{{!}}Change VectorPrefDiffInstrumentation stream name to `mediawiki.skin_diff` (T289622)]] (duration: 00m 56s)
* 19:56 urandom: enabling puppet, and forcing a run on restbase2006
* 23:24 cjming@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730936{{!}}allow sysops to add and remove users to other groups on ptwikivoyage (T292806)]] (duration: 00m 56s)
* 19:52 urandom: starting Cassandra on restbase2005
* 23:21 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 19:50 urandom: enabling puppet, and forcing a run on restbase2005
* 23:11 brennen@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730933{{!}}Add americanantiquarian.org to the wgCopyUploadsDomains allowlist of Wikimedia Commons (T292918)]] (duration: 00m 57s)
* 19:48 urandom: starting Cassandra on restbase2004
* 23:11 mutante: mw1452 - re-pooled, scap pull
* 19:45 urandom: enabling puppet, and forcing a run on restbase2004
* 23:09 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 18:55 twentyafterfour: snapshot1001.eqiad.wmnet returned [12]: rsync: write failed on "/srv/mediawiki/wikiversions.cdb": No space left on device
* 22:35 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 18:54 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf24
* 22:35 ryankemper: [[phab:T288231|T288231]] Ran puppet on `wdqs2006`, now back to the cookbook run
* 18:01 mobrovac: restbase deploying f65313ed
* 22:33 ryankemper: [[phab:T288231|T288231]] Forgot about running puppet-agent on `wdqs2006`; aborted cookbook run
* 17:49 urandom: starting Cassandra on restbase2003
* 22:33 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 17:45 logmsgbot: demon@tin Synchronized php-1.26wmf23/extensions/Wikidata: (no message) (duration: 00m 25s)
* 22:33 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 17:35 urandom: enabling, and forcing puppet run on restbase2003
* 22:32 ryankemper: [[phab:T288231|T288231]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/730795; proceeding to data-transfer on `wdqs2006`: `sudo rm -fv /srv/wdqs/data_loaded` on `wdqs2006` followed by `ryankemper@cumin1001:~$ sudo cookbook sre.wdqs.data-transfer --source wdqs2008.codfw.wmnet --dest wdqs2006.codfw.wmnet --reason "streaming updater cutover for wdqs2005" --blazegraph_instance blazegraph --task-id [[phab:T288231|T288231]]`
* 17:27 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 13s)
* 22:31 mutante: depooling mw1452 for testig
* 17:20 logmsgbot: ori@tin scap failed: OSError [Errno 2] No such file or directory: '/var/lock/scap' (duration: 44m 24s)
* 22:28 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo pool`: transfer completed successfully; tests passing on host (used `ssh -L 9999:localhost:80 wdqs2005.codfw.wmnet` to establish tunnel)
* 17:11 logmsgbot: ori@tin Synchronized php-1.26wmf24/includes/page/Article.php: (no message) (duration: 00m 13s)
* 22:23 dpifke@deploy1002: Finished deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]] (duration: 00m 05s)
* 17:11 logmsgbot: ori@tin Synchronized php-1.26wmf23/includes/page/Article.php: (no message) (duration: 00m 17s)
* 22:23 dpifke@deploy1002: Started deploy [performance/arc-lamp@84fe496]: New flamegraph.pl from upstream [[phab:T291898|T291898]]
* 16:35 logmsgbot: ori@tin Started scap: (no message)
* 22:17 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 16:35 logmsgbot: ori@tin Synchronized php-1.26wmf23/includes/page: I952068d2d: Reduced the DOS potential of 404 page floods (duration: 00m 12s)
* 22:07 eileen: civicrm revision changed from {{Gerrit|018d3b19fe}} to {{Gerrit|9b5e0d015b}}, config revision is {{Gerrit|781d6a1b1f}}
* 16:22 godog: start cassandra on restbase2002
* 21:34 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:05 godog: deploy restbase daacf4d on restbase2*
* 21:25 robh@cumin1001: START - Cookbook sre.dns.netbox
* 15:34 logmsgbot: demon@tin Synchronized php-1.26wmf24/extensions/Wikidata: (no message) (duration: 00m 20s)
* 21:10 robh@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 15:34 logmsgbot: demon@tin Synchronized php-1.26wmf23/extensions/Wikidata: (no message) (duration: 00m 21s)
* 21:06 robh@cumin1001: START - Cookbook sre.dns.netbox
* 15:34 chasemp: unbanning elastic1005 for T112559
* 19:45 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 15:32 logmsgbot: demon@tin scap aborted: (no message) (duration: 02m 07s)
* 19:23 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 15:30 logmsgbot: demon@tin Started scap: (no message)
* 19:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 15:15 logmsgbot: demon@tin Synchronized php-1.26wmf22/extensions/Wikidata: (no message) (duration: 00m 20s)
* 18:53 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 14:13 logmsgbot: demon@tin Synchronized wmf-config/abusefilter.php: (no message) (duration: 00m 12s)
* 18:53 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php --wiki=dagwiki --fix
* 14:08 logmsgbot: demon@tin Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 12s)
* 18:47 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=frwiktionary --logwiki=metawiki 'TURK FASTER' 'ARTHUR MORGAN'
* 14:05 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 12s)
* 18:42 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=metawiki --logwiki=metawiki 'George Dum Fulton' 'George Fulton' # [[phab:T293403|T293403]]
* 14:04 logmsgbot: demon@tin Synchronized langlist-labs: (no message) (duration: 00m 11s)
* 18:41 urbanecm: UTC evening B&C done
* 14:04 logmsgbot: demon@tin Synchronized docroot/noc/conf/: (no message) (duration: 00m 13s)
* 18:40 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/MediaSearch/extension.json: {{Gerrit|6da3523daaba85a4199721980c0a9c96b20697e7}}: Fix assessment quickview labels ([[phab:T292596|T292596]]) (duration: 01m 03s)
* 14:01 logmsgbot: demon@tin Synchronized multiversion/MWRealm.php: (no message) (duration: 00m 11s)
* 18:37 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c8dffefd0d095abe3709dcc962d5d24f27b55869}}: Create Salima namespace for dagwiki ([[phab:T289911|T289911]]) (duration: 01m 04s)
* 13:56 godog: force puppet run on restbase2001 to deploy new cassandra config
* 18:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 13:28 godog: reboot ms-be1012, xfs hosed
* 18:25 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bccd4bc45498db8628567574d0bb3a23f8fb378}}: Add $wgSitename and $wgMetaNamespace for kswiki and kswiktionary ([[phab:T289752|T289752]], [[phab:T289767|T289767]]) (duration: 01m 04s)
* 13:22 paravoid: upgrading cr1-codfw with newer junos
* 18:17 ryankemper@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)
* 07:43 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: semver the special:version hook (duration: 00m 12s)
* 18:14 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|262e588b44f126fb9e1aa933a3ca59b191b42bd7}}: Enable Growth mentor dashboard backend on all wikis ([[phab:T278920|T278920]]) (duration: 01m 05s)
* 03:05 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf24/cache/l10n: l10nupdate for 1.26wmf24 (duration: 10m 12s)
* 18:07 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|41baa8c41d64510986f009b9be2d70dad0915f8c}}: Add new mediawiki.skin_diff event logging stream ([[phab:T289622|T289622]]) (duration: 01m 05s)
* 02:38 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf23) at 2015-09-23 02:38:44+00:00
* 18:03 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 02:35 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 21s)
* 18:02 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 00:56 logmsgbot: legoktm@tin Synchronized php-1.26wmf24/extensions/Echo/Hooks.php: Remove duplicate 'MediaWiki' prefix from echo.unseen stats (duration: 00m 12s)
* 18:01 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 00:03 RoanKattouw: Running FlowFixLinks.php on testwiki
* 17:54 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:52 rzl: repooled mw1452 (with `sudo pool` so no auto log from conftool)
* 17:47 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 17:45 rzl@cumin1001: conftool action : set/pooled=no; selector: name=mw1452.eqiad.wmnet
* 17:42 rzl: depool mw1452 for training
* 17:32 addshore@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 17:31 addshore@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 17:29 addshore@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .
* 16:44 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:44 ryankemper: [[phab:T288231|T288231]] Manually killed dangling `pigz` / `nc` processes on `wdqs2008` (and `wdqs2005` implicitly). Should be in the right state to re-start the `data-transfer` cookbook from again
* 16:41 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:37 elukey: drop kubeflow-kfserving* docker images from deneb
* 16:36 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:34 ryankemper@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99)
* 16:33 moritzm: installing node-ansi-regex security updates
* 16:28 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere (duration: 02m 24s)
* 16:25 mbsantos@deploy1002: Started deploy [kartotherian/deploy@4bff2d1]: Force mirrored traffic to 0% for everywhere
* 16:24 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 04s)
* 16:16 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad (duration: 02m 41s)
* 16:14 mbsantos@deploy1002: Started deploy [kartotherian/deploy@071f7c3]: Increase mirrored traffic to 100% for eqiad
* 16:08 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 16:07 ryankemper@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97)
* 16:07 ryankemper: [[phab:T288231|T288231]] About to ctrl+c out of ongoing data transfer because puppet run following merge of https://gerrit.wikimedia.org/r/c/operations/puppet/+/730794 restarted blazegraph; we'll manually disable updater and kick off the transfer again
* 16:04 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo run-puppet-agent --force`
* 15:56 ryankemper@cumin1001: START - Cookbook sre.wdqs.data-transfer
* 15:54 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2008:~$ sudo depool`
* 15:52 ryankemper: [[phab:T288231|T288231]] `ryankemper@wdqs2005:~$ sudo depool`
* 15:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:20 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 15:13 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 15:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/VisualEditor/includes/VisualEditorHooks.php: Backport: [[gerrit:730729{{!}}Fix value of 'namespacesWithSubpages' in wgVisualEditorConfig (T293310)]] (duration: 01m 04s)
* 15:02 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/CollectionHooks.php: Backport: [[gerrit:730580{{!}}Check that the timestamp  key/value is set to avoid undefined offset (T293300)]] (duration: 01m 03s)
* 15:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:59 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti2026.codfw.wmnet to ganeti-test01.svc.codfw.wmnet
* 14:53 kormat: upgrading orchestrator.wm.o to 3.2.6-1 [[phab:T275784|T275784]]
* 14:49 jbond@cumin1001: conftool action : set/pooled=true; selector: name=eqiad,dnsdisc=apt
* 14:43 jbond: migrate apt.w.o to a dns active/passiev discovery address (cc moritzm)
* 14:23 moritzm: installing krb5 security updates on KDCs
* 14:19 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:10 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 2/2) (duration: 01m 03s)
* 14:07 urbanecm: Run extensions/GrowthExperiments/initWikiConfig.php for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:07 urbanecm: Create growthexperiments DB tables for ganwiki, iuwiki, tgwiki ([[phab:T291826|T291826]])
* 14:06 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:05 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b35adfc59eec9c19b509bb9439cdfe33978a4f8b}}: Deploy Growth wikis to 4 wikis in dark mode ([[phab:T291826|T291826]]; 1/2) (duration: 01m 04s)
* 14:03 urbanecm@deploy1002: Synchronized dblists/visualeditor-nondefault.dblist: {{Gerrit|82d0a4bf45126ecba2cfcd1a0c2081a00f58dca3}}: Enable VE by default on 4 more wikis ([[phab:T290614|T290614]]) (duration: 01m 05s)
* 13:56 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 13:55 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:52 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 13:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:33 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:14 kormat: uploaded orchestrator 3.2.6-1 packages to apt.wm.o (buster) [[phab:T275784|T275784]]
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti2026.codfw.wmnet with OS buster
* 12:44 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 12:42 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:42 aborrero@cumin1001: START - Cookbook sre.hosts.downtime for 10 days, 0:00:00 on cloudbackup2002.codfw.wmnet with reason: working on cinder backupse
* 12:19 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:730746{{!}}Untangle “dispatch via jobs” settings in Wikibase.php (T291828)]] (no-op) (duration: 01m 04s)
* 12:12 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730725{{!}}Set wmgWikibaseDispatchViaJobsPruneChangesTableInJobEnabled for wikidatawiki (T291828)]] (no-op) (duration: 01m 05s)
* 11:47 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2026.codfw.wmnet with OS buster
* 11:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2006.codfw.wmnet
* 11:10 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:01 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2002.codfw.wmnet
* 10:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet
* 10:35 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/: {{Gerrit|1f33fc3}}, {{Gerrit|e0ea1b8}}, {{Gerrit|cba2ac9}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 05s)
* 10:33 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|465b564}}, {{Gerrit|a8cc98b}}, {{Gerrit|6e95c48}}: GrowthExperiments backports ([[phab:T290609|T290609]]) (duration: 01m 06s)
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
* 09:20 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:20 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 09:19 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 09:18 volans@deploy1002: Finished deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1 (duration: 00m 50s)
* 09:17 volans@deploy1002: Started deploy [debmonitor/deploy@ab62ac5]: Release v0.3.1
* 09:04 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 45s)
* 09:03 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 09:02 volans@deploy1002: Finished deploy [debmonitor/deploy@444b931]: Release v0.3.1 (duration: 00m 23s)
* 09:02 volans@deploy1002: Started deploy [debmonitor/deploy@444b931]: Release v0.3.1
* 08:52 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:52 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 08:51 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 08:22 volans: rolling out debmonitor-client upgrade to 0.3.1 across the fleet
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:25 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:25 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:24 oblivian@cumin1001: END (FAIL) - Cookbook sre.discovery.service-route (exit_code=99)
* 07:24 oblivian@cumin1001: START - Cookbook sre.discovery.service-route
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift-ro,name=eqiad
* 07:18 filippo@cumin1001: conftool action : set/pooled=true; selector: dnsdisc=swift,name=eqiad
* 07:17 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:37 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:52 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:50 foks: changing user email for "Region of Peel Archives"
* 01:41 ejegg: updated payments-wiki from {{Gerrit|b329d2dea2}} to {{Gerrit|19d18c1852}}
* 01:35 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 01:31 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .


== 2015-09-22 ==
== 2021-10-13 ==
* 23:40 mutante: renaming search mailing lists to discovery mailing lists
* 23:37 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:35 logmsgbot: krenair@tin Synchronized php-1.26wmf24/extensions/Echo: https://gerrit.wikimedia.org/r/#/c/240283/ and https://gerrit.wikimedia.org/r/#/c/240281/ (duration: 00m 13s)
* 23:36 eileen: civicrm revision changed from {{Gerrit|946dfb6c5a}} to {{Gerrit|018d3b19fe}}, config revision is {{Gerrit|85277466ed}}
* 23:18 logmsgbot: krenair@tin Synchronized wmf-config/abusefilter.php: https://gerrit.wikimedia.org/r/#/c/240278/ (duration: 00m 12s)
* 23:36 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:730575{{!}}Create an alias for the project namespace on kswiki (T291740)]] (duration: 01m 05s)
* 23:16 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/240259/ (duration: 00m 12s)
* 22:30 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:15 logmsgbot: krenair@tin Synchronized w/static/images/sul/wikimania.png: https://gerrit.wikimedia.org/r/#/c/239308/ (duration: 00m 11s)
* 22:01 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Collection/includes/Specials/SpecialCollection.php: Backport: [[gerrit:730578{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 23:14 logmsgbot: krenair@tin Synchronized w/static/images/sul/commons.png: https://gerrit.wikimedia.org/r/#/c/239308/ (duration: 00m 12s)
* 21:50 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection: Backport: [[gerrit:730577{{!}}Api: Avoid trying to access undefined offset in a user's collection (T293261)]] (duration: 01m 04s)
* 22:44 logmsgbot: catrope@tin Synchronized wmf-config/CommonSettings.php: Set $wgFlowMigrateReferenceWiki to false in production (duration: 00m 12s)
* 21:47 foks: removing 8 files for legal compliance
* 22:38 logmsgbot: catrope@tin Synchronized wmf-config/InitialiseSettings.php: Enable Flow opt-in on testwiki for testing (duration: 00m 12s)
* 21:03 foks: removing 2 files for legal compliance
* 21:56 cwdent: updated payments from 7b08867d9c5e87f5babb4b5b9cf1f5bec5e243b3 to 8428499feb8760d63faf681d53995697a2ba0fa7
* 21:00 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 21:49 chasemp: unban elastic1030 from T112559
* 20:50 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 21:33 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.26wmf24
* 20:49 brennen@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Collection/includes/Api/ApiGetBookCreatorBoxContent.php: Backport: [[gerrit:730574{{!}}Fall back to main page if given title is invalid (T293299)]] (duration: 01m 04s)
* 21:13 logmsgbot: twentyafterfour@tin Finished scap: Test 1.26wmf24 (duration: 50m 34s)
* 20:46 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:44 mutante: cancel backup job of bast1001 on helium because running low on disk
* 20:40 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:22 logmsgbot: twentyafterfour@tin Started scap: Test 1.26wmf24
* 20:31 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 16:50 robh: all mw servers returned to puppet enabled, puppet swat window over
* 20:27 robh@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes1021.eqiad.wmnet with OS stretch
* 16:40 awight: updated paymentswiki 153418195a45cab820bc2aacf9a4f7dbc9dde768 to 7b08867d9c5e87f5babb4b5b9cf1f5bec5e243b3
* 20:04 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 16:23 robh: re-enabled puppet on mw hosts, as both redirection changes are good
* 20:03 robh@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kubernetes1021.eqiad.wmnet with OS stretch
* 16:14 robh: re-enabling puppet on mw hosts, as the new patchset 239278 deployed and tested fine on a single host, deploying to rest
* 20:01 robh@cumin1001: START - Cookbook sre.hosts.reimage for host kubernetes1021.eqiad.wmnet with OS stretch
* 16:04 robh: disabling puppet across mw hosts for new configuration deployment
* 19:18 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 15:46 godog: running puppet on restbase2001
* 19:16 mutante: gitlab2001 - status before was that "gitlab-ctl status" showed components "gitlab-workhorse" and "postgres-exporter" as "down". this was either pre-broken or caused by the restore process. after manually 'gitlab-ctl start gitlab-workhorse' all of the components are in "run" and https://gitlab-replica.wikimedia.org is up ( [[phab:T285867|T285867]])
* 15:41 godog: stop puppet on restbase2* pending codfw expansion
* 19:08 mutante: gitl1b2001 - started workhorse which was for some reason marked as down after restore command ran
* 15:31 cwdent: updated payments from 153418195a45cab820bc2aacf9a4f7dbc9dde768 to 7b08867d9c5e87f5babb4b5b9cf1f5bec5e243b3
* 19:08 mutante: [gitlab2001:~] $ sudo /usr/bin/gitlab-ctl start gitlab-workhorse
* 14:42 cmjohnson1: shutting down elastic1005 and elastic1030 to move around within the data center
* 19:06 dancy@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 01m 03s)
* 14:19 bblack: starting slow restart of varnish + varnish-frontend daemon processes on global text, upload, and mobile clusters for shm_reclen (all randomly blended, no parallelism, ~5 minute spacing, will take ~9 hours - FEs will lose cache data, BEs will not)
* 19:05 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 14:14 chasemp: depool elastic nodes for T112559
* 19:02 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87879865c35edab3ead523027681146e00d6fc02}}: Create Translation namespace for viwikisource ([[phab:T290691|T290691]]) (duration: 01m 04s)
* 11:18 logmsgbot: aude@tin Synchronized php-1.26wmf23/extensions/Wikidata: Fix autocomment and change handling bugs (duration: 00m 21s)
* 18:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|06fd0f225575448771cdba0d4e6bf36bb6715bc1}}: add extendedconfimed for autoreview group on ptwiki ([[phab:T292912|T292912]]) (duration: 01m 04s)
* 10:42 logmsgbot: aude@tin Synchronized arbitraryaccess.dblist: Enable arbitrary access for Wikibooks (duration: 00m 12s)
* 18:37 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript initSiteStats.php --wiki=ptwiki --update
* 10:42 logmsgbot: aude@tin Synchronized wmf-config/InitialiseSettings.php: Enable data access for Wikibooks - try again for snapshot hosts (duration: 00m 12s)
* 18:33 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript emptyUserGroup.php --wiki=ptwiki extendedconfirmed
* 10:35 logmsgbot: aude@tin Synchronized wmf-config/InitialiseSettings.php: Enable data access for Wikibooks (duration: 01m 12s)
* 18:31 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0bb2b388217aa91a39ed3684f87fdf7edb06fd81}}:  Set autoconfirmedextended and confirmedextended for ptwiki ([[phab:T292915|T292915]]) (duration: 01m 04s)
* 10:17 moritzm: enabled ferm on mw1152 (videoscaler)
* 18:16 urbanecm@deploy1002: Synchronized static/images/project-logos: {{Gerrit|694bc234ab5dbb9a2387a6129998d45a53ac0ab3}}: Remove an old dawiki temporary logo (duration: 01m 04s)
* 10:03 godog: finished stressdisk on restbase200[123] no errors reported
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|224e2a374b1cc6327e9d8c2bca576091ce4efc74}}: Add NS_MAIN back to wgExtraSignatureNamespaces for mediawikiwiki ([[phab:T291630|T291630]]) (duration: 01m 05s)
* 10:03 moritzm: enabled ferm on mw1259 (videoscaler)
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 04:35 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Sep 22 04:35:46 UTC 2015 (duration 35m 45s)
* 18:12 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 02:22 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf23) at 2015-09-22 02:22:56+00:00
* 18:11 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|1b96f54a518620b0dc6a0ab63b402d0ea2c6bf70}}: Update logo for liwiktionary ([[phab:T291479|T291479]]) (duration: 01m 14s)
* 02:19 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 00s)
* 18:10 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 01:53 mutante: sodium - deleted salt key, revoked puppet cert, rm from icinga ..
* 18:10 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 00:32 ori: Disabled Puppet for 24h on hafnium and stopped ganglia-monitor. gmond was saturating CPU.
* 18:09 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 18:09 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 18:08 volans: uploaded debmonitor-client_0.3.1 to apt.wikimedia.org stretch-wikimedia,buster-wikimedia,bullseye-wikimedia
* 17:14 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|dd7a3314602ffddc5b917cccc71c917301639388}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 04s)
* 17:13 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/maintenance/initWikiConfig.php: {{Gerrit|5c27154cf434bebc37f5e98e2ad1b5cea7cde1d4}}: initWikiConfig: Fix loading difficulty/group from SUGGESTED_EDITS_TASK_TYPES ([[phab:T293219|T293219]]) (duration: 01m 15s)
* 16:57 mutante: stat1008 - short on disk space, mostly used in /tmp, high CPU usage by R proccess, sent a message about it to all shell users via wall
* 16:50 mutante: stat1008 - apt-get clean - freed 1.3 GB disk space - was alerting in Icinga because / was 97% full
* 16:37 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 16:37 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 16:23 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 16:23 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:29 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:28 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:26 volans@cumin2002: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:26 volans@cumin2002: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 15:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:12 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:12 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 15:04 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:03 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:03 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 15:01 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 15:01 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 15:01 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:59 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 14:59 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:59 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:57 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:56 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:54 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:54 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:52 ema: repool cp4021, further testing can be performed on sretest1001 [[phab:T201317|T201317]]
* 14:51 volans: restarting ircecho.service on alert1001 to get back icinga-wm without the underscore
* 14:50 elukey: restart pybal on lvs1015 (low-traffic primary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 14:48 moritzm: reverted to clean package state on deneb
* 14:44 elukey@puppetmaster1001: conftool action : ge; selector: cluster=ml_serve,service=inference
* 14:36 elukey: restart pybal on lvs1016 (low-traffic secondary) to pick up new config for inference.discovery.wmnet - [[phab:T289835|T289835]]
* 14:27 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:27 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:25 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:25 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:21 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:21 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:20 moritzm: temporarily downgrade sphinx packages on deneb to 1.7.9-1~bpo9+1 to build a Ganeti 2.16 stretch backport with delicate toolchain needs
* 14:13 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:13 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: END (PASS) - Cookbook sre.debmonitor.remove-hosts (exit_code=0) for 1 hosts: sretest1001.eqiad.wmnet
* 14:10 jbond@cumin1001: START - Cookbook sre.debmonitor.remove-hosts for 1 hosts: sretest1001.eqiad.wmnet
* 13:59 XioNoX: push prep-work for anycast tuning in ulsfo - [[phab:T288843|T288843]]
* 13:38 jayme: imported helm-diff_3.1.3-2 to buster-wikimedia (https://gerrit.wikimedia.org/r/c/operations/debs/helm-diff/+/730509)
* 13:37 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 13:34 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 12:13 Lucas_WMDE: UTC morning backport+config window done
* 12:12 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/includes: Backport: [[gerrit:730370{{!}}Add Link: Do not log "no suggestion found" errors in production log (T291251)]] (duration: 01m 04s)
* 12:11 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]'  # after applying 730512 at mwmaint1002 to workaround [[phab:T293219|T293219]] # [[phab:T255037|T255037]]
* 12:11 kharlan@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/modules: Backport: [[gerrit:730371{{!}}Suggested Edits: Update local config.presets when topics/difficulty presets change (T292536)]] (duration: 01m 07s)
* 11:56 urbanecm@deploy1002: Synchronized wmf-config/config/itwiki.yaml: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]) (duration: 01m 04s)
* 11:55 urbanecm: mwscript extensions/Translate/scripts/moveTranslatablePage.php --wiki=mediawikiwiki "Growth/Communities/How to introduce yourself as a mentor" "Growth/Communities/How to configure the mentors' list" "Martin Urbanec (WMF)" --reason '[[:phab:T293184]]' # [[phab:T293184|T293184]]
* 11:55 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 2/3) (duration: 01m 04s)
* 11:54 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|38a019d4fd6ff8e7cf92f5e7c6a899c336f20235}}: itwiki: Deploy Growth features in dark mode ([[phab:T255037|T255037]]; 1/3) (duration: 01m 05s)
* 11:50 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=itwiki --phab='[[phab:T255037|T255037]]' # [[phab:T255037|T255037]]
* 11:49 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=itwiki growthexperiments # [[phab:T255037|T255037]]
* 11:48 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/Wikibase/repo/: Backport: [[gerrit:730380{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 07s)
* 11:43 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Wikibase/repo/: Backport: [[gerrit:730385{{!}}Instantiate ItemId for SiteLinkConflictLookup results (T293104)]] (duration: 01m 18s)
* 11:33 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 11:19 ema: pool cp4021 after reimage [[phab:T201317|T201317]]
* 11:05 ema@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4021.ulsfo.wmnet with OS buster
* 10:15 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:10 ayounsi@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:09 phuedx@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:728490{{!}}Add more types of QuickSurveys on beta cluster (T292459)]] (duration: 01m 53s)
* 10:06 ayounsi@cumin1001: START - Cookbook sre.dns.netbox
* 09:22 ema@cumin2002: START - Cookbook sre.hosts.reimage for host cp4021.ulsfo.wmnet with OS buster
* 08:35 oblivian@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:28 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:21 elukey: run kafka preferred-replica-election on kafka-main1001 to rebalance partition leaders - [[phab:T288825|T288825]]
* 08:15 godog: bounce graphite on graphite1004 to apply new config
* 07:33 elukey: increase kafka topic partition size of the top 4 high traffic topics of main-eqiad as described in https://phabricator.wikimedia.org/T288825#7422726
* 07:13 XioNoX: provision new eqsin-ulsfo link - [[phab:T273308|T273308]]
* 06:26 elukey: `kafka topics --alter --topic <nowiki>{</nowiki>eqiad,codfw<nowiki>}</nowiki>.change-prop.transcludes.resource-change --partitions 3` on kafka-main2001 - [[phab:T288825|T288825]]
* 00:38 ejegg: updated payments-wiki from {{Gerrit|030b11da1a}} to {{Gerrit|b329d2dea2}}


== 2015-09-21 ==
== 2021-10-12 ==
* 23:29 logmsgbot: krinkle@tin Synchronized php-1.26wmf23/extensions/NavigationTiming/modules/ext.navigationTiming.js: T112593 (duration: 00m 14s)
* 23:48 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 23:21 logmsgbot: krenair@tin Synchronized php-1.26wmf22/extensions/Wikidata: https://gerrit.wikimedia.org/r/#/c/239828/ (duration: 00m 21s)
* 23:16 urbanecm: UTC late B&C window done
* 23:07 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/238357/ (duration: 00m 13s)
* 23:15 urbanecm@deploy1002: Synchronized wmf-config/logos.php: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 01m 04s)
* 22:26 mutante: restarting gerrit for ssh config change
* 23:12 urbanecm@deploy1002: Synchronized static/images/project-logos/: {{Gerrit|59c31d9046a68e73b07d8179ac569425d18dcf73}}: Change logo in astwiki ([[phab:T292742|T292742]]) (duration: 02m 09s)
* 21:15 awight: legacy PayPal listener updated from 1c9ac2e66d11bbf768ea873d6e1a2522ca9841c1 to 55aeef63f6508381e3a8b7fcabddf9a3c3b73b8e
* 23:05 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:43 cwdent: updated worldpay config on payments
* 22:53 urbanecm: [urbanecm@labweb1001 ~]$ mwscript extensions/OATHAuth/maintenance/disableOATHAuthForUser.php --wiki=labswiki Jamesmontalvo3 #
* 20:41 mdholloway: MobileApps deployed sha1 013044e
* 22:51 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:39 chasemp: banning elastic1005 for T112559
* 20:21 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 20:29 subbu: deployed parsoid version 9984d221
* 19:31 dancy@deploy1002: Pruned MediaWiki: 1.38.0-wmf.1 (duration: 04m 02s)
* 19:41 urandom: temporarily stopping codfw restbase cassandra nodes to test quorum auth
* 19:13 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:15 logmsgbot: ori@tin Synchronized wmf-config/CommonSettings.php: Ieccb23f: Enable async secondary writes for mysql-multiwrite cache (on testwiki) (duration: 00m 13s)
* 19:08 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:36 ejegg: re-enabled paypal audit parser
* 19:02 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 18:16 cmjohnson1: disabling puppet on mw1031
* 18:47 dancy@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]] (duration: 45m 36s)
* 17:58 chasemp: banning 1030 from eqiad elastic cluster for T112559#1660068
* 18:12 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet with OS buster
* 17:57 ejegg: disabled paypal audit parser
* 18:01 dancy@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.4  refs [[phab:T281168|T281168]]
* 16:08 ejegg: updated payments-wiki from 4d9d165c40070e036176dba8987243f6dbc7415e to 153418195a45cab820bc2aacf9a4f7dbc9dde768
* 17:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:22 logmsgbot: thcipriani@tin Synchronized php-1.26wmf23/extensions/ContentTranslation/modules/entrypoint/ext.cx.interlanguagelink.js: SWAT: Revert "Do not call cxserver to display gray interwiki link" [[gerrit:239819]] (duration: 00m 11s)
* 17:56 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/CentralNotice: Backport: [[gerrit:730141]] (duration: 00m 59s)
* 15:10 logmsgbot: thcipriani@tin Synchronized robots.txt: SWAT: Remove redundant entries from robots.txt [[gerrit:239403]] (duration: 00m 12s)
* 17:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 ottomata: restart eventlogging with mysql consumer replace=True (AKA INSERT IGNORE)
* 17:46 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet with OS buster
* 14:09 godog: rolling restart restbase in production after cassandra credentials change
* 17:43 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:53 godog: rolling restart cassandra after enabling dc encryption, no nodes in codfw yet
* 17:41 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 12:01 moritzm: repooled mw1160 (for T104969)
* 17:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:54 moritzm: depooled mw1160 (for T104969)
* 17:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:51 moritzm: repooled mw1158, mw1159 (for T104969)
* 17:32 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SyntaxHighlight_GeSHi/includes/ResourceLoaderPygmentsModule.php: Backport: [[gerrit:730233{{!}}Include generated styles before Mediawiki overrides (T292736)]] (duration: 00m 57s)
* 11:39 moritzm: depooled mw1158, mw1159 (for T104969)
* 17:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:37 moritzm: depooled and repooled mw1156, mw1157 (for T104969)
* 17:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:26 moritzm: repooled mw1154, mw1155 (for T104969)
* 17:23 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730236{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 11:21 moritzm: depooled mw1154, mw1155 (for T104969)
* 17:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:39 moritzm: repooled mw1026-mw1029 and mw1110-mw1113 (for T104968)
* 17:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:24 moritzm: depooled mw1026-mw1029 and mw1110-mw1113 (for T104968)
* 17:16 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/includes/actions/pagers/HistoryPager.php: Backport: [[gerrit:730235{{!}}Fix history page iteration in backwards mode (T292791)]] (duration: 00m 57s)
* 10:17 moritzm: repooled mw1100-mw1109 (for T104968)
* 17:12 moritzm: installing rsync bugfix updates
* 10:17 godog: create restbase user on cassandra cluster
* 17:09 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 10:06 moritzm: depooled mw1100-mw1109 (for T104968)
* 16:56 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 09:56 moritzm: repooled mw1140 and mw1142-mw1148 (for T104968)
* 16:55 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 09:41 moritzm: depooled mw1140 and mw1142-mw1148 (for T104968)
* 16:53 moritzm: failed over ganeti master for test cluster to ganeti2025
* 09:36 moritzm: repooled mw1130-mw1139 (for T104968)
* 16:50 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 09:22 moritzm: depooled mw1130-mw1139 (for T104968)
* 16:48 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 09:14 moritzm: repooled mw1120-mw1129 (for T104968)
* 16:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:02 moritzm: depooled mw1120-mw1129 (for T104968)
* 16:30 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts testvm2009.codfw.wmnet
* 08:48 moritzm: repooled mw1189 and mw1200-mw1208 (for T104968)
* 16:30 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 08:33 moritzm: depooled mw1189 and mw1200-mw1208 (for T104968)
* 16:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:29 godog: switch to 'restbase' cassandra user on restbase test cluster
* 16:26 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 08:29 moritzm: repooled mw1190-mw1195 and mw1197-mw1199 (for T104968)
* 16:26 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/includes: Backport: [[gerrit:730226{{!}}Pre-format comments for non-local files too (T292570)]] (duration: 01m 15s)
* 08:21 _joe_: restarted the logstash agent on logstash1003, OOM'd
* 16:17 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 08:18 moritzm: depooled mw1190-mw1195 and mw1197-mw1199 (for T104968)
* 16:16 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2009.codfw.wmnet
* 08:07 _joe_: installing the new HHVM package on the api canaries
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:04 moritzm: repooled mw1221-mw1229 (for T104968)
* 16:10 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 07:53 moritzm: depooled mw1221-mw1229 (for T104968)
* 16:09 volans@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2009.codfw.wmnet
* 07:49 moritzm: repooled mw1230-mw1235 (for T104968)
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:43 _joe_: installing the new hhvm package on the canary appservers
* 16:06 dancy@deploy1002: Synchronized php-1.38.0-wmf.4/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730231{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 00m 57s)
* 07:08 moritzm: depooled mw1230-mw1235 (for T104968)
* 16:00 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2009.codfw.wmnet
* 04:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Sep 21 04:31:03 UTC 2015 (duration 31m 2s)
* 15:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:23 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf23) at 2015-09-21 02:23:12+00:00
* 15:58 dancy@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/SecurePoll/includes/Hooks/HookRunner.php: Backport: [[gerrit:730230{{!}}Fix wrong var being passed (T289950 T293102)]] (duration: 02m 13s)
* 02:20 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 25s)
* 15:57 volans@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2009.codfw.wmnet
* 02:06 MaxSem: Maps: created indexes on admin. <3 Postgres :(
* 15:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:56 bblack: downtimed eqiad ipv6 text/upload alerts as well, as with mobile above ( 1 301 TLS Redirect - 505 bytes in 1.008 second response time
* 15:51 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 01:46 bblack: downtimed the "LVS HTTP IPv6 on mobile-lb.eqiad.wikimedia.org_ipv6" alert for now ( https://phabricator.wikimedia.org/T113154 )
* 15:49 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:48 volans@cumin2002: END (ERROR) - Cookbook sre.ganeti.makevm (exit_code=97) for new host testvm2009.codfw.wmnet
* 15:48 volans@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2009.codfw.wmnet
* 15:41 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for analytics1069.eqiad.wmnet
* 15:41 btullis@cumin1001: START - Cookbook sre.hosts.remove-downtime for analytics1069.eqiad.wmnet
* 15:02 volans@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:50 volans@cumin2002: START - Cookbook sre.dns.netbox
* 13:49 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2006.codfw.wmnet
* 13:25 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2006.codfw.wmnet
* 13:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:14 godog: add 50G to prometheus/k8s in eqiad
* 13:13 otto@deploy1002: Synchronized wmf-config/CommonSettings.php: Enable x_client_ip_forwarding_enabled for eventgate-analytics and eventgate-analytics-external - [[phab:T288853|T288853]] (duration: 00m 56s)
* 13:11 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 13:11 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on analytics1069.eqiad.wmnet with reason: draining flea power [[phab:T291732|T291732]]
* 13:05 volans: upgraed spicerack to 1.0.5 on cumin hosts
* 12:25 volans: uploaded spicerack_1.0.5 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 12:15 elukey: `kafka topics --alter --topic codfw.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:15 elukey: `kafka topics --alter --topic eqiad.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:10 elukey: `kafka topics --alter --topic codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 12:09 elukey: `kafka topics --alter --topic eqiad.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite --partitions 5` - [[phab:T288825|T288825]]
* 11:58 elukey: `kafka topics --alter --topic codfw.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 11:49 elukey: `kafka topics --alter --topic eqiad.resource-purge --partitions 5` on kafka-main2001 - [[phab:T288825|T288825]]
* 11:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 11:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:42 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 11:34 urbanecm: UTC morning B&C window done
* 11:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:32 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|860ea0944d6dc1e6b5061eb84eec378eb5ac8441}}: Remove NS_MAIN from wgExtraSignatureNamespaces on most special wikis ([[phab:T291630|T291630]]) (duration: 00m 57s)
* 11:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:14 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:06 urbanecm@deploy1002: Synchronized w/static.php: {{Gerrit|e77ae17efb34723598fc69e87109944384df442a}}: static.php: correctly report a bad request (duration: 00m 57s)
* 11:02 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host testvm2003.codfw.wmnet
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2003.codfw.wmnet
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 10:30 ema: apply https://gerrit.wikimedia.org/r/726912 to all A:cp nodes [[phab:T288106|T288106]]
* 10:24 ema@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp4028.ulsfo.wmnet,service=ats-be
* 10:23 ema: depool/repool ats-be on cp4028 to verify updates to /etc/varnish/directors.frontend.vcl on cp4027 keep on working fine [[phab:T288106|T288106]]
* 10:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 10:22 ema@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp4028.ulsfo.wmnet,service=ats-be
* 10:16 ema: cp4027: enable and run puppet to test https://gerrit.wikimedia.org/r/726912 [[phab:T288106|T288106]]
* 10:12 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ganeti2025.codfw.wmnet with OS buster
* 09:16 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 100%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17456 and previous config saved to /var/cache/conftool/dbconfig/20211012-091614-kormat.json
* 09:01 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 75%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17455 and previous config saved to /var/cache/conftool/dbconfig/20211012-090111-kormat.json
* 08:46 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 50%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17454 and previous config saved to /var/cache/conftool/dbconfig/20211012-084607-kormat.json
* 08:31 kormat@cumin1001: dbctl commit (dc=all): 'db1127 (re)pooling @ 25%: repool db1127 [[phab:T292956|T292956]]', diff saved to https://phabricator.wikimedia.org/P17453 and previous config saved to /var/cache/conftool/dbconfig/20211012-083103-kormat.json
* 08:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:58 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments/: {{Gerrit|17dc3aa}}, {{Gerrit|e0ca905}}, {{Gerrit|c0f4f4e}}: GrowthExperiments backports ([[phab:T292224|T292224]], [[phab:T290609|T290609]], [[phab:T290609|T290609]]) (duration: 00m 59s)
* 07:40 elukey: run kafka preferred-replica-election on kafka-main2001 to rebalance partition leaders after the last topic moves - [[phab:T288825|T288825]]
* 07:33 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti2025.codfw.wmnet with OS buster
* 07:22 moritzm: installing RT security updates
* 04:43 eileen: civicrm revision changed from {{Gerrit|96090e4bd2}} to {{Gerrit|946dfb6c5a}}, config revision is {{Gerrit|85277466ed}}
* 03:56 kart_: cxserver: Remove Matxin Key from Production ([[phab:T292635|T292635]])
* 03:54 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:48 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 03:45 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 02:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 eileen: civicrm revision changed from {{Gerrit|598b59b0ee}} to {{Gerrit|96090e4bd2}}, config revision is {{Gerrit|85277466ed}}


== 2015-09-20 ==
== 2021-10-11 ==
* 22:34 yuvipanda: reloda pybal on lvs1012
* 21:25 btullis@cumin1001: END (PASS) - Cookbook sre.hadoop.roll-restart-masters (exit_code=0) restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 17:01 bblack: repooling cp1046 varnish-be + varnish-be-rand in confctl, fresh storage, purge queue caught up - T113184
* 20:58 btullis@cumin1001: START - Cookbook sre.hadoop.roll-restart-masters restart masters for Hadoop analytics cluster: Restart of jvm daemons. - btullis@cumin1001
* 16:44 bblack: depooling cp1046 varnish-be + varnish-be-rand in confctl, wiping storage, re-pooling - T113184
* 17:08 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 07:24 paravoid: temporarily disabling puppet on fermium and applying antispam countermeasures
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 04:29 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Sep 20 04:29:16 UTC 2015 (duration 29m 15s)
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 02:22 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf23) at 2015-09-20 02:22:53+00:00
* 15:31 jgleeson: smashpig updated from {{Gerrit|3607b16f83}} to {{Gerrit|dd3a81c7c2}}
* 02:19 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 12s)
* 14:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 14:59 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm[2001-2002,2005].codfw.wmnet with reason: Ganeti tests
* 14:36 Emperor: start restoring weight to ms-be2045 [[phab:T290881|T290881]]
* 13:42 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 12:53 moritzm: install apache security updates on buster
* 12:49 topranks: Setting up BGP peering to AS12552 (GlobalConnect Group) at AMS-IX on cr2-esams
* 12:45 ema: cp4027: upgrade varnish to 6.0.8 [[phab:T292290|T292290]]
* 12:04 moritzm: install apache security updates on bullseye
* 10:23 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2003.codfw.wmnet
* 09:50 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite2003.codfw.wmnet
* 09:45 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host graphite2003.codfw.wmnet
* 09:37 elukey: force kafka preferred-replica-election on kafka-main2001 after another batch of topic partitions moves - [[phab:T288825|T288825]]
* 09:13 filippo@cumin1001: START - Cookbook sre.hosts.reimage for host graphite2003.codfw.wmnet
* 09:09 elukey: force kafka preferred-replica-election on kafka-main2001 after the first 50 topic partitions moves - [[phab:T288825|T288825]]
* 09:05 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 09:01 godog: bounce swift-object-replicator on ms-be2036
* 08:52 godog: bounce statsite on graphite1004 to apply unit config changes
* 08:48 volans@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 08:41 volans@cumin2002: START - Cookbook sre.hosts.reimage for host sretest1002.eqiad.wmnet
* 08:38 moritzm: updated buster d-i image for Bullseye 11.1 point release [[phab:T292844|T292844]]
* 08:38 moritzm: updated buster d-i image for Buster 10.11 point release [[phab:T292838|T292838]]
* 08:26 godog: swift eqiad-prod: final weight to ms-be10[64-67] - [[phab:T290546|T290546]]
* 08:25 moritzm: updated buster d-i image for Buster 10.11 point release [[phab:T292838|T292838]]
* 08:24 volans@cumin1001: START - Cookbook sre.hosts.reimage for host sretest1001.eqiad.wmnet
* 08:06 godog: bounce uwsgi on graphite hosts to bump request size limit - [[phab:T292877|T292877]]
* 07:58 volans: migrating physical hosts DHCP to the new reimage process - [[phab:T269855|T269855]]
* 07:57 elukey: start kafka topics rebalancing for main-codfw (long running maintenance) - [[phab:T288825|T288825]]


== 2015-09-19 ==
== 2021-10-09 ==
* 23:12 urandom: begining Cassandra repair on restbase1005 (nodetool repair -pr)
* 05:01 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:08 urandom: begining Cassandra repair on restbase1004 (nodetool repair -pr)
* 04:28 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:56 jynus: restarting once more giblit, last chance
* 01:32 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 19:04 paravoid: salt rm /etc/systemd/system/txstatsd.service from all cp*, leftover because of ::txstatsd::decommission (removed with 4a1d4e) missing it
* 00:46 mutante: ms-be2045 - started systemd-timedated which had been killed by something
* 19:00 ejegg|away: updated SmashPig from d5895428d1d8ebc5a6e172e8cdec6dbec0b10d85 to d1baa32267eaad7d69b47c657f4853eb306fad6b
* 00:28 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 18:45 _joe_: restarted gitblit. I will now substitute myself with a clever perl one-liner.
* 00:24 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.force-unfreeze (exit_code=99)
* 18:38 paravoid: pooling back cp1046 to pybal eqiad/mobile, has stayed stable
* 00:23 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.force-unfreeze
* 18:34 paravoid: reactivating ΒGP with GTT @ eqiad
* 00:13 ryankemper: [[phab:T292814|T292814]] Write queue stuck at 133 events in partition 1 of topic `codfw.cpjobqueue.partitioned.mediawiki.job.cirrusSearchElasticaWrite`, will try again at another time
* 08:42 _joe_: cp1046 dead on console again, powercycling to inspect it
* 00:12 ryankemper@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 05:49 logmsgbot: aaron@tin Synchronized php-1.26wmf23/extensions/TitleBlacklist: 80d3a21a51f9c54ed2d94 (duration: 00m 12s)
* 05:22 paravoid: pybal-depooling cp1046 from eqiad/mobile until further investigation
* 05:21 paravoid: powercycling cp1046, dead on console
* 05:01 awight: deploy SmashPig config to limit weekend spam
* 04:40 awight: update crm from 15ea14f61338ca9f34e9ccb9f56eae14a161380a to 9fa38d06a75363a8009bce7ced190e39c75b68bc
* 04:29 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Sep 19 04:28:59 UTC 2015 (duration 28m 57s)
* 02:23 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf23) at 2015-09-19 02:23:29+00:00
* 02:20 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 05s)


== 2015-09-18 ==
== 2021-10-08 ==
* 23:22 awight: update fundraising-tools from 3e0e3ae799a507b378d0ece3e71631b10b361329 to e1b60fa2c258fd4ff55905b03a4d8886132278c1
* 23:16 legoktm: sudo cumin -b 10 C:mediawiki::packages 'apt-get purge lilypond-data -y'
* 20:52 ebernhardson: restart es on elastic1025 to disable dynamic scripting
* 23:10 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 20:34 gwicke: dropped by_ns indexes on restbase title_revisions tables
* 21:38 mutante: mwmaint2002 - disable-puppet, stop bacula-fd, recovery in progress
* 19:54 gwicke: finished deploy of restbase daacf4daa
* 21:34 mutante: disabling puppet on bacula - going through a restore https://wikitech.wikimedia.org/wiki/Bacula#Restore_from_a_non-existent_host_(missing_private_key)
* 19:45 gwicke: re-enabled puppet on restbase100*
* 21:30 legoktm: running puppet across C:mediawiki::packages to uninstall lilypond and ploticus: legoktm@cumin1001:~$ sudo cumin -b 4 C:mediawiki::packages 'run-puppet-agent'
* 19:35 gwicke: canary deploy of restbase daacf4daa on restbase1001; moving forward so that we can re-enable puppet over the weekend.
* 20:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 18:38 cwdent: updated payments from 1bdd287b083032ff418434ad6bb6920735af918a to 4d9d165c40070e036176dba8987243f6dbc7415e
* 20:10 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 17:54 logmsgbot: ebernhardson@tin Synchronized wmf-config/CommonSettings.php: Replace insecure es usage with usage of a plugin (duration: 00m 12s)
* 20:08 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 16:41 mutante: mailman now on 2.1.18 and jessie
* 20:08 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 16:14 dcausse: elastic in eqiad plugin updates: restarting elastic1021
* 20:06 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1004.eqiad.wmnet with reason: REIMAGE
* 16:07 paravoid: deactivating ΒGP with GTT @ eqiad
* 20:05 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubestage1003.eqiad.wmnet with reason: REIMAGE
* 15:20 godog: create restbase user on cassandra test cluster
* 19:46 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 14:55 dcausse: elastic in eqiad plugin updates: restarting elastic1020
* 19:45 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 14:22 bblack: committing lvs1007-1012 port/vlan changes for asw-d-eqiad (but leaving all 6 LVS ports in "disabled" state - T112781 )
* 19:43 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1020.eqiad.wmnet with reason: REIMAGE
* 14:14 bblack: committing lvs1007-12 port/vlan changes for asw-b-eqiad, round 3...
* 19:42 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1019.eqiad.wmnet with reason: REIMAGE
* 14:11 mutante: sodium - stopped exim - rsyncing lists to fermium
* 19:42 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 14:10 dcausse: elastic in eqiad plugin updates: restarting elastic1019
* 19:39 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on kubernetes1018.eqiad.wmnet with reason: REIMAGE
* 14:07 mutante: stopped mailman on sodium
* 18:15 cstone: civicrm revision changed from {{Gerrit|5cb7d487cb}} to {{Gerrit|598b59b0ee}}
* 14:01 bblack: rollback on asw-b-eqiad changes above
* 16:19 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=enwiki --force # to measure performance on a large wiki
* 13:56 bblack: committing eqiad lvs1007-1012 port/vlan changes for asw-b-eqiad
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:20 bblack: committing eqiad lvs1007-12 port/vlan changes for asw-c-eqiad
* 15:48 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:16 bblack: commiting eqiad lvs1007-12 port/vlan changes for asw2-a5-eqiad
* 15:29 jelto: enable puppet on gitlab1001 again for [[phab:T283076|T283076]]
* 13:12 dcausse: elastic in eqiad plugin updates: restarting elastic1018
* 14:05 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:21 godog: restart logstash on logstash1001, OOM in logs
* 14:01 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:55 dcausse: elastic in eqiad plugin updates: restarting elastic1017
* 09:49 Amir1: wikiadmin@10.64.16.85(wikidatawiki)> delete from wb_changes_subscription where cs_subscriber_id in ('testcommonswiki', 'mowiki');
* 11:06 dcausse: elastic in eqiad plugin updates: restarting elastic1016
* 09:39 Emperor: installing stress on ms-be2045 given recent h/w issues [[phab:T290881|T290881]]
* 10:28 moritzm: restarted salt-master on palladium
* 08:20 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:46 moritzm: installed openldap security updates on plutonium
* 08:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:37 moritzm: installed openldap security updates on pollux
* 08:04 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force
* 09:33 dcausse: elastic in eqiad plugin updates: restarting elastic1015
* 07:43 Emperor: reboot ms-be2045 [[phab:T290881|T290881]]
* 08:22 dcausse: elastic in eqiad plugin updates: restarting elastic1014
* 07:41 gehel: manually resuming the data reloads on wdqs1009 and wdqs2008
* 07:21 dcausse: elastic in eqiad plugin updates: restarting elastic1013
* 06:42 ayounsi@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 06:15 dcausse: elastic in eqiad plugin updates: restarting elastic1012
* 06:42 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 04:37 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Sep 18 04:37:42 UTC 2015 (duration 37m 41s)
* 06:28 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 02:31 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf23) at 2015-09-18 02:31:49+00:00
* 06:28 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 02:28 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 08s)
* 05:35 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 02:21 logmsgbot: krenair@tin Synchronized wmf-config/abusefilter.php: https://gerrit.wikimedia.org/r/#/c/218353/ (duration: 00m 12s)
* 04:56 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 02:21 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/218353/ (duration: 00m 11s)
* 04:32 ryankemper: [[phab:T292814|T292814]] Beginning rolling restart of `cloudelastic`: `sudo -i cookbook sre.elasticsearch.rolling-operation cloudelastic "cloudelastic restart" --nodes-per-run 1 --start-datetime 2021-10-08T03:53:49 --task-id [[phab:T292814|T292814]]` on `ryankemper@cumin1001` tmux `elastic`
* 02:13 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/237149/ (duration: 00m 12s)
* 04:31 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 02:07 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/234544/ (duration: 00m 12s)
* 04:29 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 01:58 logmsgbot: ori@tin Synchronized php-1.26wmf22/includes/resourceloader/ResourceLoaderModule.php: I952068d2d: ResourceLoaderModule: cache file content hash (duration: 00m 12s)
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across both test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 01:58 logmsgbot: ori@tin Synchronized php-1.26wmf23/includes/resourceloader/ResourceLoaderModule.php: I952068d2d: ResourceLoaderModule: cache file content hash (duration: 00m 11s)
* 04:28 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 01:57 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://phabricator.wikimedia.org/T106264 (duration: 00m 12s)
* 04:23 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@8f57a56]: 0.3.89 (duration: 08m 22s)
* 01:36 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/237331/ (duration: 00m 12s)
* 04:20 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 00:14 ori: restarted logstash on logstash1001
* 04:20 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 04:18 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 04:17 gehel@cumin1001: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
* 04:15 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.89` on canary `wdqs1003`; proceeding to rest of fleet
* 04:14 ryankemper@deploy1002: Started deploy [wdqs/wdqs@8f57a56]: 0.3.89
* 04:14 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.89`. Pre-deploy tests passing on canary `wdqs1003`
* 03:58 ryankemper@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 03:58 ryankemper@cumin1001: START - Cookbook sre.elasticsearch.rolling-operation restart without plugin upgrade (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic restart - ryankemper@cumin1001 - [[phab:T292814|T292814]]
* 02:04 Krinkle: krinkle@deploy1002$ echo 'https://en.wikipedia.org/static/images/project-logos/jvwiktionary.png' {{!}} mwscript purgeList.php , ref [[phab:T287425|T287425]], [[phab:T292810|T292810]]
* 00:07 tgr_: deploy window over
* 00:05 tgr@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GrowthExperiments: Backport: [[gerrit:727498{{!}}Mentee overview: Make UncachedMenteeOverviewDataProvider::getBlocksForUsers faster (T290609)]] (duration: 00m 56s)


== 2015-09-17 ==
== 2021-10-07 ==
* 23:39 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/238978/ (duration: 00m 12s)
* 23:43 thcipriani@deploy1002: Synchronized wmf-config/logos.php: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 3/3 (duration: 00m 55s)
* 23:05 logmsgbot: mattflaschen@tin Synchronized wmf-config/CommonSettings-labs.php: Beta-only change (duration: 00m 12s)
* 23:41 thcipriani@deploy1002: Synchronized logos/config.yaml: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 2/3 (duration: 00m 55s)
* 23:04 logmsgbot: mattflaschen@tin Synchronized wmf-config/CommonSettings-labs.php: Beta-only change (duration: 00m 12s)
* 23:40 thcipriani@deploy1002: Synchronized static/images/project-logos: Config: [[gerrit:708065{{!}}Change Javanese Wiktionary logo (T287425)]] part 1/3 (duration: 00m 56s)
* 22:53 gwicke: puppet on restbase cluster disabled since about  21:30 UTC for gradual deploy; ran into minor issue in staging, which is now being addressed, after which deploy will continue
* 23:30 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 2/2 (duration: 00m 56s)
* 21:22 logmsgbot: ori@tin Synchronized php-1.26wmf22/includes/resourceloader/ResourceLoaderModule.php: I952068d2d: Use MD4 to compute file hash rather than SHA1 (duration: 00m 13s)
* 23:28 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikiquote-wordmark-tr.svg: Config: [[gerrit:704170{{!}}Adding and use wordmark in trwikiquote (T286133)]] Part 1/2 (duration: 00m 57s)
* 21:22 logmsgbot: ori@tin Synchronized php-1.26wmf23/includes/resourceloader/ResourceLoaderModule.php: I952068d2d: Use MD4 to compute file hash rather than SHA1 (duration: 00m 12s)
* 21:35 urbanecm: Password reset for SUL User:LA2-bot ([[phab:T292793|T292793]])
* 20:44 logmsgbot: krenair@tin Synchronized wmf-config/interwiki.cdb: Updating interwiki cache (duration: 00m 12s)
* 20:43 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3
* 20:22 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/239206/ (duration: 00m 12s)
* 20:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.2  refs [[phab:T281167|T281167]]
* 19:46 logmsgbot: krenair@tin Synchronized multiversion/MWMultiVersion.php: (no message) (duration: 00m 12s)
* 20:35 cmooney@cumin1001: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 19:41 logmsgbot: krenair@tin Synchronized multiversion/MWMultiVersion.php: https://gerrit.wikimedia.org/r/#/c/239181/ (duration: 00m 14s)
* 20:35 cmooney@cumin1001: START - Cookbook sre.network.cf
* 19:12 mutante: powercycling unresponse mw1005
* 20:23 krinkle@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Gadgets/: {{Gerrit|I7c858b8c4bc}} (duration: 00m 56s)
* 18:14 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedia wikis to 1.26wmf23
* 20:01 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Echo/: {{Gerrit|8a7ff05ba28f302adb581bf430a868bb815b4ffd}}: Revert "Use namespaced CentralAuthSessionProvider" (duration: 00m 57s)
* 17:38 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/includes/registration/ExtensionRegistry.php: registration: Fix merging of array_plus (duration: 00m 13s)
* 19:45 urbanecm@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/CentralAuth/: {{Gerrit|c01c2e4983bad8582ddd62aeb35ac9be852d493b}}: Revert "Namespace session providers" (duration: 00m 57s)
* 17:35 logmsgbot: legoktm@tin Synchronized php-1.26wmf23/includes/registration/ExtensionRegistry.php: registration: Fix merging of array_plus (duration: 00m 11s)
* 19:44 urbanecm: Backporting https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/727489, https://gerrit.wikimedia.org/r/c/mediawiki/extensions/Echo/+/727487 in an unsafe way -- exceptions at testwikis expected, wmf.3 is not deployed elsewhere, so this should be ok
* 16:43 chasemp: restart elasticsearch on 1005
* 19:37 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert all wikis to 1.38.0-wmf.2 ([[phab:T281167|T281167]])
* 16:16 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/235900/ (duration: 00m 12s)
* 19:33 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): variously blocked, rolling back to testwikis for safe deploy of backports
* 15:15 dcausse: elastic in eqiad plugin updates: restarting elastic1004 (take 2)
* 19:14 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group2 wikis to 1.38.0-wmf.2
* 15:06 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: CX: Enable Suggestions in ptwiki [[gerrit:238097]] (duration: 00m 13s)
* 19:07 brennen@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 14:22 mutante: analytics1029 - Failed to start Hadoop datanode
* 19:03 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to all wikis
* 14:20 mutante: starting hadoop datanode on analytics1029
* 18:50 urbanecm: [urbanecm@mwmaint1002 /srv/mediawiki/php]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=test2wiki
* 14:14 _joe_: reimaging tmh1001 to mw1259
* 18:46 sukhe: running authdns-update for [[phab:T292537|T292537]]
* 14:11 jynus: stopping replication and applying schema change to db1051
* 18:29 urbanecm: Morning B&C window done
* 14:05 dcausse: elastic in eqiad plugin updates: can't restart elastic1004 (2 timeouts when disabling replication, too much load?), waiting for more shards to rebalance...
* 18:28 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|4a946c046ae17a520f8d3463a16b1435ceb4856c}}: Deploy Growth mentor dashboard to pilot wikis ([[phab:T278920|T278920]]) (duration: 01m 04s)
* 13:58 dcausse: elastic in eqiad plugin updates: restarting elastic1004
* 18:23 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 03s)
* 13:50 moritzm: repooled mw1236-mw1239 (T104968)
* 18:21 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|87e300137c14451949fac12c3ec89319305a423e}}: Deploy Growth features to test2wiki (duration: 01m 04s)
* 13:34 moritzm: depooled mw1236-mw1239 (T104968)
* 18:20 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|31770f2b3660e7d7490c0a9ab66285c1f069732d}}: shwiki: Deploy Growth features to newcomers ([[phab:T278240|T278240]]) (duration: 01m 04s)
* 13:26 moritzm: repooled mw1090-mw1099 (T104968)
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|33526dfed148068585289f5ac501feda72068fd9}}: Stream config changes for android_daily_stats schema ([[phab:T286000|T286000]]) (duration: 01m 06s)
* 13:16 moritzm: depooled mw1090-mw1099 (T104968)
* 18:10 ejegg: updated payments-wiki from {{Gerrit|6d3560d083}} to {{Gerrit|030b11da1a}}
* 13:13 moritzm: repooled mw1080-mw1089 (T104968)
* 18:07 arnoldokoth: gitlab2001 re-image complete ([[phab:T283076|T283076]])
* 13:05 moritzm: depooled mw1080-mw1089 (T104968)
* 17:30 mutante: rebooting gitlab2001.wikimedia.org
* 13:01 moritzm: repooled mw1070-mw1079 (T104968)
* 16:56 arnoldokoth: down timing gitlab2001 for re-imaging ([[phab:T283076|T283076]])
* 12:49 moritzm: depooled mw1070-mw1079 (T104968)
* 16:47 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 12:35 moritzm: repooled mw1060 and mw1062-mw1069 (T104968)
* 16:47 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2001.wikimedia.org with reason: reimage
* 12:24 moritzm: depooled mw1060 and mw1062-mw1069 (T104968) (not repooled)
* 16:32 hnowlan: roll restarting maps cassandra instances for java updates
* 12:24 moritzm: repooled mw1060 and mw1062-mw1069 (T104968)
* 16:19 ayounsi@cumin2002: END (PASS) - Cookbook sre.network.cf (exit_code=0)
* 12:16 moritzm: repooled mw1050-mw1059
* 16:19 ayounsi@cumin2002: START - Cookbook sre.network.cf
* 12:04 moritzm: depooled mw1050-mw1059
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 11:39 moritzm: repooled mw1040 and mw1042-mw1049 (T104968)
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 11:36 dcausse: elastic in eqiad plugin updates: restarting elastic1003
* 16:18 ayounsi@cumin1001: END (FAIL) - Cookbook sre.network.cf (exit_code=99)
* 11:26 moritzm: typoed earlier entry: "mw1032-mw1039" instead of "mw1032-mw1239"
* 16:18 ayounsi@cumin1001: START - Cookbook sre.network.cf
* 11:26 moritzm: depooled mw1040 and mw1042-mw1049 (T104968)
* 15:07 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001 (duration: 00m 08s)
* 11:18 moritzm: repooled mw1030 and mw1032-mw1239 (T104968)
* 15:07 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit1001
* 11:03 moritzm: depooled mw1030 and mw1032-mw1239 (T104968)
* 14:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:35 moritzm: repooled mw1250-mw1258 (T104968)
* 14:49 hashar@deploy1002: Finished deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001 (duration: 00m 10s)
* 10:27 moritzm: depooled mw1250-mw1258 (T104968)
* 14:49 hashar@deploy1002: Started deploy [gerrit/gerrit@13cef9f]: Gerrit to 3.3.6 on gerrit2001
* 10:25 _joe_: killing temporarily subra
* 14:48 hashar: Upgrading Gerrit replica to 3.3.6 # [[phab:T290236|T290236]]
* 10:24 moritzm: repooled mw1240-mw1249 (T104968)
* 14:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:19 _joe_: experimenting with poolcounter issues on subra
* 14:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:18 logmsgbot: oblivian@tin Synchronized wmf-config/PoolCounterSettings-codfw.php: Use codfw poolcounters in codfw (duration: 00m 12s)
* 14:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:12 moritzm: depooled mw1240-mw1249 (T104968)
* 13:56 jiji@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 10:12 dcausse: elastic in eqiad plugin updates: restarting elastic1002
* 13:46 jiji@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 10:05 logmsgbot: hoo@tin Synchronized wmf-config/: Set 'repoConceptBaseUri' for all Wikibase clients (duration: 00m 13s)
* 13:39 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:00 dcausse: elastic in eqiad plugin updates: unfreezing indices
* 13:30 jiji@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 09:48 dcausse: elastic in eqiad plugin updates: no more groovy in warmers, waiting for few more shards to move in elastic1001 and will unfreeze indices to test warmers
* 13:29 jiji@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:39 dcausse: elastic in eqiad plugin updates: deleting warmers manually for old unused indices (eswikisource_content_1415240352, ruwiki_content_1415302164, thwiki_content_1415318677). We will have to remove these indices.
* 13:29 hashar: restarting CI Jenkins for git plugin update
* 09:39 paravoid: repooling ulsfo US-West traffic back to ulsfo for the first time since May :)
* 13:19 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 09:01 dcausse: elastic in eqiad plugin updates: updating warmers on all wikis
* 13:15 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:58 paravoid: penalizing ulsfo-eqiad direct MPLS links to higher OSPF weights
* 13:14 hashar: Upgraded CI Jenkins on contint2001
* 08:57 paravoid: adjusting OSPF weights to be latency-based across the US network
* 13:14 jiji@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 08:53 _joe_: removed iptables rules for dropping traffic to helium on mw1017
* 13:13 jiji@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 08:52 dcausse: elastic in eqiad plugin updates: index warmer queries are outdated with inline groovy script, updating warmers on warwiki first to test
* 13:10 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 08:05 paravoid: eqiad-codfw -> eqiad-eqord-codfw migration
* 13:09 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 07:49 moritzm: repooled mw1180-mw1188 (T104968)
* 13:06 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 07:42 dcausse: elastic in eqiad plugin updates: restarting elastic1001
* 13:06 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 07:42 moritzm: depooled mw1180-mw1188 (T104968)
* 13:05 jiji@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 07:37 moritzm: repooled mw1170-mw1179 (T104968)
* 13:05 jiji@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 07:36 dcausse: elastic in eqiad plugin updates: freezing indices
* 12:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 07:27 moritzm: depooled mw1170-mw1179 (T104968)
* 12:56 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:14 _joe_: uploading new HHVM package
* 12:51 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 07:07 moritzm: repooled mw1161-1168 (T104968)
* 12:40 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 06:57 moritzm: depooled mw1161-1168 (T104968)
* 12:16 moritzm: installing testvm2005
* 06:45 moritzm: repooled mw1209-mw1220 with ferm enabled
* 11:59 moritzm: installing openssl security updates for stretch (buster/bullseye already fixed)
* 06:33 moritzm: depooling mw1209-mw1220 (in two steps)
* 11:52 Lucas_WMDE: EU backport+config window (aka UTC morning) done
* 05:47 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Sep 17 05:47:47 UTC 2015 (duration 47m 46s)
* 11:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:06 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf23) at 2015-09-17 03:06:33+00:00
* 11:50 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725858{{!}}Enable Content and Section Translation to Kurdish WP (T290238)]] (duration: 01m 04s)
* 03:03 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 06m 30s)
* 11:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:45 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf22) at 2015-09-17 02:45:48+00:00
* 11:44 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/WikidataPageBanner/includes/WikidataPageBannerFunctions.php: Backport: [[gerrit:727188{{!}}Change PropertyId to NumericPropertyId (T289125, T292667)]] (duration: 01m 05s)
* 02:39 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf22/cache/l10n: l10nupdate for 1.26wmf22 (duration: 11m 11s)
* 11:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:35 cwdent: updated payments from 155cdeb737c01baf62551292764fd2f5a93a9a63 to 1bdd287b083032ff418434ad6bb6920735af918a
* 11:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:10 jbond: update puppet stdlib gerrit:726872
* 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:31 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:27 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2004.codfw.wmnet
* 09:26 mvernon@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host ms-be2045.codfw.wmnet
* 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2005.codfw.wmnet
* 09:19 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2004.codfw.wmnet
* 09:08 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2005.codfw.wmnet
* 08:49 mvernon@cumin2002: START - Cookbook sre.experimental.reimage for host ms-be2045.codfw.wmnet
* 08:36 moritzm: imported jenkins 2.303.2 to thirdparty/ci component for buster-wikimedia
* 07:57 Emperor: re-enabling puppet on ms-be2045 after hw work [[phab:T290881|T290881]]
* 07:39 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 07:39 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 07:38 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 07:37 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 07:34 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 07:33 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 07:32 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 07:31 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 06:21 ryankemper: [Elastic] Restart of `relforge` complete
* 06:05 ryankemper: [Elastic] Cluster in green status, proceeding to next and final node => `ryankemper@relforge1003:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:53 ryankemper: [Elastic] `ryankemper@relforge1004:~$ sudo systemctl restart elasticsearch_6@relforge-eqiad-small-alpha.service && sudo systemctl restart elasticsearch_6@relforge-eqiad.service`
* 05:48 ryankemper: [Elastic] Performing rolling restarts of `relforge`. `relforge1003` is the master so I'll restart `relforge1004` first to minimize disruption
* 03:00 ejegg: updated payments-wiki from {{Gerrit|23d0ffac66}} to {{Gerrit|6d3560d083}}
* 02:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:28 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: enable Parsoid API everywhere (duration: 01m 04s)
* 02:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 mutante: [grafana2001:~] $ sudo systemctl start rsync-var-lib-grafana  because of "PROBLEM - Check systemd state on grafana2001 is CRITICAL: CRITICAL - degraded" because of some race condition where a file vanished during sync


== 2015-09-16 ==
== 2021-10-06 ==
* 23:27 bblack: updating eqiad switch configs for lvs1007-1012 vlan/trunk settings
* 23:57 mutante: releases2002 - rm /srv/org/wikimedia/reprepro/conf/distributions - contains only jessie-mediawiki - see 725670 and EOL of MediaWiki 1.31
* 23:19 logmsgbot: krenair@tin Synchronized php-1.26wmf23/extensions/MobileFrontend/resources/mobile.overlays/Overlay.less: https://gerrit.wikimedia.org/r/#/c/238865/ (duration: 00m 11s)
* 23:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:13 gwicke: started `nodetool rebuild -- eqiad` on restbase-test200{1,2
* 23:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:03 cwdent: updated payments from 9fc8ab40b7f70c7b588c2b9e7b5c94b1f893faa1 to 155cdeb737c01baf62551292764fd2f5a93a9a63
* 23:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:26 ejegg: updated SmashPig from fdb053efa617162ac9f695e493c390987a069140 to d5895428d1d8ebc5a6e172e8cdec6dbec0b10d85
* 23:21 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 22:08 urandom: disabling puppet in RESTBase eqiad staging cluster to test new code and config
* 23:20 jforrester@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-ckb.svg: Config: [[gerrit:726955{{!}}Adding and use wordmark in ckbwiki (T288368)]] (duration: 01m 04s)
* 22:08 ottomata: powercycling  analytcis1029, it is down?
* 23:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:47 cscott: updated OCG to version 4032a596ce6eb442b02cc6ee9b79263b1eb23275
* 23:16 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:726603{{!}}Enable NewUserMessage for ptwikivoyage (T290820)]] (duration: 01m 05s)
* 19:42 ejegg: updated crm from abc34b87ee9d1dbb1176f1929a3d748e1ee5ac7b to 15ea14f61338ca9f34e9ccb9f56eae14a161380a
* 22:30 mutante: re-enabling puppet on mw*, an-worker* after deploying gerrit:726954. no issue this time
* 19:38 ori: Deployed statsv 0bfd9f06f / change I050a12d3b
* 22:23 mutante: temp. disabling puppet on an-worker*, mw*
* 18:47 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf23
* 20:50 mutante: global puppet failure - revert is merged, puppet run will recover on next run everywhere. partially forcing with cumin, partially letting it recover naturally
* 18:38 logmsgbot: twentyafterfour@tin Synchronized php-1.26wmf23: syncing wmf23 ahead of deployment to group1 (duration: 01m 35s)
* 20:43 mutante: [cumin1001:~] $ sudo cumin -b 5 -p 95 'mw2*' 'run-puppet-agent -q --failed-only'
* 17:34 paravoid: asw-d-eqiad: toggling RE mastership again
* 19:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:26 godog: stop puppet on restbase* to apply https://gerrit.wikimedia.org/r/#/c/238738/ / merge / reenable puppet
* 19:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:54 _joe_: turned on the hhvm tmh, stopping the zend ones for testing
* 19:05 brennen@deploy1002: Synchronized php: group1 wikis to 1.38.0-wmf.3 refs [[phab:T281167|T281167]] (duration: 01m 03s)
* 16:44 logmsgbot: oblivian@tin Synchronized wmf-config/CommonSettings.php: use ffmpeg whereever possible (duration: 00m 12s)
* 19:04 brennen@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 16:16 bblack: upgrading pybal on lvs400[12]
* 19:01 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): still unblocked after triage meeting, rolling to group1
* 16:12 bblack: upgrading pybal on lvs400[34], lvs300[34]
* 18:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:08 bblack: upgrading pybal on lvs200[123]
* 18:50 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:05 bblack: upgrading pybal on lvs200[456]
* 18:44 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Revert disabling static mapframes on eswiki (duration: 01m 14s)
* 15:44 _joe_: uploading pybal 1.10 to reprepro, installing to the test cluster
* 18:36 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:24 moritzm: uploaded debdeploy 0.0.6 to apt.wikimedia.org
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:10 hashar: Started using Nodepool spawned instances.  Moved integration-jjb-config-diff Jenkins job to Nodepool with https://gerrit.wikimedia.org/r/#/c/238752/ . See also: https://phabricator.wikimedia.org/T112750
* 18:31 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: eswiki: Disable static mapframes ([[phab:T291736|T291736]]) (duration: 01m 17s)
* 15:05 logmsgbot: demon@tin Synchronized wmf-config/CommonSettings.php: Add m.wikidata.org to wgCrossSiteAJAXdomains (duration: 00m 12s)
* 18:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:51 _joe_: experimenting on testwiki for poolcounter failure scenarios
* 18:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:45 moritzm: enabled ferm on mw1010 (jobrunner) in eqiad
* 18:22 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: viwikibooks: Set $wgRestrictDisplayTitle to false ([[phab:T289837|T289837]]) (duration: 01m 21s)
* 14:27 paravoid: asw-d-eqiad: toggling RE mastership
* 17:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:18 paravoid: disabling/ignoring asw-d-eqiad @ librenms
* 17:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:09 jynus: upgrading and restarting db1051
* 16:53 bd808@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 13:57 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool es1051 for maintenance (duration: 00m 12s)
* 16:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 13:40 urandom: initiating Cassandra repair on restbase1007 (nodetool repair -pr)
* 16:47 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 13:40 logmsgbot: catrope@tin Synchronized php-1.26wmf23: (no message) (duration: 01m 37s)
* 16:43 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): unblocked, rolling to group0
* 13:35 moritzm: repooled mw1149-mw1151 (with ferm enabled)
* 16:41 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 13:24 moritzm: depooled mw1149-mw1151 (for enabling ferm)
* 16:35 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726596{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 04s)
* 13:19 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Reverting depool of es1055 (duration: 00m 12s)
* 16:35 jynus: stopping db1127 for hw maintenance [[phab:T292366|T292366]]
* 13:15 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool es1055 for maintenance (duration: 00m 12s)
* 16:31 jynus@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 13:03 paravoid: disabling asw-d-eqiad xe-8/0/23, xe-8/0/24, xe-8/0/25, xe-8/0/26, xe-8/0/27, xe-8/0/28; servers reboot-looping -> asw-d's SNMP unhappy -> librenms unhappy -> faidon's mailbox unhappy
* 16:31 jynus@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on db1127.eqiad.wmnet with reason: hw maintenance
* 12:48 moritzm: repooled mw1115-mw1117, mw1119 (with ferm enabled)
* 16:28 brennen@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Scribunto/includes/engines/LuaCommon/LanguageLibrary.php: Backport: [[gerrit:726597{{!}}Replace deprecated ParserOptions::getUser with ::getUserIdentity (T292589)]] (duration: 01m 10s)
* 12:42 moritzm: depooling mw1115-mw1117, mw1119 (mw1118 was already depooled) to enable ferm
* 16:22 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:32 moritzm: repooled mw1019-mw1025 with ferm enabled
* 16:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:24 jynus: making db1069 a sibling of db1055 (s1)
* 16:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:13 godog: create restbase user on cassandra test cluster
* 16:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 moritzm: depooled mw1019-mw1025 (to enable ferm)
* 16:01 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 10:52 logmsgbot: catrope@tin Synchronized php-1.26wmf23: (no message) (duration: 02m 04s)
* 15:45 brennen: 1.38.0-wmf.3 train ([[phab:T281167|T281167]]): proceeding to deploy backports for [[phab:T292589|T292589]]
* 10:49 logmsgbot: catrope@tin Synchronized php-1.26wmf22: (no message) (duration: 02m 12s)
* 15:37 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 10:48 jynus: reenabling semisync on db1072 and db1073
* 15:35 volans: installer spicerack 1.0.4 on cumin2002
* 10:47 logmsgbot: catrope@tin scap aborted: (no message) (duration: 00m 21s)
* 12:50 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:47 logmsgbot: catrope@tin Started scap: (no message)
* 12:48 volans: uploaded spicerack_1.0.4 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 10:24 logmsgbot: catrope@tin Synchronized php-1.26wmf23/includes/changes/EnhancedChangesList.php: T112738 (duration: 00m 12s)
* 12:26 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2004.codfw.wmnet
* 10:09 logmsgbot: aude@tin Synchronized arbitraryaccess.dblist: (no message) (duration: 00m 11s)
* 12:23 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 09:37 awight: ruthlessly disabled PayPal IPN listener
* 12:18 effie: pool mw1455 mw1422
* 08:12 moritzm: repooled mw1153 with ferm enabled
* 12:17 urbanecm: wikiadmin@10.64.0.164(viwiki)> delete from growthexperiments_mentee_data; # cleanup after disabling mentor dashboard backend
* 07:57 jynus: truncated some tables from ContentTranslation extension on x1
* 12:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2004.codfw.wmnet
* 07:57 moritzm: depooled mw1153 (it's an image scaler, of course) to enable ferm
* 12:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1aa67d4846f39f59127a835cb7a8ed2974506025}}: viwiki: Disable mentor dashboard backend ([[phab:T278920|T278920]]) (duration: 01m 06s)
* 07:56 moritzm: depooled mw1153 (videoscaler) to enable ferm
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:31 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Wed Sep 16 06:31:58 UTC 2015 (duration 31m 57s)
* 11:59 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 03:28 logmsgbot: ori@tin Synchronized php-1.26wmf22/vendor/monolog/monolog/src/Monolog/Logger.php: Iccfda47689: monolog: Don't waste milliseconds counting microseconds (duration: 00m 12s)
* 11:56 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts testvm2003.codfw.wmnet
* 03:27 logmsgbot: ori@tin Synchronized php-1.26wmf23/vendor/monolog/monolog/src/Monolog/Logger.php: Iccfda47689: monolog: Dont waste milliseconds counting microseconds ; sync-file php-1.26wmf22/vendor/monolog/monolog/src/Monolog/Logger.php Iccfda47689: monolog: Dont waste milliseconds counting microseconds (duration: 00m 12s)
* 11:55 XioNoX: esams - Advertise 185.15.59.0/24 instead of 185.15.58.0/23 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 03:12 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf23) at 2015-09-16 03:12:08+00:00
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2003.codfw.wmnet
* 03:05 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf23/cache/l10n: l10nupdate for 1.26wmf23 (duration: 10m 30s)
* 10:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 02:38 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf22) at 2015-09-16 02:38:48+00:00
* 10:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 02:35 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf22/cache/l10n: l10nupdate for 1.26wmf22 (duration: 07m 02s)
* 10:50 jelto: disable puppet on gitlab1001 to test puppetized code on GitLab replica - [[phab:T283076|T283076]]
* 01:03 logmsgbot: krinkle@tin Synchronized php-1.26wmf23/resources/src/mediawiki/mediawiki.js: hotfix Ia2fcd13f4 (duration: 00m 12s)
* 10:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-replica1003.wikimedia.org
* 00:29 logmsgbot: krinkle@tin Synchronized php-1.26wmf22/resources/src/mediawiki/mediawiki.js: hotfix Ia2fcd13f4 (duration: 00m 11s)
* 10:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-replica1003.wikimedia.org
* 00:15 logmsgbot: legoktm@tin Synchronized php-1.26wmf23/extensions/CentralAuth/includes/: Use set() for tokens with unique keys (duration: 00m 12s)
* 10:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:14 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/extensions/CentralAuth/includes/: Use set() for tokens with unique keys (duration: 00m 12s)
* 10:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:11 bblack: reinstalling lvs400[12] to jessie (traffic on 400[34], already jessie)
* 10:04 urbanecm@deploy1002: Synchronized wmf-config/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 2/2) (duration: 01m 05s)
* 00:08 logmsgbot: krenair@tin Synchronized php-1.26wmf23/extensions/VisualEditor/modules/ve-mw/ui/styles/dialogs: https://gerrit.wikimedia.org/r/#/c/238646/ (duration: 00m 12s)
* 10:01 urbanecm@deploy1002: Synchronized dblists/: {{Gerrit|01633739462f3bf09ae4e50b955454921ea4fbf9}}: Delete gettingstarted-with-category-suggestions dblist ([[phab:T235752|T235752]]; 1/2) (duration: 01m 04s)
* 09:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host copernicium.wikimedia.org
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host copernicium.wikimedia.org
* 09:19 jbond: update ipaddress6 fact - https://gerrit.wikimedia.org/r/c/operations/puppet/+/726625
* 09:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:13 ladsgroup@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/Wikibase/repo/includes/ChangeModification/DispatchChangesJob.php: Backport: [[gerrit:725923{{!}}Don't fail job if subscribed wiki is unknown (T292446 T292440)]] (duration: 01m 15s)
* 09:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:29 volans@cumin2002: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1001.eqiad.wmnet
* 08:21 XioNoX: add ROAs for 185.15.58.0/24 and 185.15.59.0/24 - [[phab:T288505|T288505]] - [[phab:T283050|T283050]]
* 08:04 volans@cumin2002: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews --fix # [[phab:T291344|T291344]]
* 07:56 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript namespaceDupes.php plwikinews # [[phab:T291344|T291344]]
* 07:55 urbanecm: mwdebug1001: scap pull ([[phab:T291344|T291344]] fix done)
* 07:51 urbanecm: Staging at mwdebug1001 for [[phab:T291344|T291344]]
* 05:53 kart_: Updated cxserver to use nodejs12 ([[phab:T290754|T290754]])
* 05:47 kartik@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:39 kartik@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 05:36 Amir1: start of mwscript extensions/Wikibase/repo/maintenance/pruneChanges.php --wiki wikidatawiki --number-of-days=2
* 05:31 kartik@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 04:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:49 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:29 ryankemper: [WDQS] `wdqs1012` is back up after restarting blazegraph (blazegraph was locked up)
* 04:27 ryankemper: [WDQS] `ryankemper@wdqs1012:~$ sudo systemctl restart wdqs-blazegraph` (attempting to bring downed `wdqs1012` back into health)
* 04:25 ryankemper: [WDQS] Repooling eqiad hosts following the brief outage from earlier: `wdqs1004`, `wdqs1006`, `wdqs1007`
* 03:19 eileen: civicrm revision changed from {{Gerrit|b6f5f71c18}} to {{Gerrit|82efd2e195}}, config revision is {{Gerrit|f4c57d4733}}
* 03:11 tstarling@deploy1002: Synchronized php-1.38.0-wmf.3/includes/CommentFormatter/RowCommentIterator.php: fix UBN [[phab:T292590|T292590]] (duration: 01m 04s)
* 01:39 legoktm: legoktm@mwmaint1002:~$ echo "https://en.wikiversity.org/static/images/mobile/copyright/wikiversity.svg" {{!}}mwscript purgeList.php
* 01:17 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 03s)
* 01:12 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/extensions/GlobalUserPage/includes/GlobalUserPage.php: Bump GlobalUserPage::PARSED_CACHE_VERSION for media DOM changes (duration: 01m 17s)
* 00:59 arlolra@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Enable legacy media dom on metawiki (duration: 01m 05s)
* 00:37 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:35 arlolra@deploy1002: Synchronized php-1.38.0-wmf.2/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 03s)
* 00:32 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/resourceloader/ResourceLoaderSkinModule.php: Add a separate config for content.media.less 2/2 (duration: 01m 03s)
* 00:29 arlolra@deploy1002: Synchronized php-1.38.0-wmf.3/includes/DefaultSettings.php: Add a separate config for content.media.less 1/2 (duration: 01m 04s)
* 00:16 mutante: puppetmasters: rm /etc/logrotate.d/geoipupdate && systemctl start logrotate && puppet agent -tv
* 00:14 mutante: puppetmaster2002 - rm /etc/logrotate.d/geoipupdate (not managed by puppet anymore but not removed, caused duplicate logrotate config, made logrotate service fail), start logrotate
* 00:08 cstone: civicrm revision changed from {{Gerrit|34d3c3aae8}} to {{Gerrit|b6f5f71c18}}
* 00:01 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725132{{!}}Add WN as an alias to project namespace in Polish Wikinews (T291344)]] (duration: 01m 04s)


== 2015-09-15 ==
== 2021-10-05 ==
* 23:51 logmsgbot: krenair@tin Synchronized php-1.26wmf22/extensions/WikimediaEvents/modules/ext.wikimediaEvents.geoFeatures.js: https://gerrit.wikimedia.org/r/#/c/238617/ (duration: 00m 12s)
* 23:54 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikiversity.svg: Config: [[gerrit:725413{{!}}Wikiversity Logo Update for 2017 Logo Version (T292109)]] (duration: 01m 03s)
* 23:48 logmsgbot: krenair@tin Synchronized php-1.26wmf23/extensions/WikimediaEvents/modules/ext.wikimediaEvents.geoFeatures.js: https://gerrit.wikimedia.org/r/#/c/238618/ (duration: 00m 12s)
* 23:47 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 04s)
* 23:42 logmsgbot: krenair@tin Synchronized wmf-config/mobile.php: https://gerrit.wikimedia.org/r/#/c/238543/ (duration: 00m 14s)
* 23:44 tgr@deploy1002: Synchronized static/images/mobile/copyright/wikipedia-wordmark-az.svg: Config: [[gerrit:704376{{!}}Adding and use wordmark in azwiki (T284877)]] (duration: 01m 23s)
* 23:42 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/238543/ (duration: 00m 12s)
* 23:16 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725386{{!}}Add image_suggestion_interaction event stream]] (duration: 01m 12s)
* 23:40 logmsgbot: krenair@tin Synchronized php-1.26wmf22/includes/resourceloader/ResourceLoader.php: https://gerrit.wikimedia.org/r/#/c/238544/ (duration: 00m 11s)
* 23:02 legoktm: deleting old stretch docker images from the registry for [[phab:T292485|T292485]]
* 23:38 logmsgbot: krenair@tin Synchronized php-1.26wmf23/includes/resourceloader/ResourceLoader.php: https://gerrit.wikimedia.org/r/#/c/238545/ (duration: 00m 11s)
* 22:24 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 23:24 yurik: deployed kartotherian & tilerator
* 22:20 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]) rolling back to testwikis for the day; will revisit in US-morning
* 23:22 logmsgbot: krenair@tin Synchronized php-1.26wmf22/extensions/EventLogging/modules/ext.eventLogging.core.js: https://gerrit.wikimedia.org/r/#/c/238512/ (duration: 00m 12s)
* 20:47 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 23:16 logmsgbot: krenair@tin Synchronized php-1.26wmf23/extensions/EventLogging/modules/ext.eventLogging.core.js: https://gerrit.wikimedia.org/r/#/c/238513/ (duration: 00m 12s)
* 20:44 brennen@deploy1002: Synchronized php-1.38.0-wmf.3/includes/page: Backport: [[gerrit:726594{{!}}Pre-format comments for non-local files too]] ([[phab:T292570|T292570]]) (duration: 01m 04s)
* 21:15 logmsgbot: ebernhardson@tin Synchronized php-1.26wmf22/extensions/WikimediaEvents/: touch files edited in I0cb6fe37e and re-sync to cluster (duration: 00m 13s)
* 20:18 mutante: puppetmaster1003 et al - converting maxmind geoip database fetching from cron to timers
* 21:13 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.26wmf23
* 20:06 mutante: cumin 'puppetmaster*' "disable-puppet '[[phab:T288844|T288844]] - [[phab:T273673|T273673]] - gerrit:721595 - $<nowiki>{</nowiki>USER<nowiki>}</nowiki>'"
* 21:10 logmsgbot: twentyafterfour@tin Finished scap: sync 1.26wmf23 to testwiki, once more because mw1010 overloaded (duration: 03m 52s)
* 19:30 mutante: restoring /home/amire80 from and to mwmaint2002 via Bacula bconsole ([[phab:T292573|T292573]])
* 21:07 logmsgbot: twentyafterfour@tin Started scap: sync 1.26wmf23 to testwiki, once more because mw1010 overloaded
* 19:09 brennen@deploy1002: rebuilt and synchronized wikiversions files: Revert group0 wikis to 1.38.0-wmf.2
* 21:05 logmsgbot: twentyafterfour@tin Finished scap: sync 1.26wmf23 to testwiki, again (duration: 47m 49s)
* 19:03 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 20:47 mutante: mw1010 - extremely slow,finally got on and attempted to restart hhvm. load going down
* 18:26 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.23 (duration: 01m 57s)
* 20:17 logmsgbot: twentyafterfour@tin Started scap: sync 1.26wmf23 to testwiki, again
* 18:23 brennen@deploy1002: Pruned MediaWiki: 1.37.0-wmf.21 (duration: 04m 20s)
* 20:17 logmsgbot: twentyafterfour@tin scap aborted: sync 1.26wmf23 to testwiki (duration: 82m 58s)
* 18:21 brennen: 1.38.0-wmf.3 ([[phab:T281167|T281167]]): pruning old branches, starting with 1.37.0-wmf.21, proceeeding to 1.37.0-wmf.23 if time allows
* 20:05 ottomata: restarted mysql (and oozie) on analytics1027 to start mysql binlogging
* 18:11 ppchelko@deploy1002: Synchronized wmf-config: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] Php72ToUpper.php removal (duration: 01m 06s)
* 18:54 logmsgbot: twentyafterfour@tin Started scap: sync 1.26wmf23 to testwiki
* 18:04 ppchelko@deploy1002: Synchronized wmf-config/CommonSettings.php: Remove mb_strtoupper overrides for HHVM [[phab:T219279|T219279]] CS.php (duration: 01m 06s)
* 16:55 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool es1003, es1004, es1007 and es1010 for decommision (duration: 00m 12s)
* 17:55 brennen@deploy1002: Finished scap: testwikis wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]] (duration: 45m 59s)
* 16:40 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Revert depool db1055 for maintenance (duration: 00m 11s)
* 17:12 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 16:39 ottomata: reinstalling analytics1015
* 17:09 brennen@deploy1002: Started scap: testwikis wikis to 1.38.0-wmf.3  refs [[phab:T281167|T281167]]
* 16:32 RoanKattouw: Putting wmf22 versions of Echo and MobileFrontend on mw1017 for testing
* 17:03 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 16:30 logmsgbot: ebernhardson@tin Synchronized php-1.26wmf22/extensions/WikimediaEvents/WikimediaEvents.php: touch file that is serving old version in prod (duration: 00m 12s)
* 17:02 btullis@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 16:29 logmsgbot: ebernhardson@tin Synchronized php-1.26wmf22/extensions/WikimediaEvents/modules/ext.wikimediaEvents.searchSuggest.js: Touch file that is serving old version in prod (duration: 00m 12s)
* 17:02 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 16:27 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool db1055 for maintenance (duration: 00m 11s)
* 16:56 brennen: successfully applied security patches for 1.38.0-wmf.3 train ([[phab:T281167|T281167]])
* 16:11 bblack: traffic DNS depooled out of codfw for now T112639
* 16:47 brennen: coordinated with deployment backup and starting train prep for 1.38.0-wmf.3 ([[phab:T281167|T281167]]), branched at {{Gerrit|65279490f82c785181b8b6961e40901a4aaafca4}}
* 15:38 logmsgbot: thcipriani@tin Synchronized wmf-config: SWAT: CX: Enable suggestion for testwiki (part 2) [[gerrit:237327]] (duration: 00m 13s)
* 15:57 jbond@cumin2002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for puppetboard2002.codfw.wmnet: Renew puppet certificate - jbond@cumin2002
* 15:37 logmsgbot: thcipriani@tin Synchronized wmf-config/InitialiseSettings.php: SWAT: CX: Enable suggestion for testwiki (part 1) [[gerrit:237327]] (duration: 00m 12s)
* 15:57 jbond@cumin2002: START - Cookbook sre.puppet.renew-cert for puppetboard2002.codfw.wmnet: Renew puppet certificate - jbond@cumin2002
* 15:31 logmsgbot: thcipriani@tin Synchronized php-1.26wmf22/extensions/UploadWizard/resources/jquery/jquery.mwCoolCats.js: SWAT: Do not fail horribly when invalid categories are passed [[gerrit:238421]] (duration: 00m 12s)
* 15:38 jbond: reimage puppetboard2002
* 15:14 logmsgbot: thcipriani@tin Synchronized wmf-config/PoolCounterSettings-eqiad.php: SWAT: poolcounter: enable connect_timeout for testwiki [[gerrit:238109]] (duration: 00m 19s)
* 15:15 jbond@cumin1001: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for puppetboard1002.eqiad.wmnet: Renew puppet certificate - jbond@cumin1001
* 15:09 logmsgbot: thcipriani@tin Synchronized wmf-config/PoolCounterSettings-codfw.php: SWAT: poolcounter: add connect_timeout in codfw [[gerrit:238108]] (duration: 00m 12s)
* 15:15 jbond@cumin1001: START - Cookbook sre.puppet.renew-cert for puppetboard1002.eqiad.wmnet: Renew puppet certificate - jbond@cumin1001
* 15:06 logmsgbot: thcipriani@tin Synchronized wmf-config/Wikibase.php: SWAT: Exclude Flow topic boards and Draft NS from Special:UnconnectedPages [[gerrit:229197]] (duration: 00m 11s)
* 15:10 moritzm: imported routinator 0.10.1-1bullseye to thirdparty/routinator for bullseye-wikimedia [[phab:T292503|T292503]]
* 14:51 godog: bounce cassandra on test cluster to deploy  https://gerrit.wikimedia.org/r/236391
* 14:58 jbond: reimage puppetboard1002
* 14:22 cmjohnson1: swapped disk on db1043
* 14:40 effie: depool  mw1455 and mw1422
* 13:12 moritzm: repool mw1114 (with ferm enabled)
* 14:30 Pchelolo: run foreachwiki uppercaseTitlesForUnicodeTransition.php --charmap current_to_php7_overrides.php [[phab:T219279|T219279]]
* 13:11 bblack: failing over LVS service in ulsfo to secondariess (400[12] pybal stopped, traffic on jessie-based 400[34])
* 13:51 reedy@deploy1002: Synchronized wmf-config/CommonSettings.php: ExtensionDistributor - Drop REL1_31, start REL1_37 (duration: 00m 57s)
* 12:53 moritzm: depooled mw1114 (for enabling ferm)
* 13:46 Pchelolo: run renameInvalidUsernames.php --wiki loginwiki --list /tmp/rename_users_for_uppercase_all.txt [[phab:T219279|T219279]]
* 11:42 moritzm: repool mw1018 (with ferm enabled)
* 13:39 elukey@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - elukey@cumin1001
* 11:23 moritzm: depooled mw1018 (for enabling ferm)
* 13:39 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - elukey@cumin1001
* 08:53 _joe_: created a 100 G partition on a LV on copper, for /tmp
* 13:23 ppchelko@deploy1002: Synchronized php-1.38.0-wmf.2/maintenance/uppercaseTitlesForUnicodeTransition.php: Backport uppercaseTitlesForUnicodeTransition.php maintenance script improvements [[phab:T219279|T219279]] (duration: 00m 58s)
* 08:24 godog: bounce ms-be2006, xfs
* 12:53 ema: upload varnish 6.0.8-1wm1 to apt.wikimedia.org [[phab:T292290|T292290]]
* 08:22 moritzm: bumped default size of iptables connection tracking table to 256k
* 12:43 elukey: import AMD ROCm 4.2 to buster-wikimedia's thirdparty/amd-rocm42 - [[phab:T287267|T287267]]
* 06:10 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Tue Sep 15 06:10:52 UTC 2015 (duration 10m 51s)
* 12:24 ema: deployment-cache-text06: upgrade varnish to 6.0.8-1wm1 [[phab:T292290|T292290]]
* 02:46 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf22) at 2015-09-15 02:46:50+00:00
* 11:58 hnowlan: reverted restbase2023 to use CN=hostname certificate due to loading errors
* 02:40 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf22/cache/l10n: l10nupdate for 1.26wmf22 (duration: 10m 53s)
* 11:57 hnowlan@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 02:18 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/extensions/MobileFrontend: Revert Echo to 1.26wmf21 state (duration: 00m 11s)
* 11:57 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 02:18 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/extensions/Echo: Revert Echo to 1.26wmf21 state (duration: 00m 12s)
* 11:37 hnowlan@cumin1001: END (FAIL) - Cookbook sre.cassandra.roll-restart (exit_code=99) for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 01:30 logmsgbot: krinkle@tin Synchronized php-1.26wmf22/resources/src: T112287 (duration: 00m 11s)
* 11:28 hnowlan@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase2023.codfw.wmnet: Switching over to using FQDN certificate - hnowlan@cumin1001
* 00:49 bblack: reinstalling lvs300[34] to jessie
* 11:17 hnowlan_: disabling puppet on cassandra nodes for rollout of 724061 - defaulting to cn=fqdn certificates
* 00:43 logmsgbot: ebernhardson@tin Synchronized wmf-config/CirrusSearch-labs.php: noop sync of labs config change (duration: 00m 11s)
* 11:15 effie: upgrade scap to 4.0.2 - [[phab:T291095|T291095]]
* 00:03 logmsgbot: tstarling@tin Synchronized php-1.26wmf22/extensions/ParsoidBatchAPI: for I56d28e9a for RT testing, not live yet (duration: 00m 13s)
* 11:12 urbanecm@deploy1002: Synchronized dblists/commonsuploads.dblist: {{Gerrit|04524992865b0ae5750eb6fb0a374aa74a65b383}}: Enable local uploads for tcywiki ([[phab:T166763|T166763]]) (duration: 00m 59s)
* 10:11 vgutierrez: update acme-chief to version 0.32 on acmechief hosts - [[phab:T290249|T290249]]
* 10:09 vgutierrez: update acme-chief to version 0.32 on acmechief-test hosts - [[phab:T290249|T290249]]
* 10:06 vgutierrez: upload acme-chief 0.32 to apt.wm.o (buster) - [[phab:T290249|T290249]]
* 09:46 hnowlan_: generated cassandra certificate using FQDN for restbase2023
* 09:09 topranks: updating routinator on rpki2001 ([[phab:T291543|T291543]])
* 08:59 dcausse: depool and restart blazegraph on wdqs1007
* 08:51 moritzm: installing openssl security updates for stretch (buster/bullseye already fixed)
* 07:58 moritzm: installing apache security updates
* 07:57 elukey: upgrade GPU drivers (AMD ROCm 4.3.1) on an-worker1[096-101]
* 07:27 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 07:26 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 07:26 elukey@puppetmaster1001: conftool action : set/pooled=yes; selector: name=wdqs1004.wmnet
* 06:38 elukey: reboot an-worker1096 after installing new GPU drivers
* 04:20 eileen: civicrm revision changed from {{Gerrit|d74e9aa0a1}} to {{Gerrit|34d3c3aae8}}, config revision is {{Gerrit|cae09f7691}}


== 2015-09-14 ==
== 2021-10-04 ==
* 23:27 logmsgbot: ebernhardson@tin Synchronized php-1.26wmf22/extensions/WikimediaEvents/: Change bucket selection methods in CompletionSuggestions AB test (duration: 00m 12s)
* 23:30 foks: resetting some emails used for abuse by a globally-banned user
* 23:23 logmsgbot: ebernhardson@tin Synchronized php-1.26wmf22/extensions/UploadWizard/: Swat out badtoken fix to UploadWizard in 1.26wmf22 (duration: 00m 12s)
* 23:19 jdrewniak@deploy1002: Synchronized portals: Wikimedia Portals Update: [[gerrit:726084{{!}} Bumping portals to master (T128546)]] (duration: 00m 59s)
* 22:37 yurik: deployed tilerator
* 23:18 jdrewniak@deploy1002: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: [[gerrit:726084{{!}} Bumping portals to master (T128546)]] (duration: 00m 59s)
* 21:15 logmsgbot: ori@tin Synchronized php-1.26wmf22/extensions/TitleBlacklist: Ie44fcb500: Avoid checking blacklists in isBlacklisted() for existing titles (duration: 00m 12s)
* 23:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|75645c9cc59b37dbf59942eabbc014b7dc147626}}: Add explicit config for licensing/copyright message overrides ([[phab:T284097|T284097]]) (duration: 00m 59s)
* 21:15 mutante: labnodepool1001 - re-enable puppet and nodepool
* 23:05 mutante: [deneb:~] $ sudo systemctl start docker-reporter-releng-images
* 20:59 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/extensions/Echo/: Hack around OOUI's icon pack being too large by creating our own (duration: 00m 12s)
* 22:54 mutante: puppetmaster2001 - rm /etc/logrotate.d/geoipupdate_ipinfo  and geoipupdate_ipinfo ; running puppet, starting logrotate service
* 20:53 cscott: updated OCG to version 5811056e28f2bc6408b6da96095352ab381bb11f
* 18:13 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:21 andrewbogott: graceful’d apache2 on labcontrol1001
* 16:51 bblack: rolling restart of haproxy for DoTLS on dns300[12],authdns1001,authdns2001 to recycle connections
* 20:15 subbu: deployed parsoid sha 3d5f4359
* 15:24 vgutierrez: pool cp5006
* 19:25 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/extensions/Echo/: Only load nojs Special:Notifications styles on the special page (duration: 00m 12s)
* 15:17 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 18:05 urandom: rebuilding restbase-test2001.codfw (nodetool rebuild -- eqiad)
* 15:16 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 16:12 logmsgbot: catrope@tin Synchronized php-1.26wmf22/extensions/Echo/: For real this time (duration: 00m 11s)
* 14:50 phuedx: phuedx@mwmaint1002:~$ mwscript extensions/SecurePoll/cli/purgeDecryptionKeys.php --wiki=votewiki --before="20210101000000"
* 16:06 ottomata: stopping hdfs journalnode on analytics1011 to copy journal edits to new journalnodes on analytics1035 and analytics1052
* 14:46 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 15:46 godog: switch to openjdk-8 and bounce cassandra on restbase-test200*
* 14:46 effie: uploading scap 4.0.2 - [[phab:T291095|T291095]]
* 15:39 bblack: reinstalling lvs4003, lvs4004 (jessie upgrade: T96375) (typo earlier)
* 14:45 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 15:39 bblack: reinstalling lvs4003, lvs4003 (jessie upgrade: T96375)
* 14:39 brennen: gitlab: upgrade to 14.3.2 (note there was an additional patch release on 2021-10-01) complete ([[phab:T292256|T292256]])
* 15:34 logmsgbot: catrope@tin Synchronized php-1.26wmf22/extensions/Echo/: SWAT (duration: 00m 13s)
* 14:25 Amir1: cleaning up wb_changes_subscription rows from closed wikis ([[phab:T292440|T292440]])
* 15:05 logmsgbot: krenair@tin Synchronized .gitignore: https://gerrit.wikimedia.org/r/#/c/237529/ (duration: 00m 13s)
* 14:24 brennen: gitlab: downtime for upgrade to 14.3.1
* 15:05 logmsgbot: krenair@tin Synchronized docroot/noc: https://gerrit.wikimedia.org/r/#/c/237529/ (duration: 00m 12s)
* 14:19 elukey: import AMD ROCm 4.3.1 packages in buster-wikimedia's thirdparty/amd-rocm431 - [[phab:T287267|T287267]]
* 15:04 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/237529/ (duration: 00m 11s)
* 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:02 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/234980/ (duration: 00m 12s)
* 14:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:38 godog: stop puppet on restbase-test2001 and turn up cassandra
* 14:13 ladsgroup@deploy1002: Synchronized wmf-config/Wikibase.php: Config: [[gerrit:725905{{!}}Explicitly enable dispatching and pruning for wikidata (T48643)]] (duration: 00m 58s)
* 12:56 bblack: rebooting lvs2006 to test eth hw params stuff...
* 14:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:55 logmsgbot: krenair@tin Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/238125/ (duration: 00m 13s)
* 14:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:50 urandom: starting Cassandra repair on restbase1003 (nodetool repair -pr)
* 14:03 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T292256
* 12:32 godog: enable dc encryption on cassandra test cluster and rolling restart
* 14:03 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T292256
* 11:33 mobrovac: citoid deploying d569951
* 14:01 ladsgroup@deploy1002: Synchronized wmf-config: Config: [[gerrit:725502{{!}}Enable dispatching via jobs everywhere (T48643)]] (duration: 01m 00s)
* 10:35 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool es1002, es1005, es1008 (duration: 00m 12s)
* 12:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:04 jynus: db1029 (x1-master) temporarily saturated by connections- flow was unresponsive for 10 minutes; migration partially aborted
* 12:56 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725785{{!}}Enable dispatching for wikidatawiki and commonswiki (T292088)]] (duration: 01m 00s)
* 09:08 jynus: applying schema change to flowdb
* 12:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:52 godog: rename cassandra test cluster and restart
* 12:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:44 godog: silence mendelevium for today, status unclear T111532
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti2026.codfw.wmnet with reason: Ganeti tests
* 08:30 jynus: endinf profiling and executing pt-query-digest on db1043 [ETA:4h]
* 12:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti2026.codfw.wmnet with reason: Ganeti tests
* 07:52 godog: reboot ms-be1010 to pick up disk ordering change
* 12:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on ganeti2025.codfw.wmnet with reason: Ganeti tests
* 04:48 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Mon Sep 14 04:47:58 UTC 2015 (duration 47m 57s)
* 12:02 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on ganeti2025.codfw.wmnet with reason: Ganeti tests
* 02:29 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf22) at 2015-09-14 02:29:48+00:00
* 12:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:26 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf22/cache/l10n: l10nupdate for 1.26wmf22 (duration: 06m 59s)
* 11:55 urbanecm: EU B&C window done
* 01:31 Krinkle: mwscript deleteEqualMessages.php --wiki sqwiki
* 11:55 urbanecm@deploy1002: Synchronized multiversion/MWWikiversions.php: {{Gerrit|508cf5cc6d213373f7c9ba1cdef142ebc8398022}}: Let DB expressions intersect DB lists ([[phab:T290609|T290609]]) (duration: 00m 58s)
* 11:50 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|a855078cf52d88cc2cd27a0adc7c6a680c80dd39}}: dewiki, nlwiki: Bump Growth features to 80% ([[phab:T288420|T288420]], [[phab:T285254|T285254]]) (duration: 00m 58s)
* 11:46 urbanecm@deploy1002: Synchronized private/PrivateSettings.php: {{Gerrit|5728376}}: Update [[phab:T250887|T250887]] mitigations (duration: 00m 58s)
* 11:44 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b0a96bed4562bcc975187b1d34626201d407404b}}: Undeploy GettingStarted V: Remove now-obsolete logging channels ([[phab:T235752|T235752]]) (duration: 00m 59s)
* 11:42 urbanecm@deploy1002: Synchronized wmf-config/extension-list: {{Gerrit|9709bcfc8dacbcd1704471df08c31cec0711bea6}}: Undeploy GettingStarted IV: Dont build i18n ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 11:39 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d60f332785868797e7ecc9b5e410616d5604b392}}: Undeploy getting started III: Dont set wmgUseGettingStarted, now ignored ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 11:37 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|9eaf960c4b7c304be57dfc8d248aca0c6501d04c}}: Undeploy GettingStarted II: Dont load regardless of config ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 11:35 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1c7405ad1eb323a8da524819f17d6f1a66afaa57}}: Undeploy GettingStarted I: Disable on all wikis ([[phab:T235752|T235752]]) (duration: 00m 58s)
* 11:31 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:724992{{!}}Remove deprecated SectionTranslationTargetLanguage config (T290302)]] (duration: 00m 58s)
* 11:22 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:725042{{!}}Add wikisource-bot.toolforge.org to Commons copy upload list (T292213)]] (duration: 00m 59s)
* 11:16 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720058{{!}}Add IA-Upload tool domains to Commons wgCopyUploadsDomains (T287241)]] (duration: 00m 59s)
* 11:12 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 11:10 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 11:07 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:06 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 11:04 effie: depool  wtp1026 for tests
* 11:04 effie: pool  wtp1025
* 10:59 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:13 akosiaris: hbal -L -G row_C -X on ganeti01.svc.eqiad.wmnet
* 08:59 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@071f7c3] (eqiad): Increase mirrored traffic to 100% for eqiad (duration: 00m 54s)
* 08:58 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@071f7c3] (eqiad): Increase mirrored traffic to 100% for eqiad
* 07:37 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc] (duration: 06m 14s)
* 07:31 joal@deploy1002: Started deploy [analytics/refinery@38f3adc] (hadoop-test): Hotfix analytics deploy TEST [analytics/refinery@38f3adc]
* 07:30 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc] (duration: 00m 06s)
* 07:30 joal@deploy1002: Started deploy [analytics/refinery@38f3adc] (thin): Hotfix analytics deploy THIN [analytics/refinery@38f3adc]
* 07:29 joal@deploy1002: Finished deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc] (duration: 19m 18s)
* 07:19 dcausse: restarting blazegraph on wdqs2001 & wdqs2004 (allocators burning too quickly)
* 07:18 elukey: depool + restart blazegraph + restart updater for wdqs1006
* 07:18 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs1006.wmnet
* 07:18 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=wdqs1004.wmnet
* 07:10 joal@deploy1002: Started deploy [analytics/refinery@38f3adc]: Hotfix analytics deploy [analytics/refinery@38f3adc]
* 07:02 godog: swift eqiad-prod: add weight to ms-be10[64-67] - [[phab:T290546|T290546]]
* 06:44 elukey: depool + restart blazegraph + restart updater on wdqs1004
* 05:50 ladsgroup@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 05:49 ladsgroup@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'production' .
* 05:47 ladsgroup@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'changeprop-jobqueue' for release 'staging' .


== 2015-09-13 ==
== 2021-10-03 ==
* 06:02 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sun Sep 13 06:02:52 UTC 2015 (duration 2m 51s)
* 14:45 _joe_: restarting acmechief on acmechief1001
* 02:40 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf22) at 2015-09-13 02:40:43+00:00
* 12:55 kormat@cumin1001: dbctl commit (dc=all): 'Depool db1127, bad ram', diff saved to https://phabricator.wikimedia.org/P17414 and previous config saved to /var/cache/conftool/dbconfig/20211003-125530-kormat.json
* 02:34 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf22/cache/l10n: l10nupdate for 1.26wmf22 (duration: 10m 13s)
* 08:24 elukey: powercycle cp5006 (unresponsive to ssh, remote tty available but not able to login as root, no prometheus metrics in hours)
* 08:23 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp5006.eqsin.wmnet


== 2015-09-12 ==
== 2021-10-02 ==
* 20:15 ori: Rolling back Echo to 1.26wmf21 branch on mw1017 (testwiki) to measure increase in render-blocking CSS size
* 17:28 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 19:21 urandom: performing Cassandra repair on restbase1002 (nodetool repair -pr)
* 16:10 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 14:50 jynus: phab.wmfusercontent.org has been temporarily switched to phab.wikivoyage.org due to cert issues
* 04:52 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Sat Sep 12 04:52:01 UTC 2015 (duration 52m 0s)
* 02:35 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf22) at 2015-09-12 02:35:36+00:00
* 02:32 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf22/cache/l10n: l10nupdate for 1.26wmf22 (duration: 06m 54s)


== 2015-09-11 ==
== 2021-10-01 ==
* 21:21 hashar: shutdown nodepool on labnodepool1001.eqiad.wmnet until monday
* 23:19 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 18:01 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/extensions/Echo/: Echo regression fixes #2 (duration: 00m 12s)
* 22:27 mutante: puppetmaster2001 - systemctl reset-failed
* 16:43 logmsgbot: krinkle@tin Synchronized php-1.26wmf22/resources/src/mediawiki/mediawiki.js: T112232 (duration: 00m 12s)
* 22:16 mutante: puppetmaster2001 systemctl disable geoip_update_ipinfo.timer
* 16:37 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/extensions/Echo/: Echo regression backports (duration: 00m 12s)
* 22:15 mutante: puppetmaster2001 - sudo /usr/local/bin/geoipupdate_job after adding new shell command and timer - succesfully downloaded enterprise database for [[phab:T288844|T288844]]
* 16:35 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/resources/src/mediawiki/mediawiki.js: resourceloader: Document internal mw.loader#jobs property (again) (duration: 00m 13s)
* 21:56 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 16:33 legoktm: ssh: connect to host mw1156.eqiad.wmnet port 22: Connection timed out
* 21:44 mutante: puppetmasters - temp. disabling puppet one more time, now for a different deploy, to fetch an additional MaxMind database - [[phab:T288844|T288844]]
* 16:32 paravoid: powercycling mw1156, multiple kernel backtraces in console output
* 21:19 mutante: puppetmaster2001 - puppet removed cron sync_volatile and cron sync_ca - starting and verifying new timers: 'systemctl status sync-puppet-volatile', 'systemctl status sync-puppet-ca' [[phab:T273673|T273673]]
* 16:32 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/resources/src/mediawiki/mediawiki.js: resourceloader: Document internal mw.loader#jobs property (duration: 01m 07s)
* 21:12 mutante: puppetmaster1002, puppetmaster1003, puppetmaster2002, puppetmaster2003: re-enabled puppet, they are backends. backends don't have the sync cron/job/timer, so noop as well, just like 1004/1005/2004/2005. this just leaves the actual change on 2001  - [[phab:T273673|T273673]]
* 16:15 cmjohnson1: mw1031 rebooting for f/w update
* 21:07 mutante: puppetmaster1004, puppetmaster1005, puppetmaster2004, puppetmaster2005: re-enabled puppet, they are "insetup" role
* 16:07 bblack: enabled LRO+GRO on lvs200[123], starting pybal there again ([456] testing looks good so far)
* 21:06 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@d309a6e] (eqiad): tegola: reduce load to 50% during the weekend (duration: 00m 54s)
* 15:45 bblack: enabled LRO+GRO on lvs200[456] (backups). Stopping pybal on lvs200[123] to test...
* 21:05 mbsantos@deploy1002: Started deploy [kartotherian/deploy@d309a6e] (eqiad): tegola: reduce load to 50% during the weekend
* 15:11 cmjohnson1: swapping pem2 cr2-eqiad
* 21:05 mutante: puppetmaster1001 - re-enabled puppet, noop as expected, the passive host pulls from the active one, so only 2001 has the cron/job/timer
* 10:03 jynus: starting nodepool in labnodepool1001
* 21:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:21 jynus: starting profiling of phabricator db (db1043). Very low overhead.
* 21:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:03 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Fri Sep 11 06:03:00 UTC 2015 (duration 2m 59s)
* 21:01 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Revert "Have PdfHandler use Shellbox on Commons for 10% of requests" (duration: 00m 59s)
* 02:41 logmsgbot: l10nupdate@tin LocalisationUpdate completed (1.26wmf22) at 2015-09-11 02:41:24+00:00
* 20:58 mutante: temp disabling puppet on puppetmasters - deploying gerrit:724115 (gerrit:723310) [[phab:T273673|T273673]]
* 02:34 logmsgbot: l10nupdate@tin Synchronized php-1.26wmf22/cache/l10n: l10nupdate for 1.26wmf22 (duration: 11m 18s)
* 18:58 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-db1002.eqiad.wmnet with reason: REIMAGE
* 01:16 logmsgbot: ori@tin Synchronized php-1.26wmf22/extensions/TitleBlacklist: 9bf13dbe0b, 3203b045f7 (duration: 00m 12s)
* 18:56 robh@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-db1001.eqiad.wmnet with reason: REIMAGE
* 18:55 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-db1002.eqiad.wmnet with reason: REIMAGE
* 18:53 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on an-db1001.eqiad.wmnet with reason: REIMAGE
* 18:07 robh@cumin1001: END (ERROR) - Cookbook sre.experimental.reimage (exit_code=97) for host an-db1001.eqiad.wmnet
* 18:05 robh@cumin1001: START - Cookbook sre.experimental.reimage for host an-db1001.eqiad.wmnet
* 17:58 effie: depool mw1025, mw1319, mw1312 for test
* 16:20 dancy: testing upcoming Scap 4.0.2 release on beta
* 14:04 bblack: C:envoyproxy (appservers and others): restarting envoyproxy
* 14:04 bblack: C:envoyproxy (appservers and others): ca-certificates updated via cumin to workaround [[phab:T292291|T292291]] issues
* 13:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 13:45 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 13:23 bblack: manually trying LE expired root workaround on mwdebug1001 with puppet disabled ...
* 13:12 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 13:11 gehel@cumin1001: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97)
* 13:11 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 13:10 gehel@cumin1001: START - Cookbook sre.wdqs.data-reload
* 11:42 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'tegola-vector-tiles' for release 'main' .
* 11:11 jynus: manually migrating some vms out of ganeti1009 to avoid excessive memory pressure
* 10:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17413 and previous config saved to /var/cache/conftool/dbconfig/20211001-105849-root.json
* 10:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17412 and previous config saved to /var/cache/conftool/dbconfig/20211001-105735-root.json
* 10:43 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@d4caf6d] (eqiad): Increase mirrored traffic to 100% for eqiad (duration: 00m 49s)
* 10:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17411 and previous config saved to /var/cache/conftool/dbconfig/20211001-104345-root.json
* 10:43 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@d4caf6d] (eqiad): Increase mirrored traffic to 100% for eqiad
* 10:42 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17410 and previous config saved to /var/cache/conftool/dbconfig/20211001-104232-root.json
* 10:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17409 and previous config saved to /var/cache/conftool/dbconfig/20211001-102841-root.json
* 10:27 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17408 and previous config saved to /var/cache/conftool/dbconfig/20211001-102728-root.json
* 10:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17407 and previous config saved to /var/cache/conftool/dbconfig/20211001-101338-root.json
* 10:12 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17406 and previous config saved to /var/cache/conftool/dbconfig/20211001-101224-root.json
* 10:00 jgiannelos@deploy1002: Finished deploy [kartotherian/deploy@c123ab9] (eqiad): Increase mirrored traffic to 80% for eqiad (duration: 00m 51s)
* 09:59 jgiannelos@deploy1002: Started deploy [kartotherian/deploy@c123ab9] (eqiad): Increase mirrored traffic to 80% for eqiad
* 09:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1164 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17405 and previous config saved to /var/cache/conftool/dbconfig/20211001-095834-root.json
* 09:57 marostegui@cumin1001: dbctl commit (dc=all): 'db1177 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17404 and previous config saved to /var/cache/conftool/dbconfig/20211001-095720-root.json
* 09:55 marostegui: Upgrade db1164 and db1177
* 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1177 and db1164 for upgrade', diff saved to https://phabricator.wikimedia.org/P17403 and previous config saved to /var/cache/conftool/dbconfig/20211001-095433-marostegui.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17402 and previous config saved to /var/cache/conftool/dbconfig/20211001-094913-root.json
* 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17401 and previous config saved to /var/cache/conftool/dbconfig/20211001-094902-root.json
* 09:38 urbanecm: [urbanecm@mwmaint1002 ~]$ mwscript extensions/GrowthExperiments/maintenance/updateMenteeData.php --wiki=frwiki --force # to get an idea about timing for [[phab:T290609|T290609]], runs in a tmux session under my account
* 09:34 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17400 and previous config saved to /var/cache/conftool/dbconfig/20211001-093410-root.json
* 09:33 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17399 and previous config saved to /var/cache/conftool/dbconfig/20211001-093358-root.json
* 09:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2002.codfw.wmnet
* 09:19 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17398 and previous config saved to /var/cache/conftool/dbconfig/20211001-091906-root.json
* 09:18 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17397 and previous config saved to /var/cache/conftool/dbconfig/20211001-091854-root.json
* 09:04 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17396 and previous config saved to /var/cache/conftool/dbconfig/20211001-090402-root.json
* 09:03 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17395 and previous config saved to /var/cache/conftool/dbconfig/20211001-090351-root.json
* 09:02 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2002.codfw.wmnet
* 09:00 _joe_: restarting pybal low-traffic in eqiad to pick up the drop of proxyfetch to kubernetes services
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1172 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17394 and previous config saved to /var/cache/conftool/dbconfig/20211001-084859-root.json
* 08:48 marostegui@cumin1001: dbctl commit (dc=all): 'db1135 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17393 and previous config saved to /var/cache/conftool/dbconfig/20211001-084847-root.json
* 08:44 marostegui: Upgrade db1135 and db1172
* 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1172 for upgrade', diff saved to https://phabricator.wikimedia.org/P17392 and previous config saved to /var/cache/conftool/dbconfig/20211001-084435-marostegui.json
* 08:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1135 for upgrade', diff saved to https://phabricator.wikimedia.org/P17391 and previous config saved to /var/cache/conftool/dbconfig/20211001-084411-marostegui.json
* 08:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2080 [[phab:T290868|T290868]]', diff saved to https://phabricator.wikimedia.org/P17390 and previous config saved to /var/cache/conftool/dbconfig/20211001-084345-marostegui.json
* 08:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 08:15 _joe_: restarting pybal in codfw to pick up config changes
* 08:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on testvm[2001,2003].codfw.wmnet with reason: Ganeti tests
* 08:03 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 8:00:00 on testvm[2001,2003].codfw.wmnet with reason: Ganeti tests
* 06:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17388 and previous config saved to /var/cache/conftool/dbconfig/20211001-062846-root.json
* 06:27 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 06:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 100%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17387 and previous config saved to /var/cache/conftool/dbconfig/20211001-062453-root.json
* 06:13 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17386 and previous config saved to /var/cache/conftool/dbconfig/20211001-061342-root.json
* 06:13 elukey@deploy1002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality' for release 'main' .
* 06:09 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 75%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17385 and previous config saved to /var/cache/conftool/dbconfig/20211001-060949-root.json
* 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17384 and previous config saved to /var/cache/conftool/dbconfig/20211001-055838-root.json
* 05:54 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 50%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17383 and previous config saved to /var/cache/conftool/dbconfig/20211001-055445-root.json
* 05:43 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17382 and previous config saved to /var/cache/conftool/dbconfig/20211001-054335-root.json
* 05:39 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 25%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17381 and previous config saved to /var/cache/conftool/dbconfig/20211001-053942-root.json
* 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'db1114 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17380 and previous config saved to /var/cache/conftool/dbconfig/20211001-052831-root.json
* 05:26 marostegui: Upgrade db1114
* 05:25 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1114 for upgrade', diff saved to https://phabricator.wikimedia.org/P17379 and previous config saved to /var/cache/conftool/dbconfig/20211001-052509-marostegui.json
* 05:24 marostegui@cumin1001: dbctl commit (dc=all): 'db1119 (re)pooling @ 10%: After upgrade', diff saved to https://phabricator.wikimedia.org/P17378 and previous config saved to /var/cache/conftool/dbconfig/20211001-052438-root.json
* 05:22 marostegui: Upgrade db1119
* 05:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1119 for upgrade', diff saved to https://phabricator.wikimedia.org/P17377 and previous config saved to /var/cache/conftool/dbconfig/20211001-052133-marostegui.json
* 04:00 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Have PdfHandler use Shellbox on Commons for 10% of requests ([[phab:T289228|T289228]]) (duration: 00m 59s)
* 04:00 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:24 bd808@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 03:15 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .


== 2015-09-10 ==
== 2021-09-30 ==
* 23:52 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/237064/ (duration: 00m 11s)
* 23:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:47 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/237056/ (duration: 00m 11s)
* 23:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:13 logmsgbot: krenair@tin Synchronized wmf-config/wikitech.php: https://gerrit.wikimedia.org/r/221825 (duration: 00m 13s)
* 23:51 reedy@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Put a https protocol into values (duration: 01m 00s)
* 23:04 logmsgbot: krenair@tin Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/224771 (duration: 00m 12s)
* 23:48 dpifke@deploy1002: Finished deploy [statsv/statsv@afeff42]: Deploy statsv with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 00m 05s)
* 21:13 logmsgbot: legoktm@tin Synchronized php-1.26wmf22/extensions/Echo/modules: Align popup footer buttons to take 50% width each (duration: 00m 15s)
* 23:48 dpifke@deploy1002: Started deploy [statsv/statsv@afeff42]: Deploy statsv with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 20:50 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: depool es1001; increase weight of es1015 and es1019 (duration: 00m 19s)
* 23:41 dpifke@deploy1002: Finished deploy [performance/coal@1be49f8]: Deploy Coal with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 01m 07s)
* 20:47 ottomata: restarting eventlogging with 12 client side processors on eventlog1001
* 23:40 dpifke@deploy1002: Started deploy [performance/coal@1be49f8]: Deploy Coal with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 20:31 ottomata: turning off varnishncsa eventlogging eventlistener instances on frontend caches, it is now superseded by varnishkafka
* 23:39 dpifke@deploy1002: Finished deploy [performance/navtiming@29264fb]: Deploy Navtiming with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]] (duration: 00m 05s)
* 20:28 mutante: killed/restarted ganglia aggregator process for mobile-cache, upload cache, misc esams ...
* 23:39 dpifke@deploy1002: Started deploy [performance/navtiming@29264fb]: Deploy Navtiming with Kafka TLS support (not yet enabled) [[phab:T290131|T290131]]
* 20:22 jynus: last SCAP failed on 266/466 hosts
* 23:34 ejegg: updated Fundraising CiviCRM from {{Gerrit|d4da344274}} to {{Gerrit|d74e9aa0a1}}
* 20:21 mutante: killed/restarted ganglia aggregator process for text-caches esams on hooft
* 22:09 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 20:17 yurik: deployed kartotherian
* 22:07 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 20:08 logmsgbot: jynus@tin Synchronized wmf-config/db-eqiad.php: Depool es1001; increase weight of es1015 and es1019 (duration: 00m 11s)
* 22:06 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 19:11 logmsgbot: twentyafterfour@tin rebuilt wikiversions.cdb and synchronized wikiversions files: wikipedia wikis to 1.26wmf22
* 21:53 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 19:09 logmsgbot: twentyafterfour@tin Synchronized php-1.26wmf22/extensions/CentralNotice: deploy https://gerrit.wikimedia.org/r/#/c/237458/ (duration: 00m 12s)
* 21:06 eileen: civicrm revision changed from {{Gerrit|2ecb8f0bcd}} to {{Gerrit|d4da344274}}, config revision is {{Gerrit|77cb7ec866}}
* 18:57 twentyafterfour: restarted phd on iridium
* 20:54 ryankemper: [WCQS] `ryankemper@wcqs1003:~$ sudo pool` (merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/725110 to unbreak readiness probe)
* 18:51 logmsgbot: twentyafterfour@tin Synchronized php-1.26wmf22/extensions/Wikidata: Deploy wikidata patch: https://gerrit.wikimedia.org/r/#/c/237449/ (duration: 00m 19s)
* 20:54 topranks: Routinator on rpki1001 upgraded to  0.10.0 and working again after force refresh.
* 18:23 logmsgbot: twentyafterfour@tin Synchronized php-1.26wmf22: deploy https://gerrit.wikimedia.org/r/#/c/237440/ (duration: 01m 42s)
* 20:49 brennen: gitlab1001: upgrade to 14.2.5 complete
* 18:09 cmjohnson1: reseating pem2 cr2-eqiad
* 20:32 brennen: gitlab2001, gitlab1001: downtime for upgrades to 14.2.5
* 16:52 akosiaris: puppetswat done
* 20:18 ryankemper: [WCQS] `ryankemper@wcqs1003:~$ sudo depool` (not sure why pybal can't depool it, the other 2 servers are pooled)
* 16:50 mobrovac: restbase rolling restart of rb100x
* 19:51 topranks: Updating routinator on rpki1001 [[phab:T291543|T291543]]
* 16:49 mobrovac: restbase enabled puppet on rb100x
* 19:39 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 16:13 akosiaris: started puppetSWAT
* 19:38 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:10 logmsgbot: marktraceur@tin Finished scap: Make sure codfw got the last few patches sync'd to it (duration: 07m 36s)
* 19:37 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 16:03 logmsgbot: marktraceur@tin Started scap: Make sure codfw got the last few patches sync'd to it
* 19:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:02 logmsgbot: marktraceur@tin Synchronized php-1.26wmf22/: [SWAT] [wmf22] Revert opera redirect loop fix that caused redirect loops in Firefox (duration: 02m 30s)
* 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:55 mobrovac: restbase disabled puppet on rb100x
* 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:45 logmsgbot: marktraceur@tin Synchronized php-1.26wmf22/extensions/UploadWizard/resources/transports/mw.FormDataTransport.js: [SWAT] [wmf22] Always set 'offset' with chunked uploads, even for first chunk (offset == 0) (duration: 02m 21s)
* 19:14 jhuneidi@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.38.0-wmf.2  refs [[phab:T281166|T281166]]
* 15:26 ottomata: started hadoop decomission of analytics1016
* 19:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:21 logmsgbot: marktraceur@tin Synchronized wmf-config/: [SWAT] Attempting another sync to mw2187 hoping it's up now (duration: 02m 22s)
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:05 logmsgbot: marktraceur@tin Synchronized wmf-config/: [SWAT] [config] Beta: Enable Content Translation suggestions (duration: 02m 22s)
* 19:07 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/MobileFrontend: Backport: [[gerrit:724979{{!}}Fix search within pages alignment (T292107)]] (duration: 01m 09s)
* 13:35 moritzm: enabled ferm on mediawiki app servers in codfw
* 19:05 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.1/extensions/EventBus/includes/EventBus.php: Backport: [[gerrit:724481{{!}}Guard against undefined index notice when setting x-client-ip (T288853)]] (duration: 01m 09s)
* 13:30 jynus: performing schema change and maintenance on officewiki and public all wikis with flow enabled
* 19:04 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/extensions/EventBus/includes/EventBus.php: Backport: [[gerrit:724480{{!}}Guard against undefined index notice when setting x-client-ip (T288853)]] (duration: 01m 09s)
* 12:51 moritzm: enabled ferm on mediawiki API servers in codfw
* 18:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:36 moritzm: enabled ferm on mediawiki video scalers, image scalers and job runners in codfw
* 18:58 thcipriani@deploy1002: Synchronized php-1.38.0-wmf.2/skins/Vector/resources/skins.vector.styles.legacy/components/MenuDropdown.less: Backport: [[gerrit:724798{{!}}Restore original more menu padding in legacy Vector (T289163)]] (duration: 01m 08s)
* 09:20 mobrovac: restbase deploying 0182962
* 18:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:13 logmsgbot: l10nupdate@tin ResourceLoader cache refresh completed at Thu Sep 10 06:13:14 UTC 2015 (duration 13m 13s)