You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech-static
Jump to navigation Jump to search
imported>Labslogbot
(reset password for User:Tonval after identify verification (Jamesofur))
imported>Stashbot
(ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.23/includes/libs/rdbms/database/Database.php: (no justification provided) (duration: 00m 57s))
Line 1: Line 1:
== 2015-08-06 ==
== 2021-09-18 ==
* 00:49 Jamesofur: reset password for User:Tonval after identify verification
* 01:47 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.23/includes/libs/rdbms/database/Database.php: (no justification provided) (duration: 00m 57s)
* 00:42 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: (no message) (duration: 00m 12s)
* 01:01 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.23/includes/libs/rdbms/database/Database.php: (no justification provided) (duration: 01m 03s)
* 00:34 twentyafterfour: phabricator upgrade complete
* 00:33 ebernhardson: es1.7.1 upgrade on elastic1017
* 00:31 RoanKattouw: <twentyafterfour> ok I'm gonna take phabricator down for upgrade
* 00:04 gwicke: restarted restbase old-render clean-up scripts on wikipedia html and data-parsoid


== 2015-08-05 ==
== 2021-09-17 ==
* 23:56 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Unset $wgDiff (duration: 00m 12s)
* 21:28 legoktm@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:37 logmsgbot: ori Synchronized php-1.26wmf17/extensions/FlaggedRevs: I2089b21fc (duration: 00m 13s)
* 21:19 legoktm@cumin1001: START - Cookbook sre.dns.netbox
* 23:32 logmsgbot: bd808 Synchronized php-1.26wmf17/extensions/VisualEditor/extension.json: VisualEditor b/c anon IP module name fix (Ia92ecc0) (duration: 00m 12s)
* 19:00 hnowlan@cumin1001: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 23:09 logmsgbot: bd808 Synchronized wmf-config/CommonSettings.php: beta: Configure  and  (I7d20abb) (duration: 00m 13s)
* 17:02 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1022.eqiad.wmnet with reason: REIMAGE
* 23:01 logmsgbot: ori Synchronized php-1.26wmf17/extensions/EducationProgram: I2089b21fc (duration: 00m 13s)
* 17:02 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 23:00 ebernhardson: es1.7.1 upgrade on elastic1016
* 17:00 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1022.eqiad.wmnet with reason: REIMAGE
* 22:47 logmsgbot: krinkle Synchronized php-1.26wmf17/includes/resourceloader/ResourceLoaderModule.php: T104950 (duration: 00m 12s)
* 16:48 hnowlan@cumin1001: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 22:47 logmsgbot: krinkle Synchronized php-1.26wmf16/includes/resourceloader/ResourceLoaderModule.php: T104950 (duration: 00m 13s)
* 16:27 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:29 hoo: Started dumpwikidatajson.sh on snapshot1003 again to create a Wikidata json dump after earlier attempts this week and today failed.
* 16:25 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 22:27 logmsgbot: hoo Synchronized php-1.26wmf17/extensions/Wikidata/: Update Wikibase: Fix use class in CallbackFactory (duration: 00m 21s)
* 16:11 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:27 logmsgbot: hoo Synchronized php-1.26wmf16/extensions/Wikidata/: Update Wikibase: Fix use class in CallbackFactory (duration: 00m 20s)
* 16:04 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 22:27 ebernhardson: es1.7.1 upgrade on elastic1015
* 14:49 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 21:44 subbu: deployed cherry-picked ba49b80bdc3a156604eb3996830af0d5bc45c503 hotfix to the parsoid cluster to deal with crashers from deploy earlier today
* 14:29 hnowlan@cumin1001: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 21:17 gwicke: finished deploy of restbase 9e177f3 (deploy 7006f9f) on restbase cluster
* 13:06 moritzm: installing 4.9.272 kernels on stretch hosts (no reboots yet)
* 21:12 hoo: Started dumpwikidatajson.sh on snapshoot1003 to create a Wikidata json dump after earlier attempts this week failed.
* 11:28 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 21:05 ebernhardson: es1.7.1 upgrade for es1014
* 11:14 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:59 gwicke: restbase 9e177f3 (deploy 7006f9f) canary deploy on restbase1001
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:56 logmsgbot: hoo Synchronized php-1.26wmf17/extensions/Wikidata/: Update Wikibase: Fix the dumpJson and the rebuildItemsPerSite maintenance scripts (duration: 00m 20s)
* 09:37 milimetric@deploy1002: Finished deploy [analytics/refinery@37e904a] (thin): Only syncing sanitize allowlist, deploying THIN for consistency (duration: 00m 07s)
* 20:55 logmsgbot: hoo Synchronized php-1.26wmf16/extensions/Wikidata/: Update Wikibase: Fix the dumpJson and the rebuildItemsPerSite maintenance scripts (duration: 00m 20s)
* 09:37 milimetric@deploy1002: Started deploy [analytics/refinery@37e904a] (thin): Only syncing sanitize allowlist, deploying THIN for consistency
* 20:25 subbu: deployed parsoid version d5a5722c
* 09:36 milimetric@deploy1002: Finished deploy [analytics/refinery@37e904a]: Only syncing sanitize allowlist (duration: 17m 43s)
* 20:22 logmsgbot: krinkle Synchronized php-1.26wmf16/includes/resourceloader/ResourceLoaderFileModule.php: T104950 (duration: 00m 12s)
* 09:19 milimetric@deploy1002: Started deploy [analytics/refinery@37e904a]: Only syncing sanitize allowlist
* 20:21 logmsgbot: krinkle Synchronized php-1.26wmf16/includes/resourceloader/ResourceLoader.php: T104950 (duration: 00m 11s)
* 08:00 jayme: restarting php-fpm on wtp1037 and wtp1030
* 20:13 logmsgbot: krinkle Synchronized php-1.26wmf17/includes/resourceloader/ResourceLoaderFileModule.php: T104950 (duration: 00m 12s)
* 02:28 ryankemper: [[phab:T290330|T290330]] [Remove WDQS codfw ~hourly restarts] Successfully rolled out to rest of fleet `sudo cumin 'C:query_service::crontasks' 'sudo run-puppet-agent --force && sudo systemctl reset-failed wdqs-restart-hourly-w-random-delay.timer'`
* 20:12 logmsgbot: krinkle Synchronized php-1.26wmf17/includes/resourceloader/ResourceLoader.php: T104950 (duration: 00m 13s)
* 02:22 ryankemper: [[phab:T290330|T290330]] [Remove WDQS codfw ~hourly restarts] `wdqs2001` and `wdqs2004` look fine after running `sudo systemctl reset-failed wdqs-restart-hourly-w-random-delay.timer` to clean up dangling timer
* 20:07 logmsgbot: ori Synchronized php-1.26wmf17/extensions/PageTriage: I2089b21fc: Updated mediawiki/core Project: mediawiki/extensions/PageTriage  22eddf4ad5bf6b3fe7c49af5812ce5fcfa5e1911 (duration: 00m 14s)
* 01:55 ryankemper: [[phab:T290330|T290330]] [Remove WDQS codfw ~hourly restarts] Testing on arbitrary codfw host: `ryankemper@wdqs2001:~$ sudo run-puppet-agent`
* 19:55 gwicke: re-enabled puppet on restbase staging cluster in preparation for deploy
* 01:48 ryankemper: [[phab:T290330|T290330]] [Remove WDQS codfw ~hourly restarts] `sudo cumin 'C:query_service::crontasks' 'sudo disable-puppet "Stop doing wdqs codfw ~hourly restarts - [[phab:T290330|T290330]]"'`
* 19:52 gwicke: disabled puppet on restbase hosts in preparation for the deploy
* 00:04 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 19:36 dcausse: es1.7.1: resume writes to indices
* 00:01 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 19:31 dcausse: es1.7.1: restart elastic1013
* 19:19 bblack: all caches depooled for thermal stuff repooled
* 18:54 bblack: depooled cp1060, cp1064 ( thermal batch 3: https://phabricator.wikimedia.org/T103226 )
* 18:37 dcausse: es1.7.1: restart elastic1012
* 18:34 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf17
* 18:07 bblack: depooled cp1059, cp1062, cp1067 ( thermal batch 2: https://phabricator.wikimedia.org/T103226 )
* 18:02 moritzm: restarted HHVM on appservers (mw1136-mw1158) for tidy/pcre security updates
* 17:56 dcausse: es1.7.1: restart elastic1011
* 17:48 dcausse: es1.7.1: freeze indices (take 2)
* 17:36 logmsgbot: bblack Synchronized wmf-config/squid-labs.php: (no message) (duration: 00m 12s)
* 17:15 moritzm: restarted HHVM on appservers (mw1149-mw1151, mw1161-1188, mw1209-1220) for tidy/pcre security updates
* 17:09 logmsgbot: hoo Finished scap: Rebuild l10n cache for wmf17, got forgotten during the train (duration: 26m 02s)
* 17:07 bblack: really depooled cp1046, cp1061, cp1066 ( thermal batch 1: https://phabricator.wikimedia.org/T103226 )
* 17:02 bblack: depooled cp1046, cp1061, cp1066 ( thermal batch 1: https://phabricator.wikimedia.org/T103226 )
* 16:43 logmsgbot: hoo Started scap: Rebuild l10n cache for wmf17, got forgotten during the train
* 16:28 bblack: cache puppets disabled for a little while, to make sure do_esi doesn't melt things
* 15:11 logmsgbot: thcipriani Synchronized php-1.26wmf17/extensions/ContentTranslation/modules/tools/ext.cx.tools.mt.js: SWAT: FIX: Not able to set cursor in previous sections [[gerrit:229328]] (duration: 00m 12s)
* 15:02 andrewbogott: rebooting labvirt1009
* 14:51 gwicke: stopped restbase on restbase1009
* 14:44 moritzm: restarted HHVM on appservers (mw1026-mw1113) for tidy/pcre security updates
* 14:42 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: depool db1056 (duration: 00m 12s)
* 14:29 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: repool db1059 (duration: 00m 13s)
* 13:16 hoo: Removed Wikidata JSON dumps from Monday and Tuesday as they were incomplete/ had the wrong serialization format
* 12:41 moritzm: restarted HHVM on canary appservers for tidy/pcre security updates, remaining app servers following soon
* 12:32 paravoid: upgrading asw-c-codfw and asw-d-codfw to newer junos
* 11:17 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: repool db1056, depool db1059 (duration: 00m 12s)
* 11:01 godog: depool restbase1009, investigating healthcheck returning 500s
* 10:52 godog: pool restbase100[789] in pybal
* 10:43 paravoid: upgrading asw-b-codfw to newer junos
* 10:36 jynus: applying schema change for s4 on codfw, some lag expected
* 09:08 dcausse: es1.7.1: upgrade elastic1010
* 07:46 dcausse: es1.7.1: upgrade elastic1009
* 07:12 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: depool db1056 for maintenance, db1064 set to 100% (duration: 00m 12s)
* 06:29 springle: finish OSC gerrit 228756 s5 wb_items_per_site.ips_site_page
* 06:27 logmsgbot: @tin ResourceLoader cache refresh completed at Wed Aug  5 06:27:08 UTC 2015 (duration 27m 7s)
* 06:26 dcausse: es1.7.1: upgrade elastic1008
* 04:56 ebernhardson: restarted elasticsearch on elastic1007 for 1.7.1 upgrade
* 03:34 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings.php: Disable two more wikis due to namespace conflicts - https://gerrit.wikimedia.org/r/229292 (duration: 00m 12s)
* 03:09 ebernhardson: restarted elasticsearch on elastic1006 for 1.7.1 upgrade
* 03:04 logmsgbot: @tin LocalisationUpdate completed (1.26wmf17) at 2015-08-05 03:04:08+00:00
* 02:57 logmsgbot: l10nupdate Synchronized php-1.26wmf17/cache/l10n: (no message) (duration: 10m 30s)
* 02:31 logmsgbot: @tin LocalisationUpdate completed (1.26wmf16) at 2015-08-05 02:31:44+00:00
* 02:28 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 06m 56s)
* 01:44 ebernhardson: restarting elasticsearch of es1005


== 2015-08-04 ==
== 2021-09-16 ==
* 23:59 logmsgbot: maxsem Synchronized php-1.26wmf16/extensions/WikimediaEvents/: SWAT (duration: 00m 12s)
* 23:58 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-media' for release 'main' .
* 23:57 logmsgbot: maxsem Synchronized php-1.26wmf17/extensions/WikimediaEvents/: SWAT (duration: 00m 12s)
* 23:51 ryankemper: [[phab:T273673|T273673]] All looks good, re-enabling puppet and running on rest of fleet: `sudo cumin 'R:Class = elasticsearch::log::hot_threads' 'sudo run-puppet-agent --force'`
* 23:08 logmsgbot: mattflaschen Synchronized wmf-config/InitialiseSettings.php: Disable Flow on betawikiversity (duration: 00m 13s)
* 23:44 ryankemper: [[phab:T273673|T273673]] The associated crons are gone and I see the new systemd timers for both gc-cleanup and the hot threads logger
* 22:07 logmsgbot: twentyafterfour Synchronized php-1.26wmf17: forgot submodule update (duration: 01m 39s)
* 23:39 ryankemper: [[phab:T273673|T273673]] Testing elasticsearch cron->systemd timer-job changes on canary instance `ryankemper@elastic1064:~$ sudo run-puppet-agent --force`
* 20:46 logmsgbot: twentyafterfour Finished scap: fixup wikidata submodule version (duration: 23m 26s)
* 23:37 ryankemper: [[phab:T273673|T273673]] Disabling puppet on elasticsearch hosts `sudo cumin 'R:Class = elasticsearch::log::hot_threads' 'sudo disable-puppet "https://gerrit.wikimedia.org/r/c/operations/puppet/+/721413 - [[phab:T273673|T273673]]"'`
* 20:22 logmsgbot: twentyafterfour Started scap: fixup wikidata submodule version
* 23:21 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 19:46 dcausse: es1.7.1: upgrade elastic1003
* 23:21 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 19:12 ori: Applied Icba6d7a87 on mw1017 for a couple of webpagetest runs
* 23:19 legoktm@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 19:08 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.26wmf17
* 23:18 legoktm@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 18:51 logmsgbot: twentyafterfour Finished scap: rebuild localization cache, sync 1.26wmf17 (duration: 28m 39s)
* 23:18 legoktm@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 18:42 dcausse: es1.7.1: upgrade elastic1002
* 23:17 legoktm@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 18:22 logmsgbot: twentyafterfour Started scap: rebuild localization cache, sync 1.26wmf17
* 23:17 legoktm@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 18:00 andrewbogott: re-imaging labnodepool1001
* 23:16 legoktm@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 17:35 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Increase db1064 traffic (duration: 00m 13s)
* 22:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:18 dcausse: es1.7.1: upgrade elastic1001
* 22:40 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:17 hoo: Started dumpwikidatajson.sh on snapshot1003 to create a correct Wikidata json dump
* 22:38 legoktm@deploy1002: Finished scap: i18n for restoring deprecated token APIs (duration: 15m 30s)
* 17:14 logmsgbot: hoo Synchronized php-1.26wmf16/extensions/Wikidata/: Fix maintenance/dumpJson.php fatal (duration: 00m 21s)
* 22:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:11 chasemp: freezing elasticsearch indexes for 1.7.1
* 22:25 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:23 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Repool db1064 with low traffic after maintenance (duration: 00m 12s)
* 22:23 legoktm@deploy1002: Started scap: i18n for restoring deprecated token APIs
* 15:34 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Disable Flow on ptwikibooks [[gerrit:229133]] (duration: 03m 40s)
* 22:21 legoktm@deploy1002: Synchronized php-1.37.0-wmf.23/includes/api/: Restore deprecated token APIs (3/3) (duration: 00m 56s)
* 15:28 jynus: restarting db1064 for regular maintenance and upgrade given that it was depooled in the first place for a schema change
* 22:19 legoktm@deploy1002: Synchronized php-1.37.0-wmf.23/autoload.php: Restore deprecated token APIs (2/3) (duration: 00m 56s)
* 15:24 logmsgbot: thcipriani Synchronized wmf-config: SWAT: Add configuration for authmetrics logging (part II) [[gerrit:227630]] (duration: 02m 41s)
* 22:16 legoktm@deploy1002: Synchronized php-1.37.0-wmf.23/includes/api/ApiTokens.php: Restore deprecated token APIs (1/3) (duration: 00m 56s)
* 15:21 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Add configuration for authmetrics logging (part I) [[gerrit:227630]] (duration: 03m 11s)
* 21:22 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ganeti4004.ulsfo.wmnet with reason: REIMAGE
* 15:13 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor for 10% of new accounts on enwiki [[gerrit:227329]] (duration: 03m 13s)
* 21:19 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4004.ulsfo.wmnet with reason: REIMAGE
* 14:36 paravoid: cr2-codfw upgrading SCBs
* 21:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:23 paravoid: upgrading junos on asw-a-codfw again
* 21:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:45 _joe_: repooling mw1159,mw1160
* 20:49 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:721610{{!}}Set jQuery migrate to false for wikibooks and Commons (T280944)]] (duration: 00m 56s)
* 13:21 paravoid: rebooting asw-a-codfw, member 2
* 19:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:04 Coren: labstore1001 rebooting (possibly a couple of times) during tests and reinstallation
* 19:21 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:55 hoo: Syncing to mw1160 failed (Host key verification failed.)
* 19:08 hashar@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.37.0-wmf.23
* 12:50 logmsgbot: hoo Synchronized php-1.26wmf16/extensions/Wikidata/: Update Wikibase: Fixes for JSON dump creation (duration: 00m 39s)
* 18:55 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:06 moritzm: updated canary appservers mw1017/mw1018 to updated pcre3 + hhvm restart
* 18:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:03 moritzm: added pcre3_8.31-2ubuntu2.1+wm1 to trusty-wikimedi (reroll of security update with our JIT enablement patch)
* 18:50 robh@cumin1001: START - Cookbook sre.dns.netbox
* 11:48 _joe_: killed ircecho to prevent furter icinga spam
* 18:49 dzahn@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 11:44 jynus: schema update on Commons failed, expect some minor inestabilities until everything is fixed
* 18:46 dzahn@cumin1001: START - Cookbook sre.dns.netbox
* 11:41 _joe_: reimaging mw1159 to HAT
* 18:44 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:01 paravoid: upgrading junos on asw-a-codfw
* 18:29 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.23/extensions/GrowthExperiments/modules/ext.growthExperiments.StructuredTask/addlink/AddLinkArticleTarget.js: {{Gerrit|bb8cba102fe417e8e41b7c4e9179d119c7d25a43}}: Use growthexperiments-structuredtask-no-suggestions-found-dialog-button in outdated suggestions dialog (2/2) (duration: 01m 06s)
* 10:57 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Depool db1064 (duration: 00m 13s)
* 18:27 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.23/extensions/GrowthExperiments/extension.json: {{Gerrit|bb8cba102fe417e8e41b7c4e9179d119c7d25a43}}: Use growthexperiments-structuredtask-no-suggestions-found-dialog-button in outdated suggestions dialog (1/2) (duration: 01m 07s)
* 10:27 godog: bootstrap cassandra on restbase1009
* 17:54 volans: turn of lldp agent on NIC (both ports) on ms-be105[1-9],ms-be205[2-6] - [[phab:T290984|T290984]]
* 10:21 akosiaris: enabling puppet on tin
* 17:31 volans: turn of lldp agent on NIC (both ports) on ms-be2051 - [[phab:T290984|T290984]]
* 09:30 jynus: rolling schema change on image table to all wikis
* 17:09 jynus: deployed extra grants for admin user on s6 primary
* 08:07 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Increasing load for db1027 and db1015 (duration: 00m 12s)
* 16:39 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-test-coord1002.eqiad.wmnet
* 07:38 logmsgbot: @tin ResourceLoader cache refresh completed at Tue Aug  4 07:38:01 UTC 2015 (duration 38m 0s)
* 16:17 btullis@cumin1001: START - Cookbook sre.hosts.decommission for hosts an-test-coord1002.eqiad.wmnet
* 06:14 _joe_: depooled mw1061
* 16:04 marostegui: Disconnect s6 master from m5 master (noting the replication position) [[phab:T167973|T167973]]
* 06:14 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings.php: Disable Flow on Japanese Wikiversity (duration: 00m 13s)
* 16:04 marostegui: Disconnect s6 master from m5 master (noting the replication position)
* 06:09 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings.php: Disable Flow on English Wikiversity (duration: 00m 12s)
* 15:52 bd808: marostegui is awesome and made wikitech better today. :)
* 06:07 legoktm: sync to mw1061 failed
* 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'Set wikitech on read-only for maintenance [[phab:T287454|T287454]]', diff saved to https://phabricator.wikimedia.org/P17283 and previous config saved to /var/cache/conftool/dbconfig/20210916-150444-marostegui.json
* 06:07 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings.php: Disable Flow on English Wikiversity (duration: 00m 12s)
* 15:03 marostegui: Set wikitech on read-only (from now on all SAL changes will fail) [[phab:T167973|T167973]]
* 02:32 logmsgbot: @tin LocalisationUpdate completed (1.26wmf16) at 2015-08-04 02:32:18+00:00
* 14:56 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mwmaint2002.codfw.wmnet with reason: reimage
* 02:28 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 09m 16s)
* 14:55 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on mwmaint2002.codfw.wmnet with reason: reimage
* 02:18 logmsgbot: twentyafterfour Finished scap: sync https://gerrit.wikimedia.org/r/#/c/229036/1 (duration: 25m 41s)
* 14:53 dzahn@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mwmaint2002.codfw.wmnet with reason: REIMAGE
* 01:52 logmsgbot: twentyafterfour Started scap: sync https://gerrit.wikimedia.org/r/#/c/229036/1
* 14:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:02 awight: updated paymentswiki to a8c0ecbedef6179c78ed833da9f2049cb0f2641b
* 14:51 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:51 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mwmaint2002.codfw.wmnet with reason: REIMAGE
* 14:35 mutante: reimaging mwmaint2002 to buster ([[phab:T267607|T267607]], [[phab:T245757|T245757]])
* 14:21 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwmaint2002.codfw.wmnet with reason: reimage
* 14:21 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mwmaint2002.codfw.wmnet with reason: reimage
* 14:12 mutante: switching https://noc.wikimedia.org from codfw to eqiad ([[phab:T287539|T287539]], [[phab:T267607|T267607]])
* 13:44 sukhe: homer: running for Gerrit: 721018: set up BGP peering to durum hosts in <nowiki>{</nowiki>eqiad,codfw,esams,ulsfo,eqsin<nowiki>}</nowiki>
* 13:25 effie: pool mw1422 mw1455
* 13:24 effie: poiol mw1422 mw1455
* 13:18 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:12 hashar@deploy1002: Synchronized php: group1 wikis to 1.37.0-wmf.23 (duration: 01m 04s)
* 13:11 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.23
* 12:08 marostegui: Deploy schema change on s2 codfw (lag will show up) [[phab:T290057|T290057]]
* 12:00 mbsantos: start OSM re-import script in maps2009 (depooled)
* 11:51 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.21/extensions/GrowthExperiments/includes/MentorDashboard/MenteeOverview/UncachedMenteeOverviewDataProvider.php: {{Gerrit|529f86c5a998820c32e7d7f2d952317080383e05}}: UncachedMenteeOverviewDataProvider: Do not fatal with zero mentees ([[phab:T291088|T291088]]) (duration: 01m 04s)
* 11:49 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.23/extensions/GrowthExperiments/includes/MentorDashboard/MenteeOverview/UncachedMenteeOverviewDataProvider.php: {{Gerrit|9e0f6f84240bf621e97806a94a0e786817001668}}: UncachedMenteeOverviewDataProvider: Do not fatal with zero mentees ([[phab:T291088|T291088]]) (duration: 01m 04s)
* 11:43 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.23/extensions/AbuseFilter/: Fixing incorrect deployment of {{Gerrit|01e4450}} for [[phab:T291123|T291123]]. This is supposed to be a no-op. (duration: 01m 05s)
* 11:43 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:41 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging/php-1.37.0-wmf.23 (wmf/1.37.0-wmf.23 * u+2-2)]$ git rebase &&  git submodule update extensions/AbuseFilter/ # fixing an incorrect deployment that happened in [[phab:T291123|T291123]]
* 11:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:41 urbanecm: [urbanecm@deploy1002 /srv/mediawiki-staging/php-1.37.0-wmf.23/extensions/AbuseFilter (wmf/1.37.0-wmf.23 u=)]$ git co {{Gerrit|0d2bc7ca17b9f767ae5753db7e4e41fd9e7d3531}} # reset repo to expected state, fixing incorrect deploy of a backport in [[phab:T291123|T291123]]
* 11:34 moritzm: installing 4.9.272 kernels on stretch hosts (no reboots yet)
* 11:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:32 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 11:21 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 11:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:11 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings-labs.php: Config: [[gerrit:721305{{!}}Add new WikimediaBadges config (T232927)]] (2/2) (duration: 01m 05s)
* 11:09 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:721305{{!}}Add new WikimediaBadges config (T232927)]] (1/2) (duration: 01m 05s)
* 11:03 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 11:03 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 10:59 hashar@deploy1002: Synchronized php-1.37.0-wmf.21/includes/language/Message.php: Message: Remove deprecated format property - [[phab:T146416|T146416]] [[phab:T291124|T291124]] (duration: 01m 06s)
* 10:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:21 topranks: Changing default gateway on mw1422 to use VRRP backup (cr2), to determine if tail drops from switches to cr1 is cause of TCP retransmissions.
* 10:14 effie: depool mw1455 for network testing
* 10:11 effie: depool mw1422 for network testing
* 10:01 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 10:01 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 10:00 jmm@cumin2002: END (FAIL) - Cookbook sre.puppet.renew-cert (exit_code=99) for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 10:00 jmm@cumin2002: START - Cookbook sre.puppet.renew-cert for mx2002.wikimedia.org: Renew puppet certificate - jmm@cumin2002
* 09:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2002.wikimedia.org with reason: reimage
* 09:36 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2002.wikimedia.org with reason: reimage
* 09:10 moritzm: in-place re-installation of mx2002.wikimedia.org (test VM) to test the new installer key support in the sre.puppet.renew-cert cookbook
* 08:04 moritzm: upgrading scandium to PHP 7.2 backport of patch for enhanced DOM replaceChild/removeChild performance  [[phab:T291052|T291052]]
* 07:48 elukey@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
* 05:35 marostegui: Optimize dewiki.logging in codfw [[phab:T287344|T287344]]


== 2015-08-03 ==
== 2021-09-15 ==
* 23:56 awight: updating paymentswiki to b20559f75e0fc0d863efe027d76b78462555767c
* 23:02 legoktm: upgrading lists1001 to use postorius 1.3.5
* 23:45 ottomata: rebuilding kafka cluster
* 22:51 legoktm: uploaded new mailmanclient/postorius packages to apt1001
* 23:21 logmsgbot: ebernhardson Synchronized php-1.26wmf16/extensions/VisualEditor/: Bump visualeditor for swat in 1.26wmf16 (duration: 00m 13s)
* 22:38 ryankemper: [WDQS Deploy] Deploy complete. Successful test query placed on query.wikidata.org, there's no relevant criticals in Icinga, and Grafana looks good
* 23:18 logmsgbot: ebernhardson Synchronized php-1.26wmf16/extensions/WikimediaEvents/: Bump WikimediaEvents in SWAT for 1.26wmf16 (duration: 00m 12s)
* 22:03 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 23:17 logmsgbot: ebernhardson Synchronized php-1.26wmf16/extensions/Flow: Bump flow submodule in swat for 1.26wmf16 (duration: 00m 14s)
* 22:03 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across both test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 23:05 logmsgbot: ebernhardson Synchronized wmf-config/: (no message) (duration: 00m 13s)
* 22:03 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 22:46 awight: reverting paymentswiki, to 6dbbb4c784349ace5a0ac616c61ec0c3fffa0eff
* 22:02 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@902529b]: 0.3.85 (duration: 06m 59s)
* 22:33 ejegg: updated crm from db417a28a247a3fdf3e3023a700d6266e04f3e9d to 4f40ac6de0385982d8e672b1ed30ff1a2a2a2aa1
* 21:56 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.85` on canary `wdqs1003`; proceeding to rest of fleet
* 22:27 awight: deployed debug hack to payments1004
* 21:55 ryankemper@deploy1002: Started deploy [wdqs/wdqs@902529b]: 0.3.85
* 21:43 awight: deploy paymentswiki-staging configuration: add explicit queue name for payments4 connecting to payments1-3
* 21:55 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.85`. Pre-deploy tests passing on canary `wdqs1003`
* 21:32 awight: deploy paymentswiki-staging configuration
* 21:42 ebernhardson@deploy1002: Finished deploy [wdqs/wdqs@f3473d9]: Reference files deployed by puppet through query_service paths instead of wdqs (duration: 02m 07s)
* 21:25 awight: updating payments1004 to 1daf9d0fe773c022a2ab8de5542fc15ddc261e75
* 21:40 ebernhardson@deploy1002: Started deploy [wdqs/wdqs@f3473d9]: Reference files deployed by puppet through query_service paths instead of wdqs
* 21:04 logmsgbot: bd808 Synchronized wmf-config/logging.php: Remove code duplication from monolog config (Ia960203) (duration: 00m 11s)
* 21:29 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:51 awight: updating paymentswiki from d4bdce1cae168448b116d75e3dcd3303b0f13dd2 to d56dad49ef0da0a8b9c7da410bcac12e48724ae5
* 21:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:26 arlolra: updated Parsoid to version 38d0cdb13734a40bc2908e779e1a0cde158048f2
* 21:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:49 logmsgbot: aude Synchronized php-1.26wmf16/extensions/Wikidata: Fix T104609 and fix/debug T107711 (duration: 00m 19s)
* 21:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:21 logmsgbot: aude Synchronized usagetracking.dblist: Enable usage tracking on enwiki (duration: 00m 12s)
* 21:00 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|60e7e515d7034a9f839d78851f1dcc2be3df7f3b}}: Set wmgEchoEnablePush to false explicitly on arbcom_* wikis ([[phab:T291128|T291128]]) (duration: 01m 06s)
* 19:20 logmsgbot: aude Synchronized wmf-config/InitialiseSettings.php: Add debug log group for T107711 (duration: 00m 12s)
* 19:50 twentyafterfour@deploy1002: Synchronized php-1.37.0-wmf.23/extensions/AbuseFilter/: sync backport for https://gerrit.wikimedia.org/r/c/mediawiki/extensions/AbuseFilter/+/721312 (duration: 01m 06s)
* 19:07 ottomata: stopped a couple of kafka brokers.  acknowldeging..
* 19:44 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:02 bblack: https://gerrit.wikimedia.org/r/228882 reversion salted + nginx reloaded
* 19:43 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:28 gwicke: switched restbase1002 and restbase1003 to iojs as well
* 19:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:36 logmsgbot: aude Synchronized usagetracking.dblist: Enable usage tracking on zhwiki (duration: 00m 12s)
* 19:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:21 logmsgbot: legoktm Synchronized php-1.26wmf16/includes/Revision.php: https://gerrit.wikimedia.org/r/228853 (duration: 00m 12s)
* 19:07 hashar@deploy1002: rebuilt and synchronized wikiversions files: Rollback all wikis to 1.37.0-wmf.23
* 17:21 ottomata: starting kafka partition reassignment to balance all partiions over to 3 new kafka brokers and off of analytics1021
* 19:07 urbanecm: Re-start server-side upload for 1 video file, likely temporary swift failure ([[phab:T289781|T289781]])
* 17:21 gwicke: switching from node 0.10 to iojs 2.5 on restbase1001 after load testing on xenon went well
* 19:06 urbanecm: Start server-side upload for 1 video file ([[phab:T287686|T287686]])
* 17:02 logmsgbot: legoktm Synchronized wmf-config/logging.php: logging: Enable stacktrace printing (duration: 00m 12s)
* 19:04 hashar@deploy1002: Synchronized php: group1 wikis to 1.37.0-wmf.23 (duration: 00m 55s)
* 17:00 hoo: Started dumpwikidatajson.sh on snapshot1003 to re-create today's dump
* 19:03 hashar@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.23
* 16:55 logmsgbot: legoktm Synchronized php-1.26wmf16/autoload.php: https://gerrit.wikimedia.org/r/#/c/228850/ (duration: 00m 12s)
* 18:52 urbanecm: Start server-side upload for 1 video file ([[phab:T289949|T289949]])
* 16:54 logmsgbot: legoktm Synchronized php-1.26wmf16/includes/debug/logger/: https://gerrit.wikimedia.org/r/#/c/228850/ (duration: 00m 11s)
* 18:50 urbanecm: Start server-side upload for 1 video file ([[phab:T289781|T289781]])
* 16:49 hoo: Removed today's Wikidata json dump (wikidata-20150803-all.json.gz) because it was incomplete due to the dataset problems earlier
* 18:44 urbanecm: Start server-side upload for 3 large PDF files ([[phab:T290722|T290722]])
* 16:27 paravoid: upgrading junos on cr2-codfw
* 18:43 legoktm: migrated sitereq-l@ from Google Groups to Mailman ([[phab:T290908|T290908]])
* 15:34 bblack: wiping cp3034 disk cache (upload esams) for ipsec reload testing
* 18:27 urbanecm: Start server-side upload for 1 video file ([[phab:T290290|T290290]])
* 15:23 logmsgbot: thcipriani Synchronized php-1.26wmf16/extensions/MultimediaViewer: SWAT: Track image load time with statsv (touch and re-sync) [[gerrit:228218]] (duration: 00m 12s)
* 18:23 urbanecm: Start server-side upload for 1 video file ([[phab:T290685|T290685]])
* 15:22 ottomata: reinstalling analytics1013,1014 and 1020  with Jessie
* 18:21 urbanecm: Start server-side upload for 1 video file ([[phab:T290707|T290707]])
* 15:10 logmsgbot: thcipriani Synchronized php-1.26wmf16/extensions/MultimediaViewer: SWAT: Track image load time with statsv [[gerrit:228218]] (duration: 00m 12s)
* 18:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:59 logmsgbot: aude Synchronized usagetracking.dblist: Enable usage tracking on trwiki (duration: 00m 12s)
* 18:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:54 logmsgbot: krenair Synchronized php-1.26wmf16/extensions/SemanticResultFormats: https://gerrit.wikimedia.org/r/#/c/228793/ (duration: 00m 13s)
* 18:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:42 logmsgbot: aude Synchronized usagetracking.dblist: Enable usage tracking on thwiki (duration: 00m 12s)
* 18:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:33 mutante: temp. stop puppet on dataset1001
* 18:05 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7620084a1ed92066aa8b29fa609cf6cbb4f799ab}}: Add portrattarkiv.se to wgCopyUploadsDomains whitelist of Wikimedia Commons ([[phab:T290581|T290581]]) (duration: 01m 05s)
* 14:27 paravoid: upgrading junos on cr1-codfw
* 17:39 mutante: thumbor - running puppet on all thumbor hosts, removed cron job systemd-thumbor-tmpfiles-clean, added thumbor_systemd_tmpfiles_clean timer job
* 14:23 moritzm: updated iojs on apt.wikimedia.org to 2.5.0 for jessie-wikimedia
* 16:56 joal@deploy1002: Finished deploy [analytics/refinery@0f7f6f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0f7f6f3] (duration: 06m 15s)
* 14:21 ottomata: upgrading kernel on analytics1042-1049 from 3.13.0.24.28 to 3.13.0.61.68 because T107698
* 16:50 joal@deploy1002: Started deploy [analytics/refinery@0f7f6f3] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@0f7f6f3]
* 14:18 logmsgbot: aude Synchronized usagetracking.dblist: Enable usage tracking on svwiki (duration: 00m 12s)
* 16:47 joal@deploy1002: Finished deploy [analytics/refinery@0f7f6f3] (thin): Regular analytics weekly train THIN [analytics/refinery@0f7f6f3] (duration: 00m 07s)
* 13:50 bblack: re-enabling puppet + ircecho on neon (vast majority of recovery spam is over with)
* 16:47 joal@deploy1002: Started deploy [analytics/refinery@0f7f6f3] (thin): Regular analytics weekly train THIN [analytics/refinery@0f7f6f3]
* 13:17 bblack: re-enable agent, restarted apache2 on palladium, strontium, rhodium (fact_values truncated in mysql)
* 16:45 joal@deploy1002: Finished deploy [analytics/refinery@0f7f6f3]: Regular analytics weekly train [analytics/refinery@0f7f6f3] (duration: 19m 43s)
* 13:10 bblack: rhodium too (puppetmaster stop)
* 16:31 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5002.eqsin.wmnet
* 13:05 bblack: stopped puppet-agent + apache2 on strontium + palladium (no masters alive, for mysql maintenance)
* 16:26 joal@deploy1002: Started deploy [analytics/refinery@0f7f6f3]: Regular analytics weekly train [analytics/refinery@0f7f6f3]
* 12:59 bblack: stopped ircecho + puppet-agent on neon (spam from epic puppetmaster fail)
* 16:19 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum5002.eqsin.wmnet
* 12:52 bblack: stop->wait->restart of apache2 service on palladium (seemed dead to puppet reqs)
* 16:17 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum5001.eqsin.wmnet
* 12:21 _joe_: bumped ganglia-monitor-aggregator on bast4001, the upstart script needs immediate fixing
* 16:02 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum5001.eqsin.wmnet
* 11:01 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: avoid db1044 SPOF by repooling db1027 and db1015 (duration: 00m 12s)
* 15:56 urbanecm: Remove 2FA for User:Rho at wikitech, identity verified via a videocall
* 10:56 paravoid: switching GeoDNS to GeoIP2
* 14:50 moritzm: installing lz4 security updates on stretch
* 10:45 paravoid: upgrading all AuthDNS servers to gdnsd 2.2.0
* 13:50 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:31 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: depool db1035 for maintenance (duration: 00m 12s)
* 13:33 ottomata: pointing <nowiki>{</nowiki>stats,analytics<nowiki>}</nowiki>.wikimedia.org at analytics-web.discovery.wmnet cname - [[phab:T285355|T285355]]
* 05:22 logmsgbot: @tin ResourceLoader cache refresh completed at Mon Aug  3 05:22:15 UTC 2015 (duration 22m 14s)
* 13:32 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4002.ulsfo.wmnet
* 02:23 logmsgbot: @tin LocalisationUpdate completed (1.26wmf16) at 2015-08-03 02:23:21+00:00
* 13:18 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum4002.ulsfo.wmnet
* 02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 06m 21s)
* 13:15 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4001.ulsfo.wmnet
* 01:47 springle: starting OSC gerrit 228756 s5 wb_items_per_site.ips_site_page
* 13:03 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum4001.ulsfo.wmnet
* 00:03 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/228198/ (duration: 00m 12s)
* 12:54 oblivian@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:41 marostegui: Install 10.4.21-2 on db1125
* 11:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:23 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:21 Lucas_WMDE: EU backport+config window done
* 11:20 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720983{{!}}Enable change-tags for new edits' proofread status at mulWS (T289140)]] (duration: 01m 06s)
* 11:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:10 lucaswerkmeister-wmde@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:583407{{!}}Don’t check constraints on two property qualifiers (T235292)]] (duration: 01m 11s)
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:03 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1010.eqiad.wmnet
* 09:55 effie: depool wtp1026
* 09:54 effie: depooling mw1312 and mw1319
* 09:46 topranks: Disabling Intel X710 NIC on-board LLDP processing on relforge1003 ([[phab:T290984|T290984]])
* 07:04 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 06:57 elukey: shutdown ms-be2045 (again) after seeing [[phab:T290881|T290881]]
* 06:02 elukey: powercycle ms-be2045 - no ssh, no remote tty available
* 05:28 marostegui@cumin1001: dbctl commit (dc=all): 'Restore db1109 original load', diff saved to https://phabricator.wikimedia.org/P17274 and previous config saved to /var/cache/conftool/dbconfig/20210915-052802-marostegui.json
* 04:30 marostegui@cumin1001: dbctl commit (dc=all): 'Increase db1109 load', diff saved to https://phabricator.wikimedia.org/P17273 and previous config saved to /var/cache/conftool/dbconfig/20210915-043053-marostegui.json


== 2015-08-02 ==
== 2021-09-14 ==
* 17:52 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: If7fcb6e6: Default wikipedias to enwiki.png (duration: 00m 12s)
* 23:01 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Re-enable VipsScaler (2 of 2) (duration: 01m 04s)
* 13:26 jynus: powercycling analytics1044: same kernel fatal issues as 1043
* 22:59 legoktm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Re-enable VipsScaler (1 of 2) (duration: 01m 05s)
* 13:10 jynus: powercycling analytics1043: kernel issues
* 22:58 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:05 bblack: started pybal on lvs3001
* 22:53 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:56 logmsgbot: @tin ResourceLoader cache refresh completed at Sun Aug  2 04:56:29 UTC 2015 (duration 56m 28s)
* 22:43 legoktm: legoktm@cumin2001:~$ sudo systemctl reset-failed # clear httpbb_hourly_tests failure, moved to cumin1001
* 02:23 logmsgbot: @tin LocalisationUpdate completed (1.26wmf16) at 2015-08-02 02:23:09+00:00
* 22:34 legoktm@deploy1002: Finished scap: Rebuild i18n for redeployment of VipsScaler ([[phab:T290759|T290759]]) (duration: 23m 49s)
* 02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 06m 11s)
* 22:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:11 legoktm@deploy1002: Started scap: Rebuild i18n for redeployment of VipsScaler ([[phab:T290759|T290759]])
* 22:10 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:23 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:20 dancy: testing upcoming Scap release on beta
* 20:20 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:720387{{!}}Early adopt wgIncludejQueryMigrate=false on nlwiki (T280944)]] (duration: 01m 48s)
* 20:06 cdanis: [[phab:T290425|T290425]] ✔️ cdanis@alert1001.wikimedia.org ~ 🕓🍵 sudo /usr/bin/statograph -c /etc/statograph/config.yml erase_metric_data lyfcttm2lhw4
* 20:06 cdanis: [[phab:T290425|T290425]] ✔️ cdanis@alert1001.wikimedia.org ~ 🕓🍵 sudo /usr/bin/statograph -c /etc/statograph/config.yml erase_metric_data h5mvbny28713
* 19:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:08 hashar@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.23
* 18:48 moritzm: removed filter for tcp/25 on mx2001, reimage is complete [[phab:T286911|T286911]]
* 18:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:32 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:24 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:23 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:20 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2982638039720107d0b6e3227f5dce5b34ce7533}}: Offer the DiscussionTools reply tool as opt-out setting at ptwikinews ([[phab:T285162|T285162]]) (duration: 01m 06s)
* 18:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|7f1de32f4b5788e92291a5448563bc61a9f561e2}}: Offer the DiscussionTools reply tool as opt-out setting at Wikimania wiki ([[phab:T284339|T284339]]) (duration: 01m 05s)
* 18:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:14 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:13 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e36f4d3dcc368f0afbce3649ce72f2135ab1c76f}}: DiscussionTools: Make newtopictool available to everyone on arwiki and cswiki ([[phab:T285724|T285724]]) (duration: 01m 04s)
* 18:09 urbanecm@deploy1002: Synchronized debug.json: {{Gerrit|Idef64e72}} (duration: 01m 29s)
* 18:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: reimage
* 17:54 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 1:00:00 on mx2001.wikimedia.org with reason: reimage
* 17:45 moritzm: reimaging mx2001 to bullseye [[phab:T286911|T286911]]
* 16:57 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:55 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:32 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:28 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:28 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:24 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:20 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:16 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 15:53 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1010.eqiad.wmnet with reason: Resyncing from master
* 15:53 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on maps1010.eqiad.wmnet with reason: Resyncing from master
* 15:51 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1010.eqiad.wmnet
* 15:43 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:32 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:19 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 37 hosts
* 15:19 kormat@cumin1001: START - Cookbook sre.hosts.remove-downtime for 37 hosts
* 15:11 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.09-update-tendril (exit_code=0)
* 15:11 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.09-update-tendril
* 15:10 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0)
* 15:07 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:07 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters
* 15:06 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0)
* 15:05 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl
* 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'Increase db1109 load', diff saved to https://phabricator.wikimedia.org/P17271 and previous config saved to /var/cache/conftool/dbconfig/20210914-150458-marostegui.json
* 15:03 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 15:00 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
* 14:58 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
* 14:55 marostegui@cumin1001: dbctl commit (dc=all): 'Reduce db1109 load', diff saved to https://phabricator.wikimedia.org/P17270 and previous config saved to /var/cache/conftool/dbconfig/20210914-145522-marostegui.json
* 14:54 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
* 14:54 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
* 14:53 jelto@cumin2002: END (ERROR) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=97)
* 14:53 marostegui@cumin1001: dbctl commit (dc=all): 'Reduce db1109 load', diff saved to https://phabricator.wikimedia.org/P17269 and previous config saved to /var/cache/conftool/dbconfig/20210914-145324-marostegui.json
* 14:52 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
* 14:49 jelto@cumin2002: END (FAIL) - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners (exit_code=99)
* 14:49 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:49 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners
* 14:46 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 14:46 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
* 14:46 jelto@cumin2002: MediaWiki read-only period ends at: 2021-09-14 14:46:30.570035
* 14:45 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite
* 14:45 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0)
* 14:45 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite
* 14:45 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0)
* 14:45 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions
* 14:45 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0)
* 14:44 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki
* 14:44 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0)
* 14:44 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly
* 14:44 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0)
* 14:43 jelto@cumin2002: MediaWiki read-only period starts at: 2021-09-14 14:43:48.272827
* 14:43 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.02-set-readonly
* 14:40 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 37 hosts with reason: DC switchover
* 14:40 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:00:00 on 37 hosts with reason: DC switchover
* 14:39 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
* 14:39 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
* 14:34 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0)
* 14:32 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.00-warmup-caches
* 14:30 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
* 14:24 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 14:22 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
* 14:22 jelto@cumin2002: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
* 14:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:10 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Avoid warning about undefined $wgFileBlacklist ([[phab:T290640|T290640]]) (duration: 01m 32s)
* 13:44 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: kartotherian: restore v4 maxzoom to z15 (duration: 00m 10s)
* 13:43 mbsantos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: kartotherian: restore v4 maxzoom to z15
* 13:43 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@79bc0c6]: geoshapes: update table names (duration: 00m 14s)
* 13:42 mbsantos@deploy1002: Started deploy [kartotherian/deploy@79bc0c6]: geoshapes: update table names
* 13:27 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@0a38bc5]: kartotherian: restore v4 maxzoom to z15 (duration: 00m 10s)
* 13:27 mbsantos@deploy1002: Started deploy [kartotherian/deploy@0a38bc5]: kartotherian: restore v4 maxzoom to z15
* 13:26 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@1ebdca4]: (no justification provided) (duration: 00m 15s)
* 13:26 mbsantos@deploy1002: Started deploy [kartotherian/deploy@1ebdca4]: (no justification provided)
* 12:32 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 12:32 jelto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 12:29 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 12:29 jelto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 12:19 jelto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 12:19 jelto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 12:17 jelto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:17 jelto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 11:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2001.codfw.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2001.codfw.wmnet
* 10:05 hashar@deploy1002: Pruned MediaWiki: 1.37.0-wmf.20 (duration: 01m 48s)
* 09:47 hashar@deploy1002: Pruned MediaWiki: 1.37.0-wmf.19 (duration: 04m 13s)
* 09:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2001.codfw.wmnet
* 09:38 hashar@deploy1002: Finished scap: testwikis wikis to 1.37.0-wmf.23 (duration: 70m 39s)
* 09:29 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2001.codfw.wmnet
* 09:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2002.codfw.wmnet
* 09:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2002.codfw.wmnet
* 09:09 Emperor: swift rebalance to remove h/w faulty host ms-be2045 [[phab:T290881|T290881]]
* 09:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:47 moritzm: installing testvm2002
* 08:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host testvm2002.codfw.wmnet
* 08:28 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host testvm2002.codfw.wmnet
* 08:27 hashar@deploy1002: Started scap: testwikis wikis to 1.37.0-wmf.23
* 08:25 godog: poweroff ms-be2045 and set it as failed in netbox - [[phab:T290881|T290881]]
* 08:24 hashar: train: applied security patches for 1.37.0-wmf.23  # [[phab:T281164|T281164]]
* 08:05 godog: wipe non-os partitions from ms-be2045 - [[phab:T290881|T290881]]
* 07:50 vgutierrez: update acme-chief to version 0.31 on acmechief hosts - [[phab:T290249|T290249]]
* 04:47 eileen: civicrm revision changed from {{Gerrit|1f071f6c6c}} to {{Gerrit|e6bf81d99c}}, config revision is {{Gerrit|23eda8ba3a}}
* 02:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:39 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:07 James_F: wmf/1.37.0-wmf.23 was branched at {{Gerrit|ea72c9b690c2159a12beec2f518b61cc499ed521}} for [[phab:T281164|T281164]]
* 02:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2015-08-01 ==
== 2021-09-13 ==
* 06:04 _joe_: removing some old apache access logs from mw1114
* 23:54 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:06 logmsgbot: @tin ResourceLoader cache refresh completed at Sat Aug  1 05:06:46 UTC 2015 (duration 6m 45s)
* 23:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:53 andrewbogott: cleared out nova-conductor.log on labcontrol1001, restarted nova-conductor, graceful’d apache
* 23:45 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: [[phab:T290759|T290759]]: Undeploy VipsScaler: III – Don't set wmgUseVips, now ignored (duration: 00m 58s)
* 02:23 logmsgbot: @tin LocalisationUpdate completed (1.26wmf16) at 2015-08-01 02:23:15+00:00
* 23:45 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 06m 11s)
* 23:43 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:12 logmsgbot: ori Synchronized extract2.php: Ie919881a4: Add an API listing template to the allowed templates in extract2.php
* 23:41 jforrester@deploy1002: Synchronized wmf-config/CommonSettings.php: [[phab:T290759|T290759]]: Undeploy VipsScaler: II – Don't load regardless of config (duration: 00m 58s)
* 00:01 logmsgbot: ori Synchronized php-1.26wmf16/includes: Revert I4afaecd8: "Avoiding writing sessions for no reason", and undo several uncommitted live-hacks for debugging T102199 (duration: 00m 16s)
* 19:52 jforrester@deploy1002: Synchronized wmf-config/InitialiseSettings.php: [[phab:T290759|T290759]] Undeploy VipsScaler: I – Disable on all wikis (duration: 00m 57s)
* 19:49 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:47 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:59 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:59 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript resetAuthenticationThrottle.php --wiki=<nowiki>{</nowiki>cswiki,cswikiversity<nowiki>}</nowiki> --signup --ip=185.47.223.49 # [[phab:T290809|T290809]]
* 18:58 urbanecm@deploy1002: Synchronized wmf-config/throttle.php: {{Gerrit|9db1d1ac938ca053c82fed88c8b6e75f97a52416}}: Add throttle rule for Czech wiki course ([[phab:T290809|T290809]]) (duration: 00m 58s)
* 18:29 ryankemper: [Cirrus] `eqiad` fully recovered (100% of shards), `codfw` at 99.816%. `codfw` is getting held up by recovery of `enwiki` shards which tend to be quite large
* 18:25 razzi: reenable replication on dbstore1007 for [[phab:T290841|T290841]]
* 18:16 cwhite: apply high log volume from ES mitigations to deprecated inputs
* 18:13 razzi: razzi@dbstore1007:~$ sudo systemctl restart mariadb@s3.service for [[phab:T290841|T290841]]
* 18:05 razzi: sudo systemctl restart mariadb@s2.service
* 17:48 ryankemper: [Cirrus] `eqiad` is at 99.13% shards recovered and `codfw` is at 98.83%
* 17:20 volans@cumin1001: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host sretest1002.eqiad.wmnet
* 17:17 ryankemper: [Cirrus] `enwiki` searches appear to be working now. `production-search-eqiad` is at 93.5% recovered shards, `production-search-codfw` is at 95.3% recovered
* 16:57 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1002.eqiad.wmnet
* 16:18 legoktm@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=eventgate-main
* 16:16 volans@cumin1001: conftool action : set/pooled=yes; selector: name=mw1414.*
* 16:08 volans@cumin1001: conftool action : set/pooled=no; selector: name=mw1414.*
* 16:06 volans@cumin1001: END (PASS) - Cookbook sre.experimental.reimage (exit_code=0) for host mw1414.eqiad.wmnet
* 15:54 moritzm: filtered mx2001 on the routers for reimage [[phab:T286911|T286911]]
* 15:43 vgutierrez: update acme-chief to version 0.31 on acmechief-test hosts - [[phab:T290249|T290249]]
* 15:40 vgutierrez: upload acme-chief 0.31 to apt.wm.o (buster) - [[phab:T290249|T290249]]
* 15:32 jelto: Traffic: depool codfw from user traffic
* 15:26 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.services.02-restore-ttl (exit_code=0)
* 15:25 jelto@cumin2002: START - Cookbook sre.switchdc.services.02-restore-ttl
* 15:25 volans@cumin1001: START - Cookbook sre.experimental.reimage for host mw1414.eqiad.wmnet
* 15:20 Emperor: rebooting ms-be2045 to see if that brings the disk back properly [[phab:T290881|T290881]]
* 15:13 jelto@cumin2002: conftool action : set/pooled=false; selector: name=eqiad,dnsdisc=restbase-async
* 15:13 legoktm: (cotd.) box-constraints{{!}}similar-users{{!}}termbox{{!}}thanos-query{{!}}thanos-swift{{!}}wdqs{{!}}wdqs-internal{{!}}wikifeeds{{!}}zotero)
* 15:13 rzl: (contd.) box-constraints{{!}}similar-users{{!}}termbox{{!}}thanos-query{{!}}thanos-swift{{!}}wdqs{{!}}wdqs-internal{{!}}wikifeeds{{!}}zotero)
* 15:12 jelto@cumin2002: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=(apertium{{!}}api-gateway{{!}}citoid{{!}}cxserver{{!}}echostore{{!}}eventgate-analytics{{!}}eventgate-analytics-external{{!}}eventgate-logging-external{{!}}eventgate-main{{!}}eventstreams{{!}}eventstreams-internal{{!}}kartotherian{{!}}linkrecommendation{{!}}mathoid{{!}}mobileapps{{!}}ores{{!}}proton{{!}}push-notifications{{!}}recommendation-api{{!}}restbase{{!}}restbase-async{{!}}schema{{!}}search{{!}}sessionstore{{!}}shellbox{{!}}shell
* 15:02 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=0)
* 15:02 topranks: Restarting unused line-card FPC 1 in cr2-codfw in attempt to clear alarm.
* 14:56 jelto@cumin2002: START - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep
* 14:44 herron: drained mx2001 mail queue to mx1001 [[phab:T286911|T286911]]
* 14:38 dcausse: restarting wdqs-updater.service on all wdqs servers
* 14:21 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.services.02-restore-ttl (exit_code=0)
* 14:20 jelto@cumin2002: START - Cookbook sre.switchdc.services.02-restore-ttl
* 14:13 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.services.01-switch-dc (exit_code=0)
* 14:13 legoktm: (cotd.) ternal, eventgate-main, wikifeeds, eventstreams-internal, eventgate-analytics-external: codfw => eqiad
* 14:12 jelto@cumin2002: Switching services echostore, termbox, cxserver, eventstreams, search, ores, mathoid, schema, push-notifications, thanos-swift, wdqs, sessionstore, restbase, wdqs-internal, apertium, eventgate-analytics, citoid, api-gateway, restbase-async, proton, linkrecommendation, thanos-query, shellbox, kartotherian, mobileapps, recommendation-api, zotero, similar-users, shellbox-constraints, eventgate-logging-ex
* 14:12 jelto@cumin2002: START - Cookbook sre.switchdc.services.01-switch-dc
* 14:11 jelto@cumin2002: END (PASS) - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep (exit_code=0)
* 14:05 jelto@cumin2002: START - Cookbook sre.switchdc.services.00-reduce-ttl-and-sleep
* 14:03 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum3002.esams.wmnet
* 13:51 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum3002.esams.wmnet
* 13:50 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum3001.esams.wmnet
* 13:39 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum3001.esams.wmnet
* 13:36 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum2002.codfw.wmnet
* 13:21 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum2002.codfw.wmnet
* 13:20 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum2001.codfw.wmnet
* 13:08 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum2001.codfw.wmnet
* 12:09 volans@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:03 volans@cumin1001: START - Cookbook sre.dns.netbox
* 11:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:27 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:26 kostajh: European mid-day backport window deploys done
* 11:24 kharlan@deploy1002: Synchronized wmf-config: Config: [[gerrit:713553{{!}}WikimediaEvents: Remove UnderstandingFirstDay config]] (duration: 00m 59s)
* 10:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm2002.codfw.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm2002.codfw.wmnet
* 10:15 volans@cumin1001: END (FAIL) - Cookbook sre.experimental.reimage (exit_code=93) for host mw1414.eqiad.wmnet
* 09:33 volans: restarting tcpircbot-logmsgbot on alert1001, not relying messages
* 09:18 elukey: upgrade rsyslog* on ml-serve* nodes to 8.1901.0-1+wmf2
* 09:16 godog: swift eqiad-prod: add weight to ms-be10[64-67] - [[phab:T290546|T290546]]
* 09:11 moritzm: reimaging sretest1002
* 09:11 elukey: upload rsyslog* 8.1901.0-1+wmf2 to buster-wikimedia component/rsyslog-k8s - [[phab:T277739|T277739]]
* 08:16 godog: bump +100G prometheus/ops codfw


== 2015-07-31 ==
== 2021-09-12 ==
* 20:14 logmsgbot: ori Synchronized php-1.26wmf16/includes/objectcache/ObjectCacheSessionHandler.php: Uncommitted revert of I4afaecd to test impact on T102199 (duration: 00m 12s)
* 18:33 vgutierrez: restart varnish-fe on cp3061, cp3063 and cp3065
* 20:11 godog: revert to openjdk8 and restart cassandra on restbase1008
* 18:29 vgutierrez: restart varnish on cp3055
* 19:55 logmsgbot: ori Synchronized php-1.26wmf16/includes/User.php: More debug logging for T102199 (duration: 00m 13s)
* 18:26 vgutierrez: restart varnish on cp3057
* 19:54 godog: revert to openjdk8 and restart cassandra on restbase1007
* 04:53 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:51 logmsgbot: ori Synchronized php-1.26wmf16/includes/EditPage.php: More debug logging for T102199 (duration: 00m 12s)
* 04:52 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:21 godog: revert to openjdk8 and restart cassandra on restbase1006
* 19:02 godog: revert to openjdk8 and restart cassandra on restbase1005
* 18:44 twentyafterfour: oddly, the symptom was that there were logs about apc cache entries that had been on the GC queue for too long, I guess this is due to phd being stuck
* 18:43 twentyafterfour: restarted phd on iridium. I had to forcefully kill one stuck repository worker to get the daemons to restart properly.
* 18:36 godog: revert to openjdk8 and restart cassandra on restbase1004
* 18:15 mutante: multatuli - installing package upgrades
* 18:08 legoktm: made User:Flow talk page manager a 'bot' on all wikis (except loginwiki)
* 18:08 godog: revert to openjdk8 and restart cassandra on restbase1003
* 17:53 godog: revert to openjdk8 and restart cassandra on restbase1002
* 17:41 godog: revert to openjdk8 and restart cassandra on restbase1001 T104887
* 17:11 greg-g: follow on to previous to be explicit: it's not deployed, it is queued for Monday morning SWAT
* 17:10 aude: wmf/1.26wmf16 core submodule bump for Ic25edf7 (MultimediaViewer) is now on tin
* 17:06 logmsgbot: aude Synchronized php-1.26wmf16/extensions/Wikidata: Fix api xml format (duration: 00m 20s)
* 15:52 bd808: Rebuilt grafana-dashboards index to have 1 shard/2 replicas in logstash cluster
* 15:46 bd808: Rebuilt kibana-int index to have 1 shard/2 replicas in logstash cluster
* 15:45 andrewbogott: rebooting labvirt1005, again (3.16 this time)
* 15:19 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: reverting db1035 load to 10% (duration: 00m 14s)
* 15:03 urandom: bouncing restbase1005 (attempting to reproduce GC trends)
* 14:54 Coren: turned on alerting of backup status on labstore* with (by design) low limits.  Expect alarms, and ignore.
* 14:44 kart_: Update cxserver to 9669e19
* 14:38 andrewbogott: bumped the kernel version on labvirt1005, rebooting.
* 14:09 godog: restart cassandra on restbase1004 to apply java downgrade, missed from batch downgrade yesterday
* 12:10 godog: restbase1008 bootstrap finished successfully
* 10:30 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: returning db1035 to 100% load (duration: 00m 12s)
* 08:19 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I7be6dd2f5: Set $wgAjaxEditStash to false, on suspicion of being implicated in T102199 (duration: 00m 12s)
* 07:35 _joe_: powercycling analytics1013, no ssh, console unresponsive
* 04:45 logmsgbot: @tin ResourceLoader cache refresh completed at Fri Jul 31 04:45:41 UTC 2015 (duration 45m 40s)
* 04:09 springle: upgrade/restart dbstore1001
* 03:48 logmsgbot: krenair Synchronized php-1.26wmf16/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/228197/ (duration: 00m 12s)
* 02:31 logmsgbot: @tin LocalisationUpdate completed (1.26wmf16) at 2015-07-31 02:31:20+00:00
* 02:28 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 06m 13s)
* 00:35 logmsgbot: catrope Synchronized php-1.26wmf16/extensions/Flow/includes/Model/WikiReference.php: debugging (duration: 00m 12s)
* 00:34 logmsgbot: catrope Synchronized php-1.26wmf16/extensions/Flow/includes/Model/WikiReference.php: debugging (duration: 00m 12s)
* 00:29 logmsgbot: catrope Synchronized php-1.26wmf16/extensions/Flow/includes/Model/WikiReference.php: debugging (duration: 00m 13s)


== 2015-07-30 ==
== 2021-09-11 ==
* 23:52 logmsgbot: catrope Synchronized flow.dblist: remove commons (duration: 00m 14s)
* 19:02 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|27814b8eaacb5ba2fee1b6167a36ea14356a1ecf}}: testwiki: Fully remove securepoll-related groups ([[phab:T290808|T290808]]) (duration: 00m 57s)
* 23:47 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/195886/ (duration: 00m 11s)
* 18:35 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript emptyUserGroup.php --wiki=testwiki <nowiki>{</nowiki>electionadmin,electcomm<nowiki>}</nowiki> # [[phab:T290808|T290808]]
* 23:46 logmsgbot: krenair Synchronized wmf-config/throttle.php: https://gerrit.wikimedia.org/r/#/c/195886/ (duration: 00m 12s)
* 18:31 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|908bbf35235ea4129795dfbf4c0e646440152e18}}: Revert "test: Add electcomm and electionadmin groups" ([[phab:T290808|T290808]]) (duration: 00m 58s)
* 23:41 logmsgbot: catrope Synchronized flow.dblist: Enable Flow on plwiki and commonswiki (duration: 00m 11s)
* 23:30 logmsgbot: ebernhardson Synchronized php-1.26wmf16/extensions/DonationInterface/: Bump DonationInterfae in 1.26wmf16 again...its uses submodules (duration: 00m 15s)
* 23:29 logmsgbot: ebernhardson Synchronized php-1.26wmf16/extensions/DonationInterface/: Bump DonationInterfae in 1.26wmf16 (duration: 00m 16s)
* 23:28 robh: disregard log entry about racktables, never offlined
* 23:22 logmsgbot: ebernhardson Synchronized php-1.26wmf16/includes/specials/SpecialMIMEsearch.php: (no message) (duration: 00m 12s)
* 23:21 logmsgbot: ebernhardson Synchronized php-1.26wmf16/includes/specials/SpecialSearch.php: Fix search-suggest i18n for frwiki in SWAT (duration: 00m 14s)
* 23:21 logmsgbot: ebernhardson Synchronized php-1.26wmf16/extensions/SpamBlacklist/: Update SpamBlacklist for SWAT (duration: 00m 11s)
* 23:12 awight: updating paymentswiki from 02db5f7f77b667da06b882b2f66de9c5546230bc to d4bdce1cae168448b116d75e3dcd3303b0f13dd2
* 23:10 robh: killing apache on magnesium to manually trigger an outage of racktables and test catchpoint alert formatting
* 23:10 logmsgbot: krinkle Synchronized w/rl-test.php: T105255 (duration: 00m 12s)
* 23:06 legoktm: manually merged User:Mirwin's accounts (T107168)
* 22:59 awight: rolling back.  paymentswiki.
* 22:59 awight: redeploying sketchy paymentswiki config
* 22:57 awight: updating paymentswiki from 6854683083cabc730f37b6a79d559f23e7ff7b0f to 02db5f7f77b667da06b882b2f66de9c5546230bc
* 22:43 awight: paymentswiki config rolled back
* 22:42 awight: paymentswiki: config the IIIrd
* 22:34 awight: paymentswiki: rolled back again
* 22:31 awight: redeploying paymentswiki config: with password this time
* 22:21 awight: rolled back paymentswiki config
* 22:01 logmsgbot: ori Synchronized php-1.26wmf16/includes/page/WikiPage.php: I73fba15c26c1: Defer the InfoAction purge in onArticleEdit() (duration: 00m 11s)
* 21:58 awight: paymentswiki config: jiggle the handle
* 21:42 awight: updated paymentswiki from fd0060bf86777ee6b7acd205d134066356da69e8 to 6854683083cabc730f37b6a79d559f23e7ff7b0f
* 21:06 logmsgbot: ori Synchronized php-1.26wmf16/includes/Message.php: c72b7c435f: Debug logging for T102199 (take 2) (duration: 00m 11s)
* 21:06 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I1bbf3f0: Add a debug log channel for bug T102199 (duration: 00m 12s)
* 20:47 mutante: iridium - apt-get clean - 1.7G avail
* 20:02 logmsgbot: ori Synchronized wmf-config/mobile.php: (no message) (duration: 00m 12s)
* 20:00 bblack: starting rolling wipe process on mobile cache contents for T106966 fixup
* 19:48 logmsgbot: ori Synchronized wmf-config: I0990ac5b: Update URL configuration for mobile when entering mobile mode (duration: 00m 12s)
* 19:15 matt_flaschen: Deployed patch for T107170 to wmf/1.26wmf16
* 19:09 logmsgbot: legoktm Synchronized php-1.26wmf16: Revert "Use OOUI HTMLForm for Special:Watchlist" (duration: 01m 46s)
* 18:49 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I6db1771bf4: Use absolute URLs to construct load.php requests (duration: 00m 12s)
* 18:33 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I6665bf31: Use relative URLs to construct load.php requests (duration: 00m 12s)
* 18:02 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: all wikis to 1.26wmf16
* 17:56 cmjohnson1: decom virt1001-virt1009
* 17:45 jynus: killing some long running queries on db1042
* 15:30 logmsgbot: krenair Synchronized php-1.26wmf15/extensions/MobileFrontend/includes/Resources.php: https://gerrit.wikimedia.org/r/#/c/228001/ (duration: 00m 12s)
* 15:30 logmsgbot: krenair Synchronized php-1.26wmf16/extensions/MobileFrontend/includes/Resources.php: https://gerrit.wikimedia.org/r/#/c/228000/ (duration: 00m 11s)
* 15:21 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227999/ (duration: 00m 12s)
* 15:03 gwicke: disabled old restbase checkout on tin to make sure it doesn't start up
* 15:02 logmsgbot: krenair Synchronized w/static/images/project-logos/commonswiki.png: https://gerrit.wikimedia.org/r/#/c/227962/ (duration: 00m 13s)
* 15:02 godog: bootstrap cassandra on restbase1008
* 15:02 gwicke: manually cleaned up RB code on 1007 and 1008
* 14:37 moritzm: installed openjdk security updates on analytics*
* 14:05 moritzm: restarted opendj on nembus/neptunium to effect OpenJDK security updates
* 13:44 godog: downgrade openjdk-7-jre on restbase1007, nodetool flush and cassandra restart
* 13:39 godog: downgrade openjdk-7-jre on restbase1006, nodetool flush and cassandra restart
* 13:29 godog: downgrade openjdk-7-jre on restbase1005, nodetool flush and cassandra restart
* 13:25 moritzm: installed openjdk updates on gallium, restarting jenkins
* 13:17 godog: downgrade openjdk-7-jre on restbase1004, nodetool flush and cassandra restart
* 13:02 godog: downgrade openjdk-7-jre on restbase1003, nodetool flush and cassandra restart
* 12:47 godog: downgrade openjdk-7-jre on restbase1002, nodetool flush and cassandra restart
* 12:36 godog: downgrade openjdk-7-jre on restbase1001, nodetool flush and cassandra restart
* 09:18 hashar: Upgraded Zuul on all CI slaves. Should be a noop for zuul-cloner.
* 07:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 30 07:10:39 UTC 2015 (duration 10m 38s)
* 04:06 Krenair: Ignore that last error
* 04:05 logmsgbot: LocalisationUpdate failed: git pull of core failed
* 03:33 mutante: killing processes by ellery on stat1002 - load avg was over 1500 and users reported pagecounts are broken (possibly all other crons as well)
* 03:01 logmsgbot: LocalisationUpdate completed (1.26wmf16) at 2015-07-30 03:01:49+00:00
* 02:59 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 04m 25s)
* 02:40 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-30 02:40:38+00:00
* 02:36 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 07m 45s)
* 02:26 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I3c6217f06: Double $wgMemoryLimit (330 => 660) (duration: 00m 12s)
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 30 02:07:40 UTC 2015 (duration 7m 39s)
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf16) at 2015-07-30 02:03:29+00:00
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-30 02:03:29+00:00
* 01:30 springle: MIMEsearchPage::reallyDoQuery queries with crazy eg, LIMIT 10405000,501, on commonswiki vslow slave, from tide***.microsoft.com bots. log noise is queries hitting 5min limit and auto-killed
* 00:48 logmsgbot: ori Synchronized php-1.26wmf15/includes/Message.php: 160f69871c: Debug logging for T102199 (duration: 00m 13s)
* 00:36 logmsgbot: ori Synchronized php-1.26wmf16/includes/Message.php: eb281630ce: Debug logging for T102199 (duration: 00m 11s)
* 00:10 awight: rolled back config
* 00:09 awight: crazy previous message was all about: I pointed the DonationInterface frontends to mirror limbo messages to a Redis server on localhost.
* 00:08 awight: deployed interesting gc-cc-limbo config


== 2015-07-29 ==
== 2021-09-10 ==
* 23:43 legoktm: finished fixing Scribunto content models
* 21:28 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 23:30 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/225840/ (duration: 00m 12s)
* 21:27 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 23:30 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/225840/ (duration: 00m 12s)
* 21:21 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-syntaxhighlight' for release 'main' .
* 23:23 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227892/ (duration: 00m 12s)
* 20:46 jhuneidi@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 23:20 legoktm: starting script to fix Scribunto content models due to imports on all wikis (T91170)
* 20:44 jhuneidi@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 23:14 logmsgbot: bd808 Purged l10n cache for 1.26wmf14
* 20:42 jhuneidi@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 23:14 logmsgbot: bd808 Purged l10n cache for 1.26wmf13
* 18:34 volans@cumin1001: END (FAIL) - Cookbook sre.experimental.reimage (exit_code=99) for host sretest1001.eqiad.wmnet
* 23:13 logmsgbot: bd808 Purged l10n cache for 1.26wmf12
* 18:08 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 23:03 mutante: snapshot1001 - apt-get clean - 107M avail
* 17:16 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on puppetmaster2005.codfw.wmnet with reason: REIMAGE
* 23:02 Krenair: snapshot1001 - No space left on device
* 17:14 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetmaster2005.codfw.wmnet with reason: REIMAGE
* 23:02 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227879/ (duration: 00m 12s)
* 16:42 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on puppetmaster2004.codfw.wmnet with reason: REIMAGE
* 22:27 legoktm: update page set page_content_model ="wikitext" where page_id=12134769; on wikidatawiki
* 16:40 pt1979@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on puppetmaster2004.codfw.wmnet with reason: REIMAGE
* 21:22 legoktm: fixed Module:*/doc pages on wikidatawiki
* 16:14 volans@cumin1001: END (FAIL) - Cookbook sre.experimental.reimage (exit_code=99) for host sretest1001.eqiad.wmnet
* 20:44 legoktm: update page set page_content_model="Scribunto" where page_id=12134769; on wikidatawiki
* 16:03 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 20:42 arlolra: updated Parsoid to version 6e095a92
* 15:39 volans@cumin1001: END (FAIL) - Cookbook sre.experimental.reimage (exit_code=99) for host sretest1001.eqiad.wmnet
* 20:41 legoktm: manually fixed content models for wikidata's Module namespace (T107340)
* 15:27 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 20:31 logmsgbot: ori Synchronized php-1.26wmf16/extensions/Wikidata/extensions/Wikibase/repo/includes/actions/SubmitEntityAction.php: Live-hack stats increment call for session_fail_preview (duration: 00m 12s)
* 14:48 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:30 logmsgbot: ori Synchronized php-1.26wmf16/extensions/Wikidata/extensions/Wikibase/repo/includes/EditEntity.php: Live-hack stats increment call for session_fail_preview (duration: 00m 12s)
* 14:43 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:26 urandom: bouncing cassandra on restbase1006 to apply logstash config
* 13:54 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:18 urandom: bouncing cassandra on restbase1005 to apply logstash config
* 09:31 XioNoX: push pfw policies - [[phab:T290611|T290611]]
* 20:15 urandom: bouncing cassandra on restbase1004 to apply logstash config
* 09:07 mutante: planet - deleted all state files for all languages, running fresh update via systemctl start for all languages after proxy changes ([[phab:T285251|T285251]])
* 20:11 urandom: bouncing cassandra on restbase1003 to apply logstash config
* 08:37 jynus: upgrade and restart db2139
* 20:04 urandom: bouncing cassandra on restbase1002 to apply logstash config
* 08:14 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 19:59 urandom: restarting restbase1001 to apply logstash config
* 08:14 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 19:51 twentyafterfour: scap sync failed on snapshot1001 due to full disk
* 08:14 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 19:48 logmsgbot: twentyafterfour Finished scap: group1 wikis to 1.26wmf16 (duration: 45m 12s)
* 08:13 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 19:03 logmsgbot: twentyafterfour Started scap: group1 wikis to 1.26wmf16
* 08:12 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 18:36 legoktm: fixed content models of MediaWiki and Module namespace pages on azbwiki
* 08:12 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 18:24 legoktm: manually attached User:Flow talk page manager accounts
* 07:58 jayme: updating rsyslog to 8.1901.0-1~bpo9+wmf2 on kubernetes-workers - [[phab:T289766|T289766]]
* 17:38 logmsgbot: aude Synchronized php-1.26wmf16/extensions/Wikidata: fix focus when entering site links (duration: 00m 22s)
* 07:57 moritzm: installing ntfs-3g security updates
* 17:37 logmsgbot: aude Synchronized php-1.26wmf16/thumb.php: 2c9518ed78: Add Content-Length header to thumb.php redirects (duration: 00m 13s)
* 07:46 jayme@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 16:14 andrewbogott: re-imaging labnodepool1001
* 07:45 jayme@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 16:13 ori: depooled Precise image scalers (mw1159 / mw1160)to see if 2c9518ed78 helped.
* 07:31 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 16:12 logmsgbot: ori Synchronized wmf-config: Revert "No need for wgSecureLogin on our wikis, HTTPS is forced everywhere"  (duration: 00m 13s)
* 07:31 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 16:11 logmsgbot: ori Synchronized php-1.26wmf15/thumb.php: 2c9518ed78: Add Content-Length header to thumb.php redirects (duration: 00m 12s)
* 07:25 jayme: updating rsyslog to 8.1901.0-1~bpo9+wmf2 on kubernetes-staging - [[phab:T289766|T289766]]
* 16:11 logmsgbot: ori Synchronized php-1.26wmf16/thumb.php: 2c9518ed78: Add Content-Length header to thumb.php redirects (duration: 00m 12s)
* 07:19 jayme: importes rsyslog 8.1901.0-1~bpo9+wmf2 to stretch-wikimedia - [[phab:T289766|T289766]]
* 16:01 moritzm: installed qemu security updates on labvirt*
* 06:56 effie: disable puppet on deploy1002 and mw2254
* 15:36 logmsgbot: krenair Synchronized tests/dblistTest.php: (no message) (duration: 00m 10s)
* 06:29 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 15:36 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 12s)
* 06:27 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 15:36 logmsgbot: krenair Synchronized database lists: (no message) (duration: 00m 12s)
* 06:26 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'sync'.
* 15:33 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: (no message) (duration: 00m 12s)
* 06:26 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'sync'.
* 15:30 logmsgbot: krenair Synchronized wikisource.dblist: https://gerrit.wikimedia.org/r/#/c/194549/ (duration: 00m 12s)
* 06:02 elukey@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2280.codfw.wmnet
* 15:27 logmsgbot: krenair Synchronized tests/dblistTest.php: https://gerrit.wikimedia.org/r/#/c/194549/ (duration: 00m 13s)
* 05:59 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:26 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/194549/ (duration: 00m 13s)
* 05:56 elukey: powercycle mw2280 - no tty available in mgmt, no ssh, host frozen
* 15:26 logmsgbot: krenair Synchronized database lists: https://gerrit.wikimedia.org/r/#/c/194549/ (duration: 00m 11s)
* 05:55 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=mw2280.codfw.wmnet
* 15:21 logmsgbot: krenair Synchronized wikipedia.dblist: https://gerrit.wikimedia.org/r/#/c/227718/3 (duration: 00m 12s)
* 05:54 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:21 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227718/3 (duration: 00m 12s)
* 05:45 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:20 logmsgbot: aude Synchronized php-1.26wmf15/extensions/Wikidata: rv usage tracking change (duration: 00m 20s)
* 05:42 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:18 logmsgbot: krenair Synchronized wikipedia.dblist: https://gerrit.wikimedia.org/r/#/c/227718/3 (duration: 00m 12s)
* 05:12 marostegui: Repool clouddb1017:3311
* 15:17 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227718/3 (duration: 00m 12s)
* 05:12 marostegui: Repool clouddb1013:3311
* 14:28 logmsgbot: aude Synchronized usagetracking.dblist: Enable usage tracking on ptwiki and azbwiki (duration: 00m 12s)
* 04:49 marostegui: Depool clouddb1013:3311
* 14:14 logmsgbot: aude Synchronized php-1.26wmf15/extensions/Wikidata: rv add usage tracking job (duration: 00m 20s)
* 04:49 marostegui: Depool clouddb1017:3311
* 14:13 logmsgbot: aude Synchronized php-1.26wmf15/extensions/Wikidata: add usage tracking job (duration: 00m 20s)
* 02:52 eileen: civicrm revision changed from {{Gerrit|83f514f693}} to {{Gerrit|1f071f6c6c}}, config revision is {{Gerrit|23eda8ba3a}}
* 14:11 logmsgbot: aude Synchronized php-1.26wmf16/extensions/Wikidata: add usage tracking job (duration: 00m 24s)
* 00:35 tgr: Deployed patch for [[phab:T290692|T290692]]
* 13:27 bblack: repooling cp3030 with wiped caches
* 13:19 bblack: depooling cp3030 (all layers)
* 10:51 _joe_: restarted apertium-apy on sca1001, freed 54 GB of RAM (processes were OOMing)
* 10:18 _joe_: repooling the zend imagescalers until https://gerrit.wikimedia.org/r/#/c/227676 is reviewed and deployed
* 09:14 _joe_: depooling mw1159-60 from the imagescalers pool
* 08:02 hashar_: disabled puppet on labnodepool1001.eqiad.wmnet
* 07:41 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 29 07:41:54 UTC 2015 (duration 41m 53s)
* 04:43 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: rv myself (duration: 00m 13s)
* 04:42 logmsgbot: demon Synchronized database lists: rv myself (duration: 00m 12s)
* 04:00 logmsgbot: demon Synchronized database lists: moving special wikipedias to wikipedia.dblist (duration: 00m 13s)
* 04:00 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: moving special wikipedias to wikipedia.dblist (duration: 00m 12s)
* 03:25 springle: upgrade reboot db1011 trusty
* 03:15 logmsgbot: LocalisationUpdate completed (1.26wmf16) at 2015-07-29 03:15:56+00:00
* 03:09 logmsgbot: l10nupdate Synchronized php-1.26wmf16/cache/l10n: (no message) (duration: 10m 47s)
* 02:43 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-29 02:43:27+00:00
* 02:37 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 10m 08s)
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 29 02:07:17 UTC 2015 (duration 7m 16s)
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf16) at 2015-07-29 02:03:04+00:00
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-29 02:03:03+00:00
* 00:43 logmsgbot: ori Synchronized php-1.26wmf15/extensions/AbuseFilter: Revert "Revert "Conversion to using getMainStashInstance()"" (duration: 00m 12s)
* 00:02 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Iccd317c6: Switch over the 'sessions' ObjectCache to nutcracker (T106986) (duration: 00m 13s)
* 00:01 ori: Switching over the sessions ObjectCache instance to use nutcracker. Users with an existing edit session in progress will have their session reset and will need to re-login.


== 2015-07-28 ==
== 2021-09-09 ==
* 23:50 logmsgbot: ori Synchronized php-1.26wmf15/includes/objectcache/RedisBagOStuff.php: I3812ec5a0b: RedisBagOStuff: if no alternatives, skip master link status check (duration: 00m 12s)
* 23:07 brennen: no takers on patches, ending backport & config training window.
* 23:50 logmsgbot: ori Synchronized php-1.26wmf16/includes/objectcache/RedisBagOStuff.php: I3812ec5a0b: RedisBagOStuff: if no alternatives, skip master link status check (duration: 00m 12s)
* 21:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:36 bblack: rebooting cp20xx.codfw.wmnet for kernel updates (downtimed)
* 21:27 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 23:20 logmsgbot: krenair Synchronized php-1.26wmf16/extensions/VisualEditor/modules/ve-mw/init/ve.init.mw.ApiResponseCache.js: https://gerrit.wikimedia.org/r/#/c/227607/ (duration: 00m 12s)
* 21:20 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:02 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227496/ (duration: 00m 12s)
* 21:17 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:55 ejegg: updated payments from bdc4afaa7699904ac30c1f6d3bb3fbc6bac5e87e to fd0060bf86777ee6b7acd205d134066356da69e8
* 21:02 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:51 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 wikis to 1.26wmf16
* 20:58 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 22:40 logmsgbot: krinkle Synchronized w/rl-test.php: T105255 (duration: 00m 12s)
* 19:40 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:23 Tim: on mw1203 restarted hhvm due to StatCache lockup
* 19:37 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:08 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Iecddb3bf24: Add nutcracker-redis object cache instance, unused for now (duration: 00m 11s)
* 19:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:05 logmsgbot: twentyafterfour Finished scap: new branch: testwiki to 1.26wmf16 (duration: 26m 26s)
* 19:04 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:01 gwicke: restbase ca30b69 deployed to eqiad cluster
* 18:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:48 gwicke: canary restbase ca30b69 deploy to restbase1001.eqiad
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:39 logmsgbot: twentyafterfour Started scap: new branch: testwiki to 1.26wmf16
* 18:33 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bc4f20437868b39ae2cc4eac8735ecb8bcd93157}}: Growth: Push 44 wikis out of dark mode ([[phab:T289680|T289680]]) (duration: 00m 57s)
* 21:14 matt_flaschen: Deployed patch for T107170 to wmf/1.26wmf15 and wmf/1.26wmf16
* 18:24 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|6af38d951f0ef9af369e2172c175628dc6e9a281}}: Deploy Growth features in dark modes to ~200 wikis ([[phab:T290582|T290582]]; 3/3) (duration: 00m 57s)
* 20:39 ori: Upgraded nutcracker to 0.4.1-1+wm1 across fleet
* 18:22 urbanecm@deploy1002: Synchronized wmf-config/config/: {{Gerrit|6af38d951f0ef9af369e2172c175628dc6e9a281}}: Deploy Growth features in dark modes to ~200 wikis ([[phab:T290582|T290582]]; 2/3) (duration: 01m 01s)
* 18:57 logmsgbot: bblack Synchronized wmf-config/InitialiseSettings-labs.php: remove wgSecureLogin (duration: 00m 12s)
* 18:21 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|6af38d951f0ef9af369e2172c175628dc6e9a281}}: Deploy Growth features in dark modes to ~200 wikis ([[phab:T290582|T290582]]; 1/3) (duration: 00m 58s)
* 18:56 logmsgbot: bblack Synchronized wmf-config/InitialiseSettings.php: remove wgSecureLogin (duration: 00m 12s)
* 18:21 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 18:44 ori: Twiddling with nutcracker on mw1041
* 18:20 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 18:33 andrewbogott: disabling puppet and nova-network on labnet1002 to avoid possible conflict between two different dhcp servers
* 18:20 urbanecm@deploy1002: sync-file aborted: {{Gerrit|6af38d951f0ef9af369e2172c175628dc6e9a281}}: Deploy Growth features in dark modes to ~200 wikis ([[phab:T290582|T290582]]) (duration: 00m 05s)
* 17:04 godog: start cassandra on restbase1007, tentative bootstrap
* 18:18 volans@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE
* 16:24 YuviPanda: bounced create-dbusers on labstore1002
* 18:18 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:03 bd808: logstash1002 conversion to jessie done; log event volume returning to normal in index
* 18:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:01 godog: bounce cassandra on xenon to test logstash logging
* 18:17 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 15:52 bd808: installed logstash on logstash1002; forced puppet run
* 18:16 volans@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1001.eqiad.wmnet with reason: REIMAGE
* 15:03 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor for 5% of new accounts on enwiki [[gerrit:226338]] (duration: 00m 12s)
* 18:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:43 cmjohnson1: powering down logstash1002 to remove disk and install jessie
* 18:12 urbanecm: [urbanecm@mwmaint2002 /srv/mediawiki/php]$ foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/initWikiConfig.php --phab=[[phab:T290582|T290582]] {{!}} tee ~/initwikiconfig.out # [[phab:T290582|T290582]]
* 14:28 moritzm: restarted zookeeper on conf1003 to effect OpenJDK security update
* 18:11 urbanecm: Run extensions/WikimediaMaintenance/createExtensionTables.php growthexperiments for wikis in P17258 ([[phab:T290582|T290582]])
* 14:16 _joe_: re-enabled puppet on mw1152 for testing
* 18:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:16 moritzm: restarted zookeeper on conf1002 to effect OpenJDK security update
* 18:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:58 paravoid: upgrading baham to gdnsd 2.2.0
* 18:05 urbanecm@deploy1002: Synchronized wmf-config/config: no-op: {{Gerrit|76c51f2753aed9dc8e06b63de6657c3c94371a3c}}: Standardize indentation in several .yaml files (duration: 00m 58s)
* 13:41 _joe_: disabled puppet on mw1152, thumb_handler testing
* 17:29 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-update-tendril (exit_code=0)
* 13:40 moritzm: restarted zookeeper on conf1001 to effect OpenJDK security update
* 17:28 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-update-tendril
* 13:13 jynus: temporarily changing master of db1069(s1) to db1051 in order to fix some labsdb inconsistencies on enwiki_p
* 17:28 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0)
* 12:29 godog: reenable puppet on restbase1001 after merging https://gerrit.wikimedia.org/r/#/c/227355/
* 17:26 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance
* 10:31 paravoid: merging a series of mail-related patches; ping me personally if problems arise
* 17:25 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-run-puppet-on-db-masters (exit_code=0)
* 10:03 mobrovac: citoid deploying d57ec96
* 17:22 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-run-puppet-on-db-masters
* 09:41 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Increasing db1035 weight (duration: 00m 13s)
* 17:21 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restore-ttl (exit_code=0)
* 08:13 moritzm: added elasticsearch-1.7.0 to carbon for jessie and trusty
* 17:21 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restore-ttl
* 07:30 YuviPanda: dropped others20150724190859 on labstore1002
* 17:21 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners (exit_code=0)
* 06:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 28 06:53:21 UTC 2015 (duration 53m 20s)
* 17:20 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.08-restart-envoy-on-jobrunners
* 02:30 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-28 02:30:24+00:00
* 17:14 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0)
* 02:26 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 07m 29s)
* 17:14 jelto@cumin1001: [DRY-RUN] MediaWiki read-only period ends at: 2021-09-09 17:14:12.502162
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 28 02:07:52 UTC 2015 (duration 7m 51s)
* 17:14 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-28 02:03:41+00:00
* 17:14 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0)
* 01:11 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227371/ (duration: 00m 11s)
* 17:14 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite
* 00:35 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227381/ (duration: 00m 13s)
* 17:13 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions (exit_code=0)
* 00:30 logmsgbot: krenair Synchronized php-1.26wmf15/extensions/SiteMatrix/SiteMatrix_body.php: https://gerrit.wikimedia.org/r/#/c/227379/ (duration: 00m 12s)
* 17:13 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.05-invert-redis-sessions
* 00:00 logmsgbot: catrope Finished scap: SWAT (duration: 22m 15s)
* 17:13 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0)
* 17:13 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki
* 17:13 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0)
* 17:13 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly
* 17:12 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0)
* 17:12 jelto@cumin1001: [DRY-RUN] MediaWiki read-only period starts at: 2021-09-09 17:12:27.974410
* 17:12 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.02-set-readonly
* 17:08 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0)
* 17:07 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance
* 17:07 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-warmup-caches (exit_code=0)
* 17:04 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-warmup-caches
* 17:04 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0)
* 16:58 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl
* 16:58 jelto@cumin1001: END (PASS) - Cookbook sre.switchdc.mediawiki.00-disable-puppet (exit_code=0)
* 16:58 jelto@cumin1001: START - Cookbook sre.switchdc.mediawiki.00-disable-puppet
* 16:57 jelto: start cookbook sre.switchdc.mediawiki eqiad codfw --live-test this will generate some additional SAL logs here
* 16:41 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:36 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:33 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:23 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 16:10 volans@cumin1001: END (FAIL) - Cookbook sre.experimental.reimage (exit_code=99) for host sretest1001.eqiad.wmnet
* 16:00 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 15:34 volans@cumin1001: END (FAIL) - Cookbook sre.experimental.reimage (exit_code=99) for host sretest1001.eqiad.wmnet
* 15:32 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:28 dancy@deploy1002: Synchronized .pipeline/config.yaml: Config: [[gerrit:719610{{!}}pipeline: add comment redirecting to correct file]] (duration: 00m 59s)
* 15:24 volans@cumin1001: START - Cookbook sre.experimental.reimage for host sretest1001.eqiad.wmnet
* 14:47 mutante: planet - deleting all state and lock files for the "en" feeds ([[phab:T285251|T285251]] [[phab:T289984|T289984]])
* 14:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mx2002.wikimedia.org
* 14:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mx2002.wikimedia.org
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 2 hosts
* 14:25 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 2 hosts
* 14:19 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 2 hosts
* 14:19 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 2 hosts
* 14:11 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1007.eqiad.wmnet
* 13:48 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host mx2002.wikimedia.org
* 13:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:11 mutante: planet1002 - re-enabling disabled puppet
* 13:06 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 2 hosts
* 13:06 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 2 hosts
* 13:05 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 2 hosts
* 13:05 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 2 hosts
* 13:03 jmm@cumin2002: END (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Muehlenhoff out of all services on: 2 hosts
* 13:03 jmm@cumin2002: START - Cookbook sre.idm.logout Logging Muehlenhoff out of all services on: 2 hosts
* 13:01 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:56 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:49 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1007.eqiad.wmnet
* 10:48 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1007.eqiad.wmnet with reason: Resyncing from master
* 10:48 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on maps1007.eqiad.wmnet with reason: Resyncing from master
* 10:48 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1007.eqiad.wmnet
* 10:48 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1006.eqiad.wmnet
* 10:47 topranks: Removing peering to old IPs of AS139931 (BSCCL) at Equinix Singapore (cr3-eqsin).
* 10:45 topranks: Removing peering to AS24218 at Equinix Singapore (cr3-eqsin) - network no longer uses this ASN.
* 10:22 volans: upgrading spicerack on cumin1001
* 10:20 volans@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc1027.eqiad.wmnet
* 10:10 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts mc1027.eqiad.wmnet
* 09:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host mx2002.wikimedia.org
* 09:47 volans@cumin2002: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts mc1027.eqiad.wmnet
* 09:46 volans@cumin2002: START - Cookbook sre.hosts.decommission for hosts mc1027.eqiad.wmnet
* 09:37 godog: swift eqiad add ms-be10[64-67] with initial weight - [[phab:T290546|T290546]]
* 09:19 filippo@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=swift-ro,name=eqiad
* 09:19 filippo@puppetmaster1001: conftool action : set/pooled=false; selector: dnsdisc=swift,name=eqiad
* 09:15 volans: rebooting sretest1001 to test ipmi reboot via spicerack
* 09:15 volans@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:20:00 on sretest1001.eqiad.wmnet with reason: testing reboot via ipmi
* 09:15 volans@cumin2002: START - Cookbook sre.hosts.downtime for 0:20:00 on sretest1001.eqiad.wmnet with reason: testing reboot via ipmi
* 09:13 btullis@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0) for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 09:09 btullis@cumin1001: START - Cookbook sre.aqs.roll-restart for AQS aqs cluster: Roll restart of all AQS's nodejs daemons. - btullis@cumin1001
* 08:59 godog: move swift traffic fully to codfw to rebalance eqiad - [[phab:T287539|T287539]]
* 08:59 filippo@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=swift,name=codfw
* 08:58 filippo@puppetmaster1001: conftool action : set/pooled=true; selector: dnsdisc=swift-ro,name=codfw
* 08:56 volans: upgrading spicerack on cumin2002 to test the new release
* 08:50 volans: uploaded spicerack_0.0.59 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 08:23 jelto: run ansible change 719041 on gitlab1001
* 08:13 jelto: run ansible change 719041 on gitlab2001
* 07:07 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum1002.eqiad.wmnet
* 06:47 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum1002.eqiad.wmnet
* 04:37 ryankemper: [WDQS] Dispatched e-mail to the banned user agent (dailymotion)
* 03:57 ryankemper: [WDQS] Dispatched e-mail to WDQS public mailing list informing them the outage is over; all that's left is the e-mail to the banned UA
* 03:47 ryankemper: [WDQS] Restarting `wdqs-blazegraph` on `wdqs[2001-2008].codfw.wmnet`; if banning the dailymotion UA was sufficient then servers should come back up healthy and not drop back into deadlock
* 03:43 ryankemper: [WDQS] Running puppet agent on `wdqs[2001-2008].codfw.wmnet` to roll out https://gerrit.wikimedia.org/r/719753
* 03:29 ryankemper: [WDQS] There's no clear indication of them being a culprit, but by far the most common user agent is a dailymotion VideocatalogTopic UA (see https://logstash.wikimedia.org/goto/51f238e9010d0220e5d33c6c210be93e)
* 03:12 bstorm: attempting to start replication on clouddb1017 s1 [[phab:T290630|T290630]]
* 03:11 bstorm: stopping and restarting mariadb on clouddb1017 s1
* 03:04 ryankemper: [WDQS] Dispatched email to Wikidata public mailing list about reduced service availability
* 02:36 ryankemper: [WDQS] https://grafana.wikimedia.org/d/000000489/wikidata-query-service?viewPanel=7&orgId=1&from=1631152574841&to=1631154942992 shows the availability pattern, anywhere we see missing data (null) represents time that blazegraph was locked up and therefore unable to report metrics
* 02:34 ryankemper: [WDQS] For context I glanced at `ryankemper@cumin1001:~$ sudo -E cumin 'P<nowiki>{</nowiki>wdqs2*<nowiki>}</nowiki>' 'sudo systemctl status wdqs-blazegraph'` before doing the aforementioned restarts and they'd all last restarted between 25-28 minutes ago
* 02:33 ryankemper: [WDQS] Restarting `wdqs-blazegraph` across all of `wdqs2*`
* 00:50 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Don't set default  to Score (try #2) (duration: 00m 58s)
* 00:48 legoktm@deploy1002: Synchronized php-1.37.0-wmf.21/extensions/Score/includes/Score.php: Use the 'score' Shellbox if configured ([[phab:T290193|T290193]]) (duration: 00m 57s)
* 00:46 legoktm@deploy1002: Synchronized php-1.37.0-wmf.21/includes/shell/CommandFactory.php: shell: Fix $wgShellboxUrls by passing service name when creating BoxedCommand ([[phab:T290193|T290193]]) (duration: 00m 58s)
* 00:45 legoktm@deploy1002: sync-file aborted: shell: Fix $wgShellboxUrls by passing service name when creating BoxedCommand ([[phab:T290193|T290193]] (duration: 00m 07s)
* 00:15 legoktm@deploy1002: Synchronized wmf-config/CommonSettings.php: Remove putenv() for GDFONTPATH (duration: 00m 58s)


== 2015-07-27 ==
== 2021-09-08 ==
* 23:53 ori: Re-pooling mw1159 and mw1160
* 22:34 ryankemper: WDQS] [[phab:T280247|T280247]] Ran puppet-agent on `miscweb*` following merge of https://gerrit.wikimedia.org/r/c/wikidata/query/gui-deploy/+/717649
* 23:38 logmsgbot: catrope Started scap: SWAT
* 22:24 ryankemper: WDQS] [[phab:T280247|T280247]] Ran puppet-agent on `miscweb*` following merge of https://gerrit.wikimedia.org/r/c/wikidata/query/gui-deploy/+/714623
* 23:24 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: SWAT (duration: 00m 12s)
* 21:55 ryankemper: [WDQS] [[phab:T280247|T280247]] Purged varnish to make sure change took effect: `echo 'https://query-preview.wikidata.org/' {{!}} mwscript purgeList.php` and `echo 'https://query.wikidata.org/' {{!}} mwscript purgeList.php` on `mwmaint1002`
* 23:23 logmsgbot: catrope Synchronized w/static/images/project-logos/suwikiquote.png: Localized logo for suwikiquote (duration: 00m 12s)
* 21:53 ryankemper: [WDQS] [[phab:T280247|T280247]] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/719502 and ran puppet-agent on `miscweb*`
* 23:17 ejegg: updated crm from 83cacfa1e0852ffaf47d2f02e7d843cf6f3bcda4 to db417a28a247a3fdf3e3023a700d6266e04f3e9d
* 20:49 eileen: civicrm revision changed from {{Gerrit|593d01f4fc}} to {{Gerrit|83f514f693}}, config revision is {{Gerrit|23eda8ba3a}}
* 22:19 andrewbogott: rebooting labvirt1005
* 20:41 legoktm: Successfully published image docker-registry.discovery.wmnet/php7.2-fpm-multiversion-base:1.0.2
* 21:50 bd808: updated scap to dc8eda5 (Don't exclude PHP files from being synced)
* 19:25 Krinkle: krinkle@mw1369 Running some benchmarks in Eqiad on load.php
* 21:34 logmsgbot: ori Synchronized php-1.26wmf15/extensions/AbuseFilter: I13d29ea6: Revert "Conversion to using getMainStashInstance()" (duration: 00m 12s)
* 18:27 urbanecm@deploy1002: Synchronized wmf-config/config/itwiki.yaml: {{Gerrit|6bcbe61f9a89086b775d84a81d55a7587cf26780}}: Italian Wikipedia is now a group 1 wiki ([[phab:T286664|T286664]]; 2/2) (duration: 00m 58s)
* 21:24 andrewbogott: rebooting labnet1002, just to see if I can
* 18:26 urbanecm@deploy1002: Synchronized dblists/: {{Gerrit|6bcbe61f9a89086b775d84a81d55a7587cf26780}}: Italian Wikipedia is now a group 1 wiki ([[phab:T286664|T286664]]; 1/2) (duration: 00m 58s)
* 20:57 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I1ca47ebc4: $wgEventLoggingSchemaApiUri: http -> https (duration: 00m 12s)
* 18:10 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bbefce6a3778f159ad68587c830dff4a1da0c792}}: Growth: Remove config that moved on-wiki ([[phab:T290295|T290295]]) (duration: 00m 58s)
* 20:54 bd808: installed libbcprov-java and restarted logstash on logstash1001
* 18:03 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|950a377e5ba6f5d318135e31b36334532d9ae71b}}: Stop setting $wgAbuseFilterParserClass ([[phab:T239990|T239990]]) (duration: 00m 58s)
* 20:33 subbu: deployed parsoid version 92f1cd6d
* 17:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2004.codfw.wmnet
* 20:17 ori: (A rise in 503s/minute expected. I'll keep it brief.)
* 16:53 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts maps2004.codfw.wmnet
* 20:16 ori: Depooled Precise scalers (mw1159 and mw1160) again, for testing.
* 16:52 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2003.codfw.wmnet
* 20:07 godog: bounce rsyslog on mw in eqiad in batches
* 16:37 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts maps2003.codfw.wmnet
* 19:58 godog: bounce rsyslog on mw in codfw in batches
* 16:37 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts maps2001.codfw.wmnet
* 19:54 logmsgbot: twentyafterfour Synchronized w/: deploy https://gerrit.wikimedia.org/r/#/c/227326/ (duration: 00m 12s)
* 16:33 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.21/extensions/GrowthExperiments/maintenance/updateMenteeData.php: {{Gerrit|796e23c87ccfc48334ab932e13aab4f0ec746bbd}}: updateMenteeData.php: Make it possible to force update (duration: 00m 58s)
* 19:47 godog: bounce rsyslog on mw1235
* 16:28 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:719524{{!}}Turn off jQuery migrate on wikisource wikis (T280944)]] (duration: 00m 59s)
* 19:37 bd808: godog fixed salt key for logstash1001 which fixed trebuchet install of kibana
* 16:23 hnowlan@cumin1001: START - Cookbook sre.hosts.decommission for hosts maps2001.codfw.wmnet
* 19:31 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/227273/ (duration: 00m 13s)
* 16:22 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1006.eqiad.wmnet
* 19:17 robh: etherpad was giving errors, apache restart fixed
* 16:14 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 20:00:00 on maps1006.eqiad.wmnet with reason: Resyncing from master
* 18:56 bd808: rsyslog forwarded hhvm and apache2 logs still not hitting logstash1001; rsyslog restarts may be needed
* 16:14 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 20:00:00 on maps1006.eqiad.wmnet with reason: Resyncing from master
* 18:53 legoktm: restarted populateContentModel.php --wiki=enwiki on terbium with modification to occassionally clear the link cache so it doesn't OOM.
* 16:13 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1006.eqiad.wmnet with reason: Resyncing from master
* 18:49 godog: stop jobrunner/jobchron/hhvm on mw1011
* 16:13 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on maps1006.eqiad.wmnet with reason: Resyncing from master
* 18:41 bd808: manually ran sync-common on mw1011
* 16:13 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1005.eqiad.wmnet
* 18:40 bd808: fatalmonitor full of errors from mw1011
* 15:43 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 18:38 logmsgbot: bd808 Synchronized wmf-config/InitialiseSettings.php: logstash: change ip address for logstash1001 and logstash1003 (duration: 00m 12s)
* 15:43 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 18:33 bd808: logstash1003 salt key not accepted by master
* 15:43 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 18:25 bd808: No mediawiki, hhvm or apache2 logs going to logstash1001:10514
* 15:41 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 18:20 bd808: logstash1001 back up and running
* 15:38 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 17:08 moritzm: updated mc200[34] to linux 3.19.3-7 for some testing on hardware
* 15:37 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 16:34 bblack: switched operations/dns to ff-only like operations/puppet in gerrit config
* 15:37 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 16:29 bblack: restarted gitblit on antimony (AGAIN...)
* 15:37 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 15:47 bd808: Added bgerstile and coreyfloyd to github "owners" team
* 14:57 marostegui: Retroactive: started to warm up eqiad databaes
* 15:43 _joe_: upgrading the jobrunners to the latest HHVM packlage
* 14:57 moritzm: installing 4.19.194 kernels on stretch systems with 4.19.x (no reboots yet)
* 15:39 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable EducationProgram extension at French Wikisource [[gerrit:225019]] (duration: 00m 12s)
* 14:54 brennen: gitlab: upgrading gitlab2001, followed by gitlab1001, to 14.2.3 ([[phab:T289802|T289802]])
* 15:26 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Quiz extension at French Wikibooks [[gerrit:225021]] (duration: 00m 12s)
* 14:53 robh@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1067.eqiad.wmnet with reason: REIMAGE
* 15:09 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgCategoryCollation to uca-default on cswiktionary [[gerrit:226483]] (duration: 00m 12s)
* 14:51 robh@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1067.eqiad.wmnet with reason: REIMAGE
* 15:07 bd808: logstash1001 and logstash1003 offline for physical move and reimaging to jessie. kibana data will be degraded until they are back
* 14:33 moritzm: installing zeromq3 security updates
* 15:04 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable VisualEditor for auto-created accounts on enwiki [[gerrit:226337]] (duration: 00m 13s)
* 13:50 mbsantos@deploy1002: Finished deploy [kartotherian/deploy@eb211ac]: kartotherian: restore v4 maxzoom to z15 (duration: 06m 42s)
* 14:14 cmjohnson1: logstash1001 going down to relocate to row A
* 13:44 mbsantos@deploy1002: Started deploy [kartotherian/deploy@eb211ac]: kartotherian: restore v4 maxzoom to z15
* 13:55 moritzm: uploaded linux 3.19.3-7 (based on 3.19.8-ckt4 plus the recent NMI security fixes) to carbon
* 13:38 brennen: gitlab: upgrading gitlab2001, followed by gitlab1001, to 14.1.5 ([[phab:T289802|T289802]])
* 13:20 cmjohnson1: powering down logstash1003 to relocate to rack d3
* 13:13 brennen: gitlab1001: downtiming alerts for 2.5 hours; upgrading to 14.0.10 ([[phab:T289802|T289802]])
* 12:51 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Repool db1035 after maintenance (duration: 00m 12s)
* 12:45 brennen: gitlab: pausing all runners in preparation for upgrade to 14.0.10 ([[phab:T289802|T289802]])
* 12:07 twentyafterfour: deployed https://gerrit.wikimedia.org/r/#/c/227205/ and restarted apache2 on iridium
* 11:57 moritzm: installing curl security updates on stretch
* 10:04 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Depool db1035 (duration: 00m 12s)
* 11:09 jbond: upload statograph_0.1.2
* 09:54 godog: reimage restbase1009, new disks
* 11:02 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master
* 09:24 godog: reimage restbase1007, new disks installed
* 11:01 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master
* 09:09 hashar: Allowed JenkinsBot to submit changes on operations/software/conftool for CI purposes.
* 11:01 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1005.eqiad.wmnet
* 07:54 moritzm: installed java security updates on xenon, cerium, praseodymium, maps-test*
* 10:06 jelto: upgrade gitlab2001 to gitlab-ce=14.0.10-ce.0
* 06:59 _joe_: upgrading hhvm to the latest package across the cluster
* 10:03 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T289802
* 05:47 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 27 05:47:31 UTC 2015 (duration 47m 30s)
* 10:03 jelto@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2001.wikimedia.org with reason: upgrade gitlab2001 to new version https://phabricator.wikmiedia.org/T289802
* 05:00 gwicke: restarted cassandra on restbase1003
* 09:38 godog: start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to wikimedia.org - [[phab:T210137|T210137]]
* 03:39 springle: upgrade & restart dbstore1002
* 09:29 godog: start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to codfw - [[phab:T210137|T210137]]
* 02:27 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-27 02:27:00+00:00
* 09:09 godog: start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to eqiad - [[phab:T210137|T210137]]
* 02:22 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 07m 20s)
* 07:45 godog: start rollout of prometheus-rsyslog-exporter 0.0.0+git20201008-3 to eqsin/esams/ulsfo - [[phab:T210137|T210137]]
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 27 02:07:15 UTC 2015 (duration 7m 14s)
* 06:46 ryankemper: [WDQS] Manually running puppet-agent on `miscweb2002.codfw.wmnet,miscweb1002.eqiad.wmnet`
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-27 02:03:04+00:00
* 06:45 ryankemper: [WDQS] Merged https://gerrit.wikimedia.org/r/c/operations/puppet/+/719185 to rollback query.wikidata.org changes
* 01:18 ori: Re-pooling mw1159 and mw1160; ran out of time for debugging.
* 02:59 eileen: civicrm revision changed from {{Gerrit|06ef98593f}} to {{Gerrit|593d01f4fc}}, config revision is {{Gerrit|5f004d94d7}}
* 00:43 ori: Depooled Precise image scalers (mw1159 and mw1160); watching for errors.
* 00:00 legoktm: legoktm@lists1001:~$ sudo rm -rf /etc/mailman # cleanup as part of {{Gerrit|4869d91b0be}} / [[phab:T282303|T282303]]


== 2015-07-26 ==
== 2021-09-07 ==
* 22:13 legoktm: killed populateContentModel.php for enwiki on terbium due to alerts
* 23:25 robh@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:02 logmsgbot: ori Synchronized docroot/wikimedia.org/WikipediaMobileFirefoxOS: Update WikipediaMobileFirefoxOS submodule for URL changes (duration: 00m 16s)
* 23:20 robh@cumin1001: START - Cookbook sre.dns.netbox
* 20:51 logmsgbot: ori Synchronized docroot: I5f8b8b54a: Move WikipediaMobileFirefoxOS from bits to wikimedia.org docroot (Bug: T98373) (duration: 00m 17s)
* 23:13 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:719381{{!}}Enable UrlShortener everywhere (T267925)]] (duration: 00m 58s)
* 05:30 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 26 05:30:10 UTC 2015 (duration 30m 9s)
* 23:07 dpifke@deploy1002: Synchronized wmf-config/profiler.php: Config: [[gerrit:716041{{!}}profiler: use seperate pipeline inside k8s pods (T288165)]] (duration: 00m 58s)
* 03:38 robh: ulsfo network issues, faidon depooled via https://gerrit.wikimedia.org/r/#/c/227067/
* 22:29 cstone: SmashPig revision changed from {{Gerrit|afd362b163}} to {{Gerrit|3607b16f83}}
* 02:26 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-26 02:26:47+00:00
* 20:41 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:715018{{!}}Set $wgWBRepoSettings['tmpNormalizeDataValues'] on all wikis (T251480)]] (duration: 00m 59s)
* 02:22 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 07m 12s)
* 20:31 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 26 02:07:01 UTC 2015 (duration 7m 0s)
* 20:27 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 02:02 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-26 02:02:51+00:00
* 17:18 jgiannelos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 17:09 jgiannelos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 17:01 jgiannelos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'push-notifications' for release 'main' .
* 16:39 moritzm: installing jetty9 security updates on buster
* 16:30 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue
* 16:30 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue
* 16:30 dancy@deploy1002: Synchronized README: testing (duration: 00m 59s)
* 15:18 akosiaris: run_benchmarky.py against mwdebug.svc.codfw.wmnet for performance tests
* 15:07 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:04 jbond: upload python-prometheus-client_0.6.0 to stretch-wikimedia
* 14:50 mutante: snapshot1015 - manually removed prometheus-puppet-agent-stats from crontab which was sending spam and is now a timer
* 14:33 mutante: CI - migrating zuul-merger cronjob to systemd timer (contint*)
* 14:23 XioNoX: re-pool esams-eqiad - [[phab:T288503|T288503]]
* 14:23 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE
* 14:23 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1024.eqiad.wmnet with reason: REIMAGE
* 14:22 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE
* 14:22 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1023.eqiad.wmnet with reason: REIMAGE
* 14:17 marostegui: No more db maintenance on eqiad [[phab:T288594|T288594]]
* 14:08 mutante: alert1001 - temp disabled puppet, stopped icinga-wm
* 14:07 mutante: temp killed icinga-wm because of flooding
* 14:01 Emperor: removing pc2010 from orchestrator [[phab:T289117|T289117]]
* 13:59 Emperor: removing pc2010 from tendril and zarcillo [[phab:T289117|T289117]]
* 13:57 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:57 XioNoX: drain esams-eqiad for circuit maintenance - [[phab:T288503|T288503]]
* 13:54 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 13:51 jayme: uncordoned kubestage2001
* 13:50 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:49 mutante: mw2264 - scap pulled and repooled after [[phab:T290242|T290242]]
* 13:49 dzahn@cumin1001: conftool action : set/pooled=yes; selector: name=mw2264.codfw.wmnet
* 13:43 jiji@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:40 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2010.codfw.wmnet
* 13:25 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts pc2010.codfw.wmnet
* 13:21 Emperor: removing pc2009 from orchestrator [[phab:T289116|T289116]]
* 13:21 Emperor: removing pc2009 from tendril and zarcillo [[phab:T289116|T289116]]
* 13:02 marostegui@cumin1001: dbctl commit (dc=all): 'fix s8 weights [[phab:T288594|T288594]]', diff saved to https://phabricator.wikimedia.org/P17248 and previous config saved to /var/cache/conftool/dbconfig/20210907-130244-marostegui.json
* 12:59 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2009.codfw.wmnet
* 12:51 mvernon@deploy1002: Synchronized wmf-config/ProductionServices.php: Remove old decommissioned pc hosts [[phab:T284825|T284825]] (duration: 01m 02s)
* 12:45 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts pc2009.codfw.wmnet
* 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'fix s1 weights [[phab:T288594|T288594]]', diff saved to https://phabricator.wikimedia.org/P17247 and previous config saved to /var/cache/conftool/dbconfig/20210907-122747-marostegui.json
* 12:27 marostegui@cumin1001: dbctl commit (dc=all): 'fix s1 weights [[phab:T288594|T288594]]', diff saved to https://phabricator.wikimedia.org/P17246 and previous config saved to /var/cache/conftool/dbconfig/20210907-122708-marostegui.json
* 11:46 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 6 hosts
* 11:46 btullis@cumin1001: START - Cookbook sre.hosts.remove-downtime for 6 hosts
* 11:36 awight: EU backport complete
* 11:33 awight@deploy1002: Synchronized php-1.37.0-wmf.21/extensions/CodeMirror/extension.json: Backport: [[gerrit:719170{{!}}Change line numbers default to null (T290226)]] (duration: 00m 59s)
* 11:28 awight@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:717192{{!}}Set template namespace for code mirror line numbering (T290226)]] (duration: 00m 59s)
* 10:51 Emperor: removing pc2008 from orchestrator [[phab:T289115|T289115]]
* 10:49 Emperor: removing pc2008 from tendril and zarcillo [[phab:T289115|T289115]]
* 10:46 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc2008.codfw.wmnet
* 10:35 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts pc2008.codfw.wmnet
* 10:29 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on 6 hosts with reason: commissioning aqs_new hosts
* 10:29 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on 6 hosts with reason: commissioning aqs_new hosts
* 10:29 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: commissioning aqs_new hosts
* 10:29 btullis@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 0:00:00 on aqs1010.eqiad.wmnet with reason: commissioning aqs_new hosts
* 10:27 Emperor: removing pc1010 from orchestrator [[phab:T289122|T289122]]
* 10:22 Emperor: removing pc1010 from tendril and zarcillo [[phab:T289122|T289122]]
* 10:15 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1010.eqiad.wmnet
* 10:02 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts pc1010.eqiad.wmnet
* 09:46 Emperor: removing pc1009 from orchestrator [[phab:T289120|T289120]]
* 09:26 Emperor: removing pc1009 from tendril and zarcillo [[phab:T289120|T289120]]
* 09:25 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1009.eqiad.wmnet
* 09:16 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts pc1009.eqiad.wmnet
* 08:57 elukey@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:53 elukey@cumin1001: START - Cookbook sre.dns.netbox
* 08:51 Emperor: removing pc1008 from orchestrator [[phab:T289119|T289119]]
* 08:44 Emperor: removing pc1008 from tendril and zarcillo [[phab:T289119|T289119]]
* 08:42 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1008.eqiad.wmnet
* 08:31 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts pc1008.eqiad.wmnet
* 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'More weight for db2090 into API [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17241 and previous config saved to /var/cache/conftool/dbconfig/20210907-082952-marostegui.json
* 08:25 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:25 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:25 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:24 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:02 marostegui@cumin1001: dbctl commit (dc=all): 'db2090 (re)pooling @ 100%: Slowly repool [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17240 and previous config saved to /var/cache/conftool/dbconfig/20210907-080230-root.json
* 07:52 kormat@cumin1001: dbctl commit (dc=all): 'db2118 (re)pooling @ 100%: reimage to buster (now with fixed pool config) [[phab:T288244|T288244]]', diff saved to https://phabricator.wikimedia.org/P17239 and previous config saved to /var/cache/conftool/dbconfig/20210907-075235-kormat.json
* 07:49 marostegui@cumin1001: dbctl commit (dc=all): 'More weight for db2090 into API [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17238 and previous config saved to /var/cache/conftool/dbconfig/20210907-074901-marostegui.json
* 07:47 marostegui@cumin1001: dbctl commit (dc=all): 'db2090 (re)pooling @ 75%: Slowly repool [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17237 and previous config saved to /var/cache/conftool/dbconfig/20210907-074726-root.json
* 07:37 kormat@cumin1001: dbctl commit (dc=all): 'db2118 (re)pooling @ 75%: reimage to buster (now with fixed pool config) [[phab:T288244|T288244]]', diff saved to https://phabricator.wikimedia.org/P17236 and previous config saved to /var/cache/conftool/dbconfig/20210907-073731-kormat.json
* 07:37 godog: +100G for prometheus/k8s codfw
* 07:34 marostegui@cumin1001: dbctl commit (dc=all): 'Start to pool db2090 into API [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17235 and previous config saved to /var/cache/conftool/dbconfig/20210907-073436-marostegui.json
* 07:32 marostegui@cumin1001: dbctl commit (dc=all): 'db2090 (re)pooling @ 50%: Slowly repool [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17234 and previous config saved to /var/cache/conftool/dbconfig/20210907-073222-root.json
* 07:22 kormat@cumin1001: dbctl commit (dc=all): 'db2118 (re)pooling @ 50%: reimage to buster (now with fixed pool config) [[phab:T288244|T288244]]', diff saved to https://phabricator.wikimedia.org/P17233 and previous config saved to /var/cache/conftool/dbconfig/20210907-072227-kormat.json
* 07:17 marostegui@cumin1001: dbctl commit (dc=all): 'db2090 (re)pooling @ 25%: Slowly repool [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17232 and previous config saved to /var/cache/conftool/dbconfig/20210907-071719-root.json
* 07:13 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 07:13 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 07:07 kormat@cumin1001: dbctl commit (dc=all): 'db2118 (re)pooling @ 25%: reimage to buster (now with fixed pool config) [[phab:T288244|T288244]]', diff saved to https://phabricator.wikimedia.org/P17231 and previous config saved to /var/cache/conftool/dbconfig/20210907-070724-kormat.json
* 07:07 kormat@cumin1001: dbctl commit (dc=all): 'Fixing db2118's pooling config [[phab:T288244|T288244]]', diff saved to https://phabricator.wikimedia.org/P17230 and previous config saved to /var/cache/conftool/dbconfig/20210907-070702-kormat.json
* 07:02 marostegui@cumin1001: dbctl commit (dc=all): 'db2090 (re)pooling @ 10%: Slowly repool [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17229 and previous config saved to /var/cache/conftool/dbconfig/20210907-070215-root.json
* 06:47 marostegui@cumin1001: dbctl commit (dc=all): 'db2090 (re)pooling @ 5%: Slowly repool [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17228 and previous config saved to /var/cache/conftool/dbconfig/20210907-064711-root.json
* 05:15 marostegui: Optimize eowiki.flaggedtemplates in eqiad [[phab:T290057|T290057]]
* 05:15 marostegui: Optimize vecwiki.flaggedtemplates in eqiad [[phab:T290057|T290057]]
* 05:14 marostegui: Optimize kawiki.flaggedtemplates in eqiad [[phab:T290057|T290057]]


== 2015-07-25 ==
== 2021-09-06 ==
* 20:51 gwicke: rolling restart of restbase instances
* 23:52 tstarling@deploy1002: Synchronized php-1.37.0-wmf.21/extensions/SecurePoll/includes/Talliers/STVTallier.php: [[phab:T290000|T290000]] (duration: 00m 58s)
* 16:53 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Repool db1035 at 100% capacity (duration: 00m 40s)
* 16:14 Amir1: Deployed patch for [[phab:T290394|T290394]]
* 16:30 _joe_: repooling mw1159,mw1160
* 15:01 Emperor: removing pc1007 from orchestrator [[phab:T289118|T289118]]
* 14:33 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Repool db1035 with lower weight (duration: 00m 13s)
* 15:00 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:57 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Depool db1035 (duration: 00m 12s)
* 14:53 kormat@cumin1001: dbctl commit (dc=all): 'db2118 (re)pooling @ 25%: reimage to buster [[phab:T288244|T288244]]', diff saved to https://phabricator.wikimedia.org/P17226 and previous config saved to /var/cache/conftool/dbconfig/20210906-145341-kormat.json
* 13:56 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Depool db1035 (duration: 00m 12s)
* 14:50 Emperor: removing pc1007 from tendril and zarcillo [[phab:T289118|T289118]]
* 13:42 jynus: db1035 restarted, temporarilly increasing db error rates on s3
* 14:45 mvernon@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts pc1007.eqiad.wmnet
* 07:05 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 25 07:05:08 UTC 2015 (duration 5m 7s)
* 14:45 volans@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts mc1026.eqiad.wmnet
* 02:41 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-25 02:41:09+00:00
* 14:44 volans@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1026.eqiad.wmnet
* 02:35 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 09m 52s)
* 14:36 volans@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc1027.eqiad.wmnet
* 02:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 25 02:08:04 UTC 2015 (duration 8m 3s)
* 14:35 mvernon@cumin1001: START - Cookbook sre.hosts.decommission for hosts pc1007.eqiad.wmnet
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-25 02:03:54+00:00
* 14:22 volans@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1027.eqiad.wmnet
* 14:19 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:715492{{!}}Set permission of creating short url to everyone everywhere (T267921 T267925)]], Part II (duration: 00m 57s)
* 14:17 ladsgroup@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:715492{{!}}Set permission of creating short url to everyone everywhere (T267921 T267925)]], Part I (duration: 00m 59s)
* 14:12 moritzm: installing postgres 9.6 security updates
* 14:05 gehel: re-pooling wdqs1007, catched up on lag
* 13:56 jbond: update facter networking fact gerrit:715949
* 13:51 jiji@deploy1002: Synchronized wmf-config/ProductionServices.php: Config: [[gerrit:719118{{!}}ProductionServices: fix comment for rdb* servers]] (duration: 00m 58s)
* 13:42 moritzm: updated thirdparty/gitlab component to 14.0.10 [[phab:T284811|T284811]]
* 13:04 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:42 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:42 akosiaris@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 12:42 akosiaris@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 12:41 akosiaris@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'sync'.
* 12:40 akosiaris@deploy1002: helmfile [codfw] START helmfile.d/admin 'sync'.
* 12:29 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:06 godog: silence statograph until thurs on alert1001 - [[phab:T290425|T290425]]
* 11:58 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript renameRestrictions.php --wiki=plwiki 'editor' 'editeditorprotected' # [[phab:T230103|T230103]]
* 11:56 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript renameRestrictions.php --wiki=<nowiki>{</nowiki>hewiki,lvwiki,srwiki,srwikibooks<nowiki>}</nowiki> 'autopatrol' 'editautopatrolprotected' # [[phab:T230103|T230103]]
* 11:53 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript renameRestrictions.php --wiki=etwiki 'autopatrol' 'editautopatrolprotected' # [[phab:T230103|T230103]]
* 11:50 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript renameRestrictions.php --wiki=dewiktionary 'autoreviewprotected' 'editautoreviewprotected' # [[phab:T230103|T230103]]
* 11:48 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript renameRestrictions.php --wiki=arwiki 'autoreview' 'editautoreviewprotected' # [[phab:T230103|T230103]]
* 11:07 urbanecm: EU B&C window done
* 11:06 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|c8d7cf8f7c3faaf3773940e96ba0cf599e725237}}: foundationwiki: Create editor group ([[phab:T205352|T205352]]) (duration: 00m 57s)
* 11:04 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|f90862be8c7b540065da24c24f2e2ac0df5b9d07}}: Growth: Define wgGEMentorDashboardDiscoveryEnabled ([[phab:T289054|T289054]]) (duration: 00m 58s)
* 11:02 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.21/maintenance/renameRestrictions.php: {{Gerrit|18e43ecca7d25d2d93de2f98f3bf5b36f5d4b780}}: renameRestrictions.php: Update protected_titles as well ([[phab:T290398|T290398]]) (duration: 00m 59s)
* 10:39 volans@cumin1001: END (ERROR) - Cookbook sre.hosts.decommission (exit_code=97) for hosts mc1027.eqiad.wmnet
* 10:38 volans@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1027.eqiad.wmnet
* 10:22 volans@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc1027.eqiad.wmnet
* 10:17 volans@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1027.eqiad.wmnet
* 09:22 gehel: depooling wdqs1007, catching up on lag
* 09:06 gehel: restart blazegraph and updater on wdqs1007
* 08:46 jbond: update networking fact - gerrit:715943
* 07:57 godog: fail sdw on ms-be1062, reported errors
* 07:51 moritzm: installing libssh security updates
* 07:45 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:45 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 07:44 moritzm: installing squashfs-tools security updates
* 06:56 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 06:56 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 06:28 marostegui: Optimize table mkwiki.flaggedtemplates in eqiad [[phab:T290057|T290057]]
* 06:26 marostegui: Optimize table bewiki.flaggedtemplates in eqiad [[phab:T290057|T290057]]
* 06:23 marostegui: Optimize table dewiki.flaggedtemplates in eqiad [[phab:T290057|T290057]]
* 05:34 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2090.codfw.wmnet with reason: REIMAGE
* 05:32 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2090.codfw.wmnet with reason: REIMAGE
* 05:07 marostegui: Stop replication on db2090 (old s4 master) [[phab:T289650|T289650]] [[phab:T288803|T288803]]
* 05:05 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2110 (current master) from API [[phab:T289650|T289650]]', diff saved to https://phabricator.wikimedia.org/P17223 and previous config saved to /var/cache/conftool/dbconfig/20210906-050502-marostegui.json
* 05:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2090 [[phab:T289650|T289650]]', diff saved to https://phabricator.wikimedia.org/P17222 and previous config saved to /var/cache/conftool/dbconfig/20210906-050419-marostegui.json
* 05:01 marostegui@cumin1001: dbctl commit (dc=all): 'Promote db2110 to s4 primary and set section read-write [[phab:T289650|T289650]]', diff saved to https://phabricator.wikimedia.org/P17221 and previous config saved to /var/cache/conftool/dbconfig/20210906-050140-root.json
* 05:00 marostegui@cumin1001: dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - [[phab:T289650|T289650]]', diff saved to https://phabricator.wikimedia.org/P17220 and previous config saved to /var/cache/conftool/dbconfig/20210906-050048-root.json
* 05:00 marostegui: Starting s4 codfw failover from db2090 to db2110 - [[phab:T289650|T289650]]
* 04:07 marostegui@cumin1001: dbctl commit (dc=all): 'Set db2110 with weight 0 [[phab:T289650|T289650]]', diff saved to https://phabricator.wikimedia.org/P17219 and previous config saved to /var/cache/conftool/dbconfig/20210906-040740-root.json
* 04:07 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on 33 hosts with reason: Primary switchover s4 [[phab:T289650|T289650]]
* 04:06 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on 33 hosts with reason: Primary switchover s4 [[phab:T289650|T289650]]


== 2015-07-24 ==
== 2021-09-05 ==
* 21:57 legoktm: running mwscript populateContentModel.php --wiki=enwiki --ns=all --table=page
* 18:54 urbanecm: wikiadmin@10.192.0.119(ptwiki)> update protected_titles set pt_create_perm='editautoreviewprotected' where pt_create_perm='autoreviewer'; # [[phab:T290396|T290396]]
* 20:36 logmsgbot: krenair Synchronized php-1.26wmf15/extensions/VisualEditor/modules/ve-mw/ui: https://gerrit.wikimedia.org/r/#/c/226907/ (duration: 00m 12s)
* 19:40 awight: updated DjangoBannerStats from 3db799dc8705c728c7261ae433e8197f5498fa1b to 57a0392b3f43b65050b01a0465e120ed609a769e
* 19:08 YuviPanda: remove others20150724183453 on labstore1002
* 18:39 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ib7c7861e: Point to a no-op /beacon URL rather than Special:RecordImpression (duration: 00m 12s)
* 18:38 ori: Merging Ib7c7861e: Point to a no-op /beacon URL rather than Special:RecordImpression
* 18:30 ori: Depooled Precise image scalers (mw1159 and mw1160)
* 18:29 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Idfe1fa60: testwiki: Point to a no-op /beacon URL rather than Special:RecordImpression (duration: 00m 12s)
* 18:17 YuviPanda: removed labstore/others20150724 on labstore1002
* 18:15 YuviPanda: running others20150724 on labstore1002
* 16:51 bd808: Upgraded logstash1006 to elasticsearch 1.7.0
* 16:48 bd808: Upgraded logstash1005 to elasticsearch 1.7.0
* 16:36 bd808: Upgraded logstash1004 to elasticsearch 1.7.0
* 16:27 bd808: Upgraded logstash1003 to elasticsearch 1.7.0
* 16:26 bd808: Upgraded logstash1002 to elasticsearch 1.7.0
* 16:25 bd808: Upgraded logstash1001 to elasticsearch 1.7.0
* 13:44 cmjohnson1: swapping failed disk db1058
* 13:11 cmjohnson1: swapping ssds in restbase1007
* 12:47 hashar: restarting Jenkins
* 12:47 hashar: Jenkins: switching gearman plugin from our custom compiled 0.1.1-9-g08e9c42-change_192429_2  to upstream 0.1.2. They are actually the exact same versions.
* 10:23 logmsgbot: legoktm Synchronized php-1.26wmf15/extensions/AbuseFilter/: Special:AbuseFilter on all large Wikipedias is returning errors - T106798 (duration: 00m 13s)
* 08:40 hashar: upgrading zuul to zuul_2.0.0-327-g3ebedde-wmf3precise1 to fix a regression ( https://phabricator.wikimedia.org/T106531 )
* 05:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 24 05:53:16 UTC 2015 (duration 53m 15s)
* 05:52 Krinkle: Added rl-test.php on testwiki (mw1017) to gather stats about cache-control rollover (Catrope, Krinkle). Used by testwiki/test2wiki/mediawikiwiki Common.js (sampled). See T105255.
* 02:29 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-24 02:29:25+00:00
* 02:26 urandom: restarting restbase on restbase1006
* 02:25 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 07m 12s)
* 02:06 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 24 02:06:41 UTC 2015 (duration 6m 40s)
* 02:02 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-24 02:02:31+00:00
* 00:21 ori: Re-enabled Puppet on mw1153


== 2015-07-23 ==
== 2021-09-04 ==
* 23:31 logmsgbot: catrope Synchronized php-1.26wmf15/extensions/WikimediaEvents: SWAT (duration: 00m 12s)
* 13:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 100%: Slowly repool [[phab:T290374|T290374]]', diff saved to https://phabricator.wikimedia.org/P17217 and previous config saved to /var/cache/conftool/dbconfig/20210904-133532-root.json
* 23:31 logmsgbot: catrope Synchronized php-1.26wmf15/extensions/CirrusSearch: SWAT (duration: 00m 12s)
* 13:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 75%: Slowly repool [[phab:T290374|T290374]]', diff saved to https://phabricator.wikimedia.org/P17216 and previous config saved to /var/cache/conftool/dbconfig/20210904-132029-root.json
* 23:30 logmsgbot: catrope Synchronized php-1.26wmf14/extensions/WikimediaEvents: SWAT (duration: 00m 12s)
* 13:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 50%: Slowly repool [[phab:T290374|T290374]]', diff saved to https://phabricator.wikimedia.org/P17215 and previous config saved to /var/cache/conftool/dbconfig/20210904-130525-root.json
* 23:30 logmsgbot: catrope Synchronized php-1.26wmf14/extensions/CirrusSearch: SWAT (duration: 00m 13s)
* 12:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 25%: Slowly repool [[phab:T290374|T290374]]', diff saved to https://phabricator.wikimedia.org/P17214 and previous config saved to /var/cache/conftool/dbconfig/20210904-125021-root.json
* 23:16 logmsgbot: catrope Synchronized flow.dblist: Enable Flow on viwiki (duration: 00m 12s)
* 12:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 10%: Slowly repool [[phab:T290374|T290374]]', diff saved to https://phabricator.wikimedia.org/P17213 and previous config saved to /var/cache/conftool/dbconfig/20210904-123518-root.json
* 23:14 logmsgbot: catrope Synchronized wmf-config/: SWAT (duration: 00m 11s)
* 12:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2137:3314 (re)pooling @ 5%: Slowly repool [[phab:T290374|T290374]]', diff saved to https://phabricator.wikimedia.org/P17212 and previous config saved to /var/cache/conftool/dbconfig/20210904-122014-root.json
* 23:14 logmsgbot: catrope Synchronized w/static/images/: SWAT (duration: 00m 12s)
* 09:04 elukey: restart wmf_auto_restart_rsyslog.service on puppetdb1002
* 23:11 ori: Restarting Apache on mw1153
* 09:00 elukey: `systemctl reset-failed ifup@ens6.service` on puppetdb2002 - [[phab:T273026|T273026]]
* 23:09 ori: T84842: Requests to thumb_handler.php/.* don't match the ProxyPass rule and get handled by Zend instead. To see how HHVM actually handles these requests, I'm disabling Puppet on mw1153 and dropping the '$' anchor from the ProxyPass rules.
* 03:02 rzl@cumin2001: dbctl commit (dc=all): 'Depool db2137:3314', diff saved to https://phabricator.wikimedia.org/P17210 and previous config saved to /var/cache/conftool/dbconfig/20210904-030231-rzl.json
* 23:02 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: Enable geo feature usage tracking on all wikis (duration: 00m 12s)
* 21:19 hashar: is already a nice improvement
* 20:33 twentyafterfour: deployed hotfix for T106716, restarted apache on iridium
* 18:46 logmsgbot: catrope Synchronized php-1.26wmf15/resources/src/mediawiki.less/mediawiki.ui/mixins.less: Unbreak quiet button styles (duration: 00m 13s)
* 18:10 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: all wikis to 1.26wmf15
* 17:56 logmsgbot: jynus Synchronized wmf-config/db-codfw.php: Repooling es2004 after hardware maintenance (duration: 00m 11s)
* 17:56 logmsgbot: jynus Synchronized wmf-config/db-eqiad.php: Repooling es2004 after hardware maintenance (duration: 00m 12s)
* 17:38 legoktm: running foreachwikiindblist /home/legoktm/largebutnotenwiki.dblist populateContentModel.php --ns=all --table=page
* 16:27 ori: restarted hhvm on mw1221
* 16:16 logmsgbot: thcipriani Finished scap: SWAT: Add azb interwiki sorting, Add Southern Luri, and Fix name of S and W Balochi (duration: 06m 13s)
* 16:14 urandom: restarting Cassandra on restbase1001 to (temporarily) enable GC logging
* 16:10 logmsgbot: thcipriani Started scap: SWAT: Add azb interwiki sorting, Add Southern Luri, and Fix name of S and W Balochi
* 15:38 moritzm: added jenkins-debian-glue 0.13.0 to apt.wikimedia.org (jessie-wikimedia)
* 15:35 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: fix references to non-existent wikis [[gerrit:226470]] (duration: 00m 13s)
* 15:31 _joe_: rebooting ms-be1003, stuck in kernel locks
* 15:31 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Remove reference to nonexistent ru_sibwiki.png [[gerrit:226469]] (duration: 00m 14s)
* 15:26 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Add wgSitename and wgMetaNamespace for pnbwiki [[gerrit:226543]] (duration: 00m 12s)
* 15:15 logmsgbot: thcipriani Synchronized wmf-config/CommonSettings.php: SWAT: Set a different wmgContentTranslationDefaultSourceLanguage for English part II [[gerrit:224031]] (duration: 00m 12s)
* 15:14 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Set a different wmgContentTranslationDefaultSourceLanguage for English part I [[gerrit:224031]] (duration: 00m 13s)
* 15:04 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Add wgSitename and wgMetaNamespace for pnbwikipedia [[gerrit:225322]] (duration: 00m 12s)
* 13:08 mobrovac: graphoid deploying 81b9633
* 10:56 jynus: disabling puppet on maps-test hosts to debug service issue
* 07:28 _joe_: upgrading hhvm on the canary appservers
* 06:59 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 23 06:59:44 UTC 2015 (duration 59m 43s)
* 06:42 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1070, warm up (duration: 00m 13s)
* 04:25 logmsgbot: ori Synchronized php-1.26wmf15/extensions/Scribunto/common/Base.php: (no message) (duration: 00m 13s)
* 04:24 logmsgbot: ori Synchronized php-1.26wmf14/extensions/Scribunto/common/Base.php: (no message) (duration: 00m 12s)
* 04:04 springle: upgrade & reboot db1070
* 03:04 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-23 03:04:48+00:00
* 03:00 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 07m 24s)
* 02:39 springle: temporarily silenced backup4001 check_disk space icinga noise; seems important, but not exploding-any-minute-now
* 02:37 logmsgbot: LocalisationUpdate completed (1.26wmf14) at 2015-07-23 02:37:55+00:00
* 02:34 logmsgbot: l10nupdate Synchronized php-1.26wmf14/cache/l10n: (no message) (duration: 07m 13s)
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 23 02:07:12 UTC 2015 (duration 7m 11s)
* 02:05 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1070 (duration: 00m 12s)
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-23 02:03:03+00:00
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf14) at 2015-07-23 02:03:02+00:00
* 01:45 logmsgbot: ori Synchronized php-1.26wmf15/includes/libs/objectcache/APCBagOStuff.php: I4b2cf1715538 (duration: 00m 12s)
* 01:45 logmsgbot: ori Synchronized php-1.26wmf14/includes/libs/objectcache/APCBagOStuff.php: I4b2cf1715538 (duration: 00m 12s)
* 01:05 twentyafterfour: phab is back
* 01:03 logmsgbot: ori Synchronized php-1.26wmf14/includes/libs/objectcache/APCBagOStuff.php: I4b2cf1715 (duration: 00m 12s)
* 01:01 legoktm: twentyafterfour is upgrading phabricator
* 00:50 yurik: deployed kartotherian fix, still not starting as a service, and no idea why. Have no access to logs. Frustrated.
* 00:46 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/225515/ (duration: 00m 12s)
* 00:23 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: fix extra dollar mark in https://gerrit.wikimedia.org/r/#/c/226336/1/wmf-config/InitialiseSettings.php (duration: 00m 12s)
* 00:02 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/225541/ (duration: 00m 13s)
* 00:02 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/225541/ (duration: 00m 12s)


== 2015-07-22 ==
== 2021-09-03 ==
* 23:56 cwdent: updated civicrm from 292ad137f6b3ffc818a3bd617ca4f335931091f3 to 83cacfa1e0852ffaf47d2f02e7d843cf6f3bcda4
* 21:49 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:55 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: re-try reverted portion of https://gerrit.wikimedia.org/r/#/c/118654/ using NS IDs instead of not-necessarily-defined constants which were causing warning flood (duration: 00m 13s)
* 20:30 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:51 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: partially revert https://gerrit.wikimedia.org/r/#/c/118654/ (duration: 00m 12s)
* 19:33 krinkle@deploy1002: Finished deploy [integration/docroot@6492b3d]: {{Gerrit|I48480e89e5f6}} (duration: 00m 10s)
* 23:47 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://wikitech.wikimedia.org/w/index.php?title=Deployments&diff=171578&oldid=171570 (duration: 00m 12s)
* 19:33 krinkle@deploy1002: Started deploy [integration/docroot@6492b3d]: {{Gerrit|I48480e89e5f6}}
* 23:47 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://wikitech.wikimedia.org/w/index.php?title=Deployments&diff=171578&oldid=171570 (duration: 00m 12s)
* 19:26 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:40 yurik: deployed kartotherian
* 19:04 ryankemper: [[phab:T290330|T290330]] `ryankemper@cumin1001:~$ sudo -E cumin 'P<nowiki>{</nowiki>wdqs2*<nowiki>}</nowiki>' 'sudo rm -fv /etc/cron.hourly/restart-blazegraph'` (Cleaned up manually created crons now that we have [somewhat hacky] systemd timers doing the same job)
* 23:24 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/224393/ (duration: 00m 12s)
* 17:42 dduvall@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 23:24 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/224393/ (duration: 00m 13s)
* 17:40 dduvall@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 23:19 logmsgbot: krenair Synchronized php-1.26wmf15/extensions/VisualEditor: https://gerrit.wikimedia.org/r/#/c/226447/ (duration: 00m 13s)
* 17:35 dduvall@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 22:52 Reedy: populateSitesTable.php finished
* 17:17 ryankemper: [[phab:T290330|T290330]] Deployed https://gerrit.wikimedia.org/r/c/operations/puppet/+/717508 across `wdqs` fleet; codfw wdqs hosts will restart on average once per hour now to address ongoing availability issues for wdqs codfw
* 22:09 Reedy: running in screen as reedy on tin foreachwikiindblist wikidataclient.dblist extensions/Wikidata/extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https
* 16:32 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 22:09 logmsgbot: reedy Synchronized database lists: Add azbwiki to wikidataclient.dblist (duration: 00m 11s)
* 16:10 gehel: blazegraph (public cofdfw cluster) will now restart every hour - [[phab:T290330|T290330]]
* 20:55 cscott: updated Parsoid to version 6befc44e
* 15:53 jbond: enable puppet fleet wide to post puppetdb database maintance - [[phab:T263578|T263578]]
* 20:26 logmsgbot: twentyafterfour Synchronized php-1.26wmf15/includes/libs/MultiHttpClient.php: Deploy https://gerrit.wikimedia.org/r/#/c/226388/ (duration: 00m 12s)
* 15:21 jbond: create lvm snapshot puppetdb2002_data_snapshot on ganeti2023 - [[phab:T263578|T263578]]
* 19:57 legoktm: re-attributed edits to User:Mirwin~enwiki (T106069)
* 15:17 jbond: create lvm snapshot puppetdb1002_data_snapshot on ganeti1012 - [[phab:T263578|T263578]]
* 19:34 logmsgbot: demon Finished scap: azbwiki namespace stuff (duration: 42m 57s)
* 15:00 jbond: disable puppet fleet wide to preform puppetdb database maintance - [[phab:T263578|T263578]]
* 19:30 moritzm: updated remaining Ubuntu systems for openssl/export grade update
* 14:58 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 18:51 logmsgbot: demon Started scap: azbwiki namespace stuff
* 14:58 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 18:49 logmsgbot: demon Synchronized wmf-config/interwiki.cdb: Updating interwiki cache (duration: 00m 13s)
* 14:35 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:48 logmsgbot: demon Synchronized langlist: azbwiki++ (duration: 00m 12s)
* 14:29 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 18:48 logmsgbot: demon Synchronized wmf-config/InitialiseSettings.php: azbwiki++ (duration: 00m 12s)
* 14:20 mutante: mw2264 - scap pull
* 18:47 logmsgbot: demon Synchronized w/static/images/project-logos/azbwiki.png: azbwiki++ (duration: 00m 12s)
* 14:18 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 18:45 logmsgbot: demon rebuilt wikiversions.cdb and synchronized wikiversions files: azbwiki++
* 14:18 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 18:44 logmsgbot: demon Synchronized database lists: azbwiki++ (duration: 00m 13s)
* 13:11 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc1027.eqiad.wmnet
* 18:18 legoktm: running populateContentModel.php --ns=all --table=page on all medium wikis
* 13:10 dcausse: installing openjdk-8-dbg on wdqs2007
* 18:08 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf15
* 13:04 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1027.eqiad.wmnet
* 18:08 logmsgbot: twentyafterfour Synchronized php-1.26wmf15/extensions/MobileFrontend/includes/MobileFrontend.hooks.php: deploy https://gerrit.wikimedia.org/r/#/c/226313/ (duration: 00m 13s)
* 13:02 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1023.eqiad.wmnet
* 16:03 _joe_: installed the hhvm 3.6.5 on deployment-prep
* 12:48 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1023.eqiad.wmnet
* 15:52 _joe_: uploaded hhvm_3.6.5+dfsg1-1+wm1 to reprepro
* 12:46 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc[1035-1036].eqiad.wmnet
* 15:47 logmsgbot: thcipriani Synchronized w/static/images/project-logos/lrcwiki.png: SWAT: Update the logo of lrcwiki [[gerrit:220358]] (duration: 00m 13s)
* 12:32 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc[1035-1036].eqiad.wmnet
* 15:27 logmsgbot: jynus Synchronized wmf-config: removing db-secondary.php (duration: 00m 12s)
* 12:12 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc[1028-1032].eqiad.wmnet
* 15:26 logmsgbot: jynus Synchronized docroot/noc: removing db-secondary.php from the list of symlinks to maintain (duration: 00m 12s)
* 12:03 joal@deploy1002: Finished deploy [analytics/refinery@7208d3d] (thin): Analytics hotfix deploy (bis) THIN [analytics/refinery@7208d3d] (duration: 00m 06s)
* 14:20 hashar: enabling puppet on labnodepool1001.eqiad.wmnet
* 12:03 joal@deploy1002: Started deploy [analytics/refinery@7208d3d] (thin): Analytics hotfix deploy (bis) THIN [analytics/refinery@7208d3d]
* 14:04 moritzm: added cython_0.20.1+git90-g0e6e38e-1ubuntu2~precise1 to precise-wikimedia on carbon (required for activemq backport on precise)
* 12:03 joal@deploy1002: Finished deploy [analytics/refinery@7208d3d]: Analytics hotfix deploy (bis)[analytics/refinery@7208d3d] (duration: 19m 16s)
* 11:37 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: raise db1071 to normal load (duration: 00m 12s)
* 11:56 dcausse@deploy1002: Finished deploy [wdqs/wdqs@8361ac9]: ban queries from a generic UA (duration: 19m 21s)
* 08:03 _joe_: repooling mw1158-60
* 11:44 joal@deploy1002: Started deploy [analytics/refinery@7208d3d]: Analytics hotfix deploy (bis)[analytics/refinery@7208d3d]
* 07:22 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 22 07:22:36 UTC 2015 (duration 22m 35s)
* 11:42 marostegui: Remove flaggedrevs_stats2 and flaggedrevs_stats from enwiki - [[phab:T289050|T289050]]
* 05:22 logmsgbot: ori Synchronized php-1.26wmf14/extensions/Scribunto/common/Base.php: Cherry-pick I53dd1ecb (duration: 00m 13s)
* 11:37 dcausse@deploy1002: Started deploy [wdqs/wdqs@8361ac9]: ban queries from a generic UA
* 05:22 logmsgbot: ori Synchronized php-1.26wmf15/extensions/Scribunto/common/Base.php: Cherry-pick I53dd1ecb (duration: 00m 13s)
* 11:36 dcausse@deploy1002: Finished deploy [wdqs/wdqs@8361ac9]: ban queries from a generic UA (duration: 01m 07s)
* 04:43 logmsgbot: ori Synchronized php-1.26wmf14/extensions/Scribunto/common/Base.php: Revert: Live-hack I53dd1ecb to test impact (duration: 00m 12s)
* 11:35 dcausse@deploy1002: Started deploy [wdqs/wdqs@8361ac9]: ban queries from a generic UA
* 04:35 gwicke: deployed small restbase hotfix d96210f2
* 10:58 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc[1028-1032].eqiad.wmnet
* 04:28 logmsgbot: ori Synchronized php-1.26wmf14/extensions/Scribunto/common/Base.php: Live-hack I53dd1ecb to test impact (duration: 00m 13s)
* 10:54 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc[1025-1026].eqiad.wmnet
* 04:25 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1071, warm up (duration: 00m 12s)
* 10:47 joal@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): Deploy latest code on AQS new servers - test after failures (duration: 00m 32s)
* 04:14 springle: upgrade db1071 trusty
* 10:46 joal@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): Deploy latest code on AQS new servers - test after failures
* 03:10 logmsgbot: LocalisationUpdate completed (1.26wmf15) at 2015-07-22 03:10:23+00:00
* 10:45 joal@deploy1002: deploy aborted: Deploy latest code on AQS new servers - test after failures (duration: 00m 05s)
* 03:04 logmsgbot: l10nupdate Synchronized php-1.26wmf15/cache/l10n: (no message) (duration: 10m 33s)
* 10:45 joal@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-test): Deploy latest code on AQS new servers - test after failures
* 02:52 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1071 (duration: 00m 11s)
* 10:29 hnowlan@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts (duration: 00m 03s)
* 02:37 logmsgbot: LocalisationUpdate completed (1.26wmf14) at 2015-07-22 02:37:45+00:00
* 10:29 hnowlan@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts
* 02:33 logmsgbot: l10nupdate Synchronized php-1.26wmf14/cache/l10n: (no message) (duration: 07m 01s)
* 10:22 hnowlan@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts (duration: 00m 55s)
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 22 02:07:33 UTC 2015 (duration 7m 32s)
* 10:21 hnowlan@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf15) at 2015-07-22 02:03:19+00:00
* 10:17 hnowlan@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts (duration: 00m 36s)
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf14) at 2015-07-22 02:03:18+00:00
* 10:16 hnowlan@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts
* 10:08 hnowlan@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts (duration: 00m 45s)
* 10:08 hnowlan@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts
* 10:05 hnowlan@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts (duration: 00m 36s)
* 10:04 hnowlan@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts
* 10:02 hnowlan@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts (duration: 01m 25s)
* 10:01 hnowlan@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts
* 10:00 hnowlan@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts (duration: 01m 53s)
* 09:58 hnowlan@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts
* 09:57 hnowlan@deploy1002: Finished deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts (duration: 00m 09s)
* 09:57 hnowlan@deploy1002: Started deploy [analytics/aqs/deploy@d273fde] (aqs-next): deploying aqs to inactive aqs-next hosts
* 09:32 joal@deploy1002: Finished deploy [analytics/refinery@4ff8979] (thin): Analytics hotfix deploy THIN [analytics/refinery@4ff8979] (duration: 00m 07s)
* 09:32 joal@deploy1002: Started deploy [analytics/refinery@4ff8979] (thin): Analytics hotfix deploy THIN [analytics/refinery@4ff8979]
* 09:26 joal@deploy1002: Finished deploy [analytics/refinery@4ff8979]: Analytics hotfix deploy [analytics/refinery@4ff8979] (duration: 17m 36s)
* 09:25 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc[1025-1026].eqiad.wmnet
* 09:15 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 09:14 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1022.eqiad.wmnet
* 09:13 jelto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:09 akosiaris@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 09:09 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:09 joal@deploy1002: Started deploy [analytics/refinery@4ff8979]: Analytics hotfix deploy [analytics/refinery@4ff8979]
* 09:08 jelto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:06 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mathoid' for release 'production' .
* 09:03 jelto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 09:03 jelto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 08:53 jelto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 08:52 jelto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 08:45 ema: cp-eqsin: clean apt cache to free up some space [[phab:T290305|T290305]]
* 08:45 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1022.eqiad.wmnet
* 08:23 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
* 07:43 legoktm: uploaded pygments 2.10.0+dfsg-1~wmf1 to apt.wm.o in component/pygments
* 07:42 marostegui: Remove flaggedrevs_stats2 and flaggedrevs_stats from severak s3 wikis - [[phab:T289050|T289050]]
* 07:10 godog: more weight to ms-be20[62-65] - [[phab:T288458|T288458]]
* 07:01 marostegui@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:57 marostegui@cumin1001: START - Cookbook sre.dns.netbox
* 06:45 elukey: run `apt-get clean` on cp5012 to free some space (94% of the root partition used)
* 06:12 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3314 (re)pooling @ 100%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17203 and previous config saved to /var/cache/conftool/dbconfig/20210903-061204-root.json
* 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3312 (re)pooling @ 100%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17202 and previous config saved to /var/cache/conftool/dbconfig/20210903-061138-root.json
* 05:57 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3314 (re)pooling @ 75%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17201 and previous config saved to /var/cache/conftool/dbconfig/20210903-055700-root.json
* 05:56 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3312 (re)pooling @ 75%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17200 and previous config saved to /var/cache/conftool/dbconfig/20210903-055635-root.json
* 05:41 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3314 (re)pooling @ 50%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17199 and previous config saved to /var/cache/conftool/dbconfig/20210903-054157-root.json
* 05:41 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3312 (re)pooling @ 50%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17198 and previous config saved to /var/cache/conftool/dbconfig/20210903-054131-root.json
* 05:30 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts pc2007.codfw.wmnet
* 05:26 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3314 (re)pooling @ 25%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17196 and previous config saved to /var/cache/conftool/dbconfig/20210903-052653-root.json
* 05:26 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3312 (re)pooling @ 25%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17195 and previous config saved to /var/cache/conftool/dbconfig/20210903-052628-root.json
* 05:20 marostegui@cumin1001: START - Cookbook sre.hosts.decommission for hosts pc2007.codfw.wmnet
* 05:11 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3314 (re)pooling @ 10%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17194 and previous config saved to /var/cache/conftool/dbconfig/20210903-051149-root.json
* 05:11 marostegui@cumin1001: dbctl commit (dc=all): 'db2138:3312 (re)pooling @ 10%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17193 and previous config saved to /var/cache/conftool/dbconfig/20210903-051124-root.json
* 05:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2138 for upgrade', diff saved to https://phabricator.wikimedia.org/P17192 and previous config saved to /var/cache/conftool/dbconfig/20210903-050423-marostegui.json
* 00:31 tgr@deploy1002: Synchronized php-1.37.0-wmf.21/extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php: Backport: [[gerrit:716491{{!}}fixLinkRecommendationData: Try harder to avoid >10K result sets (T284531)]] (duration: 00m 58s)
* 00:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:23 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .


== 2015-07-21 ==
== 2021-09-02 ==
* 23:45 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: Set $wgVectorResponsive = true on testwiki (duration: 00m 12s)
* 23:12 thcipriani@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:704171{{!}}Adding wordmark for ptwikinews mobile and desktop skins (T281591)]] Part II (duration: 00m 57s)
* 23:39 logmsgbot: catrope Synchronized php-1.26wmf14/extensions/VisualEditor: SWAT (duration: 00m 13s)
* 23:11 thcipriani@deploy1002: Synchronized static/images/mobile/copyright/wikinews-wordmark-pt.svg: Config: [[gerrit:704171{{!}}Adding wordmark for ptwikinews mobile and desktop skins (T281591)]] Part I (duration: 01m 14s)
* 23:37 logmsgbot: catrope Synchronized php-1.26wmf15/extensions/VisualEditor: SWAT (duration: 00m 13s)
* 21:47 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:08 logmsgbot: catrope Synchronized wmf-config/CommonSettings.php: Enable tracking of geo feature usage on enwiki (duration: 00m 12s)
* 21:37 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:07 logmsgbot: catrope Synchronized wmf-config/InitialiseSettings.php: Enable tracking of geo feature usage on enwiki (duration: 00m 13s)
* 21:17 bd808@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'toolhub' for release 'main' .
* 23:05 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: trying this again: group0 to 1.26wmf15
* 19:57 ejegg: updated fundraising CiviCRM from {{Gerrit|7ac13753c7}} to {{Gerrit|06ef98593f}}
* 22:59 logmsgbot: twentyafterfour Finished scap: test: syncing 1.26wmf15 again (duration: 20m 51s)
* 19:49 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:54 chasemp: 22:50 <  chasemp> "then git reset --hard 9588d0a6844fc9cc68372f4bf3e1eda3cffc8138 in  /etc/zuul/wikimedia"
* 19:48 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts mc1021.eqiad.wmnet
* 22:51 chasemp: gallium 'service zuul stop && service zuul-merger stop && sudo apt-get install zuul=2.0.0-304-g685ca22-wmf1precise1' DOWNGRADE due to errors
* 19:45 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 22:39 logmsgbot: twentyafterfour Started scap: test: syncing 1.26wmf15 again
* 19:40 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1021.eqiad.wmnet
* 22:27 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: revert group0 to 1.26wmf15
* 19:28 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: all wikis to 1.37.0-wmf.21  refs [[phab:T281162|T281162]]
* 22:26 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group0 to 1.26wmf15
* 18:31 ryankemper: [WCQS] `wcqs100[1-3],wcqs200[1-3]` downtimed until `2021-09-09 20:29:55` (UTC)
* 22:20 ori: Accepted mw1090's minion key on palladium
* 18:28 ryankemper: [WCQS] Merged & deployed https://gerrit.wikimedia.org/r/c/operations/puppet/+/713946, going to suppress icinga alerts on `wcqs*` hosts because these are still in the process of being spun up properly and aren't serving traffic or anything
* 21:21 logmsgbot: twentyafterfour Finished scap: sync 1.26wmf15 branch + localization cache, remove wmf8 (duration: 27m 32s)
* 18:24 ryankemper@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 20:53 logmsgbot: twentyafterfour Started scap: sync 1.26wmf15 branch + localization cache, remove wmf8
* 18:24 ryankemper@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 20:53 logmsgbot: twentyafterfour Purged l10n cache for 1.26wmf11
* 18:20 ryankemper@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'internal' .
* 20:52 logmsgbot: twentyafterfour Purged l10n cache for 1.26wmf10
* 18:20 ryankemper@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'linkrecommendation' for release 'external' .
* 20:51 logmsgbot: twentyafterfour Purged l10n cache for 1.26wmf9
* 17:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:28 hasharConfcall: Zuul no more report any result back to Gerrit :(  Fix being deployed
* 17:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:56 ori: Dropping AccountAudit table on all wikis (T105894)
* 16:57 akosiaris@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:45 logmsgbot: ori Synchronized wmf-config: I3887fd6c: Disable AccountAudit (duration: 00m 12s)
* 16:18 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:07 logmsgbot: ori Synchronized php-1.26wmf14/extensions/Scribunto: I0e5f2d3b2: Updated mediawiki/core Project: mediawiki/extensions/Scribunto  5af0350e2d09444db279f58504967d0e9b154534 (duration: 00m 13s)
* 16:09 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:06 logmsgbot: ori Synchronized php-1.26wmf14/extensions/WikimediaEvents: I0e5f2d3b2: Updated mediawiki/core Project: mediawiki/extensions/WikimediaEvents  968890f1a256a08a02925e4bdb53a8e8d64aacea (duration: 00m 13s)
* 16:04 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1020.eqiad.wmnet
* 17:08 _joe_: restarted logmsgbot, ircecho on neon
* 15:53 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1020.eqiad.wmnet
* 16:20 logmsgbot: thcipriani Synchronized php-1.26wmf14/extensions/Wikidata: SWAT: Update Wikibase: Add api featureLog for ungroupedlist param [[gerrit:226086]] (duration: 00m 20s)
* 15:40 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1019.eqiad.wmnet
* 16:01 logmsgbot: thcipriani Synchronized php-1.26wmf13/extensions/Wikidata: SWAT: Update Wikibase: Add api featureLog for ungroupedlist param [[gerrit:226086]] (duration: 00m 20s)
* 15:31 dzahn@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 15:37 godog: cleanup ganglia temp files on uranium
* 15:28 dzahn@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 15:34 logmsgbot: thcipriani Synchronized php-1.26wmf14/includes/filerepo/file/File.php: SWAT: Thumbnail logging and stats part II [[gerrit:225936]] (duration: 00m 12s)
* 15:26 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1019.eqiad.wmnet
* 15:34 logmsgbot: thcipriani Synchronized php-1.26wmf14/thumb.php: SWAT: Thumbnail logging and stats part I [[gerrit:225936]] (duration: 00m 12s)
* 15:16 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=99) for hosts mc1033.eqiad.wmnet
* 15:29 logmsgbot: thcipriani Synchronized php-1.26wmf14/includes/filerepo/file/File.php: SWAT: Thumbnail logging and stats part II [[gerrit:225936]] (duration: 00m 13s)
* 15:15 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mc1034.eqiad.wmnet
* 15:28 logmsgbot: thcipriani Synchronized php-1.26wmf14/thumb.php: SWAT: Thumbnail logging and stats part I [[gerrit:225936]] (duration: 00m 11s)
* 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'db2136 (re)pooling @ 100%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17178 and previous config saved to /var/cache/conftool/dbconfig/20210902-150412-root.json
* 15:20 cmjohnson1: re-installing mw1090
* 14:50 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1034.eqiad.wmnet
* 15:12 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Offer 400px as a thumbnail size available in Special:Preferences [[gerrit:226051]] (duration: 00m 12s)
* 14:49 marostegui@cumin1001: dbctl commit (dc=all): 'db2136 (re)pooling @ 75%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17177 and previous config saved to /var/cache/conftool/dbconfig/20210902-144908-root.json
* 15:08 logmsgbot: thcipriani Synchronized wmf-config/InitialiseSettings.php: SWAT: Assign thumbnail access log to Monolog debug channel [[gerrit:225935]] (duration: 00m 13s)
* 14:49 jiji@cumin1001: START - Cookbook sre.hosts.decommission for hosts mc1033.eqiad.wmnet
* 13:57 _joe_: depooling mw1158-60 from the imagescaler pool, to test HHVM-only imagescalers
* 14:47 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 05:08 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 21 05:08:32 UTC 2015 (duration 8m 31s)
* 14:44 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 02:27 logmsgbot: LocalisationUpdate completed (1.26wmf14) at 2015-07-21 02:26:59+00:00
* 14:39 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 02:23 logmsgbot: l10nupdate Synchronized php-1.26wmf14/cache/l10n: (no message) (duration: 06m 55s)
* 14:38 jayme@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'cxserver' for release 'production' .
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 21 02:07:22 UTC 2015 (duration 7m 21s)
* 14:38 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'cxserver' for release 'staging' .
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf14) at 2015-07-21 02:03:11+00:00
* 14:35 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 14:34 marostegui@cumin1001: dbctl commit (dc=all): 'db2136 (re)pooling @ 50%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17176 and previous config saved to /var/cache/conftool/dbconfig/20210902-143405-root.json
* 14:33 jayme@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'wikifeeds' for release 'production' .
* 14:32 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'wikifeeds' for release 'staging' .
* 14:22 moritzm: installing exiv2 security updates
* 14:19 marostegui@cumin1001: dbctl commit (dc=all): 'db2136 (re)pooling @ 25%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17175 and previous config saved to /var/cache/conftool/dbconfig/20210902-141901-root.json
* 14:13 moritzm: installing ffmpeg security updates
* 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'db2136 (re)pooling @ 10%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17174 and previous config saved to /var/cache/conftool/dbconfig/20210902-140357-root.json
* 14:00 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 13:57 jayme@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 13:55 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 13:48 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2136 for upgrade', diff saved to https://phabricator.wikimedia.org/P17173 and previous config saved to /var/cache/conftool/dbconfig/20210902-134838-marostegui.json
* 13:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2119 (re)pooling @ 100%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17172 and previous config saved to /var/cache/conftool/dbconfig/20210902-134448-root.json
* 13:42 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 13:42 jayme@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'citoid' for release 'production' .
* 13:41 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'citoid' for release 'staging' .
* 13:39 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 13:39 jayme@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'termbox' for release 'production' .
* 13:38 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'test' .
* 13:38 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'termbox' for release 'staging' .
* 13:36 jayme@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 13:35 jayme@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 13:29 marostegui@cumin1001: dbctl commit (dc=all): 'db2119 (re)pooling @ 75%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17171 and previous config saved to /var/cache/conftool/dbconfig/20210902-132945-root.json
* 13:29 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 13:24 jbond@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE
* 13:22 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on sretest1002.eqiad.wmnet with reason: REIMAGE
* 13:14 jbond: reimage sretest1002 (not sretest1001)
* 13:14 marostegui@cumin1001: dbctl commit (dc=all): 'db2119 (re)pooling @ 50%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17169 and previous config saved to /var/cache/conftool/dbconfig/20210902-131441-root.json
* 13:14 jbond: reimage sretest1001
* 12:59 marostegui@cumin1001: dbctl commit (dc=all): 'db2119 (re)pooling @ 25%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17168 and previous config saved to /var/cache/conftool/dbconfig/20210902-125937-root.json
* 12:55 jbond: disable puppet fleet wide to roll out 715728
* 12:44 marostegui@cumin1001: dbctl commit (dc=all): 'db2119 (re)pooling @ 10%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17167 and previous config saved to /var/cache/conftool/dbconfig/20210902-124434-root.json
* 12:42 marostegui: Upgrade db2119
* 12:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2119 for upgrade', diff saved to https://phabricator.wikimedia.org/P17166 and previous config saved to /var/cache/conftool/dbconfig/20210902-124102-marostegui.json
* 12:28 marostegui@cumin1001: dbctl commit (dc=all): 'db2106 (re)pooling @ 100%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17165 and previous config saved to /var/cache/conftool/dbconfig/20210902-122826-root.json
* 12:13 marostegui@cumin1001: dbctl commit (dc=all): 'db2106 (re)pooling @ 75%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17164 and previous config saved to /var/cache/conftool/dbconfig/20210902-121323-root.json
* 11:58 marostegui@cumin1001: dbctl commit (dc=all): 'db2106 (re)pooling @ 50%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17163 and previous config saved to /var/cache/conftool/dbconfig/20210902-115819-root.json
* 11:43 marostegui@cumin1001: dbctl commit (dc=all): 'db2106 (re)pooling @ 25%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17162 and previous config saved to /var/cache/conftool/dbconfig/20210902-114315-root.json
* 11:28 marostegui@cumin1001: dbctl commit (dc=all): 'db2106 (re)pooling @ 10%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17161 and previous config saved to /var/cache/conftool/dbconfig/20210902-112812-root.json
* 11:26 urbanecm@deploy1002: Synchronized README: testing scap (duration: 01m 06s)
* 11:22 jmm@puppetmaster1001: conftool action : set/pooled=inactive; selector: name=mw2264.codfw.wmnet
* 11:18 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2106 for upgrade', diff saved to https://phabricator.wikimedia.org/P17160 and previous config saved to /var/cache/conftool/dbconfig/20210902-111843-marostegui.json
* 11:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:05 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|3ce5d80eb6f8ad720b5d9c0b6ad7840dd869735e}}: dewiki: Enable Growth features for 30% of newcomers ([[phab:T288420|T288420]]) (duration: 01m 58s)
* 11:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:04 urbanecm: metawiki: Server-side page move from VRT -> Volunteer Response Team ([[phab:T290083|T290083]])
* 11:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2073 (re)pooling @ 100%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17158 and previous config saved to /var/cache/conftool/dbconfig/20210902-110022-root.json
* 10:45 marostegui@cumin1001: dbctl commit (dc=all): 'db2073 (re)pooling @ 75%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17155 and previous config saved to /var/cache/conftool/dbconfig/20210902-104518-root.json
* 10:38 mbsantos: REINDEX database gis in maps1009 while it's in depooled state
* 10:30 marostegui@cumin1001: dbctl commit (dc=all): 'db2073 (re)pooling @ 50%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17152 and previous config saved to /var/cache/conftool/dbconfig/20210902-103014-root.json
* 10:24 mvolz@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 10:23 mvolz@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'zotero' for release 'production' .
* 10:19 mvolz@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'zotero' for release 'staging' .
* 10:15 marostegui@cumin1001: dbctl commit (dc=all): 'db2073 (re)pooling @ 25%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17150 and previous config saved to /var/cache/conftool/dbconfig/20210902-101511-root.json
* 10:00 marostegui@cumin1001: dbctl commit (dc=all): 'db2073 (re)pooling @ 10%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17147 and previous config saved to /var/cache/conftool/dbconfig/20210902-100007-root.json
* 09:57 marostegui: Upgrade db2073
* 09:56 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2073 for upgrade', diff saved to https://phabricator.wikimedia.org/P17145 and previous config saved to /var/cache/conftool/dbconfig/20210902-095601-marostegui.json
* 09:56 hashar@deploy1002: Finished deploy [integration/docroot@973ac8a]: Support listing files on index pages - [[phab:T289196|T289196]] (duration: 00m 07s)
* 09:55 hashar@deploy1002: Started deploy [integration/docroot@973ac8a]: Support listing files on index pages - [[phab:T289196|T289196]]
* 09:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 100%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17142 and previous config saved to /var/cache/conftool/dbconfig/20210902-092026-root.json
* 09:05 marostegui@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 75%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17141 and previous config saved to /var/cache/conftool/dbconfig/20210902-090523-root.json
* 08:55 marostegui: Remove flaggedrevs_stats2 and flaggedrevs_stats from eowiki,idwiki,plwiki,trwiki - [[phab:T289050|T289050]]
* 08:50 marostegui@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 50%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17140 and previous config saved to /var/cache/conftool/dbconfig/20210902-085019-root.json
* 08:35 marostegui@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 25%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17138 and previous config saved to /var/cache/conftool/dbconfig/20210902-083515-root.json
* 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'db2140 (re)pooling @ 10%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17136 and previous config saved to /var/cache/conftool/dbconfig/20210902-082012-root.json
* 08:14 marostegui: Upgrade db2140
* 08:14 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2140 for upgrade', diff saved to https://phabricator.wikimedia.org/P17135 and previous config saved to /var/cache/conftool/dbconfig/20210902-081436-marostegui.json
* 07:57 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 07:51 filippo@cumin1001: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 07:44 marostegui: Remove flaggedrevs_stats2 and flaggedrevs_stats on huwiki - [[phab:T289050|T289050]]
* 07:44 marostegui: Remove flaggedrevs_stats2 and flaggedrevs_stats on arwiki - [[phab:T289050|T289050]]
* 07:06 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 07:00 marostegui: Stop mariadb on pc2007 before decommissioning [[phab:T289112|T289112]]
* 06:59 marostegui@deploy1002: Synchronized wmf-config/ProductionServices.php: Remove pc2007 [[phab:T289112|T289112]] (duration: 01m 06s)
* 06:13 eileen: civicrm revision changed from {{Gerrit|ad37f21a7d}} to {{Gerrit|7ac13753c7}}, config revision is {{Gerrit|5f004d94d7}}
* 04:50 marostegui: Remove flaggedrevs_stats2 and flaggedrevs_stats on ruwiki - [[phab:T289050|T289050]]
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:05 krinkle@deploy1002: Synchronized php-1.37.0-wmf.21/extensions/WikimediaMaintenance/blameStartupRegistry.php: {{Gerrit|I63bf1922af593b7a144ef5f6d036f9a5e23cec09}} (duration: 01m 07s)


== 2015-07-20 ==
== 2021-09-01 ==
* 23:43 gwicke: removed experimental nodes (1008, 1009) from system.peers on production C* nodes
* 23:50 Amir1: mwscript createAndPromote.php --wiki=test2wiki --sysop --force Ladsgroup
* 21:29 ejegg: updated fundraising/tools from 9a9e7881d25f101cc612cfae6375c0a1c9b0f55d to 3e0e3ae799a507b378d0ece3e71631b10b361329
* 23:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:55 XenoRyet: updated payments from ebb1a9e52172a4793cf5feb33220b4d7edfcad70 to 152a64a035a59e67b4469223b8f83609bae523a3
* 23:45 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:40 gwicke: (eevans, gwicke) removed *.hprof heap dumps from /var/lib/cassandra, freeing up a lot of space especially on 1004 & 1005
* 23:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 gwicke: deployed restbase 0951a6d to remaining nodes
* 23:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:55 gwicke: canary restbase deploy of 0951a6d on restbase1001
* 23:27 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php: {{Gerrit|0bd65426494d4df981141650211e27e17c98ee0c}}: fixLinkRecommendationData: stay under 10K search limit ([[phab:T284531|T284531]]) (duration: 01m 06s)
* 16:44 godog: powercycle mw1090, no console no anything
* 23:27 eileen: civicrm revision changed from {{Gerrit|30cd9c1d90}} to {{Gerrit|ad37f21a7d}}, config revision is {{Gerrit|5f004d94d7}}
* 15:31 ejegg: updated AstroPay curl timeout setting on payments to 12 seconds
* 23:25 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 05:32 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 20 05:32:31 UTC 2015 (duration 32m 30s)
* 23:24 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/GrowthExperiments/maintenance/fixLinkRecommendationData.php: {{Gerrit|3c7d4ecc699b7c68467a372686f5514375d2b74f}}: fixLinkRecommendationData: Allow --db-table in dry-run mode ([[phab:T283868|T283868]]) (duration: 01m 06s)
* 02:28 logmsgbot: LocalisationUpdate completed (1.26wmf14) at 2015-07-20 02:28:03+00:00
* 23:20 urbanecm@deploy1002: Synchronized wmf-config/extension-list: {{Gerrit|91ff9273fd9f80b571771a7454d34d63f43405b8}}: Enable NearbyPages on beta cluster ([[phab:T246493|T246493]]; 3/3) (duration: 01m 05s)
* 02:24 logmsgbot: l10nupdate Synchronized php-1.26wmf14/cache/l10n: (no message) (duration: 07m 07s)
* 23:19 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 20 02:07:34 UTC 2015 (duration 7m 33s)
* 23:18 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|91ff9273fd9f80b571771a7454d34d63f43405b8}}: Enable NearbyPages on beta cluster ([[phab:T246493|T246493]]; 2/3) (duration: 01m 06s)
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf14) at 2015-07-20 02:03:24+00:00
* 23:18 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:02 mutante: DNS update - adding language "azb" to langlist
* 23:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|91ff9273fd9f80b571771a7454d34d63f43405b8}}: Enable NearbyPages on beta cluster ([[phab:T246493|T246493]]; 1/3) (duration: 01m 06s)
* 23:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:15 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-timeline' for release 'main' .
* 23:11 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|bb7d92c48edf48b94fd628e9e0b5fd6682460373}}: Enable WVUI search on Wikimedia Commons ([[phab:T287215|T287215]]) (duration: 01m 07s)
* 23:04 dpifke@deploy1002: Finished deploy [performance/navtiming@63c9d31]: Deploy fix for CpuBenchmark-related Prometheus timeouts [[phab:T281243|T281243]] (duration: 00m 06s)
* 23:04 dpifke@deploy1002: Started deploy [performance/navtiming@63c9d31]: Deploy fix for CpuBenchmark-related Prometheus timeouts [[phab:T281243|T281243]]
* 22:44 legoktm@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 22:43 legoktm@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 22:43 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 22:43 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 22:42 legoktm@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 22:42 legoktm@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 22:40 legoktm@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 22:39 legoktm@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 22:35 legoktm@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 22:34 legoktm@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 22:33 legoktm@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 22:33 legoktm@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 22:32 legoktm@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 22:32 legoktm@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 22:30 legoktm@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 22:29 legoktm@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 20:02 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:01 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:57 twentyafterfour@deploy1002: Synchronized php: group1 wikis to 1.37.0-wmf.21  refs [[phab:T281161|T281161]] (duration: 01m 06s)
* 19:57 twentyafterfour: twentyafterfour@deploy1002 rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.21  refs [[phab:T281162|T281162]]
* 19:56 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.21  refs [[phab:T281161|T281161]]
* 18:38 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:37 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:21 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:19 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:17 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|fe1ae2e438841a069dc8dadc9a1850b91863c06a}}: Growth features: Deploy to 100% of newcomers on small wikis ([[phab:T289786|T289786]]) (duration: 01m 06s)
* 18:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:07 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|27e85b1f228dccb584b4692f5b1b1354b19625b4}}: nlwiki: Enable link recommendations for all Growth users ([[phab:T285254|T285254]]) (duration: 01m 06s)
* 18:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:05 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|94b1cca}}: Growth features: Enable for newcomers on two wikis ([[phab:T285254|T285254]], [[phab:T287867|T287867]]) (duration: 01m 09s)
* 17:31 ejegg: updated payments-wiki from {{Gerrit|c4d56178d0}} to {{Gerrit|f9cbf95a12}}
* 16:23 mforns@deploy1002: Finished deploy [analytics/refinery@ff15071] (thin): Fix for cassandra3 loading THIN [analytics/refinery@ff15071] (duration: 00m 06s)
* 16:23 mforns@deploy1002: Started deploy [analytics/refinery@ff15071] (thin): Fix for cassandra3 loading THIN [analytics/refinery@ff15071]
* 16:22 mforns@deploy1002: Finished deploy [analytics/refinery@ff15071]: Fix for cassandra3 loading [analytics/refinery@ff15071] (duration: 26m 58s)
* 16:06 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1066.eqiad.wmnet with reason: REIMAGE
* 16:04 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1065.eqiad.wmnet with reason: REIMAGE
* 16:02 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on ms-be1064.eqiad.wmnet with reason: REIMAGE
* 16:01 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1066.eqiad.wmnet with reason: REIMAGE
* 16:01 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1065.eqiad.wmnet with reason: REIMAGE
* 16:00 cmjohnson@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1064.eqiad.wmnet with reason: REIMAGE
* 15:55 mforns@deploy1002: Started deploy [analytics/refinery@ff15071]: Fix for cassandra3 loading [analytics/refinery@ff15071]
* 15:35 jiji@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:08 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 14:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:04 godog: move simone-this-dot from wmf to nda ldap group - [[phab:T289783|T289783]]
* 13:51 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2009.codfw.wmnet
* 13:49 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:48 krinkle@deploy1002: Synchronized php-1.37.0-wmf.20/includes/resourceloader: {{Gerrit|Id7c258841d7816}} (duration: 01m 06s)
* 13:46 krinkle@deploy1002: Synchronized php-1.37.0-wmf.21/includes/resourceloader: {{Gerrit|Id7c258841d7816}} (duration: 01m 49s)
* 13:45 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host rdb2009.codfw.wmnet
* 13:45 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:38 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:16 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 13:05 mutante: planet1002 - temp removing feed from ad.huikeshoven - seems to cause corrupt state file ([[phab:T289984|T289984]])
* 13:01 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 12:48 godog: s/webperf/navtiming/
* 12:47 godog: bounce webperf on webperf2001 - [[phab:T290138|T290138]]
* 12:41 mutante: planet1002 - rm /etc/rawdog/en/feeds/39a7970f.state (corrupt) [[phab:T289984|T289984]]
* 12:38 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 11:19 Krinkle: effie restarted php-fpm on parse2007.codfw.wmnet, ref [[phab:T290120|T290120]].
* 10:21 jbond: start filtering more puppet facts G:715461 - [[phab:T263578|T263578]]
* 09:23 marostegui: Drop flaggedrevs_stats and flaggedrevs_stats2 from dewiki [[phab:T289050|T289050]]
* 07:45 ema: deploy Varnish SLO dashboard with grr apply slo_dashboards.jsonnet [[phab:T289036|T289036]]
* 07:05 XioNoX: pfw NAT and ACLs changes - [[phab:T290077|T290077]]
* 06:29 elukey@cumin1001: END (PASS) - Cookbook sre.puppet.renew-cert (exit_code=0) for sodium.wikimedia.org: Renew puppet certificate - elukey@cumin1001
* 06:28 elukey@cumin1001: START - Cookbook sre.puppet.renew-cert for sodium.wikimedia.org: Renew puppet certificate - elukey@cumin1001
* 05:25 effie: depool mw2251 mw2255 parse2001 for tests - [[phab:T280497|T280497]]
* 04:41 marostegui: Optimize idwiki.flaggedtemplates [[phab:T290057|T290057]]
* 04:23 marostegui: Optimize arwiki.flaggedtemplates [[phab:T290057|T290057]]
* 04:16 eileen: civicrm revision changed from {{Gerrit|7da3eba4f9}} to {{Gerrit|30cd9c1d90}}, config revision is {{Gerrit|5f004d94d7}}
* 00:53 eileen: civicrm revision changed from {{Gerrit|e567b4c289}} to {{Gerrit|7da3eba4f9}}, config revision is {{Gerrit|5f004d94d7}}


== 2015-07-19 ==
== 2021-08-31 ==
* 20:52 logmsgbot: krenair Synchronized w/static/images/project-logos/arbcom_enwiki.png: https://gerrit.wikimedia.org/r/#/c/225822/ (duration: 00m 12s)
* 23:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:10 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: Ic0573f26: Follow-up for I189d748: whitelist 'archive.org' too (duration: 00m 12s)
* 23:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:06 logmsgbot: ori Synchronized wmf-config/InitialiseSettings.php: I189d748a: Whitelist *.archive.org for wgCopyUploadsDomains (T106293) (duration: 00m 13s)
* 23:38 eileen: civicrm revision changed from {{Gerrit|718aa9cad3}} to {{Gerrit|e567b4c289}}, config revision is {{Gerrit|7a24870bc7}}
* 18:29 logmsgbot: hoo Synchronized wmf-config/CommonSettings.php: Enable IP user page creation on fawiki's Draft ns (duration: 00m 11s)
* 23:33 dpifke@deploy1002: Synchronized wmf-config/profiler.php: Revert excimer-k8s pipelines [[phab:T288165|T288165]] (duration: 01m 14s)
* 18:18 logmsgbot: ori Synchronized php-1.26wmf14/includes/site/SiteSQLStore.php: I0e5f2d3b2: Use CACHE_ACCEL for SiteLists if on HHVM (duration: 00m 12s)
* 23:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:37 logmsgbot: ori Synchronized wmf-config: Ib508a440: Undeploy VectorBeta (Task: T87489) (duration: 00m 13s)
* 23:25 dpifke@deploy1002: scap failed: average error rate on 3/6 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details)
* 17:27 logmsgbot: krenair Synchronized w/static/images/project-logos/arbcom_enwiki.png: https://gerrit.wikimedia.org/r/#/c/225718/ (duration: 00m 12s)
* 23:23 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:21 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/225705/ (duration: 00m 12s)
* 23:15 mforns: failed deployment of refinery (v0.1.17) to an-test-coord1001.eqiad.wmnet (scap error)
* 17:14 logmsgbot: krenair Synchronized w/static/images/project-logos/arbcom_enwiki.png: https://gerrit.wikimedia.org/r/#/c/225705/ (duration: 00m 12s)
* 23:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:10 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 19 05:10:10 UTC 2015 (duration 10m 9s)
* 23:14 mforns@deploy1002: Finished deploy [analytics/refinery@a0f039b] (hadoop-test): Regular analytics weekly train TEST v0.1.17 [analytics/refinery@a0f039b] (duration: 13m 42s)
* 02:27 logmsgbot: LocalisationUpdate completed (1.26wmf14) at 2015-07-19 02:27:35+00:00
* 23:09 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|1437d99c1884c0695f02b81b724ec82a2bd3362e}}: Enable link recommendation frontent in dewiki and nlwiki ([[phab:T288420|T288420]], [[phab:T285254|T285254]]) (duration: 01m 06s)
* 02:23 logmsgbot: l10nupdate Synchronized php-1.26wmf14/cache/l10n: (no message) (duration: 07m 04s)
* 23:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 19 02:07:15 UTC 2015 (duration 7m 14s)
* 23:07 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|8997ae5d0b998839853aed2b246f5c88fe9d83eb}}: Fix wgDiscussionTools_sourcemodetoolbar settings (duration: 01m 22s)
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf14) at 2015-07-19 02:03:05+00:00
* 23:01 mforns@deploy1002: Started deploy [analytics/refinery@a0f039b] (hadoop-test): Regular analytics weekly train TEST v0.1.17 [analytics/refinery@a0f039b]
* 23:00 mforns@deploy1002: Finished deploy [analytics/refinery@a0f039b] (thin): Regular analytics weekly train THIN v0.1.17 [analytics/refinery@a0f039b] (duration: 00m 07s)
* 23:00 mforns@deploy1002: Started deploy [analytics/refinery@a0f039b] (thin): Regular analytics weekly train THIN v0.1.17 [analytics/refinery@a0f039b]
* 23:00 mforns@deploy1002: Finished deploy [analytics/refinery@a0f039b]: Regular analytics weekly train v0.1.17 [analytics/refinery@a0f039b] (duration: 17m 39s)
* 22:42 mforns@deploy1002: Started deploy [analytics/refinery@a0f039b]: Regular analytics weekly train v0.1.17 [analytics/refinery@a0f039b]
* 21:58 ejegg: switched Adyen to new Checkout integration
* 21:41 dduvall@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 21:38 dduvall@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'blubberoid' for release 'production' .
* 21:34 dduvall@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'blubberoid' for release 'staging' .
* 20:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:11 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:00 twentyafterfour@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.21  refs [[phab:T281161|T281161]]
* 19:20 brennen: gitlab1001: brief downtime for testing reconfiguration of cas3.session_duration
* 19:05 twentyafterfour@deploy1002: Finished scap: testwikis wikis to 1.37.0-wmf.21  refs [[phab:T281161|T281161]] (duration: 35m 53s)
* 19:03 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:56 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:40 ejegg: switched Adyen back to HPP integration
* 18:38 ejegg: updated payments-wiki from {{Gerrit|564daed816}} to {{Gerrit|c4d56178d0}}, switched Adyen to Checkout integration
* 18:30 twentyafterfour@deploy1002: Started scap: testwikis wikis to 1.37.0-wmf.21  refs [[phab:T281161|T281161]]
* 18:24 twentyafterfour: ran `scap prep 1.37.0-wmf.21` and `scap apply-patches --train 1.37.0-wmf.21` refs [[phab:T281162|T281162]]
* 18:05 XioNoX: re-pool eqsin-codfw link
* 16:18 dcausse@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 16:14 dcausse@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 16:08 hnowlan@deploy1002: Finished deploy [restbase/deploy@09156c2]: fix core Title redirect loop (duration: 16m 02s)
* 15:52 hnowlan@deploy1002: Started deploy [restbase/deploy@09156c2]: fix core Title redirect loop
* 14:30 jbond: enable puppet fleet wide to post preform puppetdb maintance [[phab:T263578|T263578]]
* 14:29 hashar: Restarting CI Jenkins for plugins upgrade
* 14:19 ottomata: merged change to service_auto_restart.pp that changes the way service names are matched to be more explicit.  tested in deployment prep and nothing bad happened.  Logging in case something bad does happen in prod. https://gerrit.wikimedia.org/r/c/operations/puppet/+/697605
* 14:09 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:09 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:07 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-main' for release 'production' .
* 14:05 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 14:05 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 14:03 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics' for release 'production' .
* 14:03 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:02 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:02 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on puppetdb2002.codfw.wmnet with reason: puppetdb maintance - [[phab:T289779|T289779]]
* 14:02 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on puppetdb2002.codfw.wmnet with reason: puppetdb maintance - [[phab:T289779|T289779]]
* 14:02 jbond@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on puppetdb1002.eqiad.wmnet with reason: puppetdb maintance - [[phab:T289779|T289779]]
* 14:02 jbond@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on puppetdb1002.eqiad.wmnet with reason: puppetdb maintance - [[phab:T289779|T289779]]
* 14:01 otto@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-analytics-external' for release 'production' .
* 14:00 otto@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:47 jbond: disable puppet fleet wide to preform puppetdb maintance [[phab:T263578|T263578]]
* 13:41 otto@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 13:37 urbanecm: Start `mwscript extensions/GrowthExperiments/maintenance/refreshLinkRecommendations.php --wiki=nlwiki --verbose` in a tmux session at mwmaint2002
* 13:28 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1010.eqiad.wmnet
* 13:06 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 13:04 jayme@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'eventgate-logging-external' for release 'production' .
* 12:59 urbanecm: [urbanecm@mwmaint2002 ~]$ sudo -u www-data kill 133282 # stop updateMenteeData.php at frwiki
* 12:52 jelto: run kubectl scale deployments.apps -n ci mediawiki-bruce --replicas=0 to stop ImagePulling and reduce io on kubestage1001
* 12:42 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue
* 12:42 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 5 days, 8:00:00 on planet1002.eqiad.wmnet with reason: known issue
* 11:38 jbond: sudo  gnt-instance modify --disk add:size=100G  puppetdb2002.codfw.wmnet [[phab:T263578|T263578]]
* 11:38 jbond: sudo gnt-instance modify --disk add:size=100G puppetdb1002.eqiad.wmnet [[phab:T263578|T263578]]
* 11:37 jbond: sudo  gnt-instance modify --disk add:size=100G  puppetdb2002.codfw.wmnet
* 11:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:33 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:31 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/GrowthExperiments/maintenance/updateMenteeData.php: {{Gerrit|53a1856128edb4ec3a5ea8840fb6755a1703f7ac}}: updateMenteeData: Send timing to statsd ([[phab:T278971|T278971]]) (duration: 00m 57s)
* 11:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:07 urbanecm: EU B&C window done
* 11:06 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|eb482e3fa88a87166b990fd9b87d0ccbbf971290}}: Offer the DiscussionTools reply tool as opt-out setting at 21 phase 2 Wikipedias ([[phab:T288483|T288483]]) (duration: 00m 57s)
* 10:38 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1010.eqiad.wmnet with reason: Resyncing from master
* 10:38 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on maps1010.eqiad.wmnet with reason: Resyncing from master
* 10:23 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1010.eqiad.wmnet
* 10:23 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1008.eqiad.wmnet
* 10:14 marostegui: Optimize huwiki.flaggedtemplates [[phab:T290057|T290057]]
* 10:11 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1008.eqiad.wmnet
* 08:39 marostegui: Optimize plwiki.flaggedtemplates [[phab:T290057|T290057]]
* 08:18 marostegui: Optimize cewiki.flaggedtemplates [[phab:T290057|T290057]]
* 08:05 marostegui: Optimize plwiktionary.flaggedtemplates [[phab:T290057|T290057]]
* 07:44 marostegui: Optimize ruwiki.flaggedtemplates [[phab:T290057|T290057]]
* 07:01 XioNoX: drain eqsin-codfw link
* 06:56 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 100%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17113 and previous config saved to /var/cache/conftool/dbconfig/20210831-065600-root.json
* 06:40 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 75%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17112 and previous config saved to /var/cache/conftool/dbconfig/20210831-064056-root.json
* 06:25 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 50%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17111 and previous config saved to /var/cache/conftool/dbconfig/20210831-062553-root.json
* 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 25%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17110 and previous config saved to /var/cache/conftool/dbconfig/20210831-061049-root.json
* 06:06 marostegui: Rename flaggedrevs_stats2 and flaggedrevs_stats on dewiki codfw [[phab:T289050|T289050]]
* 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'db2110 (re)pooling @ 10%: Slowly repool after reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17109 and previous config saved to /var/cache/conftool/dbconfig/20210831-055546-root.json
* 03:39 eileen: civicrm revision changed from {{Gerrit|e89504652a}} to {{Gerrit|718aa9cad3}}, config revision is {{Gerrit|cb0a008cad}}
* 02:33 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:31 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:04 eileen: tools revision changed from {{Gerrit|14e4125f73}} to {{Gerrit|1d67c52c12}}


== 2015-07-18 ==
== 2021-08-30 ==
* 20:58 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings-labs.php: labs only (duration: 00m 12s)
* 23:14 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:44 YuviPanda: restarted etherpad
* 23:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:56 akosiaris: reinstall labsdb1004
* 23:11 urbanecm: Evening B&C done
* 16:36 paravoid: Ganglia is up :)
* 23:11 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/GrowthExperiments/includes/Specials/SpecialMentorDashboard.php: {{Gerrit|9e2264a0c9a48548da4795b2a5b9d7275d254ac7}}: Instrument Special:MentorDashboard ([[phab:T289369|T289369]]) (duration: 00m 55s)
* 16:09 Krenair: Ganglia seems down
* 23:08 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/GrowthExperiments/includes/Specials/SpecialHomepage.php: {{Gerrit|9e2264a0c9a48548da4795b2a5b9d7275d254ac7}}: Instrument Special:MentorDashboard ([[phab:T289369|T289369]]) (duration: 00m 57s)
* 15:42 Krenair: Doing T44180
* 21:56 eileen: civicrm revision changed from {{Gerrit|13bf3a02df}} to {{Gerrit|e89504652a}}, config revision is {{Gerrit|cb0a008cad}}
* 05:28 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 18 05:28:25 UTC 2015 (duration 28m 24s)
* 19:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:34 logmsgbot: LocalisationUpdate completed (1.26wmf14) at 2015-07-18 02:34:29+00:00
* 19:57 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:30 logmsgbot: l10nupdate Synchronized php-1.26wmf14/cache/l10n: (no message) (duration: 07m 19s)
* 19:52 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|9a92e2ae7526717a0a42b825a34b4595e75a544b}}: Fix mediawiki.mentor_dashboard.visits definition (duration: 00m 56s)
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 18 02:07:38 UTC 2015 (duration 7m 37s)
* 19:08 tgr: morning deploys done for real
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf14) at 2015-07-18 02:03:29+00:00
* 19:06 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:715579{{!}}Fix schema definition for mediawiki.mentor_dashboard.visit (T289369)]] (duration: 00m 56s)
* 00:49 ejegg: restored recurring globalcollect batch size of 250
* 19:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:09 ejegg: updated civicrm from 78de1b9b74934984af3099afe9192fa53011bdaa to 292ad137f6b3ffc818a3bd617ca4f335931091f3
* 19:03 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:49 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: Revert: [[gerrit:715529{{!}}Add mediawiki.mentor_dashboard.visit schema (T289369)]] (duration: 00m 26s)
* 18:48 tgr@deploy1002: Scap failed!: 5/6 canaries failed their endpoint checks(https://en.wikipedia.org)
* 18:43 tgr: morning deploys done
* 18:43 tgr@deploy1002: scap failed: average error rate on 3/6 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/83629bcb5560d11e61d3085c89dd9ed6 for details)
* 18:41 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:22 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:715568{{!}}GrowthExperiments: Enable link recommendation for dewiki and nlwiki (T288420 T285254)]] (duration: 00m 56s)
* 18:18 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:14 tgr@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:714548{{!}}GrowthExperiments: Switch image recommendations flag off (T288797)]] (duration: 00m 57s)
* 17:44 ryankemper: [WDQS Deploy] Test query passing on `query.wikidata.org` and icinga looks good. This deploy is done.
* 17:12 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 17:12 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across both test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 17:12 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 17:10 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@a17833c]: 0.3.84 (duration: 08m 16s)
* 17:04 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.84` on canary `wdqs1003`; proceeding to rest of fleet
* 17:02 ryankemper@deploy1002: Started deploy [wdqs/wdqs@a17833c]: 0.3.84
* 17:02 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.84`. Pre-deploy tests passing on canary `wdqs1003`
* 17:00 ryankemper: [[phab:T289483|T289483]] Pooled `wdqs1013`
* 16:36 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti1024.eqiad.wmnet with reason: REIMAGE
* 16:34 dzahn@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti1024.eqiad.wmnet with reason: REIMAGE
* 16:20 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on maps1008.eqiad.wmnet with reason: Resyncing from master
* 16:20 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 1 day, 0:00:00 on maps1008.eqiad.wmnet with reason: Resyncing from master
* 16:20 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1008.eqiad.wmnet
* 16:20 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1007.eqiad.wmnet
* 16:16 sukhe: running authdns-update for Gerrit 715499
* 14:44 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 14:21 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1007.eqiad.wmnet with reason: Resyncing from master
* 14:21 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on maps1007.eqiad.wmnet with reason: Resyncing from master
* 14:21 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1007.eqiad.wmnet
* 14:21 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1006.eqiad.wmnet
* 14:18 akosiaris@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 14:02 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:00 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:55 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|b17015395cc592e021a4ca8ce6f81b699bb77381}}:  Growth mentor dashboard: Enable beta features only on beta wikis ([[phab:T280307|T280307]]) (duration: 00m 55s)
* 13:53 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|f1a178e1d4d7c98a1988da68982f97848f390c68}}: knwiki: Disable wmgNewUserMessageOnAutoCreate ([[phab:T289333|T289333]]) (duration: 00m 57s)
* 13:52 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:48 urbanecm@deploy1002: Synchronized wmf-config/CommonSettings.php: {{Gerrit|6fbcc93f429ff3fbca98aeecdee4f33f022ca7c3}}: Add missing edit*protected rights to $wgAvailableRights (duration: 00m 56s)
* 12:12 Amir1: ladsgroup@mwmaint2002:~$ mwscript extensions/WikimediaMaintenance/filebackend/setZoneAccess.php --wiki=jvwikisource --backend=local-multiwrite ([[phab:T289860|T289860]])
* 11:52 jelto@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 11:51 jelto@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:48 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 11:47 jelto@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 11:31 jelto@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:30 jelto@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 10:55 jelto@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 10:53 jelto@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 10:21 dcausse@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'rdf-streaming-updater' for release 'main' .
* 09:51 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:46 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 09:34 ladsgroup@deploy1002: Synchronized wmf-config/InitialiseSettings.php: Config: [[gerrit:703476{{!}}Set $wgIncludejQueryMigrate to false in group0 (T280944)]] (duration: 00m 57s)
* 09:01 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on maps1006.eqiad.wmnet with reason: Resyncing from master
* 09:01 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 5:00:00 on maps1006.eqiad.wmnet with reason: Resyncing from master
* 09:01 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 09:00 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 08:59 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1006.eqiad.wmnet
* 08:57 hnowlan@puppetmaster1001: conftool action : set/pooled=yes; selector: name=maps1005.eqiad.wmnet
* 08:57 godog: +100G to prometheus/global in codfw
* 08:04 vgutierrez: pool cp2027 - [[phab:T289908|T289908]]
* 06:53 elukey: drop an-airflow1001's old airflow logs to fix root partition almost filled up
* 06:38 godog: more weight to ms-be20[62-65] - [[phab:T288458|T288458]]
* 05:44 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2110.codfw.wmnet with reason: REIMAGE
* 05:42 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db2110.codfw.wmnet with reason: REIMAGE
* 05:23 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2110 for reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17105 and previous config saved to /var/cache/conftool/dbconfig/20210830-052336-marostegui.json


== 2015-07-17 ==
== 2021-08-29 ==
* 21:51 ejegg: updated civicrm from 0acac037ce0c9a64e94a475463deb2d47e84193a to 78de1b9b74934984af3099afe9192fa53011bdaa
* 00:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:53 matt_flaschen: Manually fixed issue in mediawikiwiki LQT thread table with rename of Ecliptica to Entropy. https://phabricator.wikimedia.org/T106122#1461380
* 00:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:03 hashar: stopping Zuul to get rid of a faulty registered function "build:Global-Dev Dashboard Data". Job is gone already.
* 17:50 ejegg: updated civicrm from fa724dd2e2e69545d81015c943cb7f52cf6de8e1 to 0acac037ce0c9a64e94a475463deb2d47e84193a
* 16:49 gwicke: restarted restbase on restbase1001
* 15:04 gwicke: restarted RB thinner scripts, see https://phabricator.wikimedia.org/T105706
* 14:10 urandom: restart restbase service on restbase1006
* 14:07 urandom: restart restbase service on restbase1003
* 14:05 urandom: restart restbase service on restbase1002
* 13:56 godog: apache2ctl graceful on fluorine antimony argon caesium helium
* 13:43 godog: apache2ctl graceful on netmon1001
* 11:24 hashar: rebooted labnodepool1001.eqiad.wmnet . Accidentally deleted the whole /dev which freeze everything :(
* 10:21 _joe_: repooling mw1158
* 09:08 _joe_: depooling mw1158, repooling mw1156,7
* 07:51 _joe_: depooled mw1156,7 for reimaging
* 04:53 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 17 04:53:56 UTC 2015 (duration 53m 55s)
* 03:31 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1030 (duration: 00m 12s)
* 02:30 logmsgbot: LocalisationUpdate completed (1.26wmf14) at 2015-07-17 02:30:03+00:00
* 02:26 logmsgbot: l10nupdate Synchronized php-1.26wmf14/cache/l10n: (no message) (duration: 05m 55s)
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Fri Jul 17 02:07:22 UTC 2015 (duration 7m 20s)
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf14) at 2015-07-17 02:03:12+00:00
* 01:30 mutante: git pull origin on strontium


== 2015-07-16 ==
== 2021-08-28 ==
* 21:27 ori: bounced nutcracker on mw1139 as well. hashar noticed flood of errors from these hosts on https://logstash.wikimedia.org/#/dashboard/elasticsearch/mediawiki-errors . lack of monitoring / alerts is troubling.
* 23:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:26 ori: bounced nutcracker on mw1128 and mw1134
* 23:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:50 mutante: iegreview tool - short maintenance downtime
* 09:12 elukey: powercycle cp2027 - OEM event registered in racadm getsel, no tty, no ssh
* 19:39 YuviPanda: imported aspell-id from ubuntu to jessie-wikimedia - needed by ores, simple package that I am not sure why it is not in jessie
* 09:11 elukey@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2027.codfw.wmnet
* 19:20 logmsgbot: twentyafterfour Synchronized php-1.26wmf14/includes/db/LoadMonitor.php: Deploying Hotfix for T105373 (duration: 00m 13s)
* 18:40 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: all wikis to 1.26wmf14
* 18:26 ejegg: changed batch size from 250 to 1 in RGC jenkins job
* 18:22 ejegg: updated civicrm from 24e0fc854433ea4982e94a0fd2f8bdad8f8dcad7 to fa724dd2e2e69545d81015c943cb7f52cf6de8e1
* 16:56 Jeff_Green: authdns update to rename lutetium.wm.o
* 16:08 hashar_: kept nodepool stopped on labnodepool1001.eqiad.wmnet because it spams the cron log
* 15:57 logmsgbot: demon Synchronized multiversion/MWMultiVersion.php: prod no-op, beta change (duration: 00m 13s)
* 15:54 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings-labs.php: https://gerrit.wikimedia.org/r/#/c/224975/ (duration: 00m 12s)
* 15:27 logmsgbot: thcipriani Synchronized php-1.26wmf14/extensions/Math/MathMathML.php: SWAT: Fix: Undefined variable passed hook [[gerrit:225058]] (duration: 00m 12s)
* 15:03 ejegg: updated payments from 4ca95d55a9745c05ccfbb16ee6f23a6f75328824 to ebb1a9e52172a4793cf5feb33220b4d7edfcad70
* 12:21 dcausse: es1.6 upgrade: all done
* 11:32 dcausse: restarted gmond on elastic1024
* 11:06 mobrovac: citoid deploying ff90869
* 10:56 dcausse: es1.6 upgrade: upgrade elastic1031
* 10:25 mobrovac: citoid rolled back to ffbaf6d
* 10:10 mobrovac: citoid deploying 5aeb0fc
* 10:05 dcausse: es1.6 upgrade: upgrade elastic1030
* 09:38 dcausse: es1.6 upgrade: upgrade elastic1029
* 08:42 dcausse: es1.6 upgrade: upgrade elastic1028
* 07:31 dcausse: es1.6 upgrade: upgrade elastic1027
* 07:22 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 16 07:22:49 UTC 2015 (duration 22m 48s)
* 05:53 dcausse: es1.6 upgrade: upgrade elastic1026
* 05:31 logmsgbot: krenair Synchronized wmf-config/interwiki.cdb: Updating interwiki cache (duration: 00m 12s)
* 05:24 logmsgbot: krenair Synchronized php-1.26wmf14/extensions/WikimediaMaintenance/dumpInterwiki.php: https://gerrit.wikimedia.org/r/#/c/225008/ (duration: 00m 13s)
* 04:38 logmsgbot: krenair Synchronized php-1.26wmf13/extensions/WikimediaMaintenance/dumpInterwiki.php: https://gerrit.wikimedia.org/r/#/c/225006/ (duration: 00m 13s)
* 03:54 manybubbles: es1.6 upgrade: upgrade elastic1025
* 03:19 logmsgbot: LocalisationUpdate completed (1.26wmf14) at 2015-07-16 03:19:37+00:00
* 03:13 logmsgbot: l10nupdate Synchronized php-1.26wmf14/cache/l10n: (no message) (duration: 10m 23s)
* 02:46 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-16 02:46:03+00:00
* 02:43 manybubbles: es1.6 upgrade: upgrade elastic1024
* 02:39 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 10m 50s)
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Thu Jul 16 02:07:55 UTC 2015 (duration 7m 54s)
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf14) at 2015-07-16 02:03:31+00:00
* 02:03 logmsgbot: LocalisationUpdate failed (1.26wmf13) at 2015-07-16 02:03:30+00:00
* 01:41 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/214981/ (duration: 00m 12s)
* 01:22 manybubbles: es1.6 upgrade: upgrade elastic1023


== 2015-07-15 ==
== 2021-08-27 ==
* 23:36 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/221885/ (duration: 00m 13s)
* 16:46 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master
* 23:22 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/209840/ (duration: 00m 12s)
* 16:46 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 3 days, 0:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master
* 23:16 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/194075/ (duration: 00m 12s)
* 14:50 akosiaris: stop flink on staging cluster to verify some IOPS starvation issues
* 23:10 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/224799/ (duration: 00m 13s)
* 14:46 akosiaris@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 23:09 logmsgbot: krenair Synchronized docroot/noc: https://gerrit.wikimedia.org/r/#/c/175755/ (duration: 00m 13s)
* 14:45 akosiaris@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 23:06 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/175755/ (duration: 00m 12s)
* 14:44 akosiaris@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 22:23 csteipp: deploy patch for T105305 to wmf13/14
* 14:44 akosiaris@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 22:06 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/223843/ (duration: 00m 12s)
* 14:44 akosiaris@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 21:59 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/222584/ (duration: 00m 13s)
* 14:44 akosiaris@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 21:54 manybubbles: es1.6 upgrade: upgrade elastic1022
* 14:39 hnowlan@puppetmaster1001: conftool action : set/pooled=no; selector: name=maps1005.eqiad.wmnet
* 21:37 manybubbles: es1.6 upgrade: upgrade elastic1021
* 14:38 hnowlan@cumin1001: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 21:09 logmsgbot: twentyafterfour Synchronized php-1.26wmf14: Really Sync If0237cdd0d66634d75b2bab8bc4292c0f3ef75ef this time (duration: 01m 32s)
* 14:37 hnowlan@cumin1001: START - Cookbook sre.postgresql.postgres-init
* 20:41 bblack: restarted salt-master service on palladium
* 14:30 hnowlan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master
* 20:33 bblack: globally cleaning up dangling symlinks left in /etc/certs from before Id7d2447 via salted 'find /etc/ssl/certs -type l -xtype l|xargs rm'
* 14:30 hnowlan@cumin1001: START - Cookbook sre.hosts.downtime for 4:00:00 on maps1005.eqiad.wmnet with reason: Resyncing from master
* 20:30 logmsgbot: twentyafterfour Synchronized php-1.26wmf14: Sync If0237cdd0d66634d75b2bab8bc4292c0f3ef75ef (revert Count API module instantiations and Hook runs) (duration: 01m 48s)
* 13:48 dzahn@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'miscweb' for release 'main' .
* 20:20 manybubbles: es1.6 upgrade: upgrade elastic1020
* 12:49 mutante: rsynced /srv/org/wikimedia/racktables from miscweb1002 to miscweb2002 ([[phab:T269746|T269746]])
* 20:18 RoanKattouw: Running FlowCreateMentionTemplate.php on all Flow wikis
* 12:04 topranks: removing peering to Wave Division Holdings / AS11404 at Equinix Chicago cr2-eqord, AS no longer on exchange.
* 20:06 logmsgbot: twentyafterfour rebuilt wikiversions.cdb and synchronized wikiversions files: group1 wikis to 1.26wmf14
* 10:56 akosiaris: sudo cumin 'mw*' 'ip ro ls dev docker0 && sysctl net.ipv4.ip_forward=0' to clear up the docker remnants of the dragonfly evaluation. [[phab:T286054|T286054]]
* 19:50 ejegg: updated civicrm from e29cc5f20b5069afcaff794e628596c1f70d69a3 to 24e0fc854433ea4982e94a0fd2f8bdad8f8dcad7
* 10:31 godog: bounce logstash on logstash1007
* 19:06 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/224408/ (duration: 00m 12s)
* 10:22 elukey: fallback codfw ores to rdb2007 after maintenance
* 19:01 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/222792/ (duration: 00m 13s)
* 10:18 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2007.codfw.wmnet
* 19:00 logmsgbot: krenair Synchronized wmf-config/wikitech.php: https://gerrit.wikimedia.org/r/#/c/222792/ (duration: 00m 12s)
* 10:12 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host rdb2007.codfw.wmnet
* 18:58 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/222776/ (duration: 00m 13s)
* 09:49 elukey: restart ores uwsgi/celery workers to failover rdb2007 to rdb2008 (and ease the reboot of rdb2007
* 18:57 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/222776/ (duration: 00m 13s)
* 09:33 topranks: Running homer against mr1-ulsfo to force OOB interface to 100Mb/full-duplex - [[phab:T288343|T288343]]
* 18:40 ejegg: updated civicrm from f4219bc8eca5e4db633da07b6ac9e2505cfbae16 to e29cc5f20b5069afcaff794e628596c1f70d69a3
* 09:25 cmooney@cumin1001: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) homer to cumin2002.codfw.wmnet with reason: Update to expose int type from Netbox - cmooney@cumin1001
* 18:39 logmsgbot: krenair Synchronized wmf-config/throttle.php: throttle labswiki account creations from hackathon at 500 (duration: 00m 12s)
* 09:25 cmooney@cumin1001: START - Cookbook sre.deploy.python-code homer to cumin2002.codfw.wmnet with reason: Update to expose int type from Netbox - cmooney@cumin1001
* 18:39 logmsgbot: twentyafterfour Finished scap: group0 to 1.26wmf14 (duration: 32m 34s)
* 09:23 cmooney@deploy1002: Finished deploy [homer/deploy@8183056]: Homer update exposing interface type from Netbox - [[phab:T288343|T288343]] (duration: 01m 28s)
* 18:21 manybubbles: es1.6 upgrade: upgrading elastic1019
* 09:21 cmooney@deploy1002: Started deploy [homer/deploy@8183056]: Homer update exposing interface type from Netbox - [[phab:T288343|T288343]]
* 18:20 Jeff_Green: authdns-update shifting to service-oriented hostnames for fundraising cluster
* 08:05 tstarling@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/SecurePoll/cli/wm-scripts/sendMail.php: (no justification provided) (duration: 00m 56s)
* 18:06 logmsgbot: twentyafterfour Started scap: group0 to 1.26wmf14
* 07:49 jayme: stopped kube-apiserver on kubestagemaster2001 for testing
* 17:55 ejegg: updated civicrm from 6560cefa8d7e68e35e30b310d6691ab57798a4c9 to f4219bc8eca5e4db633da07b6ac9e2505cfbae16
* 07:49 jayme: stopped kube-apiserver on kubestage2001 for testing
* 17:34 Jeff_Green: authdns-update to remove boron.wm.o
* 07:00 godog: bounce logstash on logstash1008
* 17:22 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: partially revert https://gerrit.wikimedia.org/r/#/c/224420/1/wmf-config/CommonSettings.php - doesnt quite work (duration: 00m 13s)
* 06:43 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:17 Jeff_Green: authdns-update to remove aluminium, also lanthanum by preexisting commit
* 06:41 tstarling@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/SecurePoll/cli/wm-scripts/sendMail.php: (no justification provided) (duration: 00m 56s)
* 16:45 andrewbogott: rebooting labvirt1005
* 06:41 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:43 mutante: accepting unaccepted salt keys for ganeti VMs ,planet, bromine, krypton
* 00:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:39 mutante: krypton - signing puppet cert, initial run
* 00:44 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:26 andrewbogott: woo, first try!
* 00:44 legoktm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/PageTriage/: Revert backbone.js and underscore.js updates ([[phab:T289825|T289825]]) (duration: 01m 06s)
* 16:23 andrewbogott: trying to kill labvirt1005 via repeated instance suspend/resume
* 16:04 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/224420/ (duration: 00m 12s)
* 16:03 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/224420/ (duration: 00m 12s)
* 16:01 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/224808/ (duration: 00m 12s)
* 15:58 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/222581/ (duration: 00m 11s)
* 15:35 logmsgbot: krenair Synchronized database lists: (no message) (duration: 00m 11s)
* 15:29 logmsgbot: krenair Synchronized docroot/noc/createTxtFileSymlinks.sh: https://gerrit.wikimedia.org/r/#/c/139326/ (duration: 00m 12s)
* 15:27 logmsgbot: krenair Synchronized wmf-config/CommonSettings.php: https://gerrit.wikimedia.org/r/#/c/139326/ (duration: 00m 12s)
* 15:20 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/139326/ (duration: 00m 11s)
* 14:33 logmsgbot: legoktm Synchronized wmf-config/CommonSettings.php: Set $wgCentralAuthStrict = true; (duration: 00m 12s)
* 14:22 legoktm: sync failed on mw1090.eqiad.wmnet, read only filesystem
* 14:20 logmsgbot: legoktm Synchronized php-1.26wmf13/extensions/CentralAuth/includes/CentralAuthPlugin.php: Add log entry for $wgCentralAuthStrict failures if SULMigration is enabled (duration: 00m 13s)
* 13:55 dcausse: es1.6 upgrade: upgrade elastic1018
* 13:24 springle: entry below not mw1216 fault, but r/o filesystem error on mw1090
* 13:15 springle: sync-common on mw1216 after sync-file from tin failed non-zero exit status 12
* 13:12 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: depool db1022 T105879 (duration: 00m 12s)
* 11:43 dcausse: es1.6 upgrade: upgrade elastic1017
* 08:27 dcausse: es1.6 upgrade: upgrade elastic1016
* 06:31 dcausse: es1.6 upgrade: upgrade elastic1015
* 05:40 dcausse: es1.6 upgrade: upgrade elastic1014
* 05:10 springle: db1030 busy removing table partitioning
* 04:28 manybubbles: es1.6 upgrade: lowered the shard transfer settings back to our normal rate. going to bed.
* 04:12 manybubbles: es1.6 upgrade: upgrade elastic1013
* 03:49 springle: upgrade db1030 trusty
* 03:29 manybubbles: es1.6 upgrade: upgrade elastic1012
* 03:14 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-15 03:14:21+00:00
* 03:10 logmsgbot: reedy Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 13m 32s)
* 03:03 manybubbles: es1.6 upgrade: raised limits on shard migration rate - should speed up the restart. we should lower it before we do restarts during europe's morning
* 02:10 Reedy: Running LU manually to see what's wrong with it
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Wed Jul 15 02:07:48 UTC 2015 (duration 7m 47s)
* 02:02 logmsgbot: LocalisationUpdate failed (1.26wmf13) at 2015-07-15 02:02:55+00:00


== 2015-07-14 ==
== 2021-08-26 ==
* 23:46 manybubbles: es1.6 upgrade: upgraded elastic1011
* 22:06 legoktm: restarted mailman3-web on lists1001 ([[phab:T289798|T289798]])
* 23:22 bblack: updating nginx to 1.9.3-1+wmf1 on cp*
* 19:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:17 bblack: reprepro: nginx for jessie-wikimedia/main bumped to 1.9.3-1+wmf1
* 19:08 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:22 ejegg: updated civicrm from 04efc7d5c7bbb068f907125f2184692aee676123 to 6560cefa8d7e68e35e30b310d6691ab57798a4c9
* 19:02 dancy@deploy1002: rebuilt and synchronized wikiversions files: group2 wikis to 1.37.0-wmf.20
* 21:29 Reedy: mw1090 fs is ro
* 18:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:28 logmsgbot: reedy rebuilt wikiversions.cdb and synchronized wikiversions files: Fix testwiki
* 18:54 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 21:05 _joe|AFK: depooling mw1090, ext4 errors in syslog, filesystem mounted read-only
* 18:26 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 21:01 logmsgbot: twentyafterfour Synchronized wmf-config/CommonSettings.php: revert LCStoreStaticArray (duration: 00m 12s)
* 18:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:59 logmsgbot: twentyafterfour Finished scap: testwiki to 1.26wmf14 and rebuild localization cache (duration: 72m 45s)
* 18:19 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|66717bc039f40336144dcc0dfd97ff5331b418e9}}: Install Extension Quiz on ja.wikibooks ([[phab:T289383|T289383]]) (duration: 01m 05s)
* 20:42 bblack: undoing LCStoreStaticArray because appservers look unhealthy, using ori's command: 'salt -G deployment_target:scap/scap cmd.run "rm /etc/lcstore"'
* 18:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:46 logmsgbot: twentyafterfour Started scap: testwiki to 1.26wmf14 and rebuild localization cache
* 18:16 sukhe@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on durum1001.eqiad.wmnet with reason: testing out durum
* 19:23 manybubbles: es1.6 step iforget: upgrade elasticsearch on elastic1010
* 18:16 sukhe@cumin1001: START - Cookbook sre.hosts.downtime for 0:30:00 on durum1001.eqiad.wmnet with reason: testing out durum
* 17:41 mutante: terbium:  /usr/local/bin/foreachwiki extensions/Echo/maintenance/processEchoEmailBatch.php
* 18:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:10 dcausse: es1.6 step 10: upgrade elastic1009
* 18:15 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|cde88918b73628f2eaaff919ddb869b4dc2c93c6}}: Install Extension Quiz on fa.wikibooks ([[phab:T289381|T289381]]) (duration: 01m 07s)
* 16:23 mutante: bromine - apt-get upgrade
* 18:09 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:08 logmsgbot: manybubbles Synchronized php-1.26wmf13/extensions/UniversalLanguageSelector/: SWAT add some hooks to extension.json (duration: 00m 13s)
* 18:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:34 gwicke: started RESTBase revision thin-out script for html and data-parsoid on wikimedia domains
* 18:03 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|d4340e9c18468d14885c8ced87f1e014a3481f2a}}: Finalize Event Platform migration of EchoEmail and EchoInteraction ([[phab:T287210|T287210]]) (duration: 01m 07s)
* 14:01 dcausse: es1.6 step 9: upgrade elastic1008
* 17:40 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:48 _joe_: reimaging mw1155
* 17:38 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 12:17 ori: Logging a message at https://wikitech.wikimedia.org/wiki/Server_Admin_Log.
* 17:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:28 dcausse: es1.6 step 8: upgrade elastic1007
* 17:30 dancy@deploy1002: Synchronized php: group1 wikis to 1.37.0-wmf.20 (duration: 01m 05s)
* 11:25 _joe_: repooling mw1154 with HHVM
* 17:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:12 _joe_: stopped poolcounter on mw1154
* 17:29 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.20
* 10:06 _joe_: reimaging mw1154
* 17:26 dancy@deploy1002: Synchronized php-1.37.0-wmf.20/includes/page/PageStore.php: Backport: [[gerrit:714864{{!}}PageStore: Pass query flags to getPageById() too (T289717 T195069)]] (duration: 01m 05s)
* 07:49 dcausse: es1.6 step 7: upgrade elastic1006
* 16:27 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 07:09 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 14 07:09:10 UTC 2015 (duration 9m 9s)
* 16:26 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 06:48 dcausse: es1.6 step 6: upgrade elastic1005
* 15:56 sukhe: ran homer for Gerrit 715007: Set up BGP peering to durum1001 in eqiad
* 06:41 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: I9c9bf0f4: Use LCStoreStaticArray unconditionally (duration: 03m 02s)
* 15:41 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 05:26 ori: Cleaned up now-unused hhbc files from /run/hhvm/cache on job runners
* 15:40 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 04:58 ori: Enabling LCStoreStaticArray in production. May be reverted by running: 'salt -G deployment_target:scap/scap cmd.run "rm /etc/lcstore"' on palladium.
* 14:24 Amir1: start of mwscript extensions/FlaggedRevs/maintenance/pruneRevData.php --wiki=plwiki --prune --batch-size=10 --sleep=2 ([[phab:T289249|T289249]])
* 04:48 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Follow-up for Ieb62ee050e: allow LCStoreStaticArray in server mode (duration: 00m 13s)
* 13:19 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1004.eqiad.wmnet
* 02:35 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-14 02:35:21+00:00
* 13:15 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve1004.eqiad.wmnet
* 02:31 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 07m 27s)
* 13:04 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1003.eqiad.wmnet
* 02:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Tue Jul 14 02:07:32 UTC 2015 (duration 7m 30s)
* 12:59 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve1003.eqiad.wmnet
* 02:02 logmsgbot: LocalisationUpdate failed (1.26wmf13) at 2015-07-14 02:02:33+00:00
* 12:57 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 01:22 logmsgbot: springle Synchronized wmf-config/db-eqiad.php: repool db1037; depool db1030 (duration: 00m 13s)
* 12:56 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:21 sukhe: running puppet initial run on durum1001.eqiad.wmnet - [[phab:T289536|T289536]]
* 11:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:42 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:40 Lucas_WMDE: EU backport+config window done
* 11:40 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:39 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.37.0-wmf.19/extensions/Math/src/HookHandlers/ParserHooksHandler.php: Backport: [[gerrit:714853{{!}}Allow rendering of <nowiki><math>0</math></nowiki> (T288846)]] (duration: 01m 04s)
* 11:35 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/Math/src/HookHandlers/ParserHooksHandler.php: Backport: [[gerrit:714854{{!}}Allow rendering of <nowiki><math>0</math></nowiki> (T288846)]] (duration: 01m 05s)
* 11:32 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum1001.eqiad.wmnet
* 11:21 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm for new host durum1001.eqiad.wmnet
* 11:20 nikerabbit@deploy1002: Synchronized wmf-config/CommonSettings.php: Config: [[gerrit:714770{{!}}Rename wgTranslateBlacklist to wgTranslateDisabledTargetLanguages]] (duration: 01m 05s)
* 11:13 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:12 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:09 vgutierrez: rolling restart of varnishkafka-statsv - [[phab:T289618|T289618]]
* 10:07 vgutierrez: disable puppet on cp-text to merge {{Gerrit|I52cf2a573980e33487d1f05f19b192ae7d13d717}} - [[phab:T286038|T286038]]
* 10:06 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1001.eqiad.wmnet
* 10:01 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve1001.eqiad.wmnet
* 09:36 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1002.eqiad.wmnet
* 09:30 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve1002.eqiad.wmnet
* 09:24 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve-ctrl1001.eqiad.wmnet
* 09:21 elukey: elukey@kafka-main1001:~$ kafka acls --add --allow-principal User:CN=varnishkafka --producer --topic statsv - [[phab:T286038|T286038]]
* 09:21 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-serve-ctrl1001.eqiad.wmnet
* 09:20 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd1003.eqiad.wmnet
* 09:17 elukey: restart varnishkafka-statsv on cp4032 to pick up TLS settings
* 09:15 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-etcd1003.eqiad.wmnet
* 09:15 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd1002.eqiad.wmnet
* 09:13 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-etcd1002.eqiad.wmnet
* 09:12 klausman@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd1001.eqiad.wmnet
* 09:10 klausman@cumin1001: START - Cookbook sre.hosts.reboot-single for host ml-etcd1001.eqiad.wmnet
* 08:52 vgutierrez: restart varnishkafka-statsv on cp4032
* 06:59 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1138.eqiad.wmnet with reason: REIMAGE
* 06:57 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1138.eqiad.wmnet with reason: REIMAGE
* 06:48 godog: more weight to ms-be20[62-65] - [[phab:T288458|T288458]]
* 06:46 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1160 [[phab:T288273|T288273]]', diff saved to https://phabricator.wikimedia.org/P17085 and previous config saved to /var/cache/conftool/dbconfig/20210826-064655-marostegui.json
* 06:43 marostegui: Reimage s4 eqiad master (db1138),  expect lag on eqiad [[phab:T288803|T288803]]
* 06:37 elukey@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:33 elukey@cumin1001: START - Cookbook sre.dns.netbox


== 2015-07-13 ==
== 2021-08-25 ==
* 23:22 logmsgbot: catrope Synchronized php-1.26wmf13/extensions/VisualEditor: SWAT (duration: 00m 11s)
* 23:23 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 23:11 logmsgbot: catrope Synchronized php-1.26wmf13/extensions/Flow/includes/Parsoid/Utils.php: Add title to Parsoid exception logging (duration: 00m 12s)
* 23:22 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:45 logmsgbot: legoktm Synchronized wmf-config: Revert "Set $wgCentralAuthStrict = true;" (duration: 00m 13s)
* 23:20 urbanecm: Evening B&C window completed
* 22:41 logmsgbot: legoktm Synchronized wmf-config/CommonSettings.php: Set $wgCentralAuthStrict = true; (duration: 00m 13s)
* 23:19 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/GlobalWatchlist/modules/EntryLog.js: {{Gerrit|230aec3fe7f3d0e325882a5fc926e9f3e4e86717}}: GlobalWatchlistEntryLog: fix storing log id ([[phab:T288385|T288385]]) (duration: 01m 07s)
* 22:41 logmsgbot: legoktm Synchronized wmf-config/InitialiseSettings.php: Set $wgCentralAuthStrict = true; (duration: 00m 12s)
* 22:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:16 logmsgbot: legoktm Synchronized php-1.26wmf13/includes/User.php: Add 'AuthPluginStrict' log to identify users who are unable to authenticate (duration: 00m 13s)
* 22:18 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:15 logmsgbot: legoktm Synchronized php-1.26wmf13/includes/api/ApiMain.php: Revert "Revert "Revert Count API module instantiations and Hook runs"" (duration: 00m 12s)
* 22:11 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:15 logmsgbot: legoktm Synchronized php-1.26wmf13/includes/Hooks.php: Revert "Revert "Revert Count API module instantiations and Hook runs"" (duration: 00m 13s)
* 22:10 legoktm@deploy1002: Synchronized debug.json: List primary DC servers first ([[phab:T289246|T289246]]) (duration: 01m 04s)
* 22:13 ejegg: updated payments from ec34ebf61e5962f66b807abdcb519ff323d41e8e to 4ca95d55a9745c05ccfbb16ee6f23a6f75328824
* 22:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 22:00 manybubbles: es1.6 step 4: upgrade elastic1003
* 22:07 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/Flow/includes/Content/BoardContent.php: {{Gerrit|694b94657d251df64145e8153b269094bba75be9}}: BoardContent: Fix deprecation warning ([[phab:T289625|T289625]]) (duration: 01m 04s)
* 21:54 ori: Debugging metric issue on graphite1001, brief stats drop possible
* 22:04 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/VisualEditor/includes/ApiVisualEditor.php: {{Gerrit|73478bc9c72286123cef69e57e0aef9e745dcff9}}: Make sure params is an array ([[phab:T289730|T289730]]) (duration: 01m 04s)
* 21:32 legoktm: renaming ~3k users who were originally missed for SULF
* 22:00 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 21:08 logmsgbot: ori Synchronized php-1.26wmf13/includes/Hooks.php: (no message) (duration: 00m 12s)
* 21:59 brennen: 1.37.0-wmf.20 train status ([[phab:T281161|T281161]]) blockers should be patched shortly; as we've reached the 15:00 Pacific deploy cutoff for the day, train will resume first thing in US morning
* 21:08 logmsgbot: ori Synchronized php-1.26wmf13/includes/api/ApiMain.php: (no message) (duration: 00m 13s)
* 21:58 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 20:42 logmsgbot: ori Synchronized php-1.26wmf13/includes/api/ApiMain.php: f9c89d2814: Revert "Revert Count API module instantiations and Hook runs" (duration: 00m 13s)
* 21:37 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:30 logmsgbot: ori Synchronized wmf-config/CommonSettings.php: Ieb62ee05: Temporary hack to facilitate migration of l10n cache implementations (duration: 00m 11s)
* 21:36 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:42 hoo: Updated Wikidata's property suggester with data from today's json dump
* 21:35 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/DiscussionTools/includes/Notifications/EventDispatcher.php: {{Gerrit|cc04b33dec6b9aed1d7621957c4de527266600d1}}: EventDispatcher: Try really, really hard to read from master ([[phab:T289717|T289717]]) (duration: 01m 04s)
* 19:24 manybubbles_: es1.6 step 3: upgrade elastic1002
* 21:32 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/includes/page/PageStore.php: {{Gerrit|34fb2b99104d0a2bda8aa202f4cdeb07cb983531}}: PageStore: Pass query flags to getPageByName() ([[phab:T289717|T289717]]; [[phab:T195069|T195069]]) (duration: 01m 06s)
* 19:08 legoktm: running populateContentModel.php --table=page on all small wikis
* 21:19 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:01 andrewbogott: two of two
* 21:17 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:01 mutante: morebots - are you 1.7.11 ?
* 21:14 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/ConfirmEdit/SimpleCaptcha/SimpleCaptcha.php: {{Gerrit|190d8b7579af981cf2f5e4a6d9457ee0a7edca3f}}: Use Parser::getUserIdentity() instead of ::getUser() in SimpleCaptcha ([[phab:T289731|T289731]]) (duration: 01m 05s)
* 19:01 andrewbogott: one of two
* 21:05 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:52 legoktm: running populateContentModel.php --table=page on testwiki
* 21:04 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:29 manybubbles_: es1.6 step 2: shut down extra instance of elasticsearch on elastic1021
* 21:03 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/ProofreadPage/: {{Gerrit|913043a5ca7982e07ab0c01f88076af866a43cc3}}: Fixes exception thrown by FilePagination::getPageNumber ([[phab:T289728|T289728]]) (duration: 01m 06s)
* 17:39 andrewbogott: this is the second test log of three
* 20:02 brennen: 1.37.0-wmf.20 ([[phab:T281161|T281161]]) status: blocked at group0; 2/3 blockers have probable patches, all seem to be getting attention, so holding off on blocker mail for now.
* 17:39 andrewbogott: this is the first test log of three
* 19:54 urbanecm: enwikisource: Start server-side upload for one video file ([[phab:T289698|T289698]])
* 17:36 mutante: included adminbot_1.7.11 in APT repo
* 19:45 urbanecm: Start server-side upload for ~2 GB tiff file ([[phab:T289711|T289711]])
* 16:31 andrewbogott: wikidata-dev updated local puppet and rebooting property-suggester
* 19:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:08 logmsgbot: krenair Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/224087/ (duration: 00m 12s)
* 19:29 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 16:07 logmsgbot: krenair Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/224087/ (duration: 00m 12s)
* 19:28 dancy@deploy1002: Synchronized php: group1 wikis to 1.37.0-wmf.19 (duration: 01m 05s)
* 15:11 manybubbles_: all done SWATing.
* 19:27 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.19
* 15:09 logmsgbot: manybubbles Synchronized wmf-config/InitialiseSettings.php: SWAT enable footer contact link on ukwiki (duration: 00m 11s)
* 19:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:55 manybubbles_: after upgrading elasticsearch its init script no longer shuts down the old version of elasticsearch. so you have to manually kill it. that means the upgrade instructions will be "special" this time around. hopefully this is a one time thing.
* 19:16 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:45 manybubbles_: es1.6 step 1: upgrade elasticsearch on elastic1001 -starting
* 19:14 dancy@deploy1002: Synchronized php: group1 wikis to 1.37.0-wmf.20 (duration: 01m 04s)
* 14:45 manybubbles_: es1.6 step 0: successfully synced new versions of plugins
* 19:13 dancy@deploy1002: rebuilt and synchronized wikiversions files: group1 wikis to 1.37.0-wmf.20
* 14:30 manybubbles_: es1.6 step 0: sync new versions of plugins
* 19:10 eileen: tools revision changed from {{Gerrit|15bfaa7117}} to {{Gerrit|14e4125f73}}
* 14:30 manybubbles_: starting the elasticsearch 1.6.0 upgrade
* 18:43 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:13 bblack: updating nginx/bind on cp*
* 18:42 robh@cumin1001: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:07 bblack: updating openssl on cp*
* 18:34 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 13:02 logmsgbot: krenair Synchronized php-1.26wmf13/extensions/Cite/extension.json: https://gerrit.wikimedia.org/r/#/c/224407/ - unbreak VE mobile, https://phabricator.wikimedia.org/T105686 (duration: 00m 12s)
* 18:32 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:58 mobrovac: restbase deploying 6dec79d
* 18:30 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 10:22 logmsgbot: ori Synchronized php-1.26wmf13/maintenance/rebuildLocalisationCache.php: 117f60a171: rebuildLocalisationCache: don't limit memory usage (duration: 00m 12s)
* 18:25 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:52 godog: bounce graphite-web on graphite1001
* 18:25 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/Flow/modules/editor/editors/visualeditor/ui/inspectors/mw.flow.ve.ui.MentionInspector.js: {{Gerrit|dd464b4522effbfabea371f8b95b0b25d53da43e}}: Fix reference to renamed abortAllApiRequests method ([[phab:T289648|T289648]]) (duration: 01m 04s)
* 08:51 godog: bounce carbon daemons on graphite1001
* 18:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 08:50 godog: upgrade graphite to 0.9.13 on graphite1001 and bounce one instance of carbon/cache
* 18:23 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.20/skins/WikimediaApiPortal/src/Component/NotificationAlertComponent.php: {{Gerrit|a5bfcc8def96ad1b44fff31c4c1965311be2982a}}: Remove call to text() on string ([[phab:T289692|T289692]]) (duration: 01m 04s)
* 07:29 logmsgbot: ori Synchronized php-1.26wmf13/includes/cache/LCStoreStaticArray.php: I3f63594a4: Fix variable name (follows Ib2c5856d) (duration: 00m 11s)
* 18:23 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 06:25 logmsgbot: LocalisationUpdate failed: git pull of core failed
* 18:18 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e7c8c041faa974585128c48631522a401fb3d41d}}: Add Wikimedia ES to $wgCopyUploadsDomains whitelist ([[phab:T289446|T289446]]) (duration: 01m 04s)
* 06:24 ori: Experimenting with altering the localisation cache implementation for testwiki, operations/mediawiki-config on tin will have a local hack for a little bit
* 18:17 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 05:07 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 13 05:07:32 UTC 2015 (duration 7m 31s)
* 18:16 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|e6df0803e4eaca91bd725bcd376b260b97917de3}}: Disable legacy media dom on a few more wikis ([[phab:T51097|T51097]]) (duration: 01m 05s)
* 02:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Mon Jul 13 02:25:58 UTC 2015 (duration 25m 57s)
* 18:15 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:23 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-13 02:23:43+00:00
* 18:15 robh@cumin1001: START - Cookbook sre.dns.netbox
* 02:20 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 06m 16s)
* 18:13 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 02:10 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-13 02:10:25+00:00
* 18:12 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:10 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 00m 34s)
* 18:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 01:47 springle: restarted labsdb1002 mysqld while troubleshooting replication
* 18:07 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 18:07 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|5182ac88263f23c15a3b10d0f3bc2e492fe425d5}}: Disable upcoming DiscussionTools automatic topic subscriptions for now (duration: 01m 04s)
* 18:06 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 18:06 cmjohnson@cumin1001: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 18:05 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|2b14eb525e99008d5103a93c5bd01f75211dca99}}: Enable topic subscriptions as a beta feature on Wikipedias except enwiki ([[phab:T287801|T287801]]) (duration: 01m 06s)
* 18:04 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:59 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:56 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:56 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:53 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:52 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:50 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:48 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:48 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:46 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.37.0-wmf.19/extensions/Wikibase/repo/includes/Content/EntityHandler.php: Backport: [[gerrit:714674{{!}}Set EntityHandler::generateHTMLOnEdit to false (T285987)]] (duration: 01m 06s)
* 17:45 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:38 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:31 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:30 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:29 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:27 lucaswerkmeister-wmde@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/Wikibase: Backport: [[gerrit:714677{{!}}Return normalized snaks from SetClaim, SetReference (T289501)]] (duration: 01m 11s)
* 17:24 cmjohnson@cumin1001: START - Cookbook sre.dns.netbox
* 17:14 ryankemper: [[phab:T289483|T289483]] Depooled `wdqs1013`
* 17:14 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/Wikibase/repo/includes/Content/EntityHandler.php: Backport: [[gerrit:714675{{!}}Set EntityHandler::generateHTMLOnEdit to false (T285987)]] (duration: 01m 18s)
* 17:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 17:11 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:54 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 15:54 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 15:22 urbanecm: Run `User::newSystemUser( 'MediaWiki default', ['steal' => true] )` in mywiki shell.php session (same issue as [[phab:T289690|T289690]])
* 15:16 urbanecm: [urbanecm@mwmaint2002 ~]$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php --wiki=zh_yuewiki growthexperiments # [[phab:T289680|T289680]]
* 15:04 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2004.codfw.wmnet
* 15:04 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:02 urbanecm@deploy1002: Synchronized php-1.37.0-wmf.19/extensions/GrowthExperiments/includes/Config/WikiPageConfigWriter.php: {{Gerrit|0b9ca1e11c1f0397847d4cfc7bc86220b6ebe9f6}}: WikiPageConfigWriter: Fix `autopatrol` right name ([[phab:T288886|T288886]]) (duration: 01m 04s)
* 15:02 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:00 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|0ccac4b2816f01c4b035aa51cbe4651c715632e0}}: Deploy Growth features to 44 new Wikipedias in dark mode ([[phab:T289680|T289680]]; 3/3) (duration: 01m 06s)
* 14:59 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2004.codfw.wmnet
* 14:58 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2003.codfw.wmnet
* 14:56 urbanecm@deploy1002: Synchronized wmf-config/config/: {{Gerrit|0ccac4b2816f01c4b035aa51cbe4651c715632e0}}: Deploy Growth features to 44 new Wikipedias in dark mode ([[phab:T289680|T289680]]; 2/3) (duration: 01m 05s)
* 14:55 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:55 urbanecm@deploy1002: Synchronized dblists/growthexperiments.dblist: {{Gerrit|0ccac4b2816f01c4b035aa51cbe4651c715632e0}}: Deploy Growth features to 44 new Wikipedias in dark mode ([[phab:T289680|T289680]]; 1/3) (duration: 01m 06s)
* 14:54 urbanecm@deploy1002: sync-file aborted: {{Gerrit|0ccac4b2816f01c4b035aa51cbe4651c715632e0}}: Deploy Growth features to 44 new Wikipedias in dark mode ([[phab:T289680|T289680]]) (duration: 00m 01s)
* 14:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:52 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2003.codfw.wmnet
* 14:52 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2002.codfw.wmnet
* 14:46 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2002.codfw.wmnet
* 14:42 urbanecm: [urbanecm@mwmaint2002 /srv/mediawiki/php]$ mwscript extensions/GrowthExperiments/maintenance/initWikiConfig.php --wiki=brwiki # [[phab:T289690|T289690]], [[phab:T289680|T289680]]
* 14:40 urbanecm: Run `User::newSystemUser( 'MediaWiki default', ['steal' => true] )` in brwiki shell.php session ([[phab:T289690|T289690]])
* 14:35 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve2001.codfw.wmnet
* 14:32 urbanecm: mwmaint2002: scap pull # clearing temporary config changes
* 14:30 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve2001.codfw.wmnet
* 14:29 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve-ctrl2002.codfw.wmnet
* 14:26 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve-ctrl2002.codfw.wmnet
* 14:25 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve-ctrl2001.codfw.wmnet
* 14:23 urbanecm: [urbanecm@mwmaint2002 /srv/mediawiki/php]$ foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/initWikiConfig.php # [[phab:T289680|T289680]] # r714765 applied at mwmaint2002
* 14:22 urbanecm: Apply https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/714765/ at mwmaint2002 temporarily ([[phab:T289680|T289680]])
* 14:21 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-serve-ctrl2001.codfw.wmnet
* 14:20 urbanecm: Create GrowthExperiments DB tables for wikis listed in P17081 ([[phab:T289680|T289680]])
* 14:20 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd2003.codfw.wmnet
* 14:18 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-etcd2003.codfw.wmnet
* 14:17 klausman@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd2002.codfw.wmnet
* 14:15 klausman@cumin2002: START - Cookbook sre.hosts.reboot-single for host ml-etcd2002.codfw.wmnet
* 14:12 klausman@cumin2001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-etcd2001.codfw.wmnet
* 14:10 ejegg: updated fundraising CiviCRM from {{Gerrit|d60442e119}} to {{Gerrit|13bf3a02df}}
* 14:08 klausman@cumin2001: START - Cookbook sre.hosts.reboot-single for host ml-etcd2001.codfw.wmnet
* 13:59 volans@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:05:00 on cumin2001.codfw.wmnet with reason: apostrophe's test failure
* 13:59 volans@cumin1001: START - Cookbook sre.hosts.downtime for 0:05:00 on cumin2001.codfw.wmnet with reason: apostrophe's test failure
* 13:57 ejegg: updated fundraising CiviCRM from {{Gerrit|42bb64c608}} to {{Gerrit|d60442e119}}
* 13:53 volans@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:05:00 on cumin1001.eqiad.wmnet with reason: apostrophe's test
* 13:53 volans@cumin2002: START - Cookbook sre.hosts.downtime for 0:05:00 on cumin1001.eqiad.wmnet with reason: apostrophe's test
* 13:51 volans: upgraded spicerack to 0.0.58 on cumin2002
* 13:37 joal@deploy1002: Finished deploy [analytics/refinery@7bed213] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@7bed213] (duration: 05m 55s)
* 13:32 joal@deploy1002: Started deploy [analytics/refinery@7bed213] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@7bed213]
* 13:31 joal@deploy1002: Finished deploy [analytics/refinery@7bed213] (thin): Regular analytics weekly train THIN [analytics/refinery@7bed213] (duration: 00m 07s)
* 13:31 joal@deploy1002: Started deploy [analytics/refinery@7bed213] (thin): Regular analytics weekly train THIN [analytics/refinery@7bed213]
* 13:31 joal@deploy1002: Finished deploy [analytics/refinery@7bed213]: Regular analytics weekly train [analytics/refinery@7bed213] (duration: 20m 25s)
* 13:10 joal@deploy1002: Started deploy [analytics/refinery@7bed213]: Regular analytics weekly train [analytics/refinery@7bed213]
* 13:03 jayme: restarted all pods in kube-system namespace in codfw k8s cluster - [[phab:T289131|T289131]]
* 12:25 kormat@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:21 kormat@cumin1001: START - Cookbook sre.dns.netbox
* 11:39 jayme: slowly restarting all pods in kube-system namespace in eqiad k8s cluster - [[phab:T289131|T289131]]
* 11:38 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host an-test-coord1002.eqiad.wmnet
* 11:32 kharlan@deploy1002: Synchronized php-1.37.0-wmf.20/extensions/VisualEditor/includes/ApiVisualEditorEdit.php: Backport: [[gerrit:714670{{!}}ApiVisualEditorEdit: data-<nowiki>{</nowiki>plugin<nowiki>}</nowiki> is not multi (T289652)]] (duration: 01m 06s)
* 11:30 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:28 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 11:18 volans: uploaded spicerack_0.0.58 to apt.wikimedia.org buster-wikimedia,bullseye-wikimedia
* 11:02 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2010.codfw.wmnet
* 10:57 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host rdb2010.codfw.wmnet
* 10:54 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2008.codfw.wmnet
* 10:49 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.19/includes/Storage/DerivedPageDataUpdater.php: Backport: [[gerrit:714672{{!}}Introduce concept of generateHTMLOnEdit() for ContentHandler (T285987)]], Part II (duration: 01m 04s)
* 10:47 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.19/includes/content/ContentHandler.php: Backport: [[gerrit:714672{{!}}Introduce concept of generateHTMLOnEdit() for ContentHandler (T285987)]], Part I (duration: 01m 08s)
* 10:46 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:45 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host rdb2008.codfw.wmnet
* 10:45 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:21 jbond: rolling out openssl updates
* 10:07 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:05 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 10:03 ladsgroup@deploy1002: Synchronized php-1.37.0-wmf.20/includes: Backport: [[gerrit:714671{{!}}Introduce concept of generateHTMLOnEdit() for ContentHandler (T285987)]] (duration: 02m 17s)
* 10:01 mutante: - removed jmads from wmf group
* 09:59 btullis@cumin1001: START - Cookbook sre.ganeti.makevm for new host an-test-coord1002.eqiad.wmnet
* 09:49 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb1011.eqiad.wmnet
* 09:44 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host rdb1011.eqiad.wmnet
* 09:35 jayme@deploy1002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 09:35 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb1012.eqiad.wmnet
* 09:35 jayme@deploy1002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:35 jayme@deploy1002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:34 jayme@deploy1002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 09:30 jiji@cumin1001: START - Cookbook sre.hosts.reboot-single for host rdb1012.eqiad.wmnet
* 08:59 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc2033.codfw.wmnet with reason: REIMAGE
* 08:57 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2033.codfw.wmnet with reason: REIMAGE
* 08:17 godog: swift codfw add ms-be20[62-65] with initial weight - [[phab:T288458|T288458]]
* 07:01 marostegui@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on db1160.eqiad.wmnet with reason: REIMAGE
* 06:59 marostegui@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on db1160.eqiad.wmnet with reason: REIMAGE
* 06:43 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1160 for reimage [[phab:T288803|T288803]]', diff saved to https://phabricator.wikimedia.org/P17078 and previous config saved to /var/cache/conftool/dbconfig/20210825-064319-marostegui.json
* 06:08 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on db2118.codfw.wmnet with reason: Reimaging [[phab:T288244|T288244]]
* 06:08 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 2 days, 0:00:00 on db2118.codfw.wmnet with reason: Reimaging [[phab:T288244|T288244]]
* 06:07 kormat@cumin1001: dbctl commit (dc=all): 'Depool db2118 until it's reimaged to buster [[phab:T289129|T289129]]', diff saved to https://phabricator.wikimedia.org/P17077 and previous config saved to /var/cache/conftool/dbconfig/20210825-060742-kormat.json
* 06:02 kormat@cumin1001: dbctl commit (dc=all): 'Promote db2121 to s7 primary and set section read-write [[phab:T289129|T289129]]', diff saved to https://phabricator.wikimedia.org/P17076 and previous config saved to /var/cache/conftool/dbconfig/20210825-060222-kormat.json
* 06:01 kormat@cumin1001: dbctl commit (dc=all): 'Set s7 codfw as read-only for maintenance - [[phab:T289129|T289129]]', diff saved to https://phabricator.wikimedia.org/P17075 and previous config saved to /var/cache/conftool/dbconfig/20210825-060112-kormat.json
* 06:00 kormat: Starting s7 codfw failover from db2118 to db2121 - [[phab:T289129|T289129]]
* 05:33 eileen: civicrm revision changed from {{Gerrit|a4ce949828}} to {{Gerrit|42bb64c608}}, config revision is {{Gerrit|1afcea7f5b}}
* 05:28 kormat: Moving s7 codfw replicas under db2121 - [[phab:T289129|T289129]]
* 05:27 kormat@cumin1001: dbctl commit (dc=all): 'Set db2121 with weight 0 [[phab:T289129|T289129]]', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20210825-052741-kormat.json
* 05:27 kormat@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:04:00 on 27 hosts with reason: Primary switchover s7 [[phab:T289129|T289129]]
* 05:27 kormat@cumin1001: START - Cookbook sre.hosts.downtime for 1:04:00 on 27 hosts with reason: Primary switchover s7 [[phab:T289129|T289129]]
* 02:06 eileen: civicrm revision changed from {{Gerrit|8ed303f2d1}} to {{Gerrit|a4ce949828}}, config revision is {{Gerrit|ac2d75d4a8}}
* 00:53 legoktm@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 00:50 legoktm@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 00:47 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .


== 2015-07-12 ==
== 2021-08-24 ==
* 14:59 bblack: upgraded most packages on sodium
* 22:05 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox' for release 'main' .
* 14:48 bblack: upgraded apache2 to 2.2.22-1ubuntu1.9 on: antimony argon caesium fluorine helium iodine logstash1001 logstash1003 magnesium neon netmon1001 rhodium stat1001 ytterbium
* 22:04 legoktm@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'shellbox-constraints' for release 'main' .
* 04:49 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 12 04:49:08 UTC 2015 (duration 49m 7s)
* 21:10 tgr: running extensions/GrowthExperiments/maintenance/revalidateLinkRecommendations.php on various wikis per [[phab:T282873|T282873]]#7303828
* 02:26 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-12 02:26:52+00:00
* 20:59 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sun Jul 12 02:25:33 UTC 2015 (duration 25m 32s)
* 20:55 urbanecm@deploy1002: Synchronized wmf-config/InitialiseSettings.php: {{Gerrit|a6fd96b15e6e3c068c2faac60208b9722d32af0f}}: Growth features: Promote 9 wikis out of dark mode ([[phab:T287871|T287871]]; [[phab:T287874|T287874]]; [[phab:T287872|T287872]]; [[phab:T287880|T287880]]; [[phab:T287868|T287868]]; [[phab:T287873|T287873]]; [[phab:T287879|T287879]]; [[phab:T287875|T287875]]; [[phab:T287876|T287876]]) (duration: 01m 25s)
* 02:23 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 06m 12s)
* 20:54 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:10 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-12 02:10:00+00:00
* 20:43 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:09 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 00m 34s)
* 20:35 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:35 dancy@deploy1002: Pruned MediaWiki: 1.37.0-wmf.17 (duration: 01m 48s)
* 20:33 dancy@deploy1002: Pruned MediaWiki: 1.37.0-wmf.18 (duration: 03m 26s)
* 20:27 dancy@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.37.0-wmf.20
* 20:18 dancy@deploy1002: Finished scap: testwikis wikis to 1.37.0-wmf.20 (duration: 36m 32s)
* 20:16 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 20:09 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 19:41 dancy@deploy1002: Started scap: testwikis wikis to 1.37.0-wmf.20
* 17:23 mbsantos@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 17:19 mbsantos@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mobileapps' for release 'production' .
* 17:17 mbsantos@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'mobileapps' for release 'staging' .
* 15:26 dcausse@deploy1002: Finished deploy [wikimedia/discovery/analytics@e02c602]: transfer_to_es: stop adding data to article_topics (duration: 02m 17s)
* 15:23 dcausse@deploy1002: Started deploy [wikimedia/discovery/analytics@e02c602]: transfer_to_es: stop adding data to article_topics
* 15:15 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 15:13 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 14:55 jayme@deploy1002: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:54 jayme@deploy1002: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:50 jayme@deploy1002: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 jayme@deploy1002: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 14:23 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2031.codfw.wmnet with reason: REIMAGE
* 14:19 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2031.codfw.wmnet with reason: REIMAGE
* 13:12 XioNoX: push pfw policies - [[phab:T289353|T289353]]
* 12:45 vgutierrez: enable puppet on P:tlsproxy::envoy hosts - merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/710507/9
* 12:37 vgutierrez: disable puppet on P:tlsproxy::envoy hosts - merging https://gerrit.wikimedia.org/r/c/operations/puppet/+/710507/9
* 12:33 godog: test patched python3-eventlet on thanos-fe1003 - [[phab:T283714|T283714]]
* 12:30 marostegui: Install 10.4.21 on clouddb1015
* 11:27 jiji@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on mc2029.codfw.wmnet with reason: REIMAGE
* 11:24 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2029.codfw.wmnet with reason: REIMAGE
* 09:08 jbond: upload new statograph version
* 09:02 elukey@deploy1002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 09:02 elukey@deploy1002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:54 Amir1: start of mwscript extensions/FlaggedRevs/maintenance/pruneRevData.php --wiki=dewiki --prune --batch-size=5 --sleep=5 ([[phab:T289249|T289249]])
* 08:51 Amir1: start of mwscript extensions/FlaggedRevs/maintenance/pruneRevData.php --wiki=arwiki --prune --batch-size=5 --sleep=5 ([[phab:T289249|T289249]])
* 08:01 godog: temp fix thanos-swift.discovery.wmnet in /etc/hosts to get swift-dispersion-stats to work - [[phab:T283714|T283714]]
* 07:51 dcausse: repool wdqs1012 [[phab:T289551|T289551]]
* 07:29 dcausse: restarting blazegraph on wdqs1012
* 07:17 marostegui: Optimize huwiki.flaggedtemplates on db1127
* 07:15 marostegui: Optimize huwiki.flaggedtemplates on db1098:3317
* 06:17 jiji@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mc2037.codfw.wmnet with reason: REIMAGE
* 06:14 jiji@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on mc2037.codfw.wmnet with reason: REIMAGE
* 03:51 rzl: rzl@wdqs1012:~$ sudo depool
* 03:46 legoktm: wdqs1012 restarted prometheus-blazegraph-exporter-wdqs-blazegraph.service and prometheus-blazegraph-exporter-wdqs-categories.service after apparent exceptions/crashes
* 02:08 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:06 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 00:17 ryankemper: [WDQS Deploy] Restarting `wdqs-categories` across lvs-managed hosts, one node at a time: `sudo -E cumin -b 1 'A:wdqs-all and not A:wdqs-test' 'depool && sleep 45 && systemctl restart wdqs-categories && sleep 45 && pool'`
* 00:17 ryankemper: [WDQS Deploy] Restarted `wdqs-categories` across all test hosts simultaneously: `sudo -E cumin 'A:wdqs-test' 'systemctl restart wdqs-categories'`
* 00:17 ryankemper: [WDQS Deploy] Restarted `wdqs-updater` across all hosts, 4 hosts at a time: `sudo -E cumin -b 4 'A:wdqs-all' 'systemctl restart wdqs-updater'`
* 00:16 ryankemper@deploy1002: Finished deploy [wdqs/wdqs@da9efa9]: 0.3.83 (duration: 07m 05s)
* 00:10 ryankemper: [WDQS Deploy] Tests passing following deploy of `0.3.83` on canary `wdqs1003`; proceeding to rest of fleet
* 00:09 ryankemper@deploy1002: Started deploy [wdqs/wdqs@da9efa9]: 0.3.83
* 00:08 ryankemper: [WDQS Deploy] Gearing up for deploy of wdqs `0.3.83`. Pre-deploy tests passing on canary `wdqs1003`


== 2015-07-11 ==
== 2021-08-23 ==
* 19:48 jynus: stopping labsdb1002 after table corruption has been detected
* 23:41 ryankemper: [[phab:T285355|T285355]] `helmfile -e staging -i apply` on `/srv/deployment-charts/helmfile.d/services/linkrecommendation/` from `ryankemper@deploy1002`
* 19:37 urandom: from restbase1002, starting revision culling process (node thin_out_key_rev_value_data.js `hostname -i` local_group_wikimedia_T_parsoid_html 2>&1 | tee >(gzip -c > local_group_wikimedia_T_parsoid_html.log.`date +%s`.gz))
* 23:40 ryankemper@deploy1002: helmfile [staging] Ran 'sync' command on namespace 'linkrecommendation' for release 'staging' .
* 19:33 urandom: restbase: setting gc_grace_seconds to 604800 (1 week) on local_group_wikipedia_T_parsoid_html.data
* 18:56 tgr: morning deploys done
* 04:55 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 11 04:55:56 UTC 2015 (duration 55m 55s)
* 18:56 tgr@deploy1002: Synchronized php-1.37.0-wmf.19/extensions/GrowthExperiments: Backport: [[gerrit:714158{{!}}Add Link: store when tasks were generated (T284551)]] (duration: 00m 57s)
* 04:21 bd808: Logstash cluster upgrade complete! Kibana working again
* 18:49 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:21 bd808: Upgraded Elasticsearch to 1.6.0 on logstash1006
* 18:47 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:12 bd808: rebooting logstash1006
* 18:35 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 04:06 bd808: logstash1005 fully recovered all shards
* 18:34 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 03:21 logmsgbot: mattflaschen Synchronized php-1.26wmf13/extensions/Flow/includes/Parsoid/Utils.php: Bump Flow to encode page name when sending to Parsoid (duration: 00m 13s)
* 18:27 dancy@deploy1002: Synchronized wmf-config/etcd.php: Config: [[gerrit:713907{{!}}wmfSetupEtcd only supports array input]] (duration: 00m 57s)
* 02:28 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-11 02:28:18+00:00
* 18:27 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:25 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 06m 07s)
* 18:24 mwdebug-deploy@deploy1002: helmfile [eqiad] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:25 logmsgbot: LocalisationUpdate ResourceLoader cache refresh completed at Sat Jul 11 02:25:19 UTC 2015 (duration 25m 18s)
* 18:23 dancy@deploy1002: Synchronized wmf-config: Config: [[gerrit:713906{{!}}Use array format to specify etcd server]] (duration: 00m 57s)
* 02:09 logmsgbot: LocalisationUpdate completed (1.26wmf13) at 2015-07-11 02:09:45+00:00
* 18:12 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
* 02:09 logmsgbot: l10nupdate Synchronized php-1.26wmf13/cache/l10n: (no message) (duration: 00m 35s)
* 18:12 dancy@deploy1002: Synchronized wmf-config/etcd.php: Config: [[gerrit:713704{{!}}Allow protocol for etcd server to be specified]] (