You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Server Admin Log"

From Wikitech
Jump to navigation Jump to search
imported>Stashbot
(catrope@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/GrowthExperiments/: Various topic search-related cherry-picks (duration: 00m 57s))
imported>Stashbot
(bstorm_: restarted maintain-dbusers on labstore1004 after recovering the m5 DB's connection issue)
Line 1: Line 1:
 +
== 2020-01-15 ==
 +
* 00:22 bstorm_: restarted maintain-dbusers on labstore1004 after recovering the m5 DB's connection issue
 +
* 00:12 bstorm_: set max_connections to 600 temporarily while troubleshooting on m5 (db1133)
 +
 
== 2020-01-14 ==
 
== 2020-01-14 ==
 +
* 20:11 milimetric@deploy1001: Finished deploy [analytics/aqs/deploy@1cf0530]: Increment service-runner to latest version (duration: 04m 48s)
 +
* 20:07 milimetric@deploy1001: Started deploy [analytics/aqs/deploy@1cf0530]: Increment service-runner to latest version
 +
* 19:22 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: {{Gerrit|e400916}}: [wikitech] Restore contentadmin ability to manage abuse filters (duration: 01m 05s)
 +
* 18:11 vgutierrez: repooling cp5012
 +
* 18:06 vgutierrez: depool cp5012 for some ats parent select debugging
 +
* 17:43 vgutierrez: repooling cp4027
 +
* 17:39 vgutierrez: depooling cp4027 for some ats-tls parent balancing tests
 +
* 17:21 _joe_: upload docker-report 0.0.2 to <nowiki>{</nowiki>buster,stretch<nowiki>}</nowiki>-wikimedia [[phab:T242604|T242604]]
 +
* 16:53 liw@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.15
 +
* 16:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
 +
* 16:44 liw: branch is cut for 1.35.0-wmv.15; train window is closed, but I'll continue train since the next time slot seems to not have anything
 +
* 16:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
 +
* 16:41 marostegui: Enable puppet back on install1002 and install2002 - [[phab:T242481|T242481]]
 +
* 16:31 liw@deploy1001: Finished scap: testwiki to php-1.34.0-wmf.15 and rebuild l10n cache (try 2) (duration: 43m 29s)
 +
* 16:26 marostegui: Disable temporarily puppet on install1002 and install2002 - [[phab:T242481|T242481]]
 +
* 16:08 volans@deploy1001: Finished deploy [debmonitor/deploy@e72911c]: Release v0.2.4 (duration: 01m 09s)
 +
* 16:07 volans@deploy1001: Started deploy [debmonitor/deploy@e72911c]: Release v0.2.4
 +
* 15:47 liw@deploy1001: Started scap: testwiki to php-1.34.0-wmf.15 and rebuild l10n cache (try 2)
 +
* 15:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
 +
* 15:02 marostegui: Copy data from db1080 to db1107 [[phab:T242702|T242702]]
 +
* 15:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080 for tranfer', diff saved to https://phabricator.wikimedia.org/P10144 and previous config saved to /var/cache/conftool/dbconfig/20200114-150223-marostegui.json
 +
* 15:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
 +
* 14:51 liw@deploy1001: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_44869219" --threads=30 --lang en  --quiet' returned non-zero exit status 1 (duration: 03m 55s)
 +
* 14:47 liw@deploy1001: Started scap: testwiki to php-1.35.0-wmf.15 and rebuild l10n cache
 +
* 14:43 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10143 and previous config saved to /var/cache/conftool/dbconfig/20200114-144341-marostegui.json
 +
* 14:26 marostegui: Move db1114 under db1080
 +
* 14:24 marostegui: Stop db1080 and db1107 replication in sync
 +
* 14:21 XioNoX: push firewall policies to pfw3-eqiad - [[phab:T242681|T242681]]
 +
* 14:15 XioNoX: push firewall policies to pfw3-codfw - [[phab:T242681|T242681]]
 +
* 14:12 liw: branch cut for 1.35.0-wmf.15
 +
* 14:09 vgutierrez: upgrade ats to 8.0.5-1wm12 in cp5006 and cp5012 - [[phab:T242620|T242620]]
 +
* 14:03 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
 +
* 14:03 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
 +
* 13:54 marostegui: Upgrade db1080
 +
* 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080 for upgrade', diff saved to https://phabricator.wikimedia.org/P10142 and previous config saved to /var/cache/conftool/dbconfig/20200114-135238-marostegui.json
 +
* 12:16 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir3002.esams.wmnet
 +
* 12:16 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir3001.esams.wmnet
 +
* 12:14 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir4001.ulsfo.wmnet
 +
* 12:14 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir4002.ulsfo.wmnet
 +
* 12:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
 +
* 12:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
 +
* 12:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
 +
* 12:01 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
 +
* 11:51 vgutierrez: restarting pybal on lvs4005 (high-traffic1 LVS) - [[phab:T242321|T242321]]
 +
* 11:49 vgutierrez: restarting pybal on lvs4007 (secondary LVS) - [[phab:T242321|T242321]]
 +
* 11:48 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir4002.ulsfo.wmnet
 +
* 11:47 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir4001.ulsfo.wmnet
 +
* 11:15 vgutierrez: Updating puppet-compiler facts
 +
* 10:40 vgutierrez: upgrade ats to 8.0.5-1wm12 in cp4026 and cp4032 - [[phab:T242620|T242620]]
 +
* 10:07 moritzm: installing remaining cyrus-sasl security updates
 +
* 09:44 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/Wikibase/lib/includes/Store/Sql/Terms: [[gerrit:564555{{!}}wbterms: Add Statsd metrics in critical parts of the new term store]] (duration: 00m 57s)
 +
* 07:33 XioNoX: add peering to AS26744 in eqiad, eqord and eqdfw
 +
* 06:25 marostegui: Deploy schema change on flowdb (x1) directly on the master [[phab:T242688|T242688]]
 +
* 06:23 marostegui: Deploy schema change on labswiki (wikitech) [[phab:T242688|T242688]]
 +
* 06:20 marostegui: Deploy schema change on s3 master for officewiki and techconductwiki [[phab:T242688|T242688]]
 +
* 06:01 marostegui: Remove partitions from revision table on db1103:3312
 +
* 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312 - [[phab:T239453|T239453]]', diff saved to https://phabricator.wikimedia.org/P10141 and previous config saved to /var/cache/conftool/dbconfig/20200114-060116-marostegui.json
 +
* 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312 after removing partitions from revision table', diff saved to https://phabricator.wikimedia.org/P10140 and previous config saved to /var/cache/conftool/dbconfig/20200114-060003-marostegui.json
 +
* 05:29 andrewbogott: rebooting cloudservices1004 to make sure all my upgrades are sustainable
 
* 01:03 catrope@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/GrowthExperiments/: Various topic search-related cherry-picks (duration: 00m 57s)
 
* 01:03 catrope@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/GrowthExperiments/: Various topic search-related cherry-picks (duration: 00m 57s)
  

Revision as of 00:22, 15 January 2020

2020-01-15

  • 00:22 bstorm_: restarted maintain-dbusers on labstore1004 after recovering the m5 DB's connection issue
  • 00:12 bstorm_: set max_connections to 600 temporarily while troubleshooting on m5 (db1133)

2020-01-14

  • 20:11 milimetric@deploy1001: Finished deploy [analytics/aqs/deploy@1cf0530]: Increment service-runner to latest version (duration: 04m 48s)
  • 20:07 milimetric@deploy1001: Started deploy [analytics/aqs/deploy@1cf0530]: Increment service-runner to latest version
  • 19:22 urbanecm@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: e400916: [wikitech] Restore contentadmin ability to manage abuse filters (duration: 01m 05s)
  • 18:11 vgutierrez: repooling cp5012
  • 18:06 vgutierrez: depool cp5012 for some ats parent select debugging
  • 17:43 vgutierrez: repooling cp4027
  • 17:39 vgutierrez: depooling cp4027 for some ats-tls parent balancing tests
  • 17:21 _joe_: upload docker-report 0.0.2 to {buster,stretch}-wikimedia T242604
  • 16:53 liw@deploy1001: rebuilt and synchronized wikiversions files: group0 to 1.35.0-wmf.15
  • 16:46 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 16:44 liw: branch is cut for 1.35.0-wmv.15; train window is closed, but I'll continue train since the next time slot seems to not have anything
  • 16:44 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
  • 16:41 marostegui: Enable puppet back on install1002 and install2002 - T242481
  • 16:31 liw@deploy1001: Finished scap: testwiki to php-1.34.0-wmf.15 and rebuild l10n cache (try 2) (duration: 43m 29s)
  • 16:26 marostegui: Disable temporarily puppet on install1002 and install2002 - T242481
  • 16:08 volans@deploy1001: Finished deploy [debmonitor/deploy@e72911c]: Release v0.2.4 (duration: 01m 09s)
  • 16:07 volans@deploy1001: Started deploy [debmonitor/deploy@e72911c]: Release v0.2.4
  • 15:47 liw@deploy1001: Started scap: testwiki to php-1.34.0-wmf.15 and rebuild l10n cache (try 2)
  • 15:02 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 15:02 marostegui: Copy data from db1080 to db1107 T242702
  • 15:02 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080 for tranfer', diff saved to https://phabricator.wikimedia.org/P10144 and previous config saved to /var/cache/conftool/dbconfig/20200114-150223-marostegui.json
  • 15:00 marostegui@cumin1001: START - Cookbook sre.hosts.downtime
  • 14:51 liw@deploy1001: scap failed: CalledProcessError Command '/usr/local/bin/mwscript rebuildLocalisationCache.php --wiki="testwiki" --outdir="/tmp/scap_l10n_44869219" --threads=30 --lang en --quiet' returned non-zero exit status 1 (duration: 03m 55s)
  • 14:47 liw@deploy1001: Started scap: testwiki to php-1.35.0-wmf.15 and rebuild l10n cache
  • 14:43 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1080', diff saved to https://phabricator.wikimedia.org/P10143 and previous config saved to /var/cache/conftool/dbconfig/20200114-144341-marostegui.json
  • 14:26 marostegui: Move db1114 under db1080
  • 14:24 marostegui: Stop db1080 and db1107 replication in sync
  • 14:21 XioNoX: push firewall policies to pfw3-eqiad - T242681
  • 14:15 XioNoX: push firewall policies to pfw3-codfw - T242681
  • 14:12 liw: branch cut for 1.35.0-wmf.15
  • 14:09 vgutierrez: upgrade ats to 8.0.5-1wm12 in cp5006 and cp5012 - T242620
  • 14:03 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 14:03 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
  • 13:54 marostegui: Upgrade db1080
  • 13:52 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1080 for upgrade', diff saved to https://phabricator.wikimedia.org/P10142 and previous config saved to /var/cache/conftool/dbconfig/20200114-135238-marostegui.json
  • 12:16 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir3002.esams.wmnet
  • 12:16 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir3001.esams.wmnet
  • 12:14 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir4001.ulsfo.wmnet
  • 12:14 vgutierrez@puppetmaster1001: conftool action : set/weight=1; selector: service=nginx,name=ncredir4002.ulsfo.wmnet
  • 12:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 12:02 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
  • 12:02 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 12:01 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
  • 11:51 vgutierrez: restarting pybal on lvs4005 (high-traffic1 LVS) - T242321
  • 11:49 vgutierrez: restarting pybal on lvs4007 (secondary LVS) - T242321
  • 11:48 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir4002.ulsfo.wmnet
  • 11:47 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir4001.ulsfo.wmnet
  • 11:15 vgutierrez: Updating puppet-compiler facts
  • 10:40 vgutierrez: upgrade ats to 8.0.5-1wm12 in cp4026 and cp4032 - T242620
  • 10:07 moritzm: installing remaining cyrus-sasl security updates
  • 09:44 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/Wikibase/lib/includes/Store/Sql/Terms: wbterms: Add Statsd metrics in critical parts of the new term store (duration: 00m 57s)
  • 07:33 XioNoX: add peering to AS26744 in eqiad, eqord and eqdfw
  • 06:25 marostegui: Deploy schema change on flowdb (x1) directly on the master T242688
  • 06:23 marostegui: Deploy schema change on labswiki (wikitech) T242688
  • 06:20 marostegui: Deploy schema change on s3 master for officewiki and techconductwiki T242688
  • 06:01 marostegui: Remove partitions from revision table on db1103:3312
  • 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1103:3312 - T239453', diff saved to https://phabricator.wikimedia.org/P10141 and previous config saved to /var/cache/conftool/dbconfig/20200114-060116-marostegui.json
  • 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1105:3312 after removing partitions from revision table', diff saved to https://phabricator.wikimedia.org/P10140 and previous config saved to /var/cache/conftool/dbconfig/20200114-060003-marostegui.json
  • 05:29 andrewbogott: rebooting cloudservices1004 to make sure all my upgrades are sustainable
  • 01:03 catrope@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/GrowthExperiments/: Various topic search-related cherry-picks (duration: 00m 57s)

2020-01-13

  • 21:35 milimetric@deploy1001: Finished deploy [analytics/refinery@690517c]: Referer Classify change (duration: 09m 08s)
  • 21:32 arlolra@deploy1001: Finished deploy [parsoid/deploy@dd92eeb]: Updating Parsoid to 5d37da1 (duration: 08m 21s)
  • 21:26 milimetric@deploy1001: Started deploy [analytics/refinery@690517c]: Referer Classify change
  • 21:24 arlolra@deploy1001: Started deploy [parsoid/deploy@dd92eeb]: Updating Parsoid to 5d37da1
  • 20:37 clarakosi@deploy1001: Finished deploy [restbase/deploy@bfdd342]: Use parsoid_uri, add ngwiki. T241756, T240771 (duration: 15m 41s)
  • 20:21 clarakosi@deploy1001: Started deploy [restbase/deploy@bfdd342]: Use parsoid_uri, add ngwiki. T241756, T240771
  • 19:39 tgr: ran disableOATHAuthForUser.php for T242543
  • 19:22 tgr@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Revert a temporary CommonsMetadata cache validation hook that has been unneeded for a long time (duration: 00m 56s)
  • 15:56 moritzm: installing cyrus-sasl security updates
  • 15:19 moritzm: remove hassium in Ganeti T224567
  • 15:19 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
  • 15:18 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
  • 15:18 akosiaris@deploy1001: helmfile [CODFW] Ran 'sync' command on namespace 'kube-system' for release 'rbac-deploy-clusterrole' .
  • 15:18 jmm@cumin2001: START - Cookbook sre.hosts.decommission
  • 15:00 joal@deploy1001: Finished deploy [analytics/hdfs-tools/deploy@a1b4d34]: Deploy hdfs-rsync bug correction (duration: 00m 08s)
  • 15:00 joal@deploy1001: Started deploy [analytics/hdfs-tools/deploy@a1b4d34]: Deploy hdfs-rsync bug correction
  • 14:58 jmm@cumin2001: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1)
  • 14:57 jmm@cumin2001: START - Cookbook sre.hosts.decommission
  • 14:55 moritzm: remove hassaleh in Ganeti T224567
  • 14:24 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (563985) (duration: 00m 55s)
  • 14:24 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (563985) (duration: 00m 56s)
  • 13:11 moritzm: upgrade mw canaries to PHP 7.2.26 T241222
  • 12:08 Urbanecm: EU SWAT done
  • 12:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: c7cf53c: Deploy partial blocks on enwiki (T242569) (duration: 00m 55s)
  • 11:58 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (563985) (duration: 00m 55s)
  • 11:57 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (563985) (duration: 00m 55s)
  • 11:42 moritzm: upgrading remaining mwdebug* servers and mw1261 to PHP 7.2.26 T241222
  • 11:04 volans@deploy1001: Finished deploy [debmonitor/deploy@265059b]: Release v0.2.3 (duration: 01m 10s)
  • 11:03 volans@deploy1001: Started deploy [debmonitor/deploy@265059b]: Release v0.2.3
  • 10:51 vgutierrez: pooling esams for ncredir - T242321
  • 09:38 moritzm: rename Ganeti group in ulsfo from "default" to "row_1"
  • 09:16 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 09:16 filippo@cumin1001: START - Cookbook sre.hosts.downtime
  • 07:53 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1112', diff saved to https://phabricator.wikimedia.org/P10134 and previous config saved to /var/cache/conftool/dbconfig/20200113-075334-marostegui.json
  • 07:36 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1112', diff saved to https://phabricator.wikimedia.org/P10133 and previous config saved to /var/cache/conftool/dbconfig/20200113-073656-marostegui.json
  • 07:30 XioNoX: cr3-knams> clear bfd session fe80::5e5e:ab00:d3d:85c - T240659
  • 07:26 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1112', diff saved to https://phabricator.wikimedia.org/P10132 and previous config saved to /var/cache/conftool/dbconfig/20200113-072611-marostegui.json
  • 06:45 marostegui: Upgrade db1112
  • 06:36 marostegui: Deploy schema change on db1112 with replication (lag will appear on s3 on labs) - T234052
  • 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1112', diff saved to https://phabricator.wikimedia.org/P10131 and previous config saved to /var/cache/conftool/dbconfig/20200113-063513-marostegui.json
  • 06:20 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1081 for compression T232446', diff saved to https://phabricator.wikimedia.org/P10130 and previous config saved to /var/cache/conftool/dbconfig/20200113-062007-marostegui.json
  • 06:18 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1084', diff saved to https://phabricator.wikimedia.org/P10129 and previous config saved to /var/cache/conftool/dbconfig/20200113-061835-marostegui.json
  • 06:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 after compression', diff saved to https://phabricator.wikimedia.org/P10128 and previous config saved to /var/cache/conftool/dbconfig/20200113-061434-marostegui.json
  • 06:11 marostegui: Deploy schema change on s1 master (db1083) - T234052
  • 06:11 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1013', diff saved to https://phabricator.wikimedia.org/P10127 and previous config saved to /var/cache/conftool/dbconfig/20200113-061106-marostegui.json
  • 06:10 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1075 T234052', diff saved to https://phabricator.wikimedia.org/P10126 and previous config saved to /var/cache/conftool/dbconfig/20200113-061025-marostegui.json
  • 06:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1013', diff saved to https://phabricator.wikimedia.org/P10125 and previous config saved to /var/cache/conftool/dbconfig/20200113-060841-marostegui.json
  • 06:01 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 after compression', diff saved to https://phabricator.wikimedia.org/P10124 and previous config saved to /var/cache/conftool/dbconfig/20200113-060112-marostegui.json
  • 06:00 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1075 T234052', diff saved to https://phabricator.wikimedia.org/P10123 and previous config saved to /var/cache/conftool/dbconfig/20200113-060012-marostegui.json
  • 05:58 marostegui: Remove partitions from db1105:3312 - T239453
  • 05:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1105:3312 - T239453', diff saved to https://phabricator.wikimedia.org/P10122 and previous config saved to /var/cache/conftool/dbconfig/20200113-055811-marostegui.json
  • 05:55 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2091:3312', diff saved to https://phabricator.wikimedia.org/P10121 and previous config saved to /var/cache/conftool/dbconfig/20200113-055554-marostegui.json
  • 05:53 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1084 after compression', diff saved to https://phabricator.wikimedia.org/P10120 and previous config saved to /var/cache/conftool/dbconfig/20200113-055315-marostegui.json
  • 05:51 marostegui: Deploy schema change on x1 master on flowdb with replication - T241387
  • 02:02 andrewbogott: restarted mariadb on cloudservices1003, cloudservices1004, cloudservices2001-dev, clouddb2001-dev for T239791
  • 00:58 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=cp3061.esams.wmnet
  • 00:53 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=cp3065.esams.wmnet
  • 00:23 jiji@cumin1001: conftool action : set/pooled=no; selector: name=cp3061.esams.wmnet
  • 00:23 jiji@cumin1001: conftool action : set/pooled=no; selector: name=cp3065.esams.wmnet
  • 00:22 effie: depool and restart cp3065 cp3061 - T238305
  • 00:21 effie: depool and restart cp3065 cp3061

2020-01-12

  • 14:48 effie: restart php on mw1240
  • 14:46 effie: restart php on mw1238
  • 04:35 volker-e@deploy1001: Finished deploy [design/style-guide@8bec25e]: Deploy design/style-guide: (duration: 00m 07s)
  • 04:35 volker-e@deploy1001: Started deploy [design/style-guide@8bec25e]: Deploy design/style-guide:
  • 02:57 volker-e@deploy1001: Finished deploy [design/style-guide@cebc152]: Deploy design/style-guide: (duration: 00m 07s)
  • 02:57 volker-e@deploy1001: Started deploy [design/style-guide@cebc152]: Deploy design/style-guide:

2020-01-11

  • 05:34 volker-e@deploy1001: Finished deploy [design/style-guide@6a44c69]: Deploy design/style-guide: (duration: 00m 08s)
  • 05:34 volker-e@deploy1001: Started deploy [design/style-guide@6a44c69]: Deploy design/style-guide:

2020-01-10

  • 22:33 mutante: ms-be1026 sudo systemctl reset-failed (failed Session 372989 of user debmonitor)
  • 20:45 jeh: cloudcontrol200[13]-dev schedule downtime until Feb 28 2020 on systemd service check T242462
  • 20:29 jeh: cloudmetrics100[12] schedule downtime until Feb 28 2020 on prometheus check T242460
  • 20:03 urandom: drop legacy Parsoid/JS storage keyspaces, production env -- T242344
  • 19:56 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 19:54 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 19:52 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'main' .
  • 19:51 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
  • 19:48 mutante: LDAP - add Zbyszko Papierski to "wmf" group (T242341)
  • 19:47 mutante: LDAP - add Hugh Nowlan to "wmf" group (T242309)
  • 19:42 dcausse: restarting blazegraph on wdqs1005
  • 19:40 ebernhardson: restart mjolnir-kafka-bulk-daemon across eqiad and codfw search clusters
  • 19:40 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@e141941]: repair model upload in bulk daemon (duration: 05m 02s)
  • 19:35 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@e141941]: repair model upload in bulk daemon
  • 19:13 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 18:53 mutante: welcome new (restbase) service deployer Clara Andrew-Wani (T242152)
  • 18:29 bd808: Restarted zuul on contint1001; no logs since 2020-01-10 17:55:28,452
  • 11:48 moritzm: stop/mask nginx on hassium/hassaleh T224567
  • 10:56 akosiaris: repool mathoid codfw for testing canary support in the mathoid helm chart
  • 10:56 akosiaris@cumin1001: conftool action : set/pooled=true; selector: name=codfw,dnsdisc=mathoid
  • 10:51 akosiaris@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'mathoid' for release 'canary' .
  • 10:51 akosiaris@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'mathoid' for release 'production' .
  • 10:40 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'mathoid' for release 'staging' .
  • 10:38 akosiaris: depool mathoid codfw in preparation for testing canary support in the mathoid helm chart
  • 10:37 akosiaris@cumin1001: conftool action : set/pooled=false; selector: name=codfw,dnsdisc=mathoid
  • 10:24 moritzm: rename Ganeti group for esams from "default" to "row_OE" T236216
  • 10:21 moritzm: rename Ganeti group for eqsin from "default" to "row_1" T228099
  • 09:02 marostegui: Remove revision partitions from db2091:3312
  • 09:01 marostegui@cumin1001: dbctl commit (dc=all): 'Depoool db2091:3312', diff saved to https://phabricator.wikimedia.org/P10113 and previous config saved to /var/cache/conftool/dbconfig/20200110-090143-marostegui.json
  • 08:59 marostegui@cumin1001: dbctl commit (dc=all): 'Pool db2088:3312', diff saved to https://phabricator.wikimedia.org/P10112 and previous config saved to /var/cache/conftool/dbconfig/20200110-085921-marostegui.json
  • 08:55 vgutierrez: restarting pybal on lvs3005 (high-traffic1) - T242321
  • 08:51 vgutierrez: restarting pybal on lvs3007 - T242321
  • 08:48 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir3002.esams.wmnet
  • 08:48 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=nginx,name=ncredir3001.esams.wmnet
  • 08:24 ema: cp3062: varnish-frontend-restart to clear things up after child crash the past days
  • 02:11 jhuneidi@deploy1001: Pruned MediaWiki: 1.35.0-wmf.10 (duration: 04m 13s)
  • 00:45 catrope@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/GrowthExperiments/: Expose tasktype/topic API parameter info (T240512) (duration: 01m 01s)
  • 00:35 shdubsh: restart prometheus on prometheus2004, enabling debug log

2020-01-09

  • 21:25 ebernhardson@deploy1001: Finished deploy [search/airflow@746c149]: Add skein to airflow venv (duration: 00m 55s)
  • 21:24 ebernhardson@deploy1001: Started deploy [search/airflow@746c149]: Add skein to airflow venv
  • 20:32 chasemp: add phabtest2 to #security temp to ensure reporting settings (T240605)
  • 20:06 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.35.0-wmf.14 refs T233862
  • 19:51 Urbanecm: Morning SWAT done
  • 19:51 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.14/resources/Resources.php: SWAT: 39bc331: Enable mediawiki.page.patrol.ajax on mobile (T242310) (duration: 01m 05s)
  • 19:35 urbanecm@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/MobileFrontend/: SWAT: 31d3be7: Hot fixes for mobile diff page (T242310) (duration: 01m 09s)
  • 19:13 urbanecm@deploy1001: Synchronized wmf-config/mobile.php: SWAT: 2f9ee90: Drop beta setting (T237290) (duration: 01m 06s)
  • 18:56 otto@deploy1001: Finished deploy [analytics/hdfs-tools/deploy@f8e9d6f]: (no justification provided) (duration: 00m 08s)
  • 18:55 otto@deploy1001: Started deploy [analytics/hdfs-tools/deploy@f8e9d6f]: (no justification provided)
  • 18:05 otto@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 18:03 otto@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 18:01 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 17:38 volans@cumin1001: conftool action : set/weight=10; selector: name=elastic106.*.eqiad.wmnet
  • 17:38 volans@cumin1001: conftool action : set/weight=10; selector: name=elastic105[3-9].eqiad.wmnet
  • 17:37 volans: confctl set/weight=10 for elastic10[53-67] - T242348
  • 15:46 ema: cp3058: varnish-frontend-restart to clear things up after child crash yesterday
  • 15:25 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1078', diff saved to https://phabricator.wikimedia.org/P10110 and previous config saved to /var/cache/conftool/dbconfig/20200109-152545-marostegui.json
  • 15:21 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P10109 and previous config saved to /var/cache/conftool/dbconfig/20200109-152157-marostegui.json
  • 15:14 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P10108 and previous config saved to /var/cache/conftool/dbconfig/20200109-151434-marostegui.json
  • 15:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1078', diff saved to https://phabricator.wikimedia.org/P10107 and previous config saved to /var/cache/conftool/dbconfig/20200109-150333-marostegui.json
  • 14:38 papaul: upgrading Firmware on backup2001
  • 14:27 marostegui: Upgrade db1078
  • 14:27 ema: cp3054: varnish-frontend-restart to clear things up after child crash yesterday
  • 14:10 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1078', diff saved to https://phabricator.wikimedia.org/P10105 and previous config saved to /var/cache/conftool/dbconfig/20200109-141057-marostegui.json
  • 14:04 moritzm: imported PHP 7.2.26 to component/php72 for stretch-wikimedia
  • 13:48 moritzm: upgrading mwdebug2002 to PHP 7.2.26 T241224
  • 13:47 moritzm: upgrading mwdebug2002 to PHP 7.2.26
  • 12:41 marostegui: Deploy schema change on s3 codfw, lag will appear on s3 codfw - T234052
  • 12:25 jynus: shutting down backup2001 T240177
  • 12:22 Urbanecm: EU SWAT done
  • 12:19 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: ed0357a: Set $wgArticleCountMethod to any for minwiktionary (T241694) (duration: 01m 08s)
  • 12:17 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 06394ea: Add ipblock-exempt and extendedconfirmed to bot group on fawiki (T241904) (duration: 01m 05s)
  • 12:11 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wmgUseEntitySourceBasedFederation for test.wikidata.org (T241973) (duration: 01m 07s)
  • 11:23 moritzm: installing cyrus-sasl security updates
  • 11:04 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 11:04 jmm@cumin2001: START - Cookbook sre.hosts.downtime
  • 10:09 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1106', diff saved to https://phabricator.wikimedia.org/P10104 and previous config saved to /var/cache/conftool/dbconfig/20200109-100948-marostegui.json
  • 10:05 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1106', diff saved to https://phabricator.wikimedia.org/P10103 and previous config saved to /var/cache/conftool/dbconfig/20200109-100552-marostegui.json
  • 09:56 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 09:54 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1106', diff saved to https://phabricator.wikimedia.org/P10102 and previous config saved to /var/cache/conftool/dbconfig/20200109-095433-marostegui.json
  • 09:53 filippo@cumin1001: START - Cookbook sre.hosts.downtime
  • 09:52 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1106', diff saved to https://phabricator.wikimedia.org/P10101 and previous config saved to /var/cache/conftool/dbconfig/20200109-095249-marostegui.json
  • 09:48 marostegui: Upgrade db1106
  • 09:47 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1106 for upgrade', diff saved to https://phabricator.wikimedia.org/P10100 and previous config saved to /var/cache/conftool/dbconfig/20200109-094748-marostegui.json
  • 09:39 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1118', diff saved to https://phabricator.wikimedia.org/P10099 and previous config saved to /var/cache/conftool/dbconfig/20200109-093946-marostegui.json
  • 09:32 marostegui: Deploy schema change on db1106, this will generate a bit of lag on s1 labs
  • 09:31 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P10098 and previous config saved to /var/cache/conftool/dbconfig/20200109-093119-marostegui.json
  • 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P10097 and previous config saved to /var/cache/conftool/dbconfig/20200109-082243-marostegui.json
  • 08:16 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1118', diff saved to https://phabricator.wikimedia.org/P10096 and previous config saved to /var/cache/conftool/dbconfig/20200109-081629-marostegui.json
  • 07:40 XioNoX: enable traceoptions for BFD on cr2-eqdfw - T240659
  • 07:37 marostegui: Upgrade db1118
  • 07:37 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1118', diff saved to https://phabricator.wikimedia.org/P10094 and previous config saved to /var/cache/conftool/dbconfig/20200109-073713-marostegui.json
  • 06:27 marostegui: Remove revision partitions from db2088:3312 T239453
  • 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2088:3312 T239453', diff saved to https://phabricator.wikimedia.org/P10093 and previous config saved to /var/cache/conftool/dbconfig/20200109-062608-marostegui.json
  • 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1096:3315 db1096:3316 T239453', diff saved to https://phabricator.wikimedia.org/P10092 and previous config saved to /var/cache/conftool/dbconfig/20200109-062157-marostegui.json
  • 00:33 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: (no-op) set config page for newcomer tasks (T233465) (duration: 01m 05s)

2020-01-08

  • 23:44 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: Roll commonswiki forward to 1.35.0-wmf.14
  • 23:34 jforrester@deploy1001: Synchronized php-1.35.0-wmf.14/extensions/WikibaseMediaInfo/resources/statements/StatementWidget.js: T242286 Update StatementWidget initialization logic (duration: 01m 05s)
  • 23:14 XenoRyet: updated civicrm from 42e88f92a9 to 9ac771a913
  • 23:09 mutante: LDAP - added moushirael to 'wmf' (T242000)
  • 22:39 mutante: restarted zuul on contint1001
  • 21:56 arlolra: Updated Parsoid to f963e51 (T238934, T237318, T238022, T228217)
  • 21:46 XenoRyet: updated civicrm from 2468d85f95 to 42e88f92a9
  • 21:46 arlolra@deploy1001: Finished deploy [parsoid/deploy@45a4245]: Updating Parsoid to f963e51 (duration: 08m 00s)
  • 21:38 arlolra@deploy1001: Started deploy [parsoid/deploy@45a4245]: Updating Parsoid to f963e51
  • 21:30 mutante: phab1003 - running decom cookbook - shutdown host, removed from puppetmaster, debmonitor etc (T238957)
  • 21:30 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0)
  • 21:29 dzahn@cumin1001: START - Cookbook sre.hosts.decommission
  • 21:28 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: Revert "commonswiki to 1.35.0-wmf.11"
  • 21:21 halfak@deploy1001: Finished deploy [ores/deploy@039251f]: T242035 (duration: 16m 32s)
  • 21:07 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-main' for release 'main' .
  • 21:04 halfak@deploy1001: Started deploy [ores/deploy@039251f]: T242035
  • 21:03 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 21:00 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
  • 20:53 XenoRyet: updated civicrm from 51b6fca9b2 to 2468d85f95
  • 20:51 jhuneidi@deploy1001: Synchronized php: group1 wikis to 1.35.0-wmf.14 refs T233862 (duration: 01m 04s)
  • 20:50 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.35.0-wmf.14 refs T233862
  • 20:40 mutante: contint1001 - restarting zuul service
  • 20:00 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-analytics' for release 'analytics' .
  • 19:31 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 19:16 mutante: LDAP - added 'sihe' to 'wmde' and 'nda' (T242080)
  • 19:15 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 19:13 joal@deploy1001: Finished deploy [analytics/refinery@c205576] (thin): Regular analytics weekly deploy train [thin] (duration: 00m 07s)
  • 19:13 joal@deploy1001: Started deploy [analytics/refinery@c205576] (thin): Regular analytics weekly deploy train [thin]
  • 19:13 joal@deploy1001: Finished deploy [analytics/refinery@c205576]: Regular analytics weekly deploy train (duration: 08m 36s)
  • 19:04 joal@deploy1001: Started deploy [analytics/refinery@c205576]: Regular analytics weekly deploy train
  • 18:46 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 18:46 marostegui: Remove partitions from dewiki.revision on db1096:3315 T239453
  • 18:45 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3315', diff saved to https://phabricator.wikimedia.org/P10090 and previous config saved to /var/cache/conftool/dbconfig/20200108-184510-marostegui.json
  • 18:43 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1097:3315', diff saved to https://phabricator.wikimedia.org/P10089 and previous config saved to /var/cache/conftool/dbconfig/20200108-184350-marostegui.json
  • 18:39 otto@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'eventgate-logging-external' for release 'logging-external' .
  • 18:36 ppchelko@deploy1001: Finished deploy [restbase/deploy@ebb1849]: Clean up Parsoid-PHP transition code & config T241756 (duration: 14m 27s)
  • 18:33 volans: restarted wikibugs
  • 18:22 ppchelko@deploy1001: Started deploy [restbase/deploy@ebb1849]: Clean up Parsoid-PHP transition code & config T241756
  • 18:21 ppchelko@deploy1001: Finished deploy [restbase/deploy@ebb1849] (dev-cluster): Clean up Parsoid-PHP transition code & config T241756 (duration: 02m 41s)
  • 18:18 ppchelko@deploy1001: Started deploy [restbase/deploy@ebb1849] (dev-cluster): Clean up Parsoid-PHP transition code & config T241756
  • 18:07 elukey@cumin1001: END (PASS) - Cookbook sre.aqs.roll-restart (exit_code=0)
  • 18:04 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
  • 18:03 elukey@cumin1001: END (FAIL) - Cookbook sre.aqs.roll-restart (exit_code=99)
  • 18:03 elukey@cumin1001: START - Cookbook sre.aqs.roll-restart
  • 16:25 _joe_: running puppet on deploy1001 to remove my hot-patch to scap.cfg
  • 16:20 ema: rolling ats-be restart on !text@eqiad, !text@esams to apply https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/562849/
  • 16:00 bblack: re-pooling esams text traffic in DNS
  • 15:45 ema: cumin -s10 -b1 'A:cp-text_eqiad' 'run-puppet-agent -q ; ats-backend-restart'
  • 15:40 vgutierrez: restarting ats-tls on esams text nodes
  • 15:37 ema: cumin -s10 -b1 'A:cp-text_esams' 'run-puppet-agent -q ; ats-backend-restart'
  • 15:37 bblack: authdns-update to depool esams
  • 15:26 otto@deploy1001: Synchronized wmf-config/ProductionServices.php: REVERT Make EventBus use TLS for eventgate-analytics - T242224 (duration: 00m 34s)
  • 15:24 otto@deploy1001: sync-file aborted: REVERT Make EventBus use TLS for eventgate-analytics - T242224 (duration: 03m 56s)
  • 15:20 otto@deploy1001: sync-file aborted: REVERT Make EventBus use TLS for eventgate-analytics - T242224 (duration: 06m 33s)
  • 15:12 otto@deploy1001: Scap failed!: 4/11 canaries failed their endpoint checks(http://en.wikipedia.org)
  • 15:11 otto@deploy1001: sync-file aborted: Make EventBus use TLS for eventgate-analytics - T242224 (duration: 00m 00s)
  • 15:10 otto@deploy1001: Synchronized wmf-config/ProductionServices.php: Make EventBus use TLS for eventgate-analytics - T242224 (duration: 06m 10s)
  • 15:02 XioNoX: Routinator 0.6.4 looking good on rpki2001, upgrading rpki1001 - T242197
  • 15:00 ottomata: deploying change to make EventBus use new TLS port for eventgate-analytics - T242224
  • 14:35 ema: repool cp4028 after successful X-Analytics-TLS patch test T237993
  • 14:23 ema: depool cp4028 to test X-Analytics-TLS patch T237993
  • 14:07 XioNoX: add routinator 0.6.4 to reprepro stretch-wikimedia - T242197
  • 14:00 ariel@deploy1001: Finished deploy [dumps/dumps@dbd0ecd]: don't regenerate existing 7z files on rerun of the 7z recompression job (duration: 00m 05s)
  • 14:00 ariel@deploy1001: Started deploy [dumps/dumps@dbd0ecd]: don't regenerate existing 7z files on rerun of the 7z recompression job
  • 12:46 _joe_: deleting releng/composer-php55:0.1.0 from the docker registry
  • 12:36 Lucas_WMDE: EU SWAT done
  • 12:34 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update Skolt Sami language name (T223544) (duration: 01m 06s)
  • 12:30 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/Cite: SWAT: Fix handling of `` (T241303) (duration: 01m 06s)
  • 12:17 tarrow@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable tainted references on test.wikidata.org (T239621) (duration: 01m 19s)
  • 12:08 kart_: Updated cxserver to 2020-01-06-070550-production (T233405)
  • 12:04 kartik@deploy1001: helmfile [EQIAD] Ran 'apply' command on namespace 'cxserver' for release 'production' .
  • 12:01 kartik@deploy1001: helmfile [CODFW] Ran 'apply' command on namespace 'cxserver' for release 'production' .
  • 12:00 kartik@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'cxserver' for release 'staging' .
  • 11:47 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes2001.*
  • 11:45 akosiaris@cumin1001: conftool action : set/weight=10; selector: service=echostore
  • 11:44 vgutierrez: uploaded varnish 5.1.3-1wm12 to apt.wikimedia.org (buster) - T242093
  • 11:44 akosiaris@cumin1001: conftool action : set/weight=10; selector: name=kubernetes1001.*
  • 11:44 akosiaris@cumin1001: conftool action : set/pooled=yes; selector: name=kubernetes1001.*
  • 11:07 moritzm: test failover of Ganeti master in eqsin T228099
  • 11:00 moritzm: drain ganeti5003 to test new Ganeti setup in eqsin T228099
  • 10:53 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 10:53 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
  • 10:41 moritzm: rebooting netflow5001 to pick up microcode
  • 10:08 moritzm: enabling spec-ctr, ssbd. md-clear passthrough for new eqsin cluster T228099
  • 09:27 moritzm: installing urldownloader1002 T241979
  • 09:11 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1085', diff saved to https://phabricator.wikimedia.org/P10088 and previous config saved to /var/cache/conftool/dbconfig/20200108-091124-marostegui.json
  • 09:00 moritzm: installing urldownloader1001 T241979
  • 08:29 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P10087 and previous config saved to /var/cache/conftool/dbconfig/20200108-082930-marostegui.json
  • 08:20 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1085', diff saved to https://phabricator.wikimedia.org/P10086 and previous config saved to /var/cache/conftool/dbconfig/20200108-082050-marostegui.json
  • 08:09 marostegui: Upgrade db1085
  • 08:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1085', diff saved to https://phabricator.wikimedia.org/P10085 and previous config saved to /var/cache/conftool/dbconfig/20200108-080853-marostegui.json
  • 08:07 marostegui: Deploy schema change on s1 codfw, there will be lag on s1 codfw - T234052
  • 07:58 marostegui: Deploy schema change on clouddb2001-dev.labtestwiki - T234052
  • 07:20 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1079', diff saved to https://phabricator.wikimedia.org/P10084 and previous config saved to /var/cache/conftool/dbconfig/20200108-072017-marostegui.json
  • 07:13 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P10083 and previous config saved to /var/cache/conftool/dbconfig/20200108-071312-marostegui.json
  • 07:07 marostegui: Remove partitions from dewiki.revision on db1097:3315 T239453
  • 07:07 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1097:3315', diff saved to https://phabricator.wikimedia.org/P10082 and previous config saved to /var/cache/conftool/dbconfig/20200108-070712-marostegui.json
  • 07:06 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P10081 and previous config saved to /var/cache/conftool/dbconfig/20200108-070614-marostegui.json
  • 07:00 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1079', diff saved to https://phabricator.wikimedia.org/P10080 and previous config saved to /var/cache/conftool/dbconfig/20200108-070009-marostegui.json
  • 06:56 marostegui: Upgrade db1079
  • 06:44 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1079', diff saved to https://phabricator.wikimedia.org/P10079 and previous config saved to /var/cache/conftool/dbconfig/20200108-064404-marostegui.json
  • 06:42 marostegui: Remove partitions from revision table on s6 for db1096:3316 - T239453
  • 06:41 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1096:3316 - T239453', diff saved to https://phabricator.wikimedia.org/P10078 and previous config saved to /var/cache/conftool/dbconfig/20200108-064144-marostegui.json
  • 06:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1098:3316 - T239453', diff saved to https://phabricator.wikimedia.org/P10077 and previous config saved to /var/cache/conftool/dbconfig/20200108-063550-marostegui.json
  • 05:41 XioNoX: enable netflow in eqsin
  • 03:54 volker-e@deploy1001: Finished deploy [design/style-guide@ad595d5]: Deploy design/style-guide: (duration: 00m 08s)
  • 03:54 volker-e@deploy1001: Started deploy [design/style-guide@ad595d5]: Deploy design/style-guide:
  • 00:38 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@024488f]: airflow: set mjolnir dag start date to today (20200108) (duration: 00m 42s)
  • 00:37 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@024488f]: airflow: set mjolnir dag start date to today (20200108)
  • 00:21 reedy@deploy1001: Synchronized wmf-config/throttle.php: T240845 (duration: 01m 04s)

2020-01-07

  • 23:53 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@cb228ae]: Force python to use python3.5 dependencies (take two) (duration: 00m 10s)
  • 23:53 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@cb228ae]: Force python to use python3.5 dependencies (take two)
  • 23:36 mutante: [puppetmaster2001:/var/run/confd-template] $ sudo rm .cloudceph*.err
  • 23:02 cdanis: cp3055.mgmt% racadm serveraction powercycle T240425
  • 20:42 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@6c1f455]: Bump to master: Allow cli to load without pyspark (duration: 05m 55s)
  • 20:40 jhuneidi@deploy1001: rebuilt and synchronized wikiversions files: group0 wikis to 1.35.0-wmf.14 refs T233862
  • 20:36 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@6c1f455]: Bump to master: Allow cli to load without pyspark
  • 20:30 jhuneidi@deploy1001: Finished scap: testwikis wikis to 1.35.0-wmf.14 refs T233862 (duration: 29m 01s)
  • 20:12 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@867d674]: Bump to master: Allow cli to load without pyspark (duration: 05m 13s)
  • 20:06 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@867d674]: Bump to master: Allow cli to load without pyspark
  • 20:01 jhuneidi@deploy1001: Started scap: testwikis wikis to 1.35.0-wmf.14 refs T233862
  • 19:28 James_F: mwscript createAndPromote.php foundationwiki 'Jdforrester (WMF)' --force --custom-groups=interface-admin for T241950
  • 19:02 James_F: 1.35.0-wmf.14 was branched at fb16374 T233862
  • 18:38 ebernhardson@deploy1001: Finished deploy [wikimedia/discovery/analytics@511f745]: [airflow] Force PYTHONPATH to use pyspark 3.5 deps (duration: 00m 14s)
  • 18:38 ebernhardson@deploy1001: Started deploy [wikimedia/discovery/analytics@511f745]: [airflow] Force PYTHONPATH to use pyspark 3.5 deps
  • 17:31 Urbanecm: Run scap pull at mwdebug1001, test over
  • 17:29 Urbanecm: Stashing at mwdebug1001
  • 17:28 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2076 T241647', diff saved to https://phabricator.wikimedia.org/P10072 and previous config saved to /var/cache/conftool/dbconfig/20200107-172839-marostegui.json
  • 17:23 marostegui: Remove partitions from dewiki.revision from db2089:3315 T239453
  • 17:19 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1088', diff saved to https://phabricator.wikimedia.org/P10071 and previous config saved to /var/cache/conftool/dbconfig/20200107-171955-marostegui.json
  • 17:18 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@b378752]: bump numpy to 1.17.2 (duration: 05m 53s)
  • 17:18 vgutierrez: restarting pybal on lvs1015 - T240715
  • 17:13 vgutierrez: restarting pybal on lvs1016 - T240715
  • 17:12 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@b378752]: bump numpy to 1.17.2
  • 17:10 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=cloudceph,name=cloudcephmon1003.wikimedia.org
  • 17:10 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=cloudceph,name=cloudcephmon1002.wikimedia.org
  • 17:10 vgutierrez@puppetmaster1001: conftool action : set/pooled=yes; selector: service=cloudceph,name=cloudcephmon1001.wikimedia.org
  • 16:43 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable banner on Special:Block for partial blocks early-adopter wikis T240300 (duration: 00m 57s)
  • 16:10 elukey: cr1/cr2-eqiad: set port 443 (was 8190) for term schema in analytics-in4
  • 15:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1088', diff saved to https://phabricator.wikimedia.org/P10070 and previous config saved to /var/cache/conftool/dbconfig/20200107-154529-marostegui.json
  • 15:44 papaul: shutting down db2076 for FW upgrade
  • 15:41 moritzm: installing urldownloader2002 T241979
  • 15:30 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 15:27 jmm@cumin2001: START - Cookbook sre.hosts.downtime
  • 15:23 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1088', diff saved to https://phabricator.wikimedia.org/P10069 and previous config saved to /var/cache/conftool/dbconfig/20200107-152304-marostegui.json
  • 15:16 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1088', diff saved to https://phabricator.wikimedia.org/P10068 and previous config saved to /var/cache/conftool/dbconfig/20200107-151633-marostegui.json
  • 15:11 moritzm: installing urldownloader2001 T241979
  • 15:09 moritzm: reimaging mw2282
  • 15:04 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1088 for upgrade', diff saved to https://phabricator.wikimedia.org/P10067 and previous config saved to /var/cache/conftool/dbconfig/20200107-150440-marostegui.json
  • 14:39 _joe_: uploading python3-docker-report to {buster,stretch}-wikimedia, T241206
  • 14:35 marostegui: Power off db2076 for on-site maintenance T241647
  • 14:32 marostegui: Stop MySQL on db2076 for maintenance T241647
  • 14:22 marostegui: Deploy schema change on s7 codfw master, this will generate lag on s7 codfw - T234052
  • 14:21 marostegui: Deploy schema change on s2 codfw master, this will generate lag on s2 codfw - T234052
  • 14:05 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 14:03 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1126', diff saved to https://phabricator.wikimedia.org/P10066 and previous config saved to /var/cache/conftool/dbconfig/20200107-140300-marostegui.json
  • 14:02 jmm@cumin2001: START - Cookbook sre.hosts.downtime
  • 13:43 moritzm: reimaging mw2282
  • 13:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1126', diff saved to https://phabricator.wikimedia.org/P10065 and previous config saved to /var/cache/conftool/dbconfig/20200107-134251-marostegui.json
  • 13:34 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1104', diff saved to https://phabricator.wikimedia.org/P10064 and previous config saved to /var/cache/conftool/dbconfig/20200107-133439-marostegui.json
  • 12:56 Lucas_WMDE: EU SWAT done
  • 12:56 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix WBRepoCanonicalUriProperty setting for testwikidatawiki (duration: 00m 54s)
  • 12:52 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Fix wgImportSources setting for wikidata dblist (duration: 00m 54s)
  • 12:39 Urbanecm: Run mwscript initSiteStats.php --wiki=tawiktionary --update (T241684)
  • 12:38 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 5be01f0: Modify $wgArticleCount to any for ta.wiktionary (T241684) (duration: 00m 55s)
  • 12:32 urbanecm@deploy1001: Synchronized static/images/project-logos/: SWAT: d6ee5fe: Modify ge.wikimedia project logos (T241327) (duration: 00m 57s)
  • 12:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1104', diff saved to https://phabricator.wikimedia.org/P10063 and previous config saved to /var/cache/conftool/dbconfig/20200107-122914-marostegui.json
  • 12:17 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Clean up unused configs in InitialiseSettings.php (T238154) (duration: 00m 54s)
  • 12:15 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Clean up unused configs in InitialiseSettings.php (T238154) (duration: 00m 55s)
  • 12:13 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: Clean up unused configs in Wikibase.php (T238154) (duration: 00m 54s)
  • 12:12 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: SWAT: Clean up unused configs in Wikibase.php (T238154) (duration: 00m 54s)
  • 12:11 ladsgroup@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Clean up unused configs in Wikibase.php (T238154) (duration: 00m 56s)
  • 11:12 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 11:12 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
  • 11:12 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
  • 11:11 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
  • 11:10 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
  • 11:10 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
  • 10:53 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
  • 10:53 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
  • 10:39 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.11/extensions/Wikibase/lib/includes/Store/Sql/Terms/DatabaseTermIdsAcquirer.php: Temporary add metrics of the need to reinsert in the new term store (duration: 00m 57s)
  • 10:19 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
  • 10:19 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
  • 10:07 marostegui@cumin1001: dbctl commit (dc=all): 'Fully repool db1092', diff saved to https://phabricator.wikimedia.org/P10062 and previous config saved to /var/cache/conftool/dbconfig/20200107-100743-marostegui.json
  • 10:05 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
  • 10:05 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
  • 10:01 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1092', diff saved to https://phabricator.wikimedia.org/P10061 and previous config saved to /var/cache/conftool/dbconfig/20200107-100157-marostegui.json
  • 10:01 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 10:01 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
  • 09:55 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1092', diff saved to https://phabricator.wikimedia.org/P10060 and previous config saved to /var/cache/conftool/dbconfig/20200107-095501-marostegui.json
  • 09:52 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'staging' .
  • 09:52 akosiaris@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
  • 09:49 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1092', diff saved to https://phabricator.wikimedia.org/P10059 and previous config saved to /var/cache/conftool/dbconfig/20200107-094944-marostegui.json
  • 09:45 marostegui@cumin1001: dbctl commit (dc=all): 'Slowly repool db1092', diff saved to https://phabricator.wikimedia.org/P10058 and previous config saved to /var/cache/conftool/dbconfig/20200107-094506-marostegui.json
  • 09:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1092 for alter and upgrade', diff saved to https://phabricator.wikimedia.org/P10057 and previous config saved to /var/cache/conftool/dbconfig/20200107-092221-marostegui.json
  • 08:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1084 for compression', diff saved to https://phabricator.wikimedia.org/P10056 and previous config saved to /var/cache/conftool/dbconfig/20200107-082236-marostegui.json
  • 08:11 ayounsi@deploy1001: Finished deploy [librenms/librenms@7a0f7aa]: Upgrade LibreNMS to 1.59 - T241962 (duration: 00m 10s)
  • 08:11 ayounsi@deploy1001: Started deploy [librenms/librenms@7a0f7aa]: Upgrade LibreNMS to 1.59 - T241962
  • 07:42 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1019', diff saved to https://phabricator.wikimedia.org/P10055 and previous config saved to /var/cache/conftool/dbconfig/20200107-074159-marostegui.json
  • 07:40 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1019 for upgrade', diff saved to https://phabricator.wikimedia.org/P10054 and previous config saved to /var/cache/conftool/dbconfig/20200107-074035-marostegui.json
  • 07:39 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1013', diff saved to https://phabricator.wikimedia.org/P10053 and previous config saved to /var/cache/conftool/dbconfig/20200107-073922-marostegui.json
  • 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1013 for upgrade', diff saved to https://phabricator.wikimedia.org/P10052 and previous config saved to /var/cache/conftool/dbconfig/20200107-073543-marostegui.json
  • 07:35 marostegui@cumin1001: dbctl commit (dc=all): 'Repool es1018', diff saved to https://phabricator.wikimedia.org/P10051 and previous config saved to /var/cache/conftool/dbconfig/20200107-073508-marostegui.json
  • 07:29 marostegui@cumin1001: dbctl commit (dc=all): 'Depool es1018 for upgrade', diff saved to https://phabricator.wikimedia.org/P10050 and previous config saved to /var/cache/conftool/dbconfig/20200107-072930-marostegui.json
  • 07:15 marostegui: Remove partitions from s5: db2084:3315 T239453
  • 07:13 marostegui: Remove partitions from revision table on s6: db1098 T239453
  • 07:08 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1098:3316 - T239453', diff saved to https://phabricator.wikimedia.org/P10049 and previous config saved to /var/cache/conftool/dbconfig/20200107-070850-marostegui.json
  • 07:05 marostegui: Depool labsdb1011
  • 07:03 marostegui: Deploy schema change on s8 codfw (this will generate lag on s8 codfw) - T234052
  • 06:48 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2089:3316 - T239453', diff saved to https://phabricator.wikimedia.org/P10048 and previous config saved to /var/cache/conftool/dbconfig/20200107-064846-marostegui.json
  • 01:17 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
  • 01:15 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: InitialiseSettings - clean up groupOverrides layout / spacing (sync again) (duration: 00m 53s)
  • 01:14 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: InitialiseSettings - clean up groupOverrides layout / spacing (duration: 00m 54s)
  • 01:12 mutante: ganeti - creating urldownloader2002.wikimedia.org in codfw_B with 1 CPU, 1 GB RAM, 10 GB disk, public IP (T241979)
  • 01:12 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
  • 01:09 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
  • 01:04 mutante: ganeti - creating urldownloader2001.wikimedia.org in codfw_A with 1 CPU, 1 GB RAM, 10 GB disk, public IP (T241979)
  • 01:04 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
  • 01:03 ebernhardson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Revert: "cirrus: Shift more_like to codfw cirrus cluster" (duration: 00m 54s)
  • 01:02 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
  • 00:59 mutante: ganeti - creating urldownloader1002.wikimedia.org in eqiad_C with 1 CPU, 1 GB RAM, 10 GB disk, public IP (T241979)
  • 00:58 mutante: ganeti - creating urldownloader1001.wikimedia.org in eqiad_A with 1 CPU, 1 GB RAM, 10 GB disk, public IP (T241979)
  • 00:57 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
  • 00:57 ebernhardson@deploy1001: Synchronized wmf-config/CirrusSearch-common.php: Revert "reduce query load on cirrus elastic clusters" (duration: 00m 54s)
  • 00:57 dzahn@cumin1001: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0)
  • 00:56 ebernhardson@deploy1001: sync-file aborted: Revery (duration: 00m 00s)
  • 00:52 dzahn@cumin1001: START - Cookbook sre.ganeti.makevm
  • 00:46 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: GrowthExperiments: use local search in production (T235717) (duration: 00m 54s)
  • 00:45 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: GrowthExperiments: use local search in production (T235717) (duration: 00m 58s)
  • 00:27 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable Partial Blocks on every wiki excluding those that have opted-out (T218626) (duration: 00m 55s)

2020-01-06

  • 23:49 ejegg: updated payments-wiki from 827e3235dc to c3ca3ad6a7
  • 23:12 mutante: mailman - running /usr/local/sbin/rename_list wikimediamy wikimedia-my (T241988)
  • 22:34 eileen: civicrm revision changed from b7746c31aa to 51b6fca9b2, config revision is b8af24d7c8
  • 21:28 Amir1: starting rebuild of holes in new term store from Q1Mio to Q10Mio using screen in mwmaint1002 (T219123)
  • 20:06 ejegg: updated fundraising civicrm from 5642a92223 to b7746c31aa
  • 20:02 mutante: LDAP - added 'krli' (Kris Litson) to 'wmde' and 'nda' for superset access (T241722)
  • 19:39 ebernhardson@deploy1001: Finished deploy [search/mjolnir/deploy@41a22b8]: Bump to latest master (duration: 06m 57s)
  • 19:32 ebernhardson@deploy1001: Started deploy [search/mjolnir/deploy@41a22b8]: Bump to latest master
  • 19:26 Urbanecm: Morning SWAT done
  • 19:25 urbanecm@deploy1001: Synchronized dblists/commonsuploads.dblist: SWAT: 0f045c3: Enable local uploads on inh.wiki (T239925) (duration: 00m 54s)
  • 19:22 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 7722ff3: 0bff587: Add www.digital.archives.go.jp/mediaphoto.mnhn.fr to the wgCopyUploadsDomains (T238476, T241637) (duration: 00m 54s)
  • 19:19 urbanecm@deploy1001: Synchronized wmf-config/throttle.php: SWAT: 1324af9: Add throttle rule for ECLAC editathon in Santiago, Chile (T241414) (duration: 00m 54s)
  • 19:14 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: c3a3248: Add sandboxlink for eswikivoyage (T241163) (duration: 00m 58s)
  • 19:08 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: d7a19ca: Enable GeoData extension in ruwikinews (T239000) (duration: 00m 56s)
  • 18:49 ebernhardson@deploy1001: Finished deploy [search/airflow@8db442c]: match cryptography package with debian buster (duration: 00m 53s)
  • 18:48 ebernhardson@deploy1001: Started deploy [search/airflow@8db442c]: match cryptography package with debian buster
  • 18:17 ebernhardson@deploy1001: Finished deploy [search/airflow@8ae8500]: Require apache-airflow[kerberos] python package (duration: 00m 27s)
  • 18:16 ebernhardson@deploy1001: Started deploy [search/airflow@8ae8500]: Require apache-airflow[kerberos] python package
  • 17:11 jakob@deploy1001: helmfile [STAGING] Ran 'sync' command on namespace 'termbox' for release 'test' .
  • 16:56 jakob@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'test' .
  • 16:27 jakob@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'test' .
  • 15:57 milimetric@deploy1001: Finished deploy [analytics/refinery@09133cf]: Fix for geoeditors monthly (duration: 11m 49s)
  • 15:47 herron: migrating mx1001 to seconday ganeti node T240906
  • 15:45 milimetric@deploy1001: Started deploy [analytics/refinery@09133cf]: Fix for geoeditors monthly
  • 15:30 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: [officewiki] Grant ipblock-exempt to all users T231943 (duration: 00m 56s)
  • 15:06 ariel@deploy1001: Finished deploy [dumps/dumps@db81d78]: avoid aborts on some symlink cleanup failures (duration: 00m 06s)
  • 15:06 ariel@deploy1001: Started deploy [dumps/dumps@db81d78]: avoid aborts on some symlink cleanup failures
  • 15:04 XioNoX: remove BGP to AS13285 in ulsfo (IXP not listed in peeringdb anymore)
  • 14:56 XioNoX: remove BGP to AS13285 in eqiad (IXP not listed in peeringdb anymore)
  • 14:32 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: Enable WebAuthn everywhere (duration: 00m 54s)
  • 14:31 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable WebAuthn everywhere (duration: 00m 57s)
  • 13:56 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 13:54 jmm@cumin2001: START - Cookbook sre.hosts.downtime
  • 13:35 moritzm: reimaging mw2282 to validate correctness of apt::package_from_component for fresh installs
  • 12:58 Urbanecm: EU SWAT done
  • 12:56 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 88c800c: Add basic transwiki sources for ltwiki (T241288) (duration: 00m 54s)
  • 12:50 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: c44b4ff: Enable subpages for the main namespace on ge.wikimedia (T241329) (duration: 00m 55s)
  • 12:46 Urbanecm: mwscript namespaceDupes.php --wiki=napwikisource --fix (T231880)
  • 12:45 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 864a2f8: Set Author and Author_talk aliases for Autore NS at napwikisource (T231880) (duration: 00m 55s)
  • 12:43 Urbanecm: mwscript namespaceDupes.php --wiki=zhwiktionary --fix (T241023)
  • 12:41 urbanecm@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: 0baf554: Add new namespace and aliases for zh.wiktionary (T241023) (duration: 00m 54s)
  • 12:39 urbanecm@deploy1001: sync-file aborted: SWAT: 0ac5032: Add throttle exception for Amical Wikimedia Workshop (T241705) (duration: 00m 01s)
  • 12:39 jakob@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'test' .
  • 12:37 urbanecm@deploy1001: Synchronized wmf-config/throttle.php: SWAT: 0ac5032: Add throttle exception for Amical Wikimedia Workshop (T241705) (duration: 00m 56s)
  • 12:31 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Don’t check constraints on P6685 statements Bypassing T236104 (duration: 00m 55s)
  • 12:28 ladsgroup@deploy1001: Synchronized php-1.35.0-wmf.11/maintenance/rebuildLocalisationCache.php: SWAT: Add option to override storeClass in rebuildLocalisationCache (T105683 T99740) (duration: 00m 55s)
  • 12:25 ladsgroup@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Revert "Add a bit for forcing LC caching backend in cli mode" (duration: 00m 54s)
  • 12:23 jakob@deploy1001: helmfile [STAGING] Ran 'apply' command on namespace 'termbox' for release 'test' .
  • 12:18 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Don’t check constraints on P6685 statements (T227865) (duration: 00m 55s)
  • 12:15 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set read new for item term store up to Q100K (T219123) (duration: 00m 55s)
  • 12:07 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable wgCiteResponsiveReferences on cswiki (T241304) (duration: 00m 56s)
  • 11:42 jdrewniak@deploy1001: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 55s)
  • 11:41 jdrewniak@deploy1001: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 00m 55s)
  • 10:56 urbanecm@deploy1001: Synchronized private/PrivateSettings.php: Revert T227416 mitigations (duration: 01m 05s)
  • 10:39 moritzm: installing libbsd security updates
  • 09:53 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 09:51 jmm@cumin2001: START - Cookbook sre.hosts.downtime
  • 09:32 moritzm: reimaging mw2282 to validate correctness of apt::package_from_component for fresh installs
  • 07:37 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki=wikidatawiki --batch-size=100 --sleep=2 --file=/tmp/1mio.lines (T219301)
  • 03:53 Amir1: ladsgroup@mwmaint1002:~$ mwscript extensions/Wikibase/repo/maintenance/rebuildItemTerms.php --wiki=wikidatawiki --batch-size=100 --sleep=2 --file=/tmp/100k.lines (T219301)
  • 00:06 effie: pool cp3065 T238305
  • 00:05 jiji@cumin1001: conftool action : set/pooled=yes; selector: name=cp3065.esams.wmnet

2020-01-05

  • 23:56 effie: powecycle cp3065.esams.wmnet T238305
  • 23:53 jiji@cumin1001: conftool action : set/pooled=no; selector: name=cp3065.esams.wmnet
  • 13:09 Urbanecm: mwmaint1002: mwscript importImages.php --wiki=commonswiki --comment-ext=txt --user=Coffeeandcrumbs /home/urbanecm/T241917 (T241917)

2020-01-04

  • 16:34 aborrero@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 16:34 aborrero@cumin1001: START - Cookbook sre.hosts.downtime
  • 16:34 aborrero@cumin1001: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99)
  • 16:34 aborrero@cumin1001: START - Cookbook sre.hosts.downtime

2020-01-03

  • 22:14 volker-e@deploy1001: Finished deploy [design/style-guide@8054026]: Deploy design/style-guide: (duration: 00m 08s)
  • 22:14 volker-e@deploy1001: Started deploy [design/style-guide@8054026]: Deploy design/style-guide:
  • 17:44 jynus@cumin1001: dbctl commit (dc=all): 'Repool db2084 instances T241103', diff saved to https://phabricator.wikimedia.org/P10035 and previous config saved to /var/cache/conftool/dbconfig/20200103-174447-jynus.json
  • 16:54 ejegg: updated fundraising CiviCRM from 217a1f8c63 to 5642a92223
  • 16:36 jynus: stopping db2084
  • 15:04 marostegui: Upgrade db2107
  • 14:58 marostegui: Deploy schema changes on s2 and s4 eqiad hosts T234052
  • 14:56 jbond42: clean up old /etc/apt/preferences.d/smartmontools.pref file
  • 14:48 jbond42: clean up old /etc/apt/preferences.d/puppet_all.pref file
  • 14:45 jbond42: clean up old /etc/apt/preferences.d/facter.pref file
  • 14:15 Urbanecm: Run undelete.php on a couple of pages at plwikisource per T241824
  • 13:50 marostegui: Deploy schema change on s4 codfw (lag will appear on codfw s4) - T234052
  • 13:46 moritzm: restarting exim on MXes to pick up SASL security update
  • 11:00 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1074 after schema change', diff saved to https://phabricator.wikimedia.org/P10033 and previous config saved to /var/cache/conftool/dbconfig/20200103-110028-marostegui.json
  • 10:20 moritzm: restarting apache on cloudmetrics* to pick up SASL security update
  • 10:11 moritzm: installing cyrus-sasl2 security updates on stretch/buster
  • 09:42 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1074 schema change', diff saved to https://phabricator.wikimedia.org/P10032 and previous config saved to /var/cache/conftool/dbconfig/20200103-094252-marostegui.json
  • 09:38 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db1076 after schema change', diff saved to https://phabricator.wikimedia.org/P10031 and previous config saved to /var/cache/conftool/dbconfig/20200103-093829-marostegui.json
  • 09:21 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db1076 schema change', diff saved to https://phabricator.wikimedia.org/P10030 and previous config saved to /var/cache/conftool/dbconfig/20200103-092107-marostegui.json
  • 08:17 marostegui: Deploy schema change on labswiki (wikitech) T234052
  • 07:10 marostegui: Deploy schema change on s2 codfw master, lag will appear on codfw - T234052
  • 06:57 marostegui: Deploy schema change on s6 eqiad hosts - T234052
  • 06:23 marostegui: Deploy schema change on db2089:3316
  • 06:22 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2089:3316 - T239453', diff saved to https://phabricator.wikimedia.org/P10029 and previous config saved to /var/cache/conftool/dbconfig/20200103-062242-marostegui.json
  • 06:21 marostegui@cumin1001: dbctl commit (dc=all): 'Repool db2087:3316 - T239453', diff saved to https://phabricator.wikimedia.org/P10028 and previous config saved to /var/cache/conftool/dbconfig/20200103-062148-marostegui.json

2020-01-02

  • 23:33 ejegg: updated Fundraising CiviCRM from d534f4e966 to 217a1f8c63
  • 23:09 ejegg: updated Fundraising CiviCRM from 6936aa0262 to d534f4e966
  • 22:44 ejegg: updated fundraising CiviCRM from f4db7fdb31 to 6936aa0262
  • 20:48 ejegg: updated Fundraising CiviCRM from abf0019c44 to f4db7fdb31
  • 20:30 sbassett@deploy1001: Synchronized wmf-config/CommonSettings.php: Deploying revert of temporary patch for T241503 (permissions clean-up) (duration: 00m 53s)
  • 19:57 sbassett@deploy1001: Synchronized wmf-config/CommonSettings.php: Deploying temporary patch for T241503 (permissions clean-up) (duration: 00m 54s)
  • 18:53 ejegg: re-enabled fundraising cron jobs
  • 18:29 ejegg: disabled fundraising cron jobs
  • 16:15 moritzm: restarting Apache on graphite* hosts to pick up SASL security update
  • 16:11 moritzm: restarting Apache on webperf* hosts to pick up SASL security update
  • 15:52 moritzm: restarting Apache on puppetboard* hosts to pick up SASL security update
  • 15:46 moritzm: restarting FPM on parsoid canary to pick up SASL security update
  • 14:27 marostegui: Deploy schema change on s6 codfw master (db2129) with replication - T234052
  • 14:22 marostegui: Deploy schema change on s5 eqiad hosts - T234052
  • 14:05 moritzm: restarting PHP/Apache on mw canaries to pick up SASL security update
  • 13:47 moritzm: installing cyrus-sasl security updates on Stretch/Buster
  • 13:23 marostegui: Deploy schema change on s5 codfw master (db2123) with replication - T234052
  • 13:17 moritzm: upgrading jessie servers to intel-microcode 3.20191115.2
  • 13:14 foks: scramble password for Windy906
  • 13:00 XioNoX: enable BFD traceoptions on cr1-eqiad and cr3-knams - T240659
  • 12:41 moritzm: upgrade recently reimaged hosts to puppet 5 T239832
  • 12:32 moritzm: upgrade recently reimaged hosts to facter 3 T239832
  • 12:09 jmm@cumin2001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 12:07 jmm@cumin2001: START - Cookbook sre.hosts.downtime
  • 11:53 moritzm: restarting FPM on scandium to clear opcache health
  • 11:42 moritzm: reimaging mw2277 to validate fix for puppet5/facter3 installation on new installs T239832
  • 11:23 arturo: import more openstack packages into stretch-wikimedia thirdparty/openstack-pike-stretch (T241347)
  • 10:43 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 10:40 filippo@cumin1001: START - Cookbook sre.hosts.downtime
  • 09:58 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0)
  • 09:58 filippo@cumin1001: START - Cookbook sre.hosts.downtime
  • 08:58 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2076 T241647', diff saved to https://phabricator.wikimedia.org/P10021 and previous config saved to /var/cache/conftool/dbconfig/20200102-085806-marostegui.json
  • 08:35 marostegui: Upgrade db2090
  • 08:26 marostegui: Upgrade db2075
  • 08:10 marostegui: Deploy schema change on officewiki.flow_wiki_ref on s3 master (db1123) T241387
  • 07:49 marostegui: Deploy schema change on techconductwiki.flow_wiki_ref (empty table) on s3 master (db1123) T241387
  • 07:26 marostegui: Upgrade db2079
  • 07:18 marostegui: Deploy schema change on labswiki.flow_wiki_ref (empty table) T241387
  • 06:46 marostegui: Deploy schema change on db2131 - T241387
  • 06:44 marostegui: Repool labsdb1009
  • 06:30 marostegui: Upgrade labsdb1009
  • 06:29 marostegui: Remove revision partitions from db2087:3316 T239453
  • 06:26 marostegui@cumin1001: dbctl commit (dc=all): 'Depool db2087:3316 - T239453', diff saved to https://phabricator.wikimedia.org/P10020 and previous config saved to /var/cache/conftool/dbconfig/20200102-062650-marostegui.json
  • 06:22 marostegui: Depool labsdb1009
  • 00:22 ejegg: re-enabled fundraising cron jobs

2020-01-01

  • 21:13 ejegg: stopped fundraising cron jobs to calculate EOY summaries
  • 04:57 andrewbogott: depooling labweb1002 so I can hotfix labweb1001 for T240734


Archives

See Server admin log/Archives.