You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server Admin Log: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(godog: disable raid handler for ms-be2021 - T208096)
imported>Stashbot
(bawolff: deploy patch for T210192)
Line 1: Line 1:
== 2018-11-23 ==
* 21:15 bawolff: deploy patch for [[phab:T210192|T210192]]
* 16:53 bblack: cleaned up remnants of globalsign-2017 unified cert (OCSP cache/config, unmanaged cert files, etc) on all cpNNNN - [[phab:T206804|T206804]]
* 14:02 gehel: restor wdqs-updater heap to 2G - [[phab:T210235|T210235]]
* 13:32 moritzm: installing confuse security updates
* 12:04 gehel: manually increasing wdqs-updater heap to 4G - [[phab:T210235|T210235]]
* 11:37 gehel: restarting updater on all wdqs ndoes
* 08:41 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=.*,service=zotero,cluster=kubernetes,name=.*
* 08:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 - [[phab:T86339|T86339]] (duration: 00m 46s)
* 07:48 moritzm: installing libtirpc security updates
* 07:14 marostegui: Deploy schema change db1078 - [[phab:T86339|T86339]]
* 07:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 - [[phab:T86339|T86339]] (duration: 00m 45s)
* 07:00 marostegui: Deploy schema change db1095 - [[phab:T86339|T86339]]
* 06:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 - [[phab:T86339|T86339]] (duration: 00m 49s)
* 06:32 marostegui: Deploy schema change db1123 - [[phab:T86339|T86339]]
* 06:31 marostegui: Deploy schema change dbstore1002:s3 - [[phab:T86339|T86339]]
* 06:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 - [[phab:T86339|T86339]] (duration: 00m 48s)
* 06:15 marostegui: Deploy schema change on s3 codfw master (db2043) with replication - [[phab:T86339|T86339]]
* 06:13 marostegui: Deploy schema change on db1067 (s1) master - [[phab:T86339|T86339]]
== 2018-11-22 ==
== 2018-11-22 ==
* 21:08 godog: disable raid handler for ms-be2021 - [[phab:T208096|T208096]]
* 21:08 godog: disable raid handler for ms-be2021 - [[phab:T208096|T208096]]

Revision as of 21:15, 23 November 2018

2018-11-23

  • 21:15 bawolff: deploy patch for T210192
  • 16:53 bblack: cleaned up remnants of globalsign-2017 unified cert (OCSP cache/config, unmanaged cert files, etc) on all cpNNNN - T206804
  • 14:02 gehel: restor wdqs-updater heap to 2G - T210235
  • 13:32 moritzm: installing confuse security updates
  • 12:04 gehel: manually increasing wdqs-updater heap to 4G - T210235
  • 11:37 gehel: restarting updater on all wdqs ndoes
  • 08:41 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=.*,service=zotero,cluster=kubernetes,name=.*
  • 08:10 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1078 - T86339 (duration: 00m 46s)
  • 07:48 moritzm: installing libtirpc security updates
  • 07:14 marostegui: Deploy schema change db1078 - T86339
  • 07:14 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1078 - T86339 (duration: 00m 45s)
  • 07:00 marostegui: Deploy schema change db1095 - T86339
  • 06:59 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1123 - T86339 (duration: 00m 49s)
  • 06:32 marostegui: Deploy schema change db1123 - T86339
  • 06:31 marostegui: Deploy schema change dbstore1002:s3 - T86339
  • 06:29 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 - T86339 (duration: 00m 48s)
  • 06:15 marostegui: Deploy schema change on s3 codfw master (db2043) with replication - T86339
  • 06:13 marostegui: Deploy schema change on db1067 (s1) master - T86339

2018-11-22

  • 21:08 godog: disable raid handler for ms-be2021 - T208096
  • 19:01 moritzm: installing uriparser security updates
  • 18:46 moritzm: installing openjpeg2 security updates
  • 18:45 arturo: enable puppet in all CloudVPS HW servers
  • 18:38 arturo: disable puppet in all CloudVPS HW servers to test a patch (T209948)
  • 17:46 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool es1016 (duration: 00m 46s)
  • 16:49 jynus: upgrading, and restarting es1016 (but not deleting, that was a mistake)
  • 16:49 jynus: upgrading, deleting at and restarting es1016
  • 16:35 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool es1016 (duration: 00m 47s)
  • 16:06 ema: trafficserver_8.0.0-1wm3 uploaded to stretch-wikimedia
  • 15:23 akosiaris@deploy1001: scap-helm zotero finished
  • 15:23 akosiaris@deploy1001: scap-helm zotero cluster codfw completed
  • 15:23 akosiaris@deploy1001: scap-helm zotero install --name production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: codfw]
  • 15:22 akosiaris@deploy1001: scap-helm zotero finished
  • 15:22 akosiaris@deploy1001: scap-helm zotero cluster eqiad completed
  • 15:22 akosiaris@deploy1001: scap-helm zotero install --name production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 14:50 akosiaris@deploy1001: scap-helm zotero install --name production -f zotero-values-eqiad.yaml stable/zotero [namespace: zotero, clusters: eqiad]
  • 13:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1080 - T86339 (duration: 00m 45s)
  • 13:52 marostegui@deploy1001: sync-file aborted: Depool db1080 - T86339 (duration: 00m 00s)
  • 13:50 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1080 - T86339 (duration: 00m 46s)
  • 13:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1083 - T86339 (duration: 00m 46s)
  • 13:45 marostegui: Deploy schema change on s1 eqiad hosts T86339
  • 13:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1083 - T86339 (duration: 00m 46s)
  • 13:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1089 - T86339 (duration: 00m 45s)
  • 13:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1089 - T86339 (duration: 00m 46s)
  • 13:30 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1114 - T86339 (duration: 00m 45s)
  • 13:26 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1114 - T86339 (duration: 00m 43s)
  • 13:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1119 - T86339 (duration: 00m 46s)
  • 13:17 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1119 - T86339 (duration: 00m 47s)
  • 12:19 jynus: upgrading, deleting at and restarting dbstore2002
  • 10:21 jynus: upgrading and restarting dbstore1001
  • 10:01 godog: bounce rsyslog on lithium, tls listener timeout on icinga
  • 09:54 godog: bounce rsyslog on wezen, tls listener timeout on icinga
  • 09:49 jynus: stop and upgrade dbstore2001
  • 09:24 moritzm: installing ruby-l18n security updates
  • 09:20 moritzm: installing ruby-rack security updates
  • 09:06 moritzm: installing jasper security updates
  • 08:21 marostegui: Deploy schema change on s1 codfw master (db2048) with replication - T86339
  • 08:19 marostegui: Deploy schema change on db1062 (s7 master) - T86339
  • 08:19 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1094 - T86339 (duration: 00m 46s)
  • 08:17 marostegui: Deploy schema change on db1094 - T86339
  • 08:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1094 - T86339 (duration: 00m 49s)
  • 06:58 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1086 - T86339 (duration: 00m 46s)
  • 06:55 marostegui: Deploy schema change on db1086 - T86339
  • 06:55 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1086 - T86339 (duration: 00m 46s)
  • 06:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1079 - T86339 (duration: 00m 46s)
  • 06:48 marostegui: Deploy schema change on db1079 (sanitarium master) with replication - T86339
  • 06:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1079 - T86339 (duration: 00m 51s)
  • 02:25 mutante: stat1005 - started nagios-nrpe-server

2018-11-21

  • 23:34 mutante: rsyncing /home from rutherfordium.eqiad to people1001.eqiad (people.wikimedia.org) T210036
  • 21:16 robh: cp5001 is offline running hardware tests after firmware updates to see if memory error still exists. ref: T199675
  • 20:55 robh: cp5001 reboot for firmware update
  • 19:54 milimetric@deploy1001: Finished deploy [analytics/aqs/deploy@e114d99]: Fixing sorting bug on top endpoints (duration: 05m 34s)
  • 19:49 milimetric@deploy1001: Started deploy [analytics/aqs/deploy@e114d99]: Fixing sorting bug on top endpoints
  • 19:43 ejegg: updated fundraising internal dashboard from b01458b260 to 5e9fb9a3ef
  • 17:21 elukey: manually started systemd-journald.service on scb1001 after OOM
  • 17:20 jynus: stop and upgrade db2081
  • 17:04 jynus: stop and upgrade db2080
  • 16:40 jynus: stop and upgrade db2066
  • 16:37 bawolff: deploy patch T209794
  • 16:25 jynus: stop and upgrade db2063
  • 15:15 jynus: stop and upgrade db2073
  • 15:15 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1085 (duration: 00m 46s)
  • 15:03 banyek: repooling db1085 after schema change (T85757)
  • 15:00 banyek: restarting replication on db1085 (T85757)
  • 14:31 banyek: stopping replication on db1085 (T85757)
  • 14:27 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1085 (duration: 00m 46s)
  • 14:22 banyek: depooling db1085 due schema change (T85757)
  • 13:39 jynus: stop and upgrade db2077
  • 13:05 XioNoX: remove BGP session to 2603 on cr4-ulsfo
  • 12:13 jynus: stop and upgrade db2076
  • 12:08 banyek: running schema change on dbstore1001:3316 (T85757)
  • 12:08 banyek: running schema change on dbstore1001 (T85757)
  • 12:04 jynus: stop and upgrade db2075
  • 11:53 banyek: running schema change on dbstore1002 (T85757)
  • 11:00 akosiaris: disable puppet on ores2* ores1* for gradual rollout of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/474694/1/modules/ores/manifests/web.pp
  • 10:55 jynus: stop and upgrade db2074
  • 10:51 _joe_: uploading prometheus-php-fpm-exporter to stretch-wikimedia main, T209573
  • 10:44 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1096:3316 (duration: 00m 46s)
  • 10:42 banyek: repooling db1096:3316 after schema change (T85757)
  • 10:30 jynus: stop and upgrade db2095
  • 10:26 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1096:3316 (duration: 00m 45s)
  • 10:26 godog: initial weight for new ms-be2* hosts (all but ms-be2047) - T209395
  • 10:23 banyek: depooling db1096:3316 due schema change (T85757)
  • 10:14 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1098:3316 (duration: 00m 46s)
  • 10:13 banyek@deploy1001: sync-file aborted: T85757: depool db1098:3316 (duration: 00m 03s)
  • 10:11 banyek: repooling db1098 after schema change (T85757)
  • 09:57 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1098:3316 (duration: 00m 46s)
  • 09:52 banyek: depooling db1098:3316 due schema change (T85757)
  • 09:49 volans: restarted pdfrender on scb1003
  • 09:29 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1113 (duration: 00m 46s)
  • 09:21 banyek: repooling db1113 after schema change (T85757)
  • 09:09 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1113 (duration: 00m 46s)
  • 09:01 banyek: depooling db1113 due schema change (T85757)
  • 08:48 marostegui: Deploy schema change on s7 codfw master - T86339
  • 08:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1104 - T86339 (duration: 00m 45s)
  • 08:40 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 - T86339 (duration: 00m 45s)
  • 08:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1109 - T86339 (duration: 00m 46s)
  • 08:30 marostegui: Deploy schema changes on s8 eqiad hosts - T86339
  • 08:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 - T86339 (duration: 00m 46s)
  • 07:50 marostegui: Deploy schema change on s8 codfw master (db2045) with replication - T86339
  • 07:46 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1091 - T86339 (duration: 00m 45s)
  • 07:42 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1091 - T86339 (duration: 00m 46s)
  • 07:36 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1084 - T86339 (duration: 00m 46s)
  • 07:33 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1084 - T86339 (duration: 00m 46s)
  • 07:32 marostegui: Deploy schema change on s4 eqiad hosts - T86339
  • 07:19 marostegui: Deploy schema change on db2051 (s4 codfw master) with replication - T86339
  • 07:10 marostegui: Drop foundationwiki.petition_data from s3 master (db1075) with replication - T208979
  • 07:06 marostegui: Deploy schema change on db1066 (s2 master) - T86339
  • 07:05 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1076 - T86339 (duration: 00m 46s)
  • 07:02 marostegui: Deploy schema change on db1076 - T86339
  • 07:02 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1076 - T86339 (duration: 00m 46s)
  • 06:53 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1074 - T86339 (duration: 00m 46s)
  • 06:48 marostegui: Deploy schema change on db1074 (sanitarium master) - T86339
  • 06:48 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1074 - T86339 (duration: 00m 46s)
  • 06:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1122 - T86339 (duration: 00m 46s)
  • 06:39 marostegui: Deploy schema change on db1122 - T86339
  • 06:38 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1122 - T86339 (duration: 00m 51s)
  • 06:30 marostegui: Drop schema change on db1103:3312 and db1105:3312 - T86339
  • 00:48 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:474967 Disable FlaggedRevs, enable RC patrol and add rights on srwikinews (duration: 00m 47s)
  • 00:39 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:475005 Enable SVGs in page in group1, rest of group0 (duration: 00m 46s)
  • 00:32 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:474976 Enable suppressredirect on srwiki (duration: 00m 47s)
  • 00:24 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:472744 Enable RCPatrol and add some rights on srwikibooks (duration: 00m 46s)
  • 00:07 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/GrowthExperiments/includes/Specials/SpecialWelcomeSurvey.php: gerrit:474946 WelcomeSurvey: indicate that the special page does write (duration: 00m 47s)

2018-11-20

  • 23:01 XioNoX: create vol.ans account on switches - T208726
  • 22:54 pmiazga@deploy1001: Synchronized wmf-config: SYNC: noop Doc: add repoConceptBaseUri comment (T209352)noop: Remove utf-8 characters from DOC comment for better readability (T209352)beta: Wikibase: override repoConceptBaseUri (T209352) (duration: 00m 49s)
  • 22:13 XioNoX: create volans account on routers - T208726
  • 21:12 eileen: civicrm revision changed from f4127d5316 to 013807a7b9, config revision is 684ec9b7c0
  • 21:08 jgleeson: civicrm changed from a31dbefc61 to f4127d5316
  • 18:56 ebernhardson: start loading dumps into elastic codfw omega and psi from mwmaint2001
  • 18:19 ladsgroup@deploy1001: Synchronized wmf-config/Wikibase.php: Create Federated Wikibase instance on Beta Commons, part II (T204748) (duration: 00m 47s)
  • 18:17 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Create Federated Wikibase instance on Beta Commons (T204748) (duration: 00m 48s)
  • 18:08 mholloway-shell@deploy1001: Finished deploy [mobileapps/deploy@7553087]: Deploy 2018 app fundraising announcement config (T204821) (duration: 03m 37s)
  • 18:04 mholloway-shell@deploy1001: Started deploy [mobileapps/deploy@7553087]: Deploy 2018 app fundraising announcement config (T204821)
  • 18:03 bstorm_: rebooting labsdb1006 for upgrades T209517
  • 17:25 bstorm_: rebooting labsdb1004 for upgrades T209517
  • 17:19 gehel: reload nginx configuration on elasticsearch codfw
  • 16:59 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2072 (duration: 00m 46s)
  • 16:53 XioNoX: rollback all BFD tests on cr1-codfw
  • 16:10 jynus: stop and upgrade db2072
  • 15:59 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool db2071, depool db2072 (duration: 00m 47s)
  • 15:51 banyek: repooling labsdb1011 (T209517)
  • 15:41 banyek: uploaded wmf-pt-kill_2.2.20-1+wmf5 packages to stretch-wikimedia (T209517)
  • 15:38 vgutierrez: switching to certcentral managed TLS certificate for librenms.wikimedia.org - T209856
  • 15:36 XioNoX: add test term allow BFD multihop on cr1-codfw loopback4 filter
  • 15:36 ejegg: updated fundraising CiviCRM from e648be0d9e to a31dbefc61
  • 15:20 moritzm: installing libopenmpt security updates
  • 15:12 XioNoX: enable bfd traceoptions on cr1-codfw
  • 15:02 XioNoX: Add BFD multihop support to Bird anycast DNS
  • 14:55 jijiki: libthumbor_1.3.2-0+wmf1+stretch1 uploaded to stretch-wikimedia T209886
  • 14:43 chasemp: puppet temp disable on es2001 for data transfer work
  • 14:32 jynus: stop and upgrade db2033
  • 13:19 jynus: stop and upgrade db2082
  • 13:06 banyek: depooling labsdb1011 (T209517)
  • 13:03 zeljkof: EU SWAT finished
  • 13:03 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings-labs.php: SWAT: deployment-prep: Update parsoid09 IP (T208101) (duration: 00m 47s)
  • 12:55 zfilipin@deploy1001: Synchronized wmf-config/db-labs.php: SWAT: deployment-prep: Update deployment-db* IPs (T208101) (duration: 00m 47s)
  • 12:55 banyek: setting innodb_flush_log_at_trx_commit to 2 on dbstore2002 (s3 instance only!) (T208320)
  • 12:53 banyek: setting innodb_flush_log_at_trx_commit to 2 on dbstore2002 (T208320)
  • 12:49 zfilipin@deploy1001: scap failed: average error rate on 4/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 12:45 zfilipin@deploy1001: Synchronized wmf-config/reverse-proxy-staging.php: SWAT: deployment-prep: Update cache-upload private IP (T208101) (duration: 00m 45s)
  • 12:30 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Use HD logos in InitialiseSettings.php for multiple projects (T150618) (duration: 00m 48s)
  • 12:25 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add tboverride permission to extendedmover group on enwiki (T209753) (duration: 00m 47s)
  • 12:19 jynus: powercycling db2087, stuck on reboot
  • 12:13 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Upload HD logos for multiple projects (T150618) (duration: 00m 48s)
  • 11:55 moritzm: rolling reboot of proton hosts for kernel security update
  • 11:27 jynus: stop and upgrade db2087
  • 11:16 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: (now really) repool db1093 (duration: 00m 47s)
  • 11:11 banyek: repooling db1093 (T85757)
  • 11:05 banyek: executing schema change on db1093 (T85757)
  • 11:00 jynus: stop and upgrade db2086
  • 10:59 banyek: db1093 was depooled wrong message sent
  • 10:51 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1093 (duration: 00m 47s)
  • 10:48 banyek: depooling db1093 (T85757)
  • 10:48 banyek: depooling db1093
  • 10:47 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool es2018 (duration: 00m 46s)
  • 10:17 jynus: upgrade and reboot es2018
  • 10:13 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool es2014, depool es2018 (duration: 00m 46s)
  • 09:34 marostegui: Deploy schema change on s2 hosts: dbstore1002, db1090:3312 and db1095:3312 - T86339
  • 09:26 marostegui: Deploy schema change on s2 codfw master (db2035) with replication - T86339
  • 09:25 jynus: upgrade and reboot es2014
  • 09:23 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Repool es2011, depool es2014 (duration: 00m 46s)
  • 09:23 godog: stress-test new ms-be hardware - T209395
  • 09:13 marostegui: Stop MySQL on pc2004, pc2005 and pc2006 for decommission - T209858
  • 09:05 gehel: powercycle elastic2021
  • 09:04 marostegui: Remove pc2004, pc2005 and pc2006 from tendril and zarcillo - T209858
  • 08:53 jynus: upgrade and reboot es2011
  • 08:48 jynus@deploy1001: Synchronized wmf-config/db-codfw.php: Depool es2011 (duration: 00m 47s)
  • 06:28 marostegui: Deploy schema change on db1070 (s5 master) - T86339
  • 06:28 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1082 - T86339 (duration: 00m 47s)
  • 06:21 marostegui: Deploy schema change on db1082 - T86339
  • 06:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1082 - T86339 (duration: 00m 52s)
  • 00:55 catrope@deploy1001: Synchronized static/images/project-logos/: Correct logos for Sindhi Wiktionary (duration: 00m 47s)
  • 00:51 mutante: Gerrit: added Jeena Huneidi to wmf-deployers (T209722)
  • 00:26 catrope@deploy1001: Synchronized php-1.33.0-wmf.4/resources/src/mediawiki.rcfilters/ui/mw.rcfilters.ui.FilterTagMultiselectWidget.js: RCFilters bug fix (T209657) (duration: 00m 47s)
  • 00:23 XioNoX: registering librenms IRC bot
  • 00:15 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Increase Schema.org page split test to 100% sampling (T208755) (duration: 00m 48s)

2018-11-19

  • 23:18 ejegg: updated fundraising CiviCRM from 275716f000 to e648be0d9e
  • 22:41 ejegg: updated fundraising CiviCRM from bbc0dddd1e to 275716f000
  • 22:21 ejegg: updated fundraising CiviCRM from 6b279509f8 to bbc0dddd1e
  • 21:58 XioNoX: restart bird on dns2001 to try to establish the BFD sessions
  • 20:27 catrope@deploy1001: Finished scap: Full scap for special alias changes for GrowthExperiments (duration: 21m 03s)
  • 20:06 catrope@deploy1001: Started scap: Full scap for special alias changes for GrowthExperiments
  • 19:50 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable WelcomeSurvey on cswiki and kowiki (T209725) (duration: 00m 46s)
  • 19:44 catrope@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/: EditorJourney fixes (T207307) (duration: 00m 46s)
  • 19:36 catrope@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/GrowthExperiments/: WelcomeSurvey fixes (T206371) (duration: 00m 46s)
  • 19:25 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable WelcomeSurvey on testwiki (T209725) (duration: 00m 49s)
  • 18:29 cmjohnson1: connecting eqiad asw2-b fpc2 and fpc8
  • 18:17 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@a25eb30]: GUI Update, new executor limits and new blazegraph build (duration: 08m 53s)
  • 18:08 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@a25eb30]: GUI Update, new executor limits and new blazegraph build
  • 15:45 ladsgroup@deploy1001: Finished deploy [ores/deploy@e957b24]: T209587 T170950 (duration: 17m 09s)
  • 15:28 ladsgroup@deploy1001: Started deploy [ores/deploy@e957b24]: T209587 T170950
  • 15:23 milimetric@deploy1001: Finished deploy [analytics/aqs/deploy@b399c34]: Removing empty fields from unique result (duration: 05m 17s)
  • 15:18 milimetric@deploy1001: Started deploy [analytics/aqs/deploy@b399c34]: Removing empty fields from unique result
  • 15:08 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting actor migration to write-both/read-old on group 0 (T188327) (duration: 00m 47s)
  • 14:02 joal@deploy1001: Finished deploy [analytics/aqs/deploy@7cde8c8]: Update unique-devices schema adding 2 fields (duration: 20m 57s)
  • 13:55 gtirloni: T207377 reboot cloudcontrol1004
  • 13:43 moritzm: installing chromium security update on proton* (tested new upstream release in deployment-prep)
  • 13:41 joal@deploy1001: Started deploy [analytics/aqs/deploy@7cde8c8]: Update unique-devices schema adding 2 fields
  • 13:39 fdans@deploy1001: Finished deploy [analytics/aqs/deploy@7cde8c8]: Deploying AQS to add two new fields to uniques (duration: 06m 18s)
  • 13:39 akosiaris: cumin -b1 -s 300 'ores2*' 'enable-puppet "merge of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/474158/" ; puppet agent -t ; service uwsgi-ores restart ; service celery-ores-worker restart'
  • 13:36 akosiaris: disable puppet on ores1* and ores2* for slow deployment of https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/474158/
  • 13:33 fdans@deploy1001: Started deploy [analytics/aqs/deploy@7cde8c8]: Deploying AQS to add two new fields to uniques
  • 13:21 gtirloni: T207377 icinga downtime and reboot of labcontrol1001 and labservices1001
  • 13:08 arturo: T207377 icinga downtime and reboot of cloudcontrol1003 and cloudservices1003
  • 13:06 raynor: EU SWAT finished
  • 13:03 pmiazga@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: [[gerrit:474679]|In SecurePoll use gpg1 to avoid gpg-agent autostart (T209802)]] (duration: 00m 48s)
  • 12:50 raynor: EU SWAT reopened
  • 12:40 raynor: EU SWAT finished
  • 12:38 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:472918]|Enable autopatrol, patrol, rollback rights and RCPatrol on srwiktionary (T209252)]] (duration: 00m 46s)
  • 12:21 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:474124]|Remove wgMetaNamespaceTalk for shnwiki (T206777)]] (duration: 00m 46s)
  • 12:10 pmiazga@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:473225]|Enable Schema.org page split test at 50% sampling (T208755)]] (duration: 00m 46s)
  • 11:33 gtirloni: labsdb1011 upgraded packages on labsdb1011 (pre-work T209517)
  • 11:20 elukey: restart memcached on mc1020 to apply -R 200 settings (shard wiped) - T208844
  • 10:41 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1078 T209754 (duration: 00m 46s)
  • 10:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 T209754 (duration: 00m 46s)
  • 10:21 banyek: stopping replication on db2076 (T85757)
  • 10:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Increase weight for db1078 T209754 (duration: 00m 46s)
  • 09:57 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Fully repool db1123 and increase weight for db1078 T209754 (duration: 00m 46s)
  • 09:48 marostegui: Rename table foundationwiki.petition_data on db1078 - T208979
  • 09:46 marostegui: Drop empty testwiki.petition_data from db1075 with replication - T208979
  • 09:44 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Slowly repool db1123 and db1078 T209754 (duration: 00m 46s)
  • 09:10 Nikerabbit: Rebuilt message group stats cache for T208521
  • 08:43 banyek: executing schema change on db2095 (T85757)
  • 07:57 marostegui: Stop MySQL on db1123 - T209754
  • 07:56 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1123 to clone db1078 T209754 (duration: 00m 47s)
  • 06:21 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Add db1078 line back to config file but depooled T209754 (duration: 00m 51s)
  • 06:20 marostegui@deploy1001: sync-file aborted: Add db1078 line back to config file but depooled T209754 (duration: 00m 02s)

2018-11-18

  • 17:26 andrewbogott: restarting cp1078 from mgmt console
  • 09:00 elukey: cleaned up analytics1039 and restarted Yarn

2018-11-17

  • off: 'reset modified attributes' on IcingaUI for db1078 (and mgmt) and all its services
  • 06:38 oblivian@deploy1001: Synchronized wmf-config/db-eqiad.php: Depooling db1078 (duration: 00m 59s)
  • 02:54 RoanKattouw: Deployed patches for T208112, T208109, T208110

2018-11-16

  • 23:13 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@ee91c41]: Deploy on test wdqs1010 (duration: 00m 23s)
  • 23:12 smalyshev@deploy1001: Started deploy [wdqs/wdqs@ee91c41]: Deploy on test wdqs1010
  • 16:23 Trey314159: reindexing Chinese wikis on elastic@eqiad and elastic@codfw (T209156)
  • 15:46 moritzm: rebooting debmonitor* instances for kernel security update and to pick up SSBD
  • 15:01 hashar: restarting zuul with 1300 events to process
  • 14:56 marostegui: Create ipblocks_restrictions on labswiki and labtestwiki on db1073 - T209674
  • 14:56 ema: upgrade cp-ats to 8.0.0-1wm2 T204225 T204209
  • 14:39 ema: trafficserver 8.0.0-1wm2 uploaded to stretch-wikimedia T204225 T204209
  • 14:36 hashar: Gracefully stopping zuul (kill -SIGUSR1)
  • 14:29 _joe_: re-depooling mw1261 for php-fpm testing
  • 14:28 oblivian@puppetmaster1001: conftool action : set/pooled=yes; selector: name=mw126[1-6].*,dc=eqiad,cluster=appserver
  • 14:26 _joe_: repooling the mw canaries
  • 13:02 moritzm: installing spamassassin security update on mendelevium
  • 12:10 godog: reboot restbase1014, nothing on console
  • 11:13 kartik@deploy1001: Finished deploy [cxserver/deploy@473b0de]: Update cxserver to b7cdb26 (T208831, T203077, T203160, T206777) (duration: 04m 26s)
  • 11:09 kartik@deploy1001: Started deploy [cxserver/deploy@473b0de]: Update cxserver to b7cdb26 (T208831, T203077, T203160, T206777)
  • 10:39 akosiaris: upgrade OTRS to 5.0.32 T209691
  • 09:31 marostegui: Set back sync_binlog=1 and trx_commit=1 after dbstore2002:3313 has caught up
  • 09:25 moritzm: installing postgres updates on labsdb1006
  • 09:21 moritzm: removed labvirt1016 from debmonitor db, got renamed to cloudvirt1016
  • 08:40 moritzm: installing curl security updates on jessie
  • 07:32 elukey: forced reboot + fsck + removal of /var/lib/hadoop/data/l from fstab on analytics1029
  • 06:36 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Pool pc2009 in pc3 - T208383 (duration: 00m 56s)
  • 06:28 marostegui: Set sync_binlog=0 and trx_commit=2 on dbstore2002:3313 to let it catch up
  • 05:47 vgutierrez: uploaded certcentral 0.7 to apt.wikimedia.org (stretch) - T208967 T209475
  • 00:55 mutante: some users reported missing files in home dirs on mwmaint1002, reversed rsyncd/ferm setup and rsynced /home from mwmaint2001 to /root on mwmaint1002, restored individually where requested, rsync is not fully automatic but puppetized with rsync::quickdatacopy
  • 00:35 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/includes/PageViews.php: SWAT: Exclude users where getRegistration() returns null (duration: 00m 47s)
  • 00:26 eileen: civicrm revision changed from 71755d021b to 6b279509f8, config revision is 684ec9b7c0 (lybunt report)
  • 00:22 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/GrowthExperiments: SWAT: gerrit:473843 gerrit:473844 gerrit:473845 (duration: 00m 49s)
  • 00:21 sbisson@deploy1001: sync aborted: php-1.33.0-wmf.4/extensions/GrowthExperiments SWAT: gerrit:473843 gerrit:473844 gerrit:473845 (duration: 06m 16s)
  • 00:15 sbisson@deploy1001: Started scap: php-1.33.0-wmf.4/extensions/GrowthExperiments SWAT: gerrit:473843 gerrit:473844 gerrit:473845

2018-11-15

  • 21:58 mutante: mwmaint1002 - restoring entire /home of mwmaint1001 from Bacula (job queued and to tmp dir, not directly into /home)
  • 21:06 hashar: Deleting Nodepool instances on contintcloud T209361
  • 21:05 hashar: Stopped nodepool on labnodepool1001.eqiad.wmnet . Service is no more used. T209361 T209642
  • 20:22 mholloway-shell@deploy1001: Finished deploy [kartotherian/deploy@UNKNOWN]: Fix: Loosen WDQS content-type header check to unbreak maps (T209471) (duration: 03m 55s)
  • 20:18 mholloway-shell@deploy1001: Started deploy [kartotherian/deploy@UNKNOWN]: Fix: Loosen WDQS content-type header check to unbreak maps (T209471)
  • 20:17 mutante: re-added Chase to pwstore, signed .users file, re-encrypted all pwstore files, git pushed
  • 20:15 mholloway-shell@deploy1001: Finished deploy [kartotherian/deploy@48a1e83]: Fix: Loosen WDQS content-type header check to unbreak maps (T209471) (duration: 04m 26s)
  • 20:11 mholloway-shell@deploy1001: Started deploy [kartotherian/deploy@48a1e83]: Fix: Loosen WDQS content-type header check to unbreak maps (T209471)
  • 20:04 urandom: dropping disused keyspaces -- T208616
  • 20:04 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Enable data collection for UnderstandingFirstDay on cswiki and kowiki (duration: 00m 53s)
  • 19:51 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Configure sensitive namespaces for EditorJourney schema (T207307) (duration: 00m 53s)
  • 19:49 catrope@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/: cherry-picks for T208773 (duration: 00m 54s)
  • 18:55 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Hot-deploy I26f2dc2e: Don't over-ride default Wikibase string limits (duration: 00m 53s)
  • 18:50 jforrester@deploy1001: Synchronized wmf-config/Wikibase.php: Hot-deploy Ib10de2e3: Don't set Wikibase string limits when null (duration: 00m 55s)
  • 18:31 ladsgroup@deploy1001: Finished deploy [ores/deploy@dba11e9]: Another small update (duration: 13m 42s)
  • 18:18 ladsgroup@deploy1001: Started deploy [ores/deploy@dba11e9]: Another small update
  • 18:17 ladsgroup@deploy1001: Finished deploy [ores/deploy@51cdf6b]: T208623 (duration: 14m 41s)
  • 18:03 ladsgroup@deploy1001: Started deploy [ores/deploy@51cdf6b]: T208623
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 7 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 6 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 5 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 3 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 2 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:27 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 1 wikis in group 2 for T209373. This may cause lag in codfw.
  • 17:20 gehel: upgrade prometheus-blazegraph-exporter on all wdqs nodes - T206123
  • 16:41 bstorm_: rebooted labsdb1007 for upgrades
  • 16:37 andrewbogott: rebuilding labvirt1015 and cloudvirt1015
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on wikitech for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 8 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 7 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 5 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 4 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:21 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 3 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:20 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 2 wikis in group 1 for T209373. This may cause lag in codfw.
  • 14:07 gehel: plugin and JVM upgrade on elasticsearch / cirrus / eqiad completed - T209293
  • 13:59 hashar@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.4
  • 13:55 XioNoX: push firewall policies to pfw3-eqiad - T209421
  • 13:50 banyek: Deploy schema change on s6 codfw master, this will generate lag on s6 codfw - T85757
  • 13:48 moritzm: installing qemu security updates (which also backport support for SSBD passthrough) on ganeti clusters
  • 13:48 moritzm: installing qemu security updates (which also backport support for SSBD passthrough)
  • 13:12 mobrovac@deploy1001: Finished deploy [restbase/deploy@22cb0ec]: Add new wikis to RESTBase - T206777 T205710 T205546 T204477 (duration: 19m 56s)
  • 13:00 Lucas_WMDE: EU SWAT finished
  • 12:59 tarrow@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js: gerrit:473710 Fix (accidentally?) reversed blue and yellow lines SWAT T208238 T162119 again (duration: 00m 55s)
  • 12:57 _joe_: upping pm.maxworkers to 40 on mw1261 on php7.2-fpm, benchmarking T206341
  • 12:56 tarrow@deploy1001: scap failed: average error rate on 3/11 canaries increased by 10x (rerun with --force to override this check, see https://logstash.wikimedia.org/goto/db09a36be5ed3e81155041f7d46ad040 for details)
  • 12:53 moritzm: installing nginx security updates
  • 12:52 mobrovac@deploy1001: Started deploy [restbase/deploy@22cb0ec]: Add new wikis to RESTBase - T206777 T205710 T205546 T204477
  • 12:47 lucaswerkmeister-wmde@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove wgWBQualityConstraintsCacheCheckConstraintsResults (T207854) (duration: 00m 54s)
  • 12:42 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Prod: increase Schema.org page split test to 25% sampling T208755 (duration: 00m 53s)
  • 12:40 arturo: T207377 downtime and reboot labmon1001
  • 12:38 lucaswerkmeister-wmde@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js: Fix (accidentally?) reversed blue and yellow lines (T162119, T208238) (duration: 00m 54s)
  • 12:37 lucaswerkmeister-wmde@deploy1001: sync aborted: php-1.33.0-wmf.3/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js Fix (accidentally?) reversed blue and yellow lines (T162119, T208238) (duration: 00m 11s)
  • 12:37 lucaswerkmeister-wmde@deploy1001: Started scap: php-1.33.0-wmf.3/extensions/RevisionSlider/modules/ext.RevisionSlider.SliderView.js Fix (accidentally?) reversed blue and yellow lines (T162119, T208238)
  • 12:30 tarrow@deploy1001: Synchronized wmf-config/Wikibase.php: gerrit:473716 Read WikibaseStringLimit in Wikibase.php T154660 (duration: 00m 53s)
  • 12:30 moritzm: draining ganeti1001 for reboot/kernel security update
  • 12:26 tarrow@deploy1001: sync-file aborted: gerrit:473716 Read WikibaseStringLimit in Wikibase.php (duration: 00m 01s)
  • 12:21 tarrow@deploy1001: Synchronized wmf-config/InitialiseSettings.php: gerrit:473694 Set Wikibase string-limits for wikidata dblist T154660 (duration: 00m 54s)
  • 12:08 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Make AdvancedSearch the default on de-, fa-, ar-, and hu-wiki T207640 (duration: 00m 55s)
  • 11:34 _joe_: depooling mw1261 for early benchmarks of php7.2-fpm
  • 11:04 akosiaris: enable puppet across the fleet. puppetdb1001 reboot done, ganeti migration_downtime setting applied
  • 10:49 akosiaris: disable puppet across the fleet for puppetdb1001 reboot
  • 10:49 moritzm: fail over ganeti master in eqiad to ganeti1003
  • 10:34 akosiaris: set migration_downtime=2000 for puppetdb1001. Should help with migration stalls
  • 10:15 banyek: sanitizing db1124 ( T205714 T207584 T205713 T206916 )
  • 10:07 banyek: sanitizing db2094 ( T205714 T207584 T205713 T206916 )
  • 09:57 moritzm: draining ganeti1002 for reboot/kernel security update
  • 09:40 volans: restarting icinga on icinga1001
  • 09:32 vgutierrez: restarting icinga on icinga1001
  • 09:31 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1110 (duration: 00m 52s)
  • 09:22 moritzm: draining ganeti1003 for reboot/kernel security update
  • 09:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1110 (duration: 00m 53s)
  • 09:16 marostegui: Stop MySQL on db1110 for upgrade
  • 09:08 moritzm: reset failed debmonitor session in ms-be2038
  • 09:08 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: repool db1088 (duration: 00m 53s)
  • 09:03 banyek: repooling db1088 (T85757)
  • 08:58 ema: upload fifo-log-demux 0.1 to stretch-wikimedia T204225
  • 08:56 banyek: Deploy schema change on db1088 (T85757)
  • 08:49 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T85757: depool db1088 (duration: 00m 53s)
  • 08:45 banyek: depooling db1088 due a schema change (T85757)
  • 08:21 moritzm: draining ganeti1004 for reboot/kernel security update
  • 07:42 marostegui: Drop site_stats.ss_total_views from labswiki - T86339
  • 07:08 elukey: memcached on mc1019 restarted to apply -R 200 - T208844
  • 06:57 marostegui: Stop MySQL on db2071 to upgrade MySQL and kernel
  • 06:57 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool db2071 (duration: 00m 54s)
  • 06:06 marostegui: Stop MySQL on pc2006 to clone pc2009 - T208383
  • 06:06 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool pc2006 - T208383 (duration: 00m 53s)
  • 06:05 marostegui@deploy1001: sync-file aborted: Dool pc2006 - T208383 (duration: 00m 00s)
  • 05:59 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Pool pc2008 in pc2 - T208383 (duration: 00m 56s)
  • 00:50 niharika29@deploy1001: Synchronized wmf-config/InitialiseSettings.php: increase Schema.org page split test to 5% sampling T208755 (duration: 00m 54s)

2018-11-14

  • 22:46 ejegg: updated payments-wiki from 5751286f1c to d2b66c5bab
  • 21:34 thcipriani: restart gerrit to load JavaMelody dependency library
  • 21:28 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@ab2fa18]: deploy javamelody on cobalt (duration: 00m 09s)
  • 21:28 thcipriani@deploy1001: Started deploy [gerrit/gerrit@ab2fa18]: deploy javamelody on cobalt
  • 21:27 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@ab2fa18]: deploy javamelody on gerrit2001 (duration: 00m 11s)
  • 21:26 thcipriani@deploy1001: Started deploy [gerrit/gerrit@ab2fa18]: deploy javamelody on gerrit2001
  • 20:13 hashar@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.4 (duration: 00m 52s)
  • 20:12 hashar@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.4
  • 19:14 hashar@deploy1001: Synchronized php-1.33.0-wmf.4/includes/jobqueue/JobQueue.php: Actually return the value from getRootJobCacheKey() - T209429 (duration: 00m 53s)
  • 18:33 hashar@deploy1001: Finished scap: php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php Add trace to debug memcached bad key error - T209429 (duration: 34m 07s)
  • 17:58 hashar@deploy1001: Started scap: php-1.33.0-wmf.4/includes/libs/objectcache/MemcachedPeclBagOStuff.php Add trace to debug memcached bad key error - T209429
  • 17:53 arturo: T207377 downtime and reboot cloudnet1003 (cloudnet1004 is the active one already)
  • 17:37 arturo: T207377 downtime and reboot cloudnet1004 (cloudnet1003 is the active one already)
  • 17:31 bawolff_: Running importImage.php for 'Opening ceremony of First accusation protest against presumption of guilt of judicial branch.webm' per request T209495
  • 17:26 sbisson@deploy1001: Synchronized dblists/wikidataclient.dblist: SWAT: Add incubatorwiki to wikidataclient.dblist (duration: 00m 48s)
  • 17:19 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/MobileFrontend/resources/mobile.editor.common/schemaEditAttemptStep.js: SWAT: schemaEditAttemptStep.js: Use correct config var name for sampling rate (duration: 00m 54s)
  • 17:12 sbisson@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaEvents/includes/WikimediaEventsHooks.php: SWAT: Fix EditAttemptStepSamplingRate variable export (duration: 00m 54s)
  • 16:33 bblack: [Done] replacement of GlobalSign unified TLS cert at US edges complete - T206804
  • 16:25 moritzm: rebooting ganeti1005 for kernel security update
  • 16:16 moritzm: rebooting restbase-dev1006 for kernel security update and OpenJDK security update
  • 16:10 bblack: disabling puppet as precaution on all caches (cumin A:cp) - T206804
  • 16:09 bblack: starting replacement of GlobalSign unified TLS cert at US edges (affects all public TLS termination for US traffic edges) - T206804
  • 16:08 moritzm: rebooting restbase-dev1005 for kernel security update and OpenJDK security update
  • 15:58 moritzm: rebooting restbase-dev1004 for kernel security update and OpenJDK security update
  • 15:53 jiji: Restarting pdfrender on scb*.eqiad.wmnet
  • 15:50 godog: roll restart swift-proxy in eqiad to apply statsd changes
  • 15:41 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T85757: repool db2046 (duration: 00m 52s)
  • 15:39 banyek: repooling db2046 (T85757)
  • 15:12 godog: roll-restart swift on ms-be1* to pick up statsd changes
  • 15:07 Amir1: ladsgroup@mwmaint1002:/srv/mediawiki-staging/php-1.33.0-wmf.4$ mwscript sql.php --wiki=incubatorwiki extensions/Wikibase/client/sql/entity_usage.sql (T209207)
  • 14:56 addshore@deploy1001: Synchronized wmf-config: Prod: Enable Schema.org page split test at 1% sampling (again) (duration: 00m 54s)
  • 14:29 godog: roll-restart swift-proxy in codfw to pick up statsd changes
  • 14:14 addshore@deploy1001: Synchronized wmf-config: Revert Prod: Enable Schema.org page split test at 1% sampling (duration: 00m 54s)
  • 14:10 Reedy: Wiki created T205714 T207584 T205713 T206916
  • 14:07 gehel: starting plugin and JVM upgrade on elasticsearch / cirrus / eqiad - T209293
  • 14:07 reedy@deploy1001: Synchronized wmf-config/interwiki.php: Updating interwiki cache (duration: 02m 25s)
  • 14:02 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Fix shnwiki TZ (duration: 00m 54s)
  • 13:57 reedy@deploy1001: rebuilt and synchronized wikiversions files: (no justification provided)
  • 13:55 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Add pc2010 as spare - T208383 (duration: 00m 53s)
  • 13:54 reedy@deploy1001: Synchronized multiversion/MWMultiVersion.php: (no justification provided) (duration: 00m 53s)
  • 13:51 reedy@deploy1001: Synchronized wmf-config/InitialiseSettings.php: new wikis (duration: 00m 53s)
  • 13:51 gehel: restarting tilerator on maps1004 for config change
  • 13:50 reedy@deploy1001: Synchronized static/images/project-logos/: (no justification provided) (duration: 00m 53s)
  • 13:49 reedy@deploy1001: Synchronized dblists/: new wikis! (duration: 00m 53s)
  • 13:48 reedy@deploy1001: Synchronized langlist: shn (duration: 00m 52s)
  • 13:45 gehel: plugin and JVM upgrade completed on elasticsearch / cirrus / codfw - T209293
  • 13:45 reedy@deploy1001: rebuilt and synchronized wikiversions files: (no justification provided)
  • 13:07 moritzm: installing ghostscript security updates on stretch
  • 13:01 reedy@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaMaintenance/addWiki.php: Fixing addshores code... (duration: 00m 53s)
  • 13:00 reedy@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/WikimediaMaintenance/addWiki.php: Fixing addshores code... (duration: 00m 55s)
  • 12:56 moritzm: installing gettext "security" updates for trusty
  • 12:48 moritzm: installing python3.4 security updates on trusty (Debian already fixed)
  • 12:40 reedy@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/WikimediaMaintenance/addWiki.php: Unbreak adding wiktionary (duration: 00m 52s)
  • 12:39 reedy@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaMaintenance/addWiki.php: Unbreak adding wiktionary (duration: 00m 53s)
  • 12:39 moritzm: installing python security updates on trusty
  • 12:36 Amir1: EU SWAT is done
  • 12:35 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert the language of votewiki to English (en) (T207560) (duration: 00m 55s)
  • 12:30 ladsgroup@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Start reading from change_tag_def on wikidatawiki (T208846) (duration: 00m 55s)
  • 12:19 pmiazga@deploy1001: Synchronized wmf-config: SWAT: [[gerrit:473079]|Enable Schema.org page split test at 1% sampling (T208755)]] (duration: 00m 54s)
  • 12:02 reedy@deploy1001: Synchronized php-1.33.0-wmf.4/extensions/WikimediaMaintenance/addWiki.php: Unbreak adding TMH tables (duration: 00m 53s)
  • 12:01 reedy@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/WikimediaMaintenance/addWiki.php: Unbreak adding TMH tables (duration: 00m 55s)
  • 11:11 moritzm: draining ganeti1005 for reboot/kernel security update
  • 11:09 banyek: Deploy schema change on db2046 (T85757)
  • 10:07 banyek@deploy1001: Synchronized wmf-config/db-codfw.php: T85757: depooling db2046 (duration: 00m 55s)
  • 09:59 banyek: depooling db2046 (T85757)
  • 09:22 moritzm: updated stretch netinst image for 9.6 point release
  • 09:17 marostegui: Deploy schema change on db2053 - T86339
  • 08:24 marostegui: Deploy schema change on s5 codfw master, this will generate lag on s5 codfw - T205913
  • 08:22 marostegui: Deploy schema change on s2 codfw master, this will generate lag on s7 codfw - T205913
  • 08:19 marostegui: Deploy schema change on s2 codfw master, this will generate lag on s2 codfw - T205913
  • 08:17 marostegui: Deploy schema change on s6 codfw master, this will generate lag on s6 codfw - T205913
  • 08:14 marostegui: Deploy schema change on s4 codfw master, this will generate lag on s4 codfw - T205913
  • 08:08 marostegui: Deploy schema change on s3 codfw master, this will generate lag on s3 codfw - T205913
  • 08:07 godog: rollout rsyslog_exporter to eqiad
  • 07:42 marostegui: Deploy schema change on s3 codfw master, this will generate lag on s3 codfw - T203709
  • 07:19 marostegui: Deploy schema change on s7 codfw master, this will generate lag on s7 codfw - T203709
  • 07:07 marostegui: Deploy schema change on s2 codfw master, this will generate lag on s2 codfw - T203709
  • 06:52 marostegui: Deploy schema change on s4 codfw master, this will generate lag on s4 codfw - T203709
  • 06:40 marostegui: Deploy schema change on s6 codfw master, this will generate lag on s6 codfw -T203709
  • 06:32 marostegui: Stop MySQL on pc2005 to clone it to pc2008 - T208383
  • 06:27 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool pc2005 - T208383 (duration: 01m 04s)
  • 05:46 _joe_: restarting gerrit
  • 01:02 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/Wikibase/client/includes: SWAT: Update: use wikibase-debug logger instead of "PageRandomLookup" T208796 (duration: 00m 56s)
  • 00:42 mutante: restarted smokeping on netmon1002 and netmon2001

2018-11-13

  • 22:42 XioNoX: restart librenms irc bot
  • 22:24 XioNoX: add term labnet-nova-api to cloud-in4 on cr1/2-eqiad - T209424
  • 20:22 herron: updated labs realm smarthosts (via hiera) to mx-out0[12].wmflabs.org T41785
  • 19:49 otto@deploy1001: Finished deploy [analytics/refinery@62d6f4b]: Deploy hive jars from CDH 5.10.0 to workaround Refine bug: T209407 (duration: 05m 57s)
  • 19:43 otto@deploy1001: Started deploy [analytics/refinery@62d6f4b]: Deploy hive jars from CDH 5.10.0 to workaround Refine bug: T209407
  • 19:31 herron: uploaded librdkafka_0.11.6-1~bpo9+1+wikimedia1 packages to stretch-wikimedia T209300
  • 18:11 mutante: the CUSTOM message from ores.svc.codfw was the (one-time) test of the new Icinga server
  • 18:03 mutante: icinga migration has concluded, we are now on stretch and icinga1001, einsteinium is passive (T202782)
  • 17:27 mutante: re-enabled puppet on icinga1001, einsteinium becoming passive
  • 17:21 mutante: ran puppet on einsteniumr; e-enabling puppet on tegmen and icinga1001
  • 17:13 bstorm_: Added 172.16.0.0/21 to the allowed connections for wikilabels postgresql on labsdb1004
  • 17:04 mutante: disabled puppet on all 3 icinga servers, re-enabling on einsteinium , going through https://wikitech.wikimedia.org/wiki/Icinga#Failover_Icinga_between_the_active_and_passive_servers
  • 17:02 ejegg: updated payments-wiki from 20542c9184 to 5751286f1c
  • 17:01 mutante: starting migration of icinga server - maintenance windows
  • 16:33 thcipriani: restarting gerrit service for upgrade to 2.15.6
  • 16:32 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@d2763c6]: v2.15.6 to cobalt (duration: 00m 10s)
  • 16:32 thcipriani@deploy1001: Started deploy [gerrit/gerrit@d2763c6]: v2.15.6 to cobalt
  • 16:29 thcipriani@deploy1001: Finished deploy [gerrit/gerrit@d2763c6]: v2.15.6 to gerrit2001 (duration: 00m 11s)
  • 16:29 thcipriani@deploy1001: Started deploy [gerrit/gerrit@d2763c6]: v2.15.6 to gerrit2001
  • 16:22 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting actor migration to write-both/read-old on test wikis and mediawikiwiki (T188327) (duration: 00m 54s)
  • 16:07 anomie@mwmaint1002: Running refreshExternallinksIndex.php on labtestwiki for T209373
  • 16:07 anomie@mwmaint1002: Running refreshExternallinksIndex.php on section 3 wikis in group 0 for T209373
  • 15:48 _joe_: upgrading extensions on all appservers / jobrunners while upgrading to php 7.2
  • 15:45 gehel: restart tilerator on maps1004
  • 15:21 moritzm: draining ganeti1006 for reboot/kernel security update
  • 15:18 marostegui: Restore replication consistency options on dbstore2002:3313 as it has caught up - T208320
  • 14:59 akosiaris: increase the migration downtime for kafkamon1001. It should make live migration of these VMs easier and without the need for manual fiddling
  • 14:54 hashar@deploy1001: rebuilt and synchronized wikiversions files: group to 1.33.0-wmf.4 | T206658
  • 14:40 hashar@deploy1001: Finished scap: testwiki to php-1.33.0-wmf.4 | T206658 (duration: 19m 34s)
  • 14:27 moritzm: draining ganeti1007 for reboot/kernel security update
  • 14:20 hashar@deploy1001: Started scap: testwiki to php-1.33.0-wmf.4 | T206658
  • 14:20 akosiaris: reboot logstash1007, logstash1008, logstash1009 with 500 secs of sleep between them for the migration_downtime ganeti setting to be applied
  • 14:18 akosiaris: increase the migration downtime for logstash1007, logstash1008, logstash1009. It should make live migration of these VMs easier and without the need for manual fiddling
  • 14:15 hashar@deploy1001: Pruned MediaWiki: 1.32.0-wmf.24 (duration: 08m 55s)
  • 14:03 hashar: Applied security patches to 1.33.0-wmf.4 | T206658
  • 14:03 gehel: start plugin and JVM upgrade on elasticsearch / cirrus / codfw - T209293
  • 14:00 hashar: scap prep 1.33.0-wmf.4 # T206658
  • 13:58 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Pool pc2007 to replace pc2004 (duration: 00m 48s)
  • 13:41 marostegui: Deploy schema change on s8 codfw master (db2045) this will generate lag on s8 codfw - T203709
  • 13:40 hashar: Cutting wmf/1.33.0-wmf.4 branch | T206658
  • 13:30 moritzm: draining ganeti1008 for reboot/kernel security update
  • 12:51 phuedx: European Mid-day SWAT finished
  • 12:50 phuedx@deploy1001: Finished scap: SWAT: Define WikimediaMessages for Wikibase SEO change l18n refresh (duration: 21m 43s)
  • 12:28 phuedx@deploy1001: Started scap: SWAT: Define WikimediaMessages for Wikibase SEO change l18n refresh
  • 12:22 phuedx@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/WikimediaMessages/: SWAT: Define WikimediaMessages for Wikibase SEO change (T208755) (duration: 00m 56s)
  • 10:57 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1092 (duration: 00m 52s)
  • 10:47 marostegui: Deploy schema change on db1116:3318 T203709
  • 10:40 godog: stop sending metrics to old graphite hardware
  • 10:15 gehel: restart elasticsearch on relforge for plugin upgrade - T209293
  • 09:54 moritzm: restarting jenkins on releases1001 to pick up Java security update
  • 09:25 _joe_: uploading new versions of php-msgpack, php-geoip compatible with both php 7.0 and php 7.2 to thirdparty/php72 T208433
  • 09:23 marostegui: Deploy schema change on db1092 T203709
  • 09:23 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1092 (duration: 00m 52s)
  • 09:20 elukey: rollout new prometheus-mcrouter-exporter to mw* - previous rollout didn't work as expected
  • 09:11 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1104 (duration: 00m 55s)
  • 08:37 moritzm: updating remaining rsyslog on stretch to 8.38.0-1~bpo9+1wmf1
  • 07:21 marostegui: Deploy schema change on db1104 T203709
  • 07:20 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1104 (duration: 00m 53s)
  • 07:16 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1109 (duration: 00m 54s)
  • 07:05 elukey: powercycle lvs2006 - mgmt/serial console blank, not responsive since hours ago
  • 06:02 marostegui: Add ipb_sitewide column to db1073:labtestwiki
  • 05:43 marostegui: Stop MySQL on pc2004 to transfer its data to pc2007 - T208383
  • 05:42 marostegui@deploy1001: Synchronized wmf-config/db-codfw.php: Depool pc2004 - T208383 (duration: 00m 53s)
  • 05:39 marostegui: Deploy schema change on db2048 (s1 codfw master), this will create lag on s1 codfw - T114117
  • 05:34 marostegui: Deploy schema change on db1109 T203709
  • 05:34 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1109 (duration: 00m 55s)

2018-11-12

  • 19:22 bawolff@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT T208663 4ff32d1df - Enable moving files for users with patrol and rollbacker rights on srwiki (duration: 00m 54s)
  • 18:29 onimisionipe@deploy1001: Finished deploy [wdqs/wdqs@ee91c41]: GUI update, New Thesaurus endpoint, New updater build and blazegraph update (duration: 11m 28s)
  • 18:17 onimisionipe@deploy1001: Started deploy [wdqs/wdqs@ee91c41]: GUI update, New Thesaurus endpoint, New updater build and blazegraph update
  • 18:03 elukey: rolling restart of aqs on aqs* to pick up new druid datasource settings
  • 17:44 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Optimize s2 for throughput (duration: 00m 53s)
  • 17:19 jynus@deploy1001: Synchronized wmf-config/db-eqiad.php: Pool more resources into s2 api (duration: 00m 54s)
  • 17:15 _joe_: restarting HHVM on the high-cpu api hosts in eqiad, to ease the pressure and latencies
  • 17:10 _joe_: depooling mw1222 for debug
  • 16:41 banyek: disabling puppet on parsercache hosts (T208383)
  • 16:14 elukey: upgrade prometheus-mcrouter-exporter on all the mw* hosts to the new version
  • 16:09 phuedx: phuedx@mwmaint1002 running restPageRandom.php maintenance script for large wikis
  • 16:02 volans: restarted proton on proton1002
  • 15:45 jynus: stop and upgrade db2094
  • 15:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1101:3318 (duration: 00m 53s)
  • 14:49 banyek: disabling puppet on parsercache hosts - pc[12]00[456] (T208383)
  • 14:17 phuedx: phuedx@mwmaint1002 running restPageRandom.php maintenance script for medium wikis
  • 14:08 moritzm: updating rsyslog on stretch to 8.38.0-1~bpo9+1wmf1
  • 13:59 phuedx: phuedx@mwmaint1002 running restPageRandom.php maintenance script for small wikis (small.dblist)
  • 13:59 marostegui: Deploy schema change on db1101:3318 - T203709
  • 13:55 hashar: Upgrading Jenkins on contint1001 , contint2001, releases1001 and releases2002 | T209264
  • 13:46 moritzm: updating libfastjson on stretch to 0.99.8-1~bpo9+1wmf1
  • 13:41 gehel: starting rolling restart of elasticsearch codfw for JVM upgrade
  • 13:32 phuedx: phuedx@mwmaint1002 running restPageRandom.php maintenance script for mediawikiwiki
  • 13:23 phuedx: phuedx@mwmaint1002 running resetPageRandom.php maintenance script for testwiki
  • 13:17 zeljkof: EU SWAT finished
  • 13:16 phuedx@deploy1001: Synchronized php-1.33.0-wmf.3/maintenance/resetPageRandom.php: SWAT: Provide a script to reset the page_random column (T208909) (duration: 00m 53s)
  • 13:16 moritzm: updating liblognorm on stretch to 2.0.3-1~bpo9+1wmf1
  • 13:14 phuedx@deploy1001: Synchronized php-1.33.0-wmf.3/autoload.php: SWAT: Provide a script to reset the page_random column (T208909) (duration: 00m 55s)
  • 13:12 elukey: upgrade the Hadoop Analytics cluster to CDH 5.15 (downtime required)
  • 12:54 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Add new throttle rule for Wikipedia event in Ireland on 2018-11-13 (T209037) (duration: 00m 53s)
  • 12:15 jiji: Restarting nutcracker on scb200[1-6] - T206450
  • 12:00 moritzm: uploaded jenkins 2.138.3 to apt.wikimedia.org (jessie and stretch)
  • 11:49 hashar: updating puppet CI job for mtail upgrade https://gerrit.wikimedia.org/r/#/c/integration/config/+/472962/
  • 11:37 hashar: contint1001 : cleaning disk | T209123 ?
  • 11:26 moritzm: installing Java security updates on elastic*
  • 10:51 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1101:3318 (duration: 00m 55s)
  • 10:48 godog: upload mtail 3.0.0~rc5-1~bpo9+1wmf1 to stretch-wikimedia
  • 10:45 marostegui: Deploy schema change on db2048 (s1 codfw master), this will generate lag on s1 codfw - T51191
  • 10:43 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Repool db1099:3318 (duration: 00m 53s)
  • 10:32 elukey: upload mcrouter exporter 0.0.0+git20181106 to stretch-wikimedia
  • 09:57 elukey: upgraded cdh packages (cdh 5.10 -> 5.15) for thirdparty/cloudera in jessie/stretch-wikimedia
  • 09:12 marostegui: Deploy schema change on db2048 (s1 codfw master) (replication will be stopped) - T67448
  • 08:53 marostegui: Deploy schema change on db1099:3318 - T203709
  • 08:52 marostegui@deploy1001: Synchronized wmf-config/db-eqiad.php: Depool db1099:3318 (duration: 01m 01s)
  • 08:41 marostegui: Change sync_binlog to 0 and trx_commit to 2 on dbstore2002:3313 to let it catch up
  • {{safesubst:SAL entry|1=08:26 _joe_: uploading new php-{luasandbox,wikidiff2} to stretch main component, rebuild php-{luasandbox,wikidiff2,geoip,msgpack} for php 7.2, upload to stretch component php72, T208433}}
  • 08:23 godog: temporarily disable puppet in codfw before enabling rsyslog_exporter

2018-11-10

  • 01:10 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@7eeede7]: redeploy runUpdate.sh for better reporting (duration: 00m 39s)
  • 01:10 smalyshev@deploy1001: Started deploy [wdqs/wdqs@7eeede7]: redeploy runUpdate.sh for better reporting
  • 01:07 smalyshev@deploy1001: Finished deploy [wdqs/wdqs@7eeede7]: redeploy runUpdate.sh for better reporting (duration: 00m 04s)
  • 01:07 smalyshev@deploy1001: Started deploy [wdqs/wdqs@7eeede7]: redeploy runUpdate.sh for better reporting

2018-11-09

  • 21:46 SMalyshev: repooled wdqs1004 - looks like other servers feel worse so probably makes sense to share the load equally
  • 21:16 jiji: Reimaging rdb2003, rdb2004 - T206450
  • 20:46 SMalyshev: depooled wdqs1004 to let it catch up
  • 20:40 legoktm@deploy1001: Synchronized docroot/mediawiki/keys/: Add my (20after4) PGP key to mediawiki.org/keys/keys.(txt|html) (duration: 00m 55s)
  • 20:08 andrewbogott: restarted neutron-linuxbridge-agent on cloudvirt1018 and cloudvirt1023
  • away: repooling labsdb1011 (T189158)
  • 15:24 banyek: depooling labsdb1011 (T189158)
  • 15:23 banyek: depooling labsdb1011
  • 15:08 banyek: repooling labsdb1009 (T189158)
  • 15:06 bblack: cp1008/pinkunicorn: puppet disabled, public-facing testing of new globalsign 2018 certs
  • 15:04 ladsgroup@deploy1001: Finished deploy [ores/deploy@bb39f4b]: T191842 T209060, try II (duration: 14m 43s)
  • 14:50 andrewbogott: rebooting cloudvirt1024 to (I hope) cause a page
  • 14:49 ladsgroup@deploy1001: Started deploy [ores/deploy@bb39f4b]: T191842 T209060, try II
  • 14:48 ladsgroup@deploy1001: deploy aborted: T191842 T209060 (duration: 09m 32s)
  • 14:39 ladsgroup@deploy1001: Started deploy [ores/deploy@0728805]: T191842 T209060
  • 14:18 addshore@deploy1001: Synchronized wmf-config: BETA ONLY: Enable SSR termbox for wikibase on beta - T209143 (duration: 00m 56s)
  • 13:32 moritzm: rebooting acrab for some qemu tests
  • 13:21 godog: upload graphite-web_1.0.2+debian-2.1wmf1 to stretch-wikimedia - T208782
  • 13:10 moritzm: upgrading qemu on ganeti2001 (packages supporting SSBD passthrough)
  • 12:40 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T189158: repool db1106 (duration: 00m 53s)
  • 12:34 banyek: repooling db1106 (T208954)
  • 12:17 kartik@deploy1001: Finished deploy [cxserver/deploy@fc21164]: Update cxserver to 01686f6 (T208831) (duration: 01m 09s)
  • 12:16 kartik@deploy1001: Started deploy [cxserver/deploy@fc21164]: Update cxserver to 01686f6 (T208831)
  • 11:45 banyek: data load finished restarting replication on db1106 (T208954)
  • 11:43 akosiaris: set previous normal wait for scb1001 for apertium service T206439
  • 11:39 akosiaris@puppetmaster1001: conftool action : set/weight=8; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 11:30 akosiaris: upgrade apertium apertium-cat apertium-fra apertium-fra-cat apertium-lex-tools apertium-separable cg3 libapertium3-3.5-1 libcg3-1 lttoolbox on all scb boxes and restart apertium-apy
  • 11:26 akosiaris: upgrade apertium apertium-cat apertium-fra apertium-fra-cat apertium-lex-tools apertium-separable cg3 libapertium3-3.5-1 libcg3-1 lttoolbox on scb1002
  • 11:22 jiji: switch scb*.eqiad.wmnet nutcracker rdb1003:6382 with rdb1005:6379
  • 10:51 vgutierrez: uploaded certcentral 0.6 to apt.wikimedia.org (stretch) - T208859 T208948 T208967 T208970
  • 09:48 ema: repool cp2018, cp2025 (cache_upload) T208588
  • 09:45 banyek: truncating enwiki.archive on db1124 and labsdb hosts too (T208954)
  • 09:21 banyek: stopping replication on db1106 (T208954)
  • 09:21 banyek: stopping replication on db1106 (T208672)
  • 09:08 banyek@deploy1001: Synchronized wmf-config/db-eqiad.php: T189158: depool db1106 (duration: 00m 55s)
  • 09:02 banyek: depooling db1106 (T208954)
  • 08:28 moritzm: installing nginx security updates
  • 08:05 ema: repool cp2006, cp2012 (cache_text) T208588
  • 04:33 ejegg: restarted recurring donation charge jobs
  • 04:24 ejegg: updated fundraising CiviCRM from 1154cca3f2 to 71755d021b
  • 03:25 ejegg: updated fundraising CiviCRM from 02cc1f80d4 to 1154cca3f2
  • 00:07 ejegg: updated fundraising CiviCRM from 07183ed7cc to 02cc1f80d4
  • 00:03 ejegg: updated payments-wiki from 983ce3af0f to 20542c9184

2018-11-08

  • 22:48 mutante: gerrit - adding Thomas Arrow to 'wmf-deployment' group for +2 on mw-config for T208491 access request
  • 22:37 mutante: gerrit - adding Lucas Werkmeister (WMDE) to 'wmf-deployment' group for +2 on mw-config for T208518 access request
  • 20:28 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: all wikis to 1.33.0-wmf.3
  • 19:35 thcipriani@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Revert "Disable wmgUseTwoColConflict everywhere" T205942 T208840 T209012 T209036 (duration: 00m 54s)
  • 18:19 shdubsh: update statsd-proxy to 0.0.9-2 on graphite1004
  • 17:29 banyek: depooling labsdb1009 (T189158)
  • 17:24 banyek: repooling labsdb1010 (T189158)
  • 17:07 godog: upload libfastjson 0.99.8-1~bpo9+1wmf1 version bump only
  • 16:59 akosiaris@deploy1001: scap-helm zotero finished
  • 16:59 akosiaris@deploy1001: scap-helm zotero cluster staging completed
  • 16:59 akosiaris@deploy1001: scap-helm zotero [namespace: zotero, clusters: staging]
  • 16:50 akosiaris@deploy1001: scap-helm zotero install --name alextest --set main_app.version=20181019165254-production --set monitoring.enable=true charts/zotero [namespace: zotero, clusters: staging]
  • 16:50 akosiaris@deploy1001: scap-helm zotero install --name alextest --set main_app.version=20181019165254-production --set monitoring.enable=true charts/zotero [namespace: zotero, clusters: staging]
  • 16:29 XioNoX: enable Zayo transit on cr3-ulsfo
  • 15:42 chasemp: disable /etc/logrotate.d/udp2log-mw for a bit on mwlog1001
  • 15:25 Amir1: rolling restart of celery on ores nodes (T209060)
  • 15:20 akosiaris: 'cd /srv/deployment/ores/deploy/submodules/wheels && sudo -u deploy-service git lfs pull' on all ores1* and ores2* hosts T209060
  • 15:07 XioNoX: zeroize asw-c8-codfw (decom)
  • 14:12 moritzm: rebooting releases2001 for some tests with ssbd for KVM
  • 13:52 moritzm: installing postgres updates on labsdb1006/1007
  • 13:38 jiji: Done reimaging rdb1006 - T206450
  • 13:37 moritzm: draining ganeti2001 for reboot/kernel security update
  • 13:36 moritzm: failing over ganeti master in codfw from ganeti2001 to ganeti2003
  • 13:13 godog: upload rsyslog 8.38.0-1~bpo9+1wmf1 to stretch-wikimedia, version bump only
  • 13:07 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Re-enable wmgUseTwoColConflict on dewiki - T205942 T208840 T209012 T209036 (duration: 00m 53s)
  • 12:56 akosiaris: increase weight of scb1001 for apertium to 99+%
  • 12:56 akosiaris@puppetmaster1001: conftool action : set/weight=3800; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 12:53 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Re-enable wmgUseTwoColConflict on group0 only T205942 T208840 T209012 T209036 (duration: 00m 54s)
  • 12:52 moritzm: draining ganeti2002 for reboot/kernel security update
  • 12:41 jiji: Shutdown and reimage rdb200[56] - T206450
  • 12:31 moritzm: draining ganeti2003 for reboot/kernel security update
  • 12:30 zeljkof: EU SWAT finished
  • 12:29 zfilipin@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/TwoColConflict: SWAT: Fix harmless edits turning into conflicts (T205942 T208840 T209012 T209036) (duration: 00m 55s)
  • 12:19 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set AdvancedSearch to default on group0 wikis (T207641) (duration: 00m 55s)
  • 12:18 moritzm: draining ganeti2004 for reboot/kernel security update
  • 11:57 moritzm: draining ganeti2005 for reboot/kernel security update
  • 11:51 akosiaris: increase weight of scb1001 for apertium to 50%
  • 11:50 akosiaris@puppetmaster1001: conftool action : set/weight=38; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 11:41 moritzm: draining ganeti2006 for reboot/kernel security update
  • 11:18 moritzm: draining ganeti2007 for reboot/kernel security update
  • 11:05 moritzm: draining ganeti2008 for reboot/kernel security update
  • 10:52 jiji: Reimaging rdb1006 to stretch - T206450
  • 10:52 addshore@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Disable wmgUseTwoColConflict everywhere T209012 T208840 T195724 (duration: 00m 58s)
  • 10:26 elukey: restart memcached on mc2029 (was depooled yesterday for network maintenance)
  • 10:23 jiji: restarting pdfrender on scb1003
  • 10:19 volans: restarting pdfrender on scb1004
  • 10:18 volans: restarting pdfrender on scb1002
  • 10:18 _joe_: restarting pdfrender on scb1001
  • 10:02 moritzm: installing ppp security updates on trusty
  • 09:37 godog: keep 2x not 3x copies of older (>15d) logstash elasticsearch indices
  • 09:29 moritzm: installing curl security updates
  • 09:29 godog: temporarily set elasticsearch logstash watermark to low:0.85 and high:0.9
  • 06:34 bawolff@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/OpenStackManager/special/SpecialNovaSudoer.php: T203885 (duration: 00m 54s)
  • 05:30 bawolff: deployed patch T208881
  • 01:18 mutante: scb1004 - systemctl restart pdfrender (T174916)
  • 00:43 jforrester@deploy1001: Synchronized php-1.33.0-wmf.3/includes/resourceloader/ResourceLoader.php: ResourceLoader: Fail less hard when JSON serialization of config fails I673f59d93 (duration: 00m 53s)
  • 00:33 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T205368 Enable BotPasswords on Governance wiki (duration: 00m 55s)
  • 00:32 ejegg: updated fundraising CiviCRM from 769dcf6456 to 07183ed7cc
  • 00:26 James_F: Created the bot_passwords table for Governance wiki T205368
  • 00:21 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208449 Disable wgWelcomeSurveyEnabled everywhere in production (duration: 00m 54s)
  • 00:18 jforrester@deploy1001: Synchronized wmf-config/extension-list: T208081 Drop the Petition extension from extension-list (duration: 00m 53s)
  • 00:16 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208081 Drop the Petition extension from InitialiseSettings (duration: 00m 52s)
  • 00:14 jforrester@deploy1001: Synchronized wmf-config/CommonSettings.php: T208081 Drop the Petition extension from CommonSettings (duration: 00m 53s)
  • 00:12 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208899 Enabling wgMediaInTargetLanguage for testwiki (duration: 00m 54s)
  • 00:00 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T208081 Disable the Petition extension in production (duration: 00m 52s)

2018-11-07

  • 23:48 catrope@deploy1001: Synchronized wmf-config/CommonSettings.php: Make GrowthExperiments flag operative in CommonSettings (duration: 00m 53s)
  • 23:44 catrope@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Add flag for GrowthExperiments to InitialiseSettings (duration: 00m 53s)
  • 23:37 catrope@deploy1001: Finished scap: Full scap to rebuild i18n for the addition of the GrowthExperiments extension (duration: 39m 40s)
  • 23:21 jiji: Disabled nagios checks on rdb1006 and rdb2005 due to rdb1005 reimaging - T206450
  • 22:57 catrope@deploy1001: Started scap: Full scap to rebuild i18n for the addition of the GrowthExperiments extension
  • 22:13 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: Revert "labswiki rollback to 1.33.0-wmf.2"
  • 22:07 thcipriani@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/LdapAuthentication/LdapAuthenticationPlugin.php: Expose methods used by OpenStackManager T208995 (duration: 00m 54s)
  • 22:06 XenoRyet: updated payments-wiki from 34506ce636 to 983ce3af0f
  • 22:02 thcipriani@deploy1001: Synchronized wmf-config/CommonSettings.php: Allow Cloud VPS 172.16.0.0/16 for $wmgAllowLabsAnonEdits wikis T208986 (duration: 00m 54s)
  • 22:02 arlolra: Updated Parsoid to 970751a (T206940)
  • 21:54 arlolra@deploy1001: Finished deploy [parsoid/deploy@4edc771]: Updating Parsoid to 970751a (duration: 09m 34s)
  • 21:45 arlolra@deploy1001: Started deploy [parsoid/deploy@4edc771]: Updating Parsoid to 970751a
  • 21:21 ladsgroup@deploy1001: Finished deploy [ores/deploy@25dfa4f]: T191842 T197096 (duration: 17m 24s)
  • 21:18 krinkle@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/AbuseFilter/includes/AbuseFilter.php: T208144 - I0fdda5 (duration: 00m 53s)
  • 21:16 krinkle@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/VipsScaler: Id9f82afd (duration: 00m 55s)
  • 21:06 krinkle@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/AbuseFilter/includes/AbuseFilter.php: T208144 - I0fdda51010243 (duration: 00m 53s)
  • 21:06 banyek: stopping replication on db2072 (T208954)
  • 21:04 krinkle@deploy1001: Synchronized php-1.33.0-wmf.3/includes/jobqueue/jobs/RefreshLinksJob.php: T208147 -I7f5fafe9439d8a7b4 (duration: 00m 54s)
  • 21:03 ladsgroup@deploy1001: Started deploy [ores/deploy@25dfa4f]: T191842 T197096
  • 20:55 banyek: depool labsdb1010 (T189158)
  • 20:24 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: rollback labswiki to 1.33.0-wmf.2
  • 20:12 thcipriani@deploy1001: Synchronized php: group1 wikis to 1.33.0-wmf.3 (duration: 00m 53s)
  • 20:11 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group1 wikis to 1.33.0-wmf.3
  • 20:00 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T206173 Adding namespaces to Governance wiki (duration: 00m 55s)
  • 19:50 chasemp: labstore1007:~# mkdir /srv/security/
  • 19:48 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Re-sync for skipped apaches due to maintenance (whoops) (duration: 00m 55s)
  • 19:48 XioNoX: Revert "Redirect eqsin/ulsfo caches to eqiad" - T208272
  • 19:47 XioNoX: repool codfw - T208272
  • 19:45 XioNoX: asw-c-codfw maintenance finished successfuly - T208272
  • 18:51 jforrester@deploy1001: Synchronized wmf-config/InitialiseSettings.php: T201285: Disable wgRawHTML on Governance wiki (duration: 05m 12s)
  • 18:31 onimisionipe: restarting relforge-eqiad and relforge-eqiad-small-alpha clusters on relforge100[1-2]
  • 18:21 XioNoX: power down asw-c4-codfw - T208272
  • 17:31 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Add sourceswiki to wikidata clients (duration: 00m 53s)
  • 17:25 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Inject wikidata rc records on wikidata itself (duration: 00m 53s)
  • 17:21 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable WMEUnderstandingFirstDay on testwiki (duration: 00m 53s)
  • 16:37 XioNoX: remove asw-c-codfw FPC8 from config - T208272
  • 16:35 XioNoX: shutdown asw-c-codfw FPC8 - T208272
  • 16:33 jforrester@deploy1001: Synchronized php-1.33.0-wmf.3/resources/src/mediawiki.rcfilters/styles/mw.rcfilters.ui.less: T208898 Hot-deploy to wmf.3 (duration: 00m 53s)
  • 16:22 jforrester@deploy1001: Synchronized php-1.33.0-wmf.3/extensions/Echo/modules/styles: Hot-deploy T208930 to wmf.3 (duration: 00m 54s)
  • 16:20 XioNoX: Enable all VC ports (except uplinks) on spines - T208272
  • 15:58 XioNoX: Redirect eqsin/ulsfo caches to eqiad - T208272
  • 15:57 XioNoX: depool codfw for row C maintenance - T208272
  • 15:36 moritzm: installing Java security updates on relforge*
  • 15:29 jiji@deploy1001: Synchronized wmf-config/ProductionServices.php: Remove jobqueue_redis references, T198220 (duration: 00m 54s)
  • 15:21 akosiaris: T206439 direct 30% of the apertium.svc.eqiad.wmnet traffic to scb1001. Will increase tomorrow to 50%
  • 15:20 akosiaris@puppetmaster1001: conftool action : set/weight=15; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 15:16 akosiaris@puppetmaster1001: conftool action : set/pooled=yes; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 15:15 akosiaris: T206439 pool upgraded scb1001 to apertium.svc.eqiad.wmnet as a form of canary
  • 15:13 moritzm: uploaded nginx 1.13.6-2+wmf2 to apt.wikimedia.org/stretch-wikimedia
  • 14:55 akosiaris@puppetmaster1001: conftool action : set/pooled=no; selector: dc=eqiad,service=apertium,cluster=scb,name=scb1001.*
  • 14:37 oblivian@deploy1001: Synchronized docroot/wwwportal/w/search-redirect.php: Fixing redirects if no language is specified (duration: 00m 54s)
  • 14:33 moritzm: uploaded nginx 1.13.6-2+wmf2~jessie1 to apt.wikimedia.org/jessie-wikimedia
  • 14:32 akosiaris: T206439 upload apertium-cat_2.6.0-1+wmf1 apertium-fra-cat_1.5.0-1+wmf1 apertium-fra_1.5.0-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 14:26 bblack: rebooting graphite1004
  • 14:16 akosiaris: T206439 upload apertium-separable_0.3.2-1+wmf1 apertium-lex-tools_0.2.1-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 14:07 akosiaris: T206439 upload apertium_3.5.2-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 13:43 bblack: hi
  • 13:21 Amir1: ladsgroup@mwmaint1002:/srv/mediawiki/php-1.33.0-wmf.1$ mwscript sql.php --wiki=sourceswiki extensions/Wikibase/client/sql/entity_usage.sql (T208858)
  • 12:38 zeljkof: EU SWAT finished
  • 12:36 zfilipin@deploy1001: Synchronized wmf-config: SWAT: BC: Enable Schema.org page split test (T208763) (duration: 00m 54s)
  • 12:35 akosiaris: T206439 upload hfst-ospell_0.5.0-1+wmf1to apt.wikimedia.org/jessie-wikimedia/main
  • 12:27 akosiaris: T206439 upload cg3_1.1.7-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 12:16 zfilipin@deploy1001: Synchronized wmf-config/InterwikiSortOrders.php: SWAT: Add dty, gor, inh, kbp and lfn to InterwikiSortOrders (T208217) (duration: 00m 53s)
  • 12:12 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: New throttle rule for Art&Feminism event in Chile (T208866) (duration: 00m 54s)
  • 12:12 akosiaris: T206439 upload hfst_3.15.0-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 12:08 zfilipin@deploy1001: Synchronized wmf-config/throttle.php: SWAT: Remove expired throttle rules (duration: 01m 05s)
  • 11:49 akosiaris: T206439 upload lttoolbox_3.5.0-1+wmf1 to apt.wikimedia.org/jessie-wikimedia/main
  • 11:49 akosiaris: T206439 upload lttoolbox_3.5.0-1+wmf1
  • 10:06 hashar: CI: switched operations/puppet job to be based on Stretch ( T208422 ) and to add python3 ( T208873 )
  • 09:25 _joe_: run systemctl reset-failed on ms-be1029, had a failed debmonitor session
  • 08:16 kartik@deploy1001: Finished deploy [cxserver/deploy@6f97d25]: Update cxserver to f9ffd24 (duration: 04m 59s)
  • 08:11 kartik@deploy1001: Started deploy [cxserver/deploy@6f97d25]: Update cxserver to f9ffd24
  • 02:55 ejegg: disabled recurring charge jobs
  • 00:46 mutante: tegmen - shutting down for renaming and reinstall (T208824)
  • 00:11 dereckson@deploy1001: Synchronized wmf-config/interwiki.php: Updating interwiki cache (duration: 02m 44s)

2018-11-06

  • 23:55 mutante: cp1084 - network went down, powercycled, probably T203194
  • 22:49 ejegg: updated fundraising CiviCRM from e0742d2210 to 769dcf6456
  • 21:50 mutante: icinga1001-"MediaWiki EtcdConfig up-to-date" checks were all UNKNOWN because systemd unit update-etcd-mw-config-lastindex was present but service not running. it was turned off in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/427328/ on purpose. manually ran "systemctl start update-etcd-mw-config-lastindex" and the checks all work (T202782)
  • 21:49 mutante: icinga1001 - the "MediaWiki EtcdConfig up-to-date" checks were all unknown on the new icinga server, this was because systemd unit update-etcd-mw-config-lastindex was present but service not running. that was turned off in https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/427328/ on purpose. manually ran "systemctl start update-etcd-mw-config-lastindex" to start it and the checks
  • 21:40 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: Group0 wikis to 1.33.0-wmf.3
  • 20:19 thcipriani@deploy1001: Finished scap: testwiki to 1.33.0-wmf.3 and rebuild l10n cache (duration: 34m 06s)
  • 19:45 thcipriani@deploy1001: Started scap: testwiki to 1.33.0-wmf.3 and rebuild l10n cache
  • 19:44 thcipriani@deploy1001: Pruned MediaWiki: 1.32.0-wmf.23 (duration: 07m 19s)
  • 18:36 thcipriani: cutting branch for MediaWiki and extensions version 1.33.0-wmf.3
  • 17:37 XioNoX: add vlan-analytics1-a-eqiad interface-range on asw2-a-eqiad
  • 16:47 XioNoX: enable cr4-ulsfo zayo transport to cr1-codfw
  • 16:23 akosiaris@deploy1001: scap-helm zotero install --name alextest --set main_app.version=20181019165254-production --set monitoring.enable=true charts/zotero [namespace: zotero, clusters: staging]
  • 16:23 akosiaris@deploy1001: scap-helm zotero install --set main_app.version=20181019165254-production --set monitoring.enable=true charts/zotero [namespace: zotero, clusters: staging]
  • 16:12 mutante: einsteinium - temp disabling icinga notifications and puppet, reloading icinga (for extra caution while deploying global NRPE change)
  • 16:07 mutante: planet1001 - disabling puppet, editing NRPE config, testing allowed_hosts change
  • 16:07 akosiaris@deploy1001: scap-helm -h finished
  • 16:07 akosiaris@deploy1001: scap-helm -h cluster staging completed
  • 16:07 akosiaris@deploy1001: scap-helm -h [namespace: -h, clusters: staging]
  • 15:59 banyek: updating facts for the puppet compilers
  • 15:37 akosiaris: create zotero namespace in eqiad, codfw, staging cluster T201611
  • 15:20 godog: switch all graphite read traffic to graphite1004
  • 15:16 XioNoX: push `lldp port-id-subtype interface-name` to all compatible switches - T208630
  • 15:14 jiji: scb1001/scb1002 switched nutcracker redis from rdb1001:6382 to rdb1009:6379
  • 15:08 XioNoX: push `lldp port-id-subtype interface-name` to all routers - T208630
  • 14:40 jynus_: reducing consistenct temp. on db2048 to avoid lagging
  • 14:30 moritzm: installing Ruby 2.1 security updates
  • 14:22 godog: add graphite1004 to graphite cluster for reads
  • 14:21 moritzm: installing clamav security updates on mendelevium/ticket.wikimedia.org
  • 13:25 moritzm: restart HHVM on canaries to pick up new curl
  • 13:10 XioNoX: zeroize asw-b-eqiad (decom) - T208788
  • 12:33 moritzm: installing curl security updates
  • 12:17 moritzm: installing chromium security updates on proton* (new upstream release tested in deployment-prep)
  • 12:12 zeljkof: EU SWAT finished
  • 12:10 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Update several Wikidata-related configs (duration: 00m 55s)
  • 11:37 moritzm: installing Ruby 2.3 security updates
  • 11:37 moritzm: installing Ruby 2.3 security updates on trusty
  • 11:30 moritzm: installing Mono security updates
  • 11:23 moritzm: installing Ruby 1.9 security updates on trusty
  • 11:07 banyek: stopping replication on db2077 (T208672)
  • 07:25 _joe_: also restarting on the other eqiad nodes
  • 07:25 _joe_: restarting tilerator on maps1002
  • 04:47 kartik@deploy1001: Finished deploy [cxserver/deploy@ddb0031]: Update cxserver to 17f9a10 (T144467, T198699, T208386) (duration: 05m 26s)
  • 04:42 kartik@deploy1001: Started deploy [cxserver/deploy@ddb0031]: Update cxserver to 17f9a10 (T144467, T198699, T208386)
  • 03:40 eileen: civicrm revision changed from 99895316de to e0742d2210, config revision is e832b5a04a
  • 00:26 maxsem@deploy1001: Synchronized wmf-config/InitialiseSettings.php: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/471875/ (duration: 00m 51s)
  • 00:07 maxsem@deploy1001: Synchronized wmf-config: https://gerrit.wikimedia.org/r/#/c/operations/mediawiki-config/+/471874/ (duration: 00m 54s)
  • 00:03 eileen: civicrm revision changed from 042eeaeca9 to 99895316de, config revision is e832b5a04a

2018-11-05

  • 22:41 mutante: sodium - reboot after disk replacement (T202705)
  • 21:51 ppchelko@deploy1001: Finished deploy [cpjobqueue/deploy@a92fce5]: Increase cirrusSearchLinksUpdate concurrency to 100 (duration: 01m 02s)
  • 21:50 ppchelko@deploy1001: Started deploy [cpjobqueue/deploy@a92fce5]: Increase cirrusSearchLinksUpdate concurrency to 100
  • 21:49 arlolra: Updated Parsoid to 8ed698b (T205334, T208360)
  • 21:42 mobrovac@deploy1001: Started restart [zotero/translation-server@50f216a]: Free up some memory
  • 21:41 arlolra@deploy1001: Finished deploy [parsoid/deploy@96d739b]: Updating Parsoid to 8ed698b (duration: 10m 59s)
  • 21:38 thcipriani@deploy1001: Synchronized wmf-config/CommonSettings.php: Removed decomissioned citoid url T133001 (duration: 00m 53s)
  • 21:30 arlolra@deploy1001: Started deploy [parsoid/deploy@96d739b]: Updating Parsoid to 8ed698b
  • 21:23 ppchelko@deploy1001: Finished deploy [restbase/deploy@5b8ad3c]: Update deps, removed sections table, T207904 T206048 T207324 take 2 (duration: 09m 18s)
  • 21:14 ppchelko@deploy1001: Started deploy [restbase/deploy@5b8ad3c]: Update deps, removed sections table, T207904 T206048 T207324 take 2
  • 21:10 ppchelko@deploy1001: Finished deploy [restbase/deploy@5b8ad3c]: Update deps, removed sections table, T207904 T206048 T207324 (duration: 12m 15s)
  • 21:09 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: Ensure all wikis on 1.33.0-wmf.2
  • 20:58 ppchelko@deploy1001: Started deploy [restbase/deploy@5b8ad3c]: Update deps, removed sections table, T207904 T206048 T207324
  • 20:38 ppchelko@deploy1001: Finished deploy [restbase/deploy@5b8ad3c] (dev-cluster): Update deps, removed sections table (duration: 03m 40s)
  • 20:35 ppchelko@deploy1001: Started deploy [restbase/deploy@5b8ad3c] (dev-cluster): Update deps, removed sections table
  • 19:53 akosiaris: do a depool, scap deploy, scap wikiversions-compile, hhvm restart and then a pool in eqiad mediawiki servers
  • {{safesubst:SAL entry|1=19:50 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: [[gerrit:470868|Disable page issues A/B test (duration: 00m 53s)}}
  • 19:44 sbisson@deploy1001: Synchronized php-1.33.0-wmf.2/includes/block/BlockRestriction.php: SWAT: BlockRestriction::update() unnecessarily does a SELECT on the page table. (duration: 01m 00s)
  • 19:19 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable error logging for WikimediaEvents (duration: 00m 52s)
  • 19:12 sbisson@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Enable GrowthExperiments logging channel (duration: 00m 53s)
  • 18:57 _joe_: restarting hhvm on mwdebug1002
  • 18:44 godog: pool graphite1004 for reads - T196484
  • 18:43 XioNoX: delete asw2-b - asw-b interface - T183585
  • 18:41 XioNoX: remove asw-b-eqiad from LibreNMS - T183585
  • 18:37 XioNoX: remove vrrp priority 70 on cr2-eqiad:ae2 to failback VIPs to cr2 - T183585
  • 18:26 XioNoX: re-enable ae2 on cr2-eqiad - T183585
  • 18:21 thcipriani: rollback mwdebug1001 group2 wikis
  • 18:13 thcipriani: testing php-1.33.0-wmf.2 on group2 wikis on mwdebug1001
  • 18:05 XioNoX: disable ae2 on cr2-eqiad - T183585
  • 18:02 XioNoX: set vrrp priority 70 on cr2-eqiad:ae2 to failover VIP to cr1 - T183585
  • 16:49 XioNoX: Update LLDP config on cr3-ulsfo - T208630
  • 16:48 vgutierrez: uploaded certcentral 0.5 to apt.wikimedia.org (stretch) - T208572 T208378
  • 16:06 anomie@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Setting MCR to read-new on all wikis (T198308) (duration: 00m 55s)
  • 13:57 jynus_: increase consistency of db2050, dbstore2002 s3 after them catching up replication T208462
  • 12:33 ladsgroup@deploy1001: Finished deploy [ores/deploy@096ffb3]: T208577 T181632 T208608 (duration: 22m 58s)
  • 12:23 zeljkof: EU SWAT finished
  • 12:23 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Increase wikidata dispatchers to 3 (duration: 00m 54s)
  • 12:16 zfilipin@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Set wgForeignUploadTargets to [] for zhwiki (T208397) (duration: 00m 54s)
  • 12:10 ladsgroup@deploy1001: Started deploy [ores/deploy@096ffb3]: T208577 T181632 T208608
  • 12:05 zfilipin@deploy1001: Synchronized static/images/project-logos/: SWAT: Revert "Anniversary logo for cswiki" (T207589) (duration: 00m 58s)
  • 10:02 godog: reformat xfs filesystems on ms-be1040 - T199198
  • 09:17 elukey@deploy1001: Finished deploy [analytics/refinery@9d39efa]: fixing stat1004 (duration: 00m 04s)
  • 09:17 elukey@deploy1001: Started deploy [analytics/refinery@9d39efa]: fixing stat1004
  • 09:08 joal@deploy1001: Finished deploy [analytics/refinery@9d39efa]: regular analytics weekly deploy (duration: 05m 21s)
  • 09:02 joal@deploy1001: Started deploy [analytics/refinery@9d39efa]: regular analytics weekly deploy

2018-11-04

  • 23:42 jynus_: deleting the same row on all s8 broken servers
  • 23:39 jynus_: deleting one row on db1104
  • 20:38 krinkle@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/FlaggedRevs/frontend/specialpages/reports/ProblemChanges_body.php: T176232 - Ia43626584e (duration: 01m 17s)
  • 18:32 jynus_: reduce temp. consistency level of s4, s5, and s6 codfw masters to prevent excessive lagging due to ongoing mediawiki core maintenance
  • 08:42 eileen: process-control config revision is e832b5a04a renable running job list (all jobs on again now0
  • 08:38 eileen: process-control config revision is e16b2c1c61 renable jobs
  • 02:00 eileen: I think I got the rest of the jobs off process-control config revision is 4422254128
  • 01:52 eileen: process-control config revision is 6ec67b3d01 - also turn off omnirecipient repair job
  • 01:40 eileen: process-control config revision is 5b72cfe874 - reapply turn off q jobs

2018-11-03

  • 09:35 elukey: run tcpdump on mc1035 to grab memcache traffic (rotating pcaps, ~30G maximum)

2018-11-02

  • 17:04 thcipriani: rollback group2 wikis to 1.33.0-wmf.1 on mwdebug100{1,2}
  • 16:54 thcipriani: deploying 1.33.0-wmf.2 to group2 wikis on mwdebug1002
  • 16:43 _joe_: live-hacking removal of time limit on mwdebug1001
  • 16:32 thcipriani: deploying 1.33.0-wmf.2 to group2 wikis on mwdebug1001
  • 15:12 jynus: restarting replication @ db2074 after db2094:s3 table fix T208565
  • 15:00 jynus: stopping replication on db2074 to fix db2094:s3 T208565
  • 14:01 vgutierrez: reimaging eeden.wikimedia.org as jessie test system - T208583
  • 11:43 jynus: ignoring cawikimedia.archive replication on db2094:s3 until a reimport happens T208565
  • 11:29 jijiki: Rebooting mw2244 (spare system) for maintenance
  • 10:52 ema: restart varnish-be on cp3032 T208574
  • 08:19 jynus: performing alter table on dbstore2002 s3 and reducing consistency to improve recovery time T208462 T204006
  • 08:01 jynus: reducing consitency on db2050 to improve recovery time T208462
  • 07:59 jynus: performing alter table on db2050 T208462 T204006
  • 07:38 godog: reformat ms-be1043 xfs filesystems - T199198
  • 07:38 jynus: reducing consistency temporarily (flush, binlog sync) at db2040 to prevent lagging
  • 07:26 jynus: reducing consistency temporarily (flush, binlog sync) at db2035 to prevent lagging

2018-11-01

  • 23:01 shdubsh: restart hhvm on mw1261
  • 22:29 ejegg: restarted fundraising queue consumer jobs
  • 22:21 ejegg: updated fundraising CiviCRM from 65130ef3dd to 042eeaeca9
  • 22:18 ejegg: turned off fundraising queue jobs for civi update
  • 22:12 _joe_: rolling restart of hhvm on appservers and api in eqiad
  • 22:09 shdubsh: cumin -b 2 -s 30 "O:mediawiki::appserver and *.eqiad.wmnet" "restart-hhvm"
  • 22:05 _joe_: restarting hhvm on mw1238,1240
  • 22:02 _joe_: restart hhvm on mw1244
  • 21:52 shdubsh: restart hhvm on mw1247
  • 21:49 _joe_: depooling mw1238 for debugging
  • 21:09 thcipriani@deploy1001: rebuilt and synchronized wikiversions files: group2 back to 1.33.0-wmf.1
  • 20:55 hoo: Restarted hhvm on mwdebug2002
  • 19:40 hoo: Ran "UPDATE wb_changes_dispatch SET chd_seen = '775203911' WHERE chd_site LIKE '%wikt%' AND chd_seen < '775180000';" on wikidata master (dispatching for wiktionaries)
  • 19:00 hoo@deploy1001: Synchronized php-1.33.0-wmf.1/includes/export/WikiExporter.php: Fix for missing end tag </page> on some exports (T207974) (duration: 01m 01s)
  • 18:38 hoo@deploy1001: Synchronized php-1.33.0-wmf.2/includes/export/WikiExporter.php: Fix for missing end tag </page> on some exports (T207974) (duration: 00m 55s)
  • 18:25 jijiki: Enabling puppet on mw servers (T206923)
  • 18:19 hoo@deploy1001: Synchronized wmf-config/InitialiseSettings.php: Remove now redundant Wikidata config for wiktionary (T208317) (duration: 00m 54s)
  • 18:12 hoo@deploy1001: Synchronized dblists/wikidataclient.dblist: Add all wiktionaries to wikidataclient.dblist, sort list (T208317) (duration: 00m 57s)
  • 18:02 gehel: restart nginx on relforge100*
  • 17:57 jijiki: Disabling puppet on mw servers (T206923)
  • 16:07 anomie@mwmaint1002: Running migrateComments.php on section 4 wikis for T166733
  • 13:46 anomie@mwmaint1002: Running migrateComments.php on remaining section 3 wikis for T166733
  • 13:37 anomie@mwmaint1002: Running migrateComments.php on section 7 wikis for T166733
  • 13:37 anomie@mwmaint1002: Running migrateComments.php on wikitech for T166733
  • 13:37 anomie@mwmaint1002: Running migrateImageCommentTemp.php on wikitech for T188132
  • 13:37 anomie@mwmaint1002: Running migrateComments.php on section 6 wikis for T166733
  • 13:37 anomie@mwmaint1002: Running migrateComments.php on section 8 wikis for T166733
  • 13:37 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 8 wikis for T188132
  • 13:37 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 7 wikis for T188132
  • 13:37 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 6 wikis for T188132
  • 13:36 anomie@mwmaint1002: Running migrateComments.php on section 5 wikis for T166733
  • 13:36 anomie@mwmaint1002: Running migrateComments.php on section 1 wikis for T166733
  • 13:36 anomie@mwmaint1002: Running migrateComments.php on section 2 wikis for T166733
  • 13:36 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 5 wikis for T188132
  • 13:36 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 4 wikis for T188132
  • 13:36 anomie@mwmaint1002: Running migrateImageCommentTemp.php on remaining section 3 wikis for T188132
  • 13:36 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 2 wikis for T188132
  • 13:35 anomie@mwmaint1002: Running migrateImageCommentTemp.php on section 1 wikis for T188132
  • 12:50 addshore@deploy1001: Synchronized wmf-config/CommonSettings.php: List wikidataclient-test in CS.php dblists T208488 (duration: 00m 57s)
  • 09:10 elukey: added a tmux session on mw1314m mw1344, mw1316 that checks mcrouter stats every 10s
  • 00:58 onimisionipe: repooling wdqs1004. It has caught up on lag with others
  • 00:22 tgr@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/SyntaxHighlight_GeSHi/extension.json: SWAT: Follow-up I3daca6fb: Fix exception thrown when inserting new code block (invalidate RL cache) (duration: 00m 53s)
  • 00:20 tgr@deploy1001: Synchronized php-1.33.0-wmf.2/extensions/SyntaxHighlight_GeSHi/modules/ve-syntaxhighlight/ve.ui.MWSyntaxHighlightWindow.js: SWAT: Follow-up I3daca6fb: Fix exception thrown when inserting new code block (duration: 00m 54s)
  • 00:13 mobrovac@deploy1001: Finished deploy [restbase/deploy@e8f3a85] (dev-cluster): Add title normalisation and remove Accept-Language header duplicates (duration: 03m 00s)
  • 00:10 mobrovac@deploy1001: Started deploy [restbase/deploy@e8f3a85] (dev-cluster): Add title normalisation and remove Accept-Language header duplicates
  • 00:07 tgr@deploy1001: Synchronized wmf-config/CommonSettings.php: SWAT: Move auth logging to different channels for easier counting (T150300, T123243) (duration: 00m 53s)
  • 00:05 tgr@deploy1001: Synchronized wmf-config/InitialiseSettings.php: SWAT: Move auth logging to different channels for easier counting (T150300, T123243) (duration: 00m 53s)


Archives

See Server admin log/Archives.