You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Server Admin Log: Revision history

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

(newest | oldest) View (newer 250 | ) (20 | 50 | 100 | 250 | 500)

3 February 2023

  • curprev 00:3500:35, 3 February 2023imported>Stashbot 71,409 bytes +23,889 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1080.eqiad.wmnet with OS bullseye

2 February 2023

  • curprev 01:2401:24, 2 February 2023imported>Stashbot 47,520 bytes −622,512 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1075.eqiad.wmnet with reason: host reimage

1 February 2023

  • curprev 00:3800:38, 1 February 2023imported>Stashbot 670,032 bytes +34,224 brett@cumin2002: conftool action : set/pooled=yes; selector: name=cp3055.esams.wmnet

31 January 2023

  • curprev 00:5000:50, 31 January 2023imported>Stashbot 635,808 bytes +57,101 brett@cumin1001: conftool action : set/pooled=yes; selector: name=cp5027.eqsin.wmnet

29 January 2023

  • curprev 14:4614:46, 29 January 2023imported>Stashbot 578,707 bytes +458 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1002.eqiad.wmnet

28 January 2023

  • curprev 00:3600:36, 28 January 2023imported>Stashbot 578,249 bytes +18,571 brett@cumin1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet

27 January 2023

  • curprev 01:1601:16, 27 January 2023imported>Stashbot 559,678 bytes +62,951 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching cassandra-dev2*: Applying configuration change to cassandra-dev cluster - eevans@cumin1001

26 January 2023

  • curprev 01:2401:24, 26 January 2023imported>Stashbot 496,727 bytes +35,241 ejegg: payments-wiki upgraded from 15395d05 to 08b8c3bc (upgraded from MW 1.35 to MW 1.39)

25 January 2023

  • curprev 01:1701:17, 25 January 2023imported>Stashbot 461,486 bytes +48,935 legoktm: adjusting Gerrit group "Campaigns Team" so it is not recursively a member of itself

24 January 2023

  • curprev 01:1601:16, 24 January 2023imported>Stashbot 412,551 bytes +38,290 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1031.eqiad.wmnet

20 January 2023

18 January 2023

  • curprev 23:4723:47, 18 January 2023imported>Stashbot 352,750 bytes +20,082 zabe: run populateCulComment.php on all group0 and group1 wikis # T327290
  • curprev 01:1301:13, 18 January 2023imported>Stashbot 332,668 bytes +24,670 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2031.codfw.wmnet

16 January 2023

14 January 2023

13 January 2023

  • curprev 23:3923:39, 13 January 2023imported>Stashbot 300,244 bytes +10,059 mutante: people2002 - systemctl reset-failed after removing auto_restart_rsync timers
  • curprev 01:2601:26, 13 January 2023imported>Stashbot 290,185 bytes +53,216 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=False) upgrade firmware for hosts ['an-mariadb1002']

12 January 2023

  • curprev 01:1501:15, 12 January 2023imported>Stashbot 236,969 bytes +44,353 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T321391)', diff saved to https://phabricator.wikimedia.org/P43056 and previous config saved to /var/cache/conftool/dbconfig/20230112-011526-marostegui.json

10 January 2023

  • curprev 23:5823:58, 10 January 2023imported>Stashbot 192,616 bytes +46,411 krinkle@deploy1002: Finished deploy [integration/docroot@b7c82a3]: (no justification provided) (duration: 00m 15s)
  • curprev 00:4800:48, 10 January 2023imported>Stashbot 146,205 bytes +26,510 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: plugin upgrade - bking@cumin1001 - T324247

6 January 2023

4 January 2023

3 January 2023

2 January 2023

  • curprev 10:0410:04, 2 January 2023imported>Stashbot 345 bytes +229 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host otrs1001.eqiad.wmnet

1 January 2023

31 December 2022

  • curprev 19:1119:11, 31 December 2022imported>Stashbot 552,815 bytes +154 AndyRussG: payments-wiki upgraded c212825e -> f02e3585, config c1c4a9f6 -> 8103bce6

30 December 2022

  • curprev 21:3621:36, 30 December 2022imported>Stashbot 552,661 bytes +126 dcausse: restarting blazegraph on wdqs1006 and wdqs1013 (BlazegraphFreeAllocatorsDecreasingRapidly)

29 December 2022

  • curprev 23:2623:26, 29 December 2022imported>Stashbot 552,535 bytes +629 ryankemper@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)

22 December 2022

  • curprev 18:2718:27, 22 December 2022imported>Stashbot 551,906 bytes +7,038 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS bullseye

21 December 2022

  • curprev 23:4123:41, 21 December 2022imported>Stashbot 544,868 bytes +1,082 ejegg: civicrm upgraded from d80f9550 to e3405a4e
  • curprev 00:1000:10, 21 December 2022imported>Stashbot 543,786 bytes +7,035 eileen: + fc0536195a12df9bc7a896f77db6b2c8e609352e Check for correct fin type name in hasEndowment

20 December 2022

18 December 2022

  • curprev 19:4019:40, 18 December 2022imported>Stashbot 520,139 bytes +785 sukhe: ran sudo cumin -b 1 -s 30 'A:mw-api and A:eqiad' 'restart-php7.4-fpm' [at 18:55 UTC]: T325477

17 December 2022

  • curprev 14:3614:36, 17 December 2022imported>Stashbot 519,354 bytes +264 aikochou@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .

16 December 2022

  • curprev 19:5519:55, 16 December 2022imported>Stashbot 519,090 bytes +5,844 robh@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4007.ulsfo.wmnet with OS bullseye
  • curprev 00:5100:51, 16 December 2022imported>Stashbot 513,246 bytes +24,332 mutante: puppetmasters - merged gerrit:868481 to "revert" gerrit:866644,ran puppet and 'systemctl reset-failed' via cumin on 10 masters, resolved monitoring alerts

15 December 2022

  • curprev 00:5800:58, 15 December 2022imported>Stashbot 488,914 bytes +47,486 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2026.codfw.wmnet with reason: host reimage

14 December 2022

  • curprev 01:2201:22, 14 December 2022imported>Stashbot 441,428 bytes +24,146 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2037.codfw.wmnet with OS bullseye

12 December 2022

10 December 2022

  • curprev 03:4603:46, 10 December 2022imported>Stashbot 402,542 bytes +1,087 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: search_codfw elasticsearch and plugin upgrade - ryankemper@cumin2002

9 December 2022

  • curprev 23:5923:59, 9 December 2022imported>Stashbot 401,455 bytes +16,799 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
  • curprev 01:1101:11, 9 December 2022imported>Stashbot 384,656 bytes +64,721 eevans@cumin1001: START - Cookbook sre.hosts.reimage for host cassandra-dev2002.codfw.wmnet with OS buster

8 December 2022

  • curprev 01:0501:05, 8 December 2022imported>Stashbot 319,935 bytes +71,471 bblack: lvsNNNN: restart pybal to apply etcd key changes on all "high-traffic1" lvs at all sites - T324336

7 December 2022

  • curprev 00:2100:21, 7 December 2022imported>Stashbot 248,464 bytes +42,999 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubernetes1024.eqiad.wmnet with OS bullseye

6 December 2022

  • curprev 01:2501:25, 6 December 2022imported>Stashbot 205,465 bytes +81,524 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P42379 and previous config saved to /var/cache/conftool/dbconfig/20221206-012539-ladsgroup.json

4 December 2022

  • curprev 04:1904:19, 4 December 2022imported>Stashbot 123,941 bytes +177 TheresNoTime: T302486 : `[samtar@mwmaint1002 ~]$ mwscript maintenance/fixMergeHistoryCorruption.php --wiki enwiki --dry-run --ns 828`

3 December 2022

  • curprev 00:1700:17, 3 December 2022imported>Stashbot 123,764 bytes +12,040 cwhite: draining shards from logstash1010, logstash1033, logstash1034, logstash1035 - T321410

2 December 2022

  • curprev 00:0900:09, 2 December 2022imported>Stashbot 111,724 bytes −1,825,217 rzl@cumin1001: conftool action : set/pooled=no; selector: name=mw14(45|46).eqiad.wmnet,cluster=jobrunner

30 November 2022

  • curprev 01:2201:22, 30 November 2022imported>Stashbot 1,936,941 bytes +129,140 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T321126)', diff saved to https://phabricator.wikimedia.org/P41834 and previous config saved to /var/cache/conftool/dbconfig/20221130-012218-marostegui.json

29 November 2022

  • curprev 01:1701:17, 29 November 2022imported>Stashbot 1,807,801 bytes +121,792 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P41530 and previous config saved to /var/cache/conftool/dbconfig/20221129-011707-ladsgroup.json

27 November 2022

  • curprev 03:0103:01, 27 November 2022imported>Stashbot 1,686,009 bytes +943 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint', diff saved to https://phabricator.wikimedia.org/P41257 and previous config saved to /var/cache/conftool/dbconfig/20221127-030126-ladsgroup.json

26 November 2022

  • curprev 21:3421:34, 26 November 2022imported>Stashbot 1,685,066 bytes +3,790 urandom: initiating Cassandra bootstrap, aqs1021-b -- T307802
  • curprev 01:1601:16, 26 November 2022imported>Stashbot 1,681,276 bytes +58,470 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41244 and previous config saved to /var/cache/conftool/dbconfig/20221126-011647-ladsgroup.json

25 November 2022

  • curprev 01:1801:18, 25 November 2022imported>Stashbot 1,622,806 bytes +76,429 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P41071 and previous config saved to /var/cache/conftool/dbconfig/20221125-011818-marostegui.json

24 November 2022

  • curprev 01:2601:26, 24 November 2022imported>Stashbot 1,546,377 bytes +91,450 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40894 and previous config saved to /var/cache/conftool/dbconfig/20221124-012652-marostegui.json

23 November 2022

  • curprev 01:1601:16, 23 November 2022imported>Stashbot 1,454,927 bytes +128,094 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye

22 November 2022

  • curprev 01:1401:14, 22 November 2022imported>Stashbot 1,326,833 bytes +77,259 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40411 and previous config saved to /var/cache/conftool/dbconfig/20221122-011404-ladsgroup.json

21 November 2022

  • curprev 01:0801:08, 21 November 2022imported>Stashbot 1,249,574 bytes +1,268 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster

19 November 2022

  • curprev 22:5122:51, 19 November 2022imported>Stashbot 1,248,306 bytes +1,417 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS buster
  • curprev 01:2401:24, 19 November 2022imported>Stashbot 1,246,889 bytes +59,627 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage

18 November 2022

  • curprev 01:2601:26, 18 November 2022imported>Stashbot 1,187,262 bytes +56,176 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2173']

17 November 2022

  • curprev 00:5900:59, 17 November 2022imported>Stashbot 1,131,086 bytes +67,747 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T323214)', diff saved to https://phabricator.wikimedia.org/P40031 and previous config saved to /var/cache/conftool/dbconfig/20221117-005929-ladsgroup.json

16 November 2022

  • curprev 01:1501:15, 16 November 2022imported>Stashbot 1,063,339 bytes +102,599 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS bullseye

15 November 2022

  • curprev 01:0701:07, 15 November 2022imported>Stashbot 960,740 bytes +99,923 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P39630 and previous config saved to /var/cache/conftool/dbconfig/20221115-010745-marostegui.json

12 November 2022

  • curprev 23:3423:34, 12 November 2022imported>Stashbot 860,817 bytes +12,581 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T318605)', diff saved to https://phabricator.wikimedia.org/P39371 and previous config saved to /var/cache/conftool/dbconfig/20221112-233420-ladsgroup.json
  • curprev 01:2101:21, 12 November 2022imported>Stashbot 848,236 bytes +62,212 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P39331 and previous config saved to /var/cache/conftool/dbconfig/20221112-012122-marostegui.json

11 November 2022

  • curprev 01:2201:22, 11 November 2022imported>Stashbot 786,024 bytes +125,544 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P39163 and previous config saved to /var/cache/conftool/dbconfig/20221111-012250-ladsgroup.json

10 November 2022

8 November 2022

  • curprev 22:0022:00, 8 November 2022imported>Stashbot 604,525 bytes +104,241 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • curprev 01:2501:25, 8 November 2022imported>Stashbot 500,284 bytes +154,243 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

6 November 2022

5 November 2022

  • curprev 12:5612:56, 5 November 2022imported>Stashbot 345,478 bytes +537 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@c849762]: (no justification provided) (duration: 00m 49s)

4 November 2022

  • curprev 18:3118:31, 4 November 2022imported>Stashbot 344,941 bytes +23,994 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS buster

3 November 2022

  • curprev 22:4522:45, 3 November 2022imported>Stashbot 320,947 bytes +90,964 andrew@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: (no justification provided) (duration: 01m 00s)

2 November 2022

  • curprev 23:2523:25, 2 November 2022imported>Stashbot 229,983 bytes +129,100 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T318605)', diff saved to https://phabricator.wikimedia.org/P37874 and previous config saved to /var/cache/conftool/dbconfig/20221102-232540-ladsgroup.json
  • curprev 01:1901:19, 2 November 2022imported>Stashbot 100,883 bytes −1,663,965 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P37541 and previous config saved to /var/cache/conftool/dbconfig/20221102-011937-ladsgroup.json

31 October 2022

  • curprev 22:2322:23, 31 October 2022imported>Stashbot 1,764,848 bytes +82,444 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance

29 October 2022

28 October 2022

  • curprev 20:4220:42, 28 October 2022imported>Stashbot 1,681,955 bytes +34,207 mutante: clouddumps* - deployed gerrit:848444 - as kind of expected it fails - most likely the project dirs are not automatically created before rsync runs the first time - T57503
  • curprev 01:1501:15, 28 October 2022imported>Stashbot 1,647,748 bytes +124,757 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T318950)', diff saved to https://phabricator.wikimedia.org/P36942 and previous config saved to /var/cache/conftool/dbconfig/20221028-011505-ladsgroup.json

27 October 2022

  • curprev 01:1601:16, 27 October 2022imported>Stashbot 1,522,991 bytes +90,843 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS buster

25 October 2022

  • curprev 22:3322:33, 25 October 2022imported>Stashbot 1,432,148 bytes +92,794 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp4052.mgmt.ulsfo.wmnet with reboot policy FORCED
  • curprev 01:0901:09, 25 October 2022imported>Stashbot 1,339,354 bytes +82,471 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T321312)', diff saved to https://phabricator.wikimedia.org/P36164 and previous config saved to /var/cache/conftool/dbconfig/20221025-010943-ladsgroup.json

22 October 2022

  • curprev 03:1603:16, 22 October 2022imported>Stashbot 1,256,883 bytes +83 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
  • curprev 00:0300:03, 22 October 2022imported>Stashbot 1,256,800 bytes +69,045 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance

21 October 2022

  • curprev 01:1401:14, 21 October 2022imported>Stashbot 1,187,755 bytes +85,034 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T321312)', diff saved to https://phabricator.wikimedia.org/P35798 and previous config saved to /var/cache/conftool/dbconfig/20221021-011452-ladsgroup.json

19 October 2022

  • curprev 23:3323:33, 19 October 2022imported>Stashbot 1,102,721 bytes +35,546 wfan: civicrm upgraded from 477323fe to c96dd3ae
  • curprev 01:2001:20, 19 October 2022imported>Stashbot 1,067,175 bytes +11,210 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-presto1011.eqiad.wmnet with reason: host reimage

17 October 2022

  • curprev 23:1623:16, 17 October 2022imported>Stashbot 1,055,965 bytes +26,027 bblack@puppetmaster2001: conftool action : set/pooled=yes; selector: service=git-ssh

15 October 2022

  • curprev 23:2723:27, 15 October 2022imported>Stashbot 1,029,938 bytes +932 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1131 T320879', diff saved to https://phabricator.wikimedia.org/P35497 and previous config saved to /var/cache/conftool/dbconfig/20221015-232716-ladsgroup.json

14 October 2022

  • curprev 22:5622:56, 14 October 2022imported>Stashbot 1,029,006 bytes +9,465 mutante: pcc-worker1003.puppet-diffs.eqiad1.wikimedia.cloud - out of disk space again - deleted 3.5GB job "1460" to unblock puppet compiling
  • curprev 01:2601:26, 14 October 2022imported>Stashbot 1,019,541 bytes +26,484 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 03m 36s)

13 October 2022

  • curprev 00:5800:58, 13 October 2022imported>Stashbot 993,057 bytes +28,159 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4045.ulsfo.wmnet with OS buster

11 October 2022

  • curprev 21:3621:36, 11 October 2022imported>Stashbot 964,898 bytes +24,881 mutante: phab1001 / phab2001 - temp. disabled puppet; stopped ssh-phab service; scheduled icinga downtimes for ssh-phab pybal backend alerts - effectively "soft shutting down" the service - T296022

10 October 2022

  • curprev 21:1921:19, 10 October 2022imported>Stashbot 940,017 bytes +15,468 robh@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bullseye

8 October 2022

  • curprev 06:5606:56, 8 October 2022imported>Stashbot 924,549 bytes +110 hashar: Restarting Gerrit to fix up replicaton to GitHub - T320305

7 October 2022

  • curprev 21:2921:29, 7 October 2022imported>Stashbot 924,439 bytes +12,004 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: debugging

6 October 2022

  • curprev 21:1321:13, 6 October 2022imported>Stashbot 912,435 bytes +26,746 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 01:1201:12, 6 October 2022imported>Stashbot 885,689 bytes +23,589 reedy@deploy1002: Finished deploy [integration/docroot@dc380cb]: Update jQuery (duration: 00m 11s)

5 October 2022

  • curprev 00:0500:05, 5 October 2022imported>Stashbot 862,100 bytes +37,349 sukhe: disable puppet on dns4003 till we resolve the puppet failures

3 October 2022

  • curprev 21:4521:45, 3 October 2022imported>Stashbot 824,751 bytes +32,264 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

2 October 2022

  • curprev 08:1308:13, 2 October 2022imported>Stashbot 792,487 bytes +109 elukey: `apt-get clean` on an-airflow1001 to free some space on the root partition

1 October 2022

  • curprev 13:2413:24, 1 October 2022imported>Stashbot 792,378 bytes +460 fab@deploy1002: Finished deploy [airflow-dags/research@44a1158]: (no justification provided) (duration: 00m 08s)

30 September 2022

  • curprev 23:2623:26, 30 September 2022imported>Stashbot 791,918 bytes +15,804 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • curprev 00:3100:31, 30 September 2022imported>Stashbot 776,114 bytes +53,878 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye

29 September 2022

  • curprev 01:0101:01, 29 September 2022imported>Stashbot 722,236 bytes +72,509 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2037.codfw.wmnet with OS buster

28 September 2022

  • curprev 01:2201:22, 28 September 2022imported>Stashbot 649,727 bytes +31,209 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T314041)', diff saved to https://phabricator.wikimedia.org/P34972 and previous config saved to /var/cache/conftool/dbconfig/20220928-012205-ladsgroup.json

27 September 2022

25 September 2022

  • curprev 17:2917:29, 25 September 2022imported>Stashbot 587,776 bytes +2,487 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1053.eqiad.wmnet with OS bullseye

23 September 2022

  • curprev 19:1019:10, 23 September 2022imported>Stashbot 585,289 bytes +2,764 mforns@deploy1002: Finished deploy [airflow-dags/analytics@4c973d6]: (no justification provided) (duration: 00m 12s)

22 September 2022

  • curprev 22:2022:20, 22 September 2022imported>Stashbot 582,525 bytes +8,984 joal@deploy1002: Finished deploy [airflow-dags/analytics@901f810]: (no justification provided) (duration: 00m 11s)

21 September 2022

  • curprev 20:5120:51, 21 September 2022imported>Stashbot 573,541 bytes +10,680 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

20 September 2022

19 September 2022

  • curprev 22:5922:59, 19 September 2022imported>Stashbot 549,728 bytes +6,720 ebernhardson: T317200 start cirrussearch in-place reindex process for eqiad, codfw and cloudelastic

17 September 2022

16 September 2022

  • curprev 21:2921:29, 16 September 2022imported>Stashbot 538,373 bytes +24,573 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • curprev 00:1400:14, 16 September 2022imported>Stashbot 513,800 bytes +30,264 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 17s)

14 September 2022

  • curprev 22:0822:08, 14 September 2022imported>Stashbot 483,536 bytes +39,587 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T314041)', diff saved to https://phabricator.wikimedia.org/P34739 and previous config saved to /var/cache/conftool/dbconfig/20220914-220822-ladsgroup.json
  • curprev 01:1601:16, 14 September 2022imported>Stashbot 443,949 bytes +58,148 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P34670 and previous config saved to /var/cache/conftool/dbconfig/20220914-011637-ladsgroup.json

13 September 2022

  • curprev 00:5000:50, 13 September 2022imported>Stashbot 385,801 bytes +62,019 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2001.codfw.wmnet with reason: syntax error in sudo

12 September 2022

  • curprev 01:2101:21, 12 September 2022imported>Stashbot 323,782 bytes +10,360 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T314041)', diff saved to https://phabricator.wikimedia.org/P34424 and previous config saved to /var/cache/conftool/dbconfig/20220912-012118-ladsgroup.json

10 September 2022

  • curprev 21:3321:33, 10 September 2022imported>Stashbot 313,422 bytes +11,937 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T312863)', diff saved to https://phabricator.wikimedia.org/P34396 and previous config saved to /var/cache/conftool/dbconfig/20220910-213300-ladsgroup.json
  • curprev 00:5000:50, 10 September 2022imported>Stashbot 301,485 bytes +17,350 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P34358 and previous config saved to /var/cache/conftool/dbconfig/20220910-005046-ladsgroup.json

9 September 2022

  • curprev 00:0900:09, 9 September 2022imported>Stashbot 284,135 bytes +69,901 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 22s)

8 September 2022

6 September 2022

  • curprev 23:3823:38, 6 September 2022imported>Stashbot 155,293 bytes +69,531 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T314041)', diff saved to https://phabricator.wikimedia.org/P33981 and previous config saved to /var/cache/conftool/dbconfig/20220906-233809-ladsgroup.json
  • curprev 01:0301:03, 6 September 2022imported>Stashbot 85,762 bytes +31,159 TimStarling: multi-DC stage 3: 2% of codfw/ulsfo/eqsin traffic going to codfw appservers, rolling out via puppet 00:54-01:24

5 September 2022

  • curprev 00:3600:36, 5 September 2022imported>Stashbot 54,603 bytes −976,269 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance

3 September 2022

  • curprev 23:5023:50, 3 September 2022imported>Stashbot 1,030,872 bytes +5,252 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P33759 and previous config saved to /var/cache/conftool/dbconfig/20220903-235001-ladsgroup.json

2 September 2022

  • curprev 19:0319:03, 2 September 2022imported>Stashbot 1,025,620 bytes +16,632 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

1 September 2022

  • curprev 20:5120:51, 1 September 2022imported>Stashbot 1,008,988 bytes +25,876 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 01:2001:20, 1 September 2022imported>Stashbot 983,112 bytes +32,085 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase201[3-8].codfw.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001

31 August 2022

  • curprev 00:1500:15, 31 August 2022imported>Stashbot 951,027 bytes +47,602 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

30 August 2022

  • curprev 01:0401:04, 30 August 2022imported>Stashbot 903,425 bytes +47,117 TimStarling: setting scaling_governor=performance on all mediawiki servers, via puppet gerrit 826405

28 August 2022

  • curprev 21:0321:03, 28 August 2022imported>Stashbot 856,308 bytes +47,266 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P33561 and previous config saved to /var/cache/conftool/dbconfig/20220828-210336-ladsgroup.json
  • curprev 01:0501:05, 28 August 2022imported>Stashbot 809,042 bytes +25,117 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P33423 and previous config saved to /var/cache/conftool/dbconfig/20220828-010522-ladsgroup.json

27 August 2022

  • curprev 01:0301:03, 27 August 2022imported>Stashbot 783,925 bytes +51,766 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T316186)', diff saved to https://phabricator.wikimedia.org/P33346 and previous config saved to /var/cache/conftool/dbconfig/20220827-010313-ladsgroup.json

26 August 2022

  • curprev 00:3800:38, 26 August 2022imported>Stashbot 732,159 bytes +75,646 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131 (T312160)', diff saved to https://phabricator.wikimedia.org/P33172 and previous config saved to /var/cache/conftool/dbconfig/20220826-003819-ladsgroup.json

25 August 2022

  • curprev 01:2501:25, 25 August 2022imported>Stashbot 656,513 bytes +49,388 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P32985 and previous config saved to /var/cache/conftool/dbconfig/20220825-012538-ladsgroup.json

23 August 2022

  • curprev 22:3122:31, 23 August 2022imported>Stashbot 607,125 bytes +44,996 mutante: mwmaint1002 - find /var/lib/puppet/clientbucket -type f -size +100M -delete
  • curprev 01:1101:11, 23 August 2022imported>Stashbot 562,129 bytes +42,364 TimStarling: on mw1411, mw1413, mw1419, mw1429, mw1431, mw1433: set scaling_governor to powersave and energy_performance_preference to performance

22 August 2022

  • curprev 00:3800:38, 22 August 2022imported>Stashbot 519,765 bytes +4,589 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

20 August 2022

  • curprev 22:1822:18, 20 August 2022imported>Stashbot 515,176 bytes +5,114 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32637 and previous config saved to /var/cache/conftool/dbconfig/20220820-221826-ladsgroup.json
  • curprev 01:2601:26, 20 August 2022imported>Stashbot 510,062 bytes +33,432 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T314041)', diff saved to https://phabricator.wikimedia.org/P32622 and previous config saved to /var/cache/conftool/dbconfig/20220820-012602-ladsgroup.json

18 August 2022

  • curprev 23:3323:33, 18 August 2022imported>Stashbot 476,630 bytes +51,439 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:4900:49, 18 August 2022imported>Stashbot 425,191 bytes +51,429 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kubernetes2023.codfw.wmnet']

17 August 2022

  • curprev 01:2301:23, 17 August 2022imported>Stashbot 373,762 bytes +37,026 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging2005']

16 August 2022

  • curprev 00:1800:18, 16 August 2022imported>Stashbot 336,736 bytes +19,280 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: replaceableSettings g 820247 (duration: 03m 18s)

14 August 2022

  • curprev 08:5408:54, 14 August 2022imported>Stashbot 317,456 bytes +542 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T312863)', diff saved to https://phabricator.wikimedia.org/P32380 and previous config saved to /var/cache/conftool/dbconfig/20220814-085443-ladsgroup.json

13 August 2022

  • curprev 13:3713:37, 13 August 2022imported>Stashbot 316,914 bytes +1,309 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance

12 August 2022

  • curprev 23:4123:41, 12 August 2022imported>Stashbot 315,605 bytes +10,763 mutante: wikistats-bullseye:~$ /usr/lib/wikistats/update.php wp prefix blk ; /usr/lib/wikistats/update.php wp prefix kcg T315121
  • curprev 01:0301:03, 12 August 2022imported>Stashbot 304,842 bytes +25,321 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T312863)', diff saved to https://phabricator.wikimedia.org/P32369 and previous config saved to /var/cache/conftool/dbconfig/20220812-010312-ladsgroup.json

11 August 2022

  • curprev 00:5800:58, 11 August 2022imported>Stashbot 279,521 bytes +35,722 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2042.codfw.wmnet,service=varnish-fe

9 August 2022

  • curprev 23:1723:17, 9 August 2022imported>Stashbot 243,799 bytes +16,655 bking@cumin1001: conftool action : set/weight=10:pooled=yes; selector: name=wdqs1011.eqiad.wmnet

8 August 2022

  • curprev 23:5223:52, 8 August 2022imported>Stashbot 227,144 bytes +14,190 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: clean up testwiki experiments T314750 (duration: 03m 19s)

7 August 2022

  • curprev 19:5819:58, 7 August 2022imported>Stashbot 212,954 bytes +3,230 taavi: taavi@mwmaint1002 ~ $ echo "https://upload.wikimedia.org/wikipedia/commons/1/15/Keep_tidy_ask.svg" | mwscript purgeList.php --wiki enwiki # T314712

6 August 2022

  • curprev 17:5917:59, 6 August 2022imported>Stashbot 209,724 bytes +2,395 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T312863)', diff saved to https://phabricator.wikimedia.org/P32295 and previous config saved to /var/cache/conftool/dbconfig/20220806-175916-ladsgroup.json

5 August 2022

  • curprev 22:2022:20, 5 August 2022imported>Stashbot 207,329 bytes +9,421 dcausse@deploy1002: Finished deploy [wikimedia/discovery/analytics@71fe016]: Fix schedule_interval for image_recommendation_weekly (duration: 02m 01s)
  • curprev 00:5300:53, 5 August 2022imported>Stashbot 197,908 bytes +49,951 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on gerrit2001.wikimedia.org with reason: decom, replaced by gerrit2002

4 August 2022

  • curprev 01:2301:23, 4 August 2022imported>Stashbot 147,957 bytes +66,383 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32278 and previous config saved to /var/cache/conftool/dbconfig/20220804-012341-marostegui.json

2 August 2022

  • curprev 22:3922:39, 2 August 2022imported>Stashbot 81,574 bytes +58,081 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:4100:41, 2 August 2022imported>Stashbot 23,493 bytes −741,414 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

1 August 2022

  • curprev 01:0001:00, 1 August 2022imported>Stashbot 764,907 bytes +3,017 krinkle@deploy1002: Synchronized multiversion/: Ic0dbcba9f60f20a (duration: 03m 31s)

30 July 2022

  • curprev 01:4401:44, 30 July 2022imported>Stashbot 761,890 bytes +392 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reimage (bullseye upgrade) - bking@cumin1001 - T289135
  • curprev 00:5500:55, 30 July 2022imported>Stashbot 761,498 bytes +10,804 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2028.codfw.wmnet with OS bullseye

29 July 2022

  • curprev 00:4800:48, 29 July 2022imported>Stashbot 750,694 bytes +35,853 TimStarling: slowly restarting (with batch 1 sleep 5) trafficserver on text caches to fully deploy g 817086 T313578

28 July 2022

  • curprev 01:2601:26, 28 July 2022imported>Stashbot 714,841 bytes +36,833 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply

26 July 2022

  • curprev 23:5923:59, 26 July 2022imported>Stashbot 678,008 bytes +28,393 tzatziki: removing one file for legal compliance
  • curprev 00:1100:11, 26 July 2022imported>Stashbot 649,615 bytes +31,213 TimStarling: restarted php7.2-fpm on the 9 canary hosts in eqiad T313770

24 July 2022

  • curprev 20:5420:54, 24 July 2022imported>Stashbot 618,402 bytes +4,271 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM archiva1002.wikimedia.org
  • curprev 00:3700:37, 24 July 2022imported>Stashbot 614,131 bytes +18,168 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T312863)', diff saved to https://phabricator.wikimedia.org/P31802 and previous config saved to /var/cache/conftool/dbconfig/20220724-003718-ladsgroup.json

23 July 2022

  • curprev 01:3701:37, 23 July 2022imported>Stashbot 595,963 bytes +27,357 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P31750 and previous config saved to /var/cache/conftool/dbconfig/20220723-013755-ladsgroup.json

22 July 2022

  • curprev 00:4400:44, 22 July 2022imported>Stashbot 568,606 bytes +68,956 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

21 July 2022

  • curprev 00:4400:44, 21 July 2022imported>Stashbot 499,650 bytes +51,704 bking@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reimage (bullseye upgrade) - bking@cumin1001 - T289135

20 July 2022

  • curprev 01:2701:27, 20 July 2022imported>Stashbot 447,946 bytes +56,169 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2052.codfw.wmnet with reason: host reimage

18 July 2022

  • curprev 23:5823:58, 18 July 2022imported>Stashbot 391,777 bytes +61,137 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt1050.eqiad.wmnet

17 July 2022

  • curprev 18:0518:05, 17 July 2022imported>Stashbot 330,640 bytes +10,758 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T312984)', diff saved to https://phabricator.wikimedia.org/P31256 and previous config saved to /var/cache/conftool/dbconfig/20220717-180539-ladsgroup.json
  • curprev 00:4800:48, 17 July 2022imported>Stashbot 319,882 bytes +13,275 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P31225 and previous config saved to /var/cache/conftool/dbconfig/20220717-004804-ladsgroup.json

16 July 2022

  • curprev 00:4700:47, 16 July 2022imported>Stashbot 306,607 bytes +27,543 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2064.codfw.wmnet with OS bullseye

15 July 2022

  • curprev 00:3000:30, 15 July 2022imported>Stashbot 279,064 bytes +31,156 TimStarling: on ms-fe1010 restarting swift-proxy

14 July 2022

  • curprev 00:4400:44, 14 July 2022imported>Stashbot 247,908 bytes +16,545 krinkle@deploy1002: Synchronized php-1.39.0-wmf.19/includes/ResourceLoader/: Ie11bdfdcf5e6724 (duration: 02m 55s)

12 July 2022

  • curprev 22:3222:32, 12 July 2022imported>Stashbot 231,363 bytes +15,830 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2039.codfw.wmnet with OS bullseye
  • curprev 00:1000:10, 12 July 2022imported>Stashbot 215,533 bytes +11,642 ejegg: updated payments-wiki from 53a7b7bd to 2f95d8b4

11 July 2022

  • curprev 00:2300:23, 11 July 2022imported>Stashbot 203,891 bytes +379 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

9 July 2022

  • curprev 13:3413:34, 9 July 2022imported>Stashbot 203,512 bytes +504 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 01:4401:44, 9 July 2022imported>Stashbot 203,008 bytes +12,736 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

8 July 2022

  • curprev 00:0200:02, 8 July 2022imported>Stashbot 190,272 bytes +47,463 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS bullseye

7 July 2022

  • curprev 00:5800:58, 7 July 2022imported>Stashbot 142,809 bytes +50,193 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

5 July 2022

  • curprev 23:3023:30, 5 July 2022imported>Stashbot 92,616 bytes +32,932 ebernhardson: start restore of commonswiki_file from thanos-swift to cloudelastic

4 July 2022

  • curprev 20:0920:09, 4 July 2022imported>Stashbot 59,684 bytes +22,620 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1004.wikimedia.org

3 July 2022

  • curprev 11:3611:36, 3 July 2022imported>Stashbot 37,064 bytes +255 _joe_: temporarily raised replicas for shellbox to 24

2 July 2022

  • curprev 05:3605:36, 2 July 2022imported>Stashbot 36,809 bytes +2,607 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 09s)
  • curprev 00:4500:45, 2 July 2022imported>Stashbot 34,202 bytes +30,284 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance

1 July 2022

  • curprev 01:3901:39, 1 July 2022imported>Stashbot 3,918 bytes −776,405 krinkle@deploy1002: Synchronized tests/: I60edfb0f60 (1/3) (duration: 03m 32s)

30 June 2022

  • curprev 01:3601:36, 30 June 2022imported>Stashbot 780,323 bytes +52,007 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bullseye

29 June 2022

  • curprev 00:1800:18, 29 June 2022imported>Stashbot 728,316 bytes +67,167 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2003-dev.codfw.wmnet with OS bullseye

27 June 2022

  • curprev 23:5123:51, 27 June 2022imported>Stashbot 661,149 bytes +85,388 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host mw1474.mgmt.eqiad.wmnet with reboot policy FORCED
  • curprev 01:2501:25, 27 June 2022imported>Stashbot 575,761 bytes +7,333 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1008.mgmt.eqiad.wmnet with reboot policy FORCED

25 June 2022

  • curprev 18:1718:17, 25 June 2022imported>Stashbot 568,428 bytes +2,028 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet

24 June 2022

  • curprev 19:3519:35, 24 June 2022imported>Stashbot 566,400 bytes +54,579 dancy@deploy1002: backport aborted: (duration: 00m 12s)

23 June 2022

  • curprev 21:2321:23, 23 June 2022imported>Stashbot 511,821 bytes +39,618 mutante: restbase-dev1006 has manually installed packages (wrk, maybe others)
  • curprev 00:3500:35, 23 June 2022imported>Stashbot 472,203 bytes +29,515 brennen: end of phabricator maintenance window

22 June 2022

  • curprev 01:1801:18, 22 June 2022imported>Stashbot 442,688 bytes +15,474 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

20 June 2022

  • curprev 07:1407:14, 20 June 2022imported>Stashbot 427,214 bytes +308 SandraEbele: Started Airflow 3 Wikidata metrics jobs (Articleplaceholder, Reliability and SpecialEntityData metrics).

19 June 2022

  • curprev 10:2810:28, 19 June 2022imported>Stashbot 426,906 bytes +493 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1132.eqiad.wmnet with reason: depooled

17 June 2022

  • curprev 22:0522:05, 17 June 2022imported>Stashbot 426,413 bytes +16,273 AndyRussG: update payments-wiki revision 10304f69 -> ef53c82e
  • curprev 01:4301:43, 17 June 2022imported>Stashbot 410,140 bytes +29,970 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1017.eqiad.wmnet with reason: host reimage

15 June 2022

  • curprev 22:4822:48, 15 June 2022imported>Stashbot 380,170 bytes +61,049 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T310011)', diff saved to https://phabricator.wikimedia.org/P29867 and previous config saved to /var/cache/conftool/dbconfig/20220615-224845-marostegui.json

14 June 2022

  • curprev 23:5223:52, 14 June 2022imported>Stashbot 319,121 bytes +44,141 mutante: gitlab-runner1001/1002 - clean revert not possible, icinga alerting about failed buildkitd service, manually deleting systemd unit and trying to clean up T308271
  • curprev 00:3600:36, 14 June 2022imported>Stashbot 274,980 bytes +45,898 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T310011)', diff saved to https://phabricator.wikimedia.org/P29701 and previous config saved to /var/cache/conftool/dbconfig/20220614-003608-marostegui.json

12 June 2022

  • curprev 18:3118:31, 12 June 2022imported>Stashbot 229,082 bytes +4,306 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host clouddumps1002.wikimedia.org with OS bullseye
  • curprev 01:4601:46, 12 June 2022imported>Stashbot 224,776 bytes +4,304 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage

11 June 2022

  • curprev 01:1701:17, 11 June 2022imported>Stashbot 220,472 bytes +8,628 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

10 June 2022

  • curprev 00:3300:33, 10 June 2022imported>Stashbot 211,844 bytes +35,139 ejegg: rolled back payments-wiki from 05139a0c to 8c6208c2

9 June 2022

  • curprev 00:4900:49, 9 June 2022imported>Stashbot 176,705 bytes +52,552 krinkle@deploy1002: Synchronized php-1.39.0-wmf.15/includes/libs/rdbms/: I99b817b3d50ffcdf56, T310214 (duration: 03m 23s)

8 June 2022

  • curprev 01:4301:43, 8 June 2022imported>Stashbot 124,153 bytes +33,565 cstone: civicrm revision changed from de12571a to b0b400ae

6 June 2022

  • curprev 23:1723:17, 6 June 2022imported>Stashbot 90,588 bytes +16,595 tzatziki: removing one file for legal compliance

5 June 2022

  • curprev 22:2122:21, 5 June 2022imported>Stashbot 73,993 bytes +6,438 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298560)', diff saved to https://phabricator.wikimedia.org/P29417 and previous config saved to /var/cache/conftool/dbconfig/20220605-222110-ladsgroup.json
  • curprev 01:3701:37, 5 June 2022imported>Stashbot 67,555 bytes +6,227 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye

3 June 2022

  • curprev 22:1922:19, 3 June 2022imported>Stashbot 61,328 bytes +9,538 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • curprev 01:2001:20, 3 June 2022imported>Stashbot 51,790 bytes +28,593 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298560)', diff saved to https://phabricator.wikimedia.org/P29365 and previous config saved to /var/cache/conftool/dbconfig/20220603-012045-ladsgroup.json

2 June 2022

  • curprev 01:4701:47, 2 June 2022imported>Stashbot 23,197 bytes −1,118,222 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply

1 June 2022

  • curprev 01:4101:41, 1 June 2022imported>Stashbot 1,141,419 bytes +62,344 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance

31 May 2022

  • curprev 00:4000:40, 31 May 2022imported>Stashbot 1,079,075 bytes +101,499 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance

30 May 2022

  • curprev 01:4501:45, 30 May 2022imported>Stashbot 977,576 bytes +7,857 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P28904 and previous config saved to /var/cache/conftool/dbconfig/20220530-014458-ladsgroup.json

28 May 2022

  • curprev 23:3623:36, 28 May 2022imported>Stashbot 969,719 bytes +50,883 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298560)', diff saved to https://phabricator.wikimedia.org/P28882 and previous config saved to /var/cache/conftool/dbconfig/20220528-233650-ladsgroup.json
  • curprev 01:3201:32, 28 May 2022imported>Stashbot 918,836 bytes +45,130 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 (T309311)', diff saved to https://phabricator.wikimedia.org/P28737 and previous config saved to /var/cache/conftool/dbconfig/20220528-013212-ladsgroup.json

27 May 2022

  • curprev 00:4500:45, 27 May 2022imported>Stashbot 873,706 bytes +31,398 mutante: rsyncing /srv/gitlab-backup from gitlab1004 to gitlab2002 | systemctl status full-backup ..in progress on gitlab1001 - T274463

26 May 2022

  • curprev 00:5800:58, 26 May 2022imported>Stashbot 842,308 bytes +49,509 mutante: gitlab1001 - T308089 T274463 - gitlab1001 - systemctl start full-backup

25 May 2022

  • curprev 00:1500:15, 25 May 2022imported>Stashbot 792,799 bytes +52,401 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T298560)', diff saved to https://phabricator.wikimedia.org/P28462 and previous config saved to /var/cache/conftool/dbconfig/20220525-001552-ladsgroup.json

24 May 2022

  • curprev 00:5200:52, 24 May 2022imported>Stashbot 740,398 bytes +67,605 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28379 and previous config saved to /var/cache/conftool/dbconfig/20220524-005257-ladsgroup.json

22 May 2022

  • curprev 20:4620:46, 22 May 2022imported>Stashbot 672,793 bytes +13,528 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:2100:21, 22 May 2022imported>Stashbot 659,265 bytes +20,709 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T298560)', diff saved to https://phabricator.wikimedia.org/P28249 and previous config saved to /var/cache/conftool/dbconfig/20220522-002120-ladsgroup.json

21 May 2022

  • curprev 01:0601:06, 21 May 2022imported>Stashbot 638,556 bytes +27,942 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298555)', diff saved to https://phabricator.wikimedia.org/P28208 and previous config saved to /var/cache/conftool/dbconfig/20220521-010640-ladsgroup.json

20 May 2022

  • curprev 01:3101:31, 20 May 2022imported>Stashbot 610,614 bytes +72,169 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

19 May 2022

  • curprev 00:5800:58, 19 May 2022imported>Stashbot 538,445 bytes +60,753 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance

18 May 2022

  • curprev 01:0501:05, 18 May 2022imported>Stashbot 477,692 bytes +34,747 ejegg: updated fundraising CiviCRM from d45afdfc to b8b8c177

16 May 2022

  • curprev 22:1422:14, 16 May 2022imported>Stashbot 442,945 bytes +16,328 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging

15 May 2022

  • curprev 21:4721:47, 15 May 2022imported>Stashbot 426,617 bytes +1,183 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 07s)

14 May 2022

  • curprev 08:3408:34, 14 May 2022imported>Stashbot 425,434 bytes +205 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1172', diff saved to https://phabricator.wikimedia.org/P27830 and previous config saved to /var/cache/conftool/dbconfig/20220514-083421-jynus.json
  • curprev 00:5300:53, 14 May 2022imported>Stashbot 425,229 bytes +4,537 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Server need to be downgraded to stretch, on monday

12 May 2022

  • curprev 21:5621:56, 12 May 2022imported>Stashbot 420,692 bytes +26,145 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@a2bdc3e]: (no justification provided) (duration: 02m 08s)

11 May 2022

  • curprev 22:2822:28, 11 May 2022imported>Stashbot 394,547 bytes +16,527 robh: cp305[67] returned to service and all green in icinga, cp305[89] depooling for firmware update T243167
  • curprev 01:4101:41, 11 May 2022imported>Stashbot 378,020 bytes +25,757 mutante: gitlab2001 - starting backup-restore service that had failed on previous automatic run

9 May 2022

  • curprev 21:5821:58, 9 May 2022imported>Stashbot 352,263 bytes +25,329 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: new kernel round deux

8 May 2022

  • curprev 07:1607:16, 8 May 2022imported>Stashbot 326,934 bytes +81 godog: silence probedown for thumbor:8800 until monday

7 May 2022

  • curprev 21:2921:29, 7 May 2022imported>Stashbot 326,853 bytes +2,312 andrew@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: seeking consistency between codfw1dev and eqiad1 (duration: 04m 04s)
(newest | oldest) View (newer 250 | ) (20 | 50 | 100 | 250 | 500)