You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Revision history

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

(newest | oldest) View (newer 500 | ) (20 | 50 | 100 | 250 | 500)

9 June 2023

  • curprev 01:0801:08, 9 June 2023imported>Stashbot 340,662 bytes +52,887 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - pt1979@cumin2002"

8 June 2023

  • curprev 01:0801:08, 8 June 2023imported>Stashbot 287,775 bytes +111,676 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T336886)', diff saved to https://phabricator.wikimedia.org/P49233 and previous config saved to /var/cache/conftool/dbconfig/20230608-010853-ladsgroup.json

7 June 2023

  • curprev 01:1601:16, 7 June 2023imported>Stashbot 176,099 bytes +73,106 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2111', diff saved to https://phabricator.wikimedia.org/P48979 and previous config saved to /var/cache/conftool/dbconfig/20230607-011602-ladsgroup.json

6 June 2023

  • curprev 01:2101:21, 6 June 2023imported>Stashbot 102,993 bytes +67,142 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317 (T336886)', diff saved to https://phabricator.wikimedia.org/P48826 and previous config saved to /var/cache/conftool/dbconfig/20230606-012058-ladsgroup.json

3 June 2023

  • curprev 13:4113:41, 3 June 2023imported>Stashbot 35,851 bytes +556 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on an-test-worker1001.eqiad.wmnet with reason: Host under testing/upgrade

2 June 2023

  • curprev 20:1620:16, 2 June 2023imported>Stashbot 35,295 bytes +4,802 apergos: rsync in ariel screen session, bwlimit 100000, running on dumpsdata1003, pulling from dumpsdata1002, copying over 'other dumps'

1 June 2023

30 May 2023

29 May 2023

  • curprev 15:1915:19, 29 May 2023imported>Stashbot 851,144 bytes +7,988 eoghan@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on vrts2001.codfw.wmnet with reason: This is being worked on

28 May 2023

  • curprev 13:1913:19, 28 May 2023imported>Stashbot 843,156 bytes +471 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync

27 May 2023

  • curprev 21:4021:40, 27 May 2023imported>Stashbot 842,685 bytes +225 Amir1: insert into templatelinks (tl_from, tl_from_namespace, tl_target_id) values (686, 0, 199); on db1154:3113 (T337446)
  • curprev 00:0300:03, 27 May 2023imported>Stashbot 842,460 bytes +17,800 tzatziki: removing 1 file for legal compliance

25 May 2023

24 May 2023

22 May 2023

  • curprev 23:2923:29, 22 May 2023imported>Stashbot 720,989 bytes +27,509 eileen: civicrm upgraded from cc9593d0 to 7eae24d5

21 May 2023

  • curprev 07:4507:45, 21 May 2023imported>Stashbot 693,480 bytes +523 jelto@deploy1002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply

20 May 2023

19 May 2023

  • curprev 21:2221:22, 19 May 2023imported>Stashbot 691,547 bytes +18,498 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

18 May 2023

  • curprev 23:2623:26, 18 May 2023imported>Stashbot 673,049 bytes +20,985 brennen@deploy1002: rebuilt and synchronized wikiversions files: group0 wikis to 1.41.0-wmf.9 refs T330215

17 May 2023

  • curprev 22:3022:30, 17 May 2023imported>Stashbot 652,064 bytes +51,758 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • curprev 00:2100:21, 17 May 2023imported>Stashbot 600,306 bytes +39,703 krinkle@deploy1002: Synchronized src/: I4cfa4a2474b4e (duration: 06m 01s)

15 May 2023

  • curprev 23:3723:37, 15 May 2023imported>Stashbot 560,603 bytes +15,254 eileen: civicrm upgraded from db6e8d69 to ef7b3822

12 May 2023

  • curprev 22:5922:59, 12 May 2023imported>Stashbot 545,349 bytes +13,865 pt1979@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudswift1001.eqiad.wmnet with OS bullseye
  • curprev 01:0801:08, 12 May 2023imported>Stashbot 531,484 bytes +31,985 denisse@cumin1001: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus6001.drmrs.wmnet

10 May 2023

  • curprev 22:0822:08, 10 May 2023imported>Stashbot 499,499 bytes +53,507 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs2021.codfw.wmnet with OS buster

9 May 2023

  • curprev 23:4323:43, 9 May 2023imported>Stashbot 445,992 bytes +64,686 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad
  • curprev 00:4300:43, 9 May 2023imported>Stashbot 381,306 bytes +82,838 eileen: civicrm upgraded from d5229d22 to 301e24e4

7 May 2023

  • curprev 00:5400:54, 7 May 2023imported>Stashbot 298,468 bytes +1,045 sukhe: restart haproxy on cp1087: T334448

5 May 2023

  • curprev 23:2423:24, 5 May 2023imported>Stashbot 297,423 bytes +37,811 tzatziki: removing emails from 230 users per self-requests
  • curprev 01:1701:17, 5 May 2023imported>Stashbot 259,612 bytes +123,759 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2126', diff saved to https://phabricator.wikimedia.org/P47724 and previous config saved to /var/cache/conftool/dbconfig/20230505-011700-ladsgroup.json

3 May 2023

  • curprev 23:5523:55, 3 May 2023imported>Stashbot 135,853 bytes +103,007 eileen: config revision changed from 2995f558 to 26147e89
  • curprev 01:2401:24, 3 May 2023imported>Stashbot 32,846 bytes +22,306 eileen: civicrm upgraded from 09d2eefd to c6149ad2

1 May 2023

29 April 2023

  • curprev 23:0323:03, 29 April 2023imported>Stashbot 738,510 bytes +490 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1132.eqiad.wmnet with reason: Maint

28 April 2023

  • curprev 22:4622:46, 28 April 2023imported>Stashbot 738,020 bytes +9,846 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

27 April 2023

  • curprev 22:1722:17, 27 April 2023imported>Stashbot 728,174 bytes +23,740 zabe@deploy1002: Finished scap: T334295 (duration: 06m 58s)
  • curprev 00:0100:01, 27 April 2023imported>Stashbot 704,434 bytes +39,904 zabe@deploy1002: Finished scap: T334295 (duration: 06m 53s)

25 April 2023

  • curprev 21:4021:40, 25 April 2023imported>Stashbot 664,530 bytes +14,368 mutante: gerrit1003 - chown -R gerrit2:gerrit2 /var/lib/gerrit2/review_site/ - T326368

24 April 2023

  • curprev 23:1523:15, 24 April 2023imported>Stashbot 650,162 bytes +20,174 eileen: civicrm upgraded from c17c8db2 to 26150ed4

22 April 2023

  • curprev 05:4105:41, 22 April 2023imported>Stashbot 629,988 bytes +821 joe: <thumbor/codfw>$ helmfile --state-values-set roll_restart=1 -e codfw sync

21 April 2023

  • curprev 18:2818:28, 21 April 2023imported>Stashbot 629,167 bytes +2,295 mvernon@cumin2002: END (FAIL) - Cookbook sre.swift.remove-ghost-objects (exit_code=99) from container wikipedia-en-local-public.a8 in codfw
  • curprev 00:3700:37, 21 April 2023imported>Stashbot 626,872 bytes +23,363 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host backup2010.codfw.wmnet with OS bullseye

20 April 2023

  • curprev 00:0200:02, 20 April 2023imported>Stashbot 603,509 bytes +50,285 mutante: LDAP - adding uid fnavas-foundation to group wmf - T331482

19 April 2023

  • curprev 01:1701:17, 19 April 2023imported>Stashbot 553,224 bytes +56,948 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1210 (T333332)', diff saved to https://phabricator.wikimedia.org/P47161 and previous config saved to /var/cache/conftool/dbconfig/20230419-011754-ladsgroup.json

18 April 2023

  • curprev 00:5400:54, 18 April 2023imported>Stashbot 496,276 bytes +58,776 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cassandra-dev2001.codfw.wmnet

16 April 2023

  • curprev 07:5407:54, 16 April 2023imported>Stashbot 437,500 bytes +394 vgutierrez: restart haproxy on cp2033 to clear unexpected service restart alerts - T334448

15 April 2023

  • curprev 07:1307:13, 15 April 2023imported>Stashbot 437,106 bytes +11,341 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181 (T333332)', diff saved to https://phabricator.wikimedia.org/P46929 and previous config saved to /var/cache/conftool/dbconfig/20230415-071327-ladsgroup.json
  • curprev 01:2301:23, 15 April 2023imported>Stashbot 425,765 bytes +83,532 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P46892 and previous config saved to /var/cache/conftool/dbconfig/20230415-012305-ladsgroup.json

14 April 2023

  • curprev 01:0801:08, 14 April 2023imported>Stashbot 342,233 bytes +39,811 fab@deploy2002: Finished deploy [airflow-dags/research@f8dad05]: (no justification provided) (duration: 00m 10s)

13 April 2023

  • curprev 01:2301:23, 13 April 2023imported>Stashbot 302,422 bytes +92,852 fab@deploy2002: Finished deploy [airflow-dags/research@f8dad05]: (no justification provided) (duration: 00m 10s)

12 April 2023

  • curprev 01:1601:16, 12 April 2023imported>Stashbot 209,570 bytes +52,844 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1141 (T333332)', diff saved to https://phabricator.wikimedia.org/P46380 and previous config saved to /var/cache/conftool/dbconfig/20230412-011619-ladsgroup.json

11 April 2023

  • curprev 00:3700:37, 11 April 2023imported>Stashbot 156,726 bytes +28,171 eileen: civicrm upgraded from 001e156a to bc2f5ccc

8 April 2023

  • curprev 17:5717:57, 8 April 2023imported>Stashbot 128,555 bytes +152 jclark@cumin1001: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['ms-be1073']

7 April 2023

  • curprev 18:1918:19, 7 April 2023imported>Stashbot 128,403 bytes +1,699 xcollazo@deploy2002: Finished deploy [airflow-dags/platform_eng@5c4ebda]: (no justification provided) (duration: 00m 35s)
  • curprev 01:1701:17, 7 April 2023imported>Stashbot 126,704 bytes +39,724 urandom: rebooting sessionstore1001 — T327954

6 April 2023

  • curprev 00:5000:50, 6 April 2023imported>Stashbot 86,980 bytes +44,195 urandom: rebooting sessionstore1001 — T327954

4 April 2023

  • curprev 23:4023:40, 4 April 2023imported>Stashbot 42,785 bytes +27,465 jclark@cumin1001: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudvirtlocal1002.mgmt.eqiad.wmnet with reboot policy FORCED

3 April 2023

  • curprev 21:5321:53, 3 April 2023imported>Stashbot 15,320 bytes +15,053 ryankemper: T331896 `sudo -E cumin -b 4 'wdqs*' 'sudo run-puppet-agent'`

1 April 2023

31 March 2023

  • curprev 01:0801:08, 31 March 2023imported>Stashbot 921,204 bytes +54,765 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gerrit1003.wikimedia.org with OS bullseye

30 March 2023

  • curprev 00:2700:27, 30 March 2023imported>Stashbot 866,439 bytes +44,733 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db1207']

29 March 2023

  • curprev 00:4200:42, 29 March 2023imported>Stashbot 821,706 bytes +28,993 sukhe@puppetmaster1001: conftool action : set/pooled=yes; selector: name=cp2035.codfw.wmnet,service=ats-be

27 March 2023

  • curprev 23:4723:47, 27 March 2023imported>Stashbot 792,713 bytes +20,868 mutante: people1003 - taking down apache to provoke monitoring alert (inactive instances) and confirm IRC alerting change works

25 March 2023

  • curprev 07:5407:54, 25 March 2023imported>Stashbot 771,845 bytes +267 hashar@deploy2002: Finished deploy [integration/docroot@ab848e3]: build: Updating eslint-config-wikimedia to 0.24.0 (duration: 00m 08s)
  • curprev 00:5900:59, 25 March 2023imported>Stashbot 771,578 bytes +6,013 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doc1002.eqiad.wmnet with reason: WIP-known-to-be-debugged-new-host

24 March 2023

  • curprev 00:3800:38, 24 March 2023imported>Stashbot 765,565 bytes +28,346 gmodena@deploy1002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mediawiki-page-content-change-enrichment: apply

23 March 2023

  • curprev 01:0501:05, 23 March 2023imported>Stashbot 737,219 bytes +29,786 denisse@cumin1001: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "doc1003 - denisse@cumin1001 - T332812"

22 March 2023

  • curprev 00:5700:57, 22 March 2023imported>Stashbot 707,433 bytes +22,655 zabe@deploy2002: Finished scap: update interwiki cache (duration: 07m 02s)

20 March 2023

19 March 2023

  • curprev 18:2718:27, 19 March 2023imported>Stashbot 664,250 bytes +466 AndyRussG: update config (to re-enable old PayPal orphan slayer job) 27a5b481 -> 6359222d
  • curprev 00:1700:17, 19 March 2023imported>Stashbot 663,784 bytes +1,354 fab@deploy2002: Finished deploy [airflow-dags/research@5edcd7b]: (no justification provided) (duration: 00m 05s)

18 March 2023

17 March 2023

  • curprev 01:0501:05, 17 March 2023imported>Stashbot 655,551 bytes +22,974 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2010.codfw.wmnet

15 March 2023

14 March 2023

12 March 2023

  • curprev 10:4710:47, 12 March 2023imported>Stashbot 500,011 bytes +444 elukey: reset offsets on kafka jumbo for benthos webrequest live (as indicated in https://phabricator.wikimedia.org/T331801#8685569)

10 March 2023

  • curprev 22:4322:43, 10 March 2023imported>Stashbot 499,567 bytes +21,702 jhathaway@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • curprev 00:3300:33, 10 March 2023imported>Stashbot 477,865 bytes +75,245 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host sretest2002.codfw.wmnet with OS bullseye

9 March 2023

  • curprev 01:1201:12, 9 March 2023imported>Stashbot 402,620 bytes +115,579 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2167:3318 (T329260)', diff saved to https://phabricator.wikimedia.org/P45600 and previous config saved to /var/cache/conftool/dbconfig/20230309-011251-marostegui.json

8 March 2023

  • curprev 01:1601:16, 8 March 2023imported>Stashbot 287,041 bytes +135,771 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P45358 and previous config saved to /var/cache/conftool/dbconfig/20230308-011624-marostegui.json

7 March 2023

  • curprev 01:1101:11, 7 March 2023imported>Stashbot 151,270 bytes +84,939 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P45146 and previous config saved to /var/cache/conftool/dbconfig/20230307-011117-marostegui.json

4 March 2023

  • curprev 14:5614:56, 4 March 2023imported>Stashbot 66,331 bytes +1,014 andrew@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: Updating member dashboard to reflect new role names -- T330759 (duration: 02m 17s)

3 March 2023

  • curprev 20:5820:58, 3 March 2023imported>Stashbot 65,317 bytes +20,951 inflatador: bking@cumin2002 persistently unban all elastic nodes in eqiad T322082
  • curprev 01:1201:12, 3 March 2023imported>Stashbot 44,366 bytes +15,955 mutante: releases1002: deleting /usr/local/sbin/sync-srv-org-wikimedia-reprepro-releases1002.eqiad.wmnet which confusingly contains an rsync command to rsync from releases1001 which does not exist anymore T330960

2 March 2023

  • curprev 01:0801:08, 2 March 2023imported>Stashbot 28,411 bytes −866,644 mutante: releases2002 - stopping apache2 to test alerting (active server is 1002 but should be switched) T327975 T330960

1 March 2023

  • curprev 00:2500:25, 1 March 2023imported>Stashbot 895,055 bytes +37,440 ejegg: civicrm rolled back from d199694e to ffc16d2d

27 February 2023

  • curprev 23:5423:54, 27 February 2023imported>Stashbot 857,615 bytes +59,717 bking@cumin1001: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0)

26 February 2023

  • curprev 02:0702:07, 26 February 2023imported>Stashbot 797,898 bytes +147 Amir1: foreachwikiindblist s5 maintenance/migrateExternallinks.php --batch-size=100 --sleep 1 (T326314)

25 February 2023

  • curprev 15:3015:30, 25 February 2023imported>Stashbot 797,751 bytes +618 apergos: resized lvm and filesystem for /data on dumpsdata1004,5,7; was <100G, now is 38T usable (left some room for growth later)

24 February 2023

  • curprev 23:1523:15, 24 February 2023imported>Stashbot 797,133 bytes +27,610 mutante: people2002 - for each user who has a public_html dir that is not empty (for pubdir in $(find . -name public_html -type d -not -empty); ..); rsync it from people1003 with --delete (rsync -avp rsync://people1003.eqiad.wmnet/people-home/${pubdiruser}/public_html/ /home/${pubdiruser}/public_html/); T330091
  • curprev 01:1301:13, 24 February 2023imported>Stashbot 769,523 bytes +31,636 pt1979@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2017.codfw.wmnet with OS bullseye

23 February 2023

21 February 2023

18 February 2023

  • curprev 08:2908:29, 18 February 2023imported>Stashbot 652,075 bytes +417 elukey: kill leftover processes of user `mepps` (offboarded) from stat100[4,5] to unblock puppet

17 February 2023

  • curprev 22:4522:45, 17 February 2023imported>Stashbot 651,658 bytes +16,147 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.RESTART (1 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic cluster restart - bking@cumin1001 - T329957
  • curprev 00:4700:47, 17 February 2023imported>Stashbot 635,511 bytes +15,836 eevans@cumin1001: START - Cookbook sre.cassandra.roll-restart for nodes matching restbase[2012,2015-2018,2020,2022,2023,2025-2027].codfw.wmnet: Restarting Cassandra to apply JVM 1.8.0_362 - eevans@cumin1001

15 February 2023

  • curprev 23:3023:30, 15 February 2023imported>Stashbot 619,675 bytes +27,557 dduvall@deploy1002: Synchronized php: group1 wikis to 1.40.0-wmf.23 refs T325586 (duration: 06m 43s)
  • curprev 01:1101:11, 15 February 2023imported>Stashbot 592,118 bytes +71,533 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T328255)', diff saved to https://phabricator.wikimedia.org/P44663 and previous config saved to /var/cache/conftool/dbconfig/20230215-011110-ladsgroup.json

14 February 2023

  • curprev 01:1701:17, 14 February 2023imported>Stashbot 520,585 bytes +90,483 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on clouddb[1014,1018,1021].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance

11 February 2023

  • curprev 01:5501:55, 11 February 2023imported>Stashbot 430,102 bytes +2,087 marostegui@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • curprev 01:1001:10, 11 February 2023imported>Stashbot 428,015 bytes +93,273 pt1979@cumin2002: START - Cookbook sre.hosts.provision for host mw2448.mgmt.codfw.wmnet with reboot policy FORCED

9 February 2023

8 February 2023

  • curprev 01:0701:07, 8 February 2023imported>Stashbot 172,368 bytes +39,780 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['mw2435']

6 February 2023

  • curprev 23:1723:17, 6 February 2023imported>Stashbot 132,588 bytes +48,055 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host mw2421.mgmt.codfw.wmnet with reboot policy FORCED

5 February 2023

  • curprev 22:2822:28, 5 February 2023imported>Stashbot 84,533 bytes +385 topranks: Re-enabling peering to Seabone/Telecom Italit AS 6762 on cr2-esams at AMS-IX

3 February 2023

  • curprev 21:0521:05, 3 February 2023imported>Stashbot 84,148 bytes +12,739 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • curprev 00:3500:35, 3 February 2023imported>Stashbot 71,409 bytes +23,889 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1080.eqiad.wmnet with OS bullseye

2 February 2023

  • curprev 01:2401:24, 2 February 2023imported>Stashbot 47,520 bytes −622,512 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1075.eqiad.wmnet with reason: host reimage

1 February 2023

  • curprev 00:3800:38, 1 February 2023imported>Stashbot 670,032 bytes +34,224 brett@cumin2002: conftool action : set/pooled=yes; selector: name=cp3055.esams.wmnet

31 January 2023

  • curprev 00:5000:50, 31 January 2023imported>Stashbot 635,808 bytes +57,101 brett@cumin1001: conftool action : set/pooled=yes; selector: name=cp5027.eqsin.wmnet

29 January 2023

  • curprev 14:4614:46, 29 January 2023imported>Stashbot 578,707 bytes +458 ariel@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dumpsdata1002.eqiad.wmnet

28 January 2023

  • curprev 00:3600:36, 28 January 2023imported>Stashbot 578,249 bytes +18,571 brett@cumin1001: conftool action : set/pooled=yes; selector: name=cp4050.ulsfo.wmnet

27 January 2023

  • curprev 01:1601:16, 27 January 2023imported>Stashbot 559,678 bytes +62,951 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching cassandra-dev2*: Applying configuration change to cassandra-dev cluster - eevans@cumin1001

26 January 2023

  • curprev 01:2401:24, 26 January 2023imported>Stashbot 496,727 bytes +35,241 ejegg: payments-wiki upgraded from 15395d05 to 08b8c3bc (upgraded from MW 1.35 to MW 1.39)

25 January 2023

  • curprev 01:1701:17, 25 January 2023imported>Stashbot 461,486 bytes +48,935 legoktm: adjusting Gerrit group "Campaigns Team" so it is not recursively a member of itself

24 January 2023

  • curprev 01:1601:16, 24 January 2023imported>Stashbot 412,551 bytes +38,290 eevans@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host restbase1031.eqiad.wmnet

20 January 2023

18 January 2023

  • curprev 23:4723:47, 18 January 2023imported>Stashbot 352,750 bytes +20,082 zabe: run populateCulComment.php on all group0 and group1 wikis # T327290
  • curprev 01:1301:13, 18 January 2023imported>Stashbot 332,668 bytes +24,670 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2031.codfw.wmnet

16 January 2023

14 January 2023

13 January 2023

  • curprev 23:3923:39, 13 January 2023imported>Stashbot 300,244 bytes +10,059 mutante: people2002 - systemctl reset-failed after removing auto_restart_rsync timers
  • curprev 01:2601:26, 13 January 2023imported>Stashbot 290,185 bytes +53,216 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=False) upgrade firmware for hosts ['an-mariadb1002']

12 January 2023

  • curprev 01:1501:15, 12 January 2023imported>Stashbot 236,969 bytes +44,353 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T321391)', diff saved to https://phabricator.wikimedia.org/P43056 and previous config saved to /var/cache/conftool/dbconfig/20230112-011526-marostegui.json

10 January 2023

  • curprev 23:5823:58, 10 January 2023imported>Stashbot 192,616 bytes +46,411 krinkle@deploy1002: Finished deploy [integration/docroot@b7c82a3]: (no justification provided) (duration: 00m 15s)
  • curprev 00:4800:48, 10 January 2023imported>Stashbot 146,205 bytes +26,510 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: plugin upgrade - bking@cumin1001 - T324247

6 January 2023

4 January 2023

3 January 2023

2 January 2023

  • curprev 10:0410:04, 2 January 2023imported>Stashbot 345 bytes +229 jelto@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host otrs1001.eqiad.wmnet

1 January 2023

31 December 2022

  • curprev 19:1119:11, 31 December 2022imported>Stashbot 552,815 bytes +154 AndyRussG: payments-wiki upgraded c212825e -> f02e3585, config c1c4a9f6 -> 8103bce6

30 December 2022

  • curprev 21:3621:36, 30 December 2022imported>Stashbot 552,661 bytes +126 dcausse: restarting blazegraph on wdqs1006 and wdqs1013 (BlazegraphFreeAllocatorsDecreasingRapidly)

29 December 2022

  • curprev 23:2623:26, 29 December 2022imported>Stashbot 552,535 bytes +629 ryankemper@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)

22 December 2022

  • curprev 18:2718:27, 22 December 2022imported>Stashbot 551,906 bytes +7,038 btullis@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kafka-jumbo1015.eqiad.wmnet with OS bullseye

21 December 2022

  • curprev 23:4123:41, 21 December 2022imported>Stashbot 544,868 bytes +1,082 ejegg: civicrm upgraded from d80f9550 to e3405a4e
  • curprev 00:1000:10, 21 December 2022imported>Stashbot 543,786 bytes +7,035 eileen: + fc0536195a12df9bc7a896f77db6b2c8e609352e Check for correct fin type name in hasEndowment

20 December 2022

18 December 2022

  • curprev 19:4019:40, 18 December 2022imported>Stashbot 520,139 bytes +785 sukhe: ran sudo cumin -b 1 -s 30 'A:mw-api and A:eqiad' 'restart-php7.4-fpm' [at 18:55 UTC]: T325477

17 December 2022

  • curprev 14:3614:36, 17 December 2022imported>Stashbot 519,354 bytes +264 aikochou@deploy1002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .

16 December 2022

  • curprev 19:5519:55, 16 December 2022imported>Stashbot 519,090 bytes +5,844 robh@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4007.ulsfo.wmnet with OS bullseye
  • curprev 00:5100:51, 16 December 2022imported>Stashbot 513,246 bytes +24,332 mutante: puppetmasters - merged gerrit:868481 to "revert" gerrit:866644,ran puppet and 'systemctl reset-failed' via cumin on 10 masters, resolved monitoring alerts

15 December 2022

  • curprev 00:5800:58, 15 December 2022imported>Stashbot 488,914 bytes +47,486 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on logstash2026.codfw.wmnet with reason: host reimage

14 December 2022

  • curprev 01:2201:22, 14 December 2022imported>Stashbot 441,428 bytes +24,146 cwhite@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2037.codfw.wmnet with OS bullseye

12 December 2022

10 December 2022

  • curprev 03:4603:46, 10 December 2022imported>Stashbot 402,542 bytes +1,087 ryankemper@cumin2002: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: search_codfw elasticsearch and plugin upgrade - ryankemper@cumin2002

9 December 2022

  • curprev 23:5923:59, 9 December 2022imported>Stashbot 401,455 bytes +16,799 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
  • curprev 01:1101:11, 9 December 2022imported>Stashbot 384,656 bytes +64,721 eevans@cumin1001: START - Cookbook sre.hosts.reimage for host cassandra-dev2002.codfw.wmnet with OS buster

8 December 2022

  • curprev 01:0501:05, 8 December 2022imported>Stashbot 319,935 bytes +71,471 bblack: lvsNNNN: restart pybal to apply etcd key changes on all "high-traffic1" lvs at all sites - T324336

7 December 2022

  • curprev 00:2100:21, 7 December 2022imported>Stashbot 248,464 bytes +42,999 cmjohnson@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host kubernetes1024.eqiad.wmnet with OS bullseye

6 December 2022

  • curprev 01:2501:25, 6 December 2022imported>Stashbot 205,465 bytes +81,524 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P42379 and previous config saved to /var/cache/conftool/dbconfig/20221206-012539-ladsgroup.json

4 December 2022

  • curprev 04:1904:19, 4 December 2022imported>Stashbot 123,941 bytes +177 TheresNoTime: T302486 : `[samtar@mwmaint1002 ~]$ mwscript maintenance/fixMergeHistoryCorruption.php --wiki enwiki --dry-run --ns 828`

3 December 2022

  • curprev 00:1700:17, 3 December 2022imported>Stashbot 123,764 bytes +12,040 cwhite: draining shards from logstash1010, logstash1033, logstash1034, logstash1035 - T321410

2 December 2022

  • curprev 00:0900:09, 2 December 2022imported>Stashbot 111,724 bytes −1,825,217 rzl@cumin1001: conftool action : set/pooled=no; selector: name=mw14(45|46).eqiad.wmnet,cluster=jobrunner

30 November 2022

  • curprev 01:2201:22, 30 November 2022imported>Stashbot 1,936,941 bytes +129,140 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2176 (T321126)', diff saved to https://phabricator.wikimedia.org/P41834 and previous config saved to /var/cache/conftool/dbconfig/20221130-012218-marostegui.json

29 November 2022

  • curprev 01:1701:17, 29 November 2022imported>Stashbot 1,807,801 bytes +121,792 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T322618)', diff saved to https://phabricator.wikimedia.org/P41530 and previous config saved to /var/cache/conftool/dbconfig/20221129-011707-ladsgroup.json

27 November 2022

  • curprev 03:0103:01, 27 November 2022imported>Stashbot 1,686,009 bytes +943 ladsgroup@cumin1001: dbctl commit (dc=all): 'db2105 (re)pooling @ 100%: Maint', diff saved to https://phabricator.wikimedia.org/P41257 and previous config saved to /var/cache/conftool/dbconfig/20221127-030126-ladsgroup.json

26 November 2022

  • curprev 21:3421:34, 26 November 2022imported>Stashbot 1,685,066 bytes +3,790 urandom: initiating Cassandra bootstrap, aqs1021-b -- T307802
  • curprev 01:1601:16, 26 November 2022imported>Stashbot 1,681,276 bytes +58,470 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P41244 and previous config saved to /var/cache/conftool/dbconfig/20221126-011647-ladsgroup.json

25 November 2022

  • curprev 01:1801:18, 25 November 2022imported>Stashbot 1,622,806 bytes +76,429 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P41071 and previous config saved to /var/cache/conftool/dbconfig/20221125-011818-marostegui.json

24 November 2022

  • curprev 01:2601:26, 24 November 2022imported>Stashbot 1,546,377 bytes +91,450 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T321126)', diff saved to https://phabricator.wikimedia.org/P40894 and previous config saved to /var/cache/conftool/dbconfig/20221124-012652-marostegui.json

23 November 2022

  • curprev 01:1601:16, 23 November 2022imported>Stashbot 1,454,927 bytes +128,094 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS bullseye

22 November 2022

  • curprev 01:1401:14, 22 November 2022imported>Stashbot 1,326,833 bytes +77,259 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T323214)', diff saved to https://phabricator.wikimedia.org/P40411 and previous config saved to /var/cache/conftool/dbconfig/20221122-011404-ladsgroup.json

21 November 2022

  • curprev 01:0801:08, 21 November 2022imported>Stashbot 1,249,574 bytes +1,268 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS buster

19 November 2022

  • curprev 22:5122:51, 19 November 2022imported>Stashbot 1,248,306 bytes +1,417 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS buster
  • curprev 01:2401:24, 19 November 2022imported>Stashbot 1,246,889 bytes +59,627 sukhe@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage

18 November 2022

  • curprev 01:2601:26, 18 November 2022imported>Stashbot 1,187,262 bytes +56,176 pt1979@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['db2173']

17 November 2022

  • curprev 00:5900:59, 17 November 2022imported>Stashbot 1,131,086 bytes +67,747 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2125 (T323214)', diff saved to https://phabricator.wikimedia.org/P40031 and previous config saved to /var/cache/conftool/dbconfig/20221117-005929-ladsgroup.json

16 November 2022

  • curprev 01:1501:15, 16 November 2022imported>Stashbot 1,063,339 bytes +102,599 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS bullseye

15 November 2022

  • curprev 01:0701:07, 15 November 2022imported>Stashbot 960,740 bytes +99,923 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2136', diff saved to https://phabricator.wikimedia.org/P39630 and previous config saved to /var/cache/conftool/dbconfig/20221115-010745-marostegui.json

12 November 2022

  • curprev 23:3423:34, 12 November 2022imported>Stashbot 860,817 bytes +12,581 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T318605)', diff saved to https://phabricator.wikimedia.org/P39371 and previous config saved to /var/cache/conftool/dbconfig/20221112-233420-ladsgroup.json
  • curprev 01:2101:21, 12 November 2022imported>Stashbot 848,236 bytes +62,212 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317', diff saved to https://phabricator.wikimedia.org/P39331 and previous config saved to /var/cache/conftool/dbconfig/20221112-012122-marostegui.json

11 November 2022

  • curprev 01:2201:22, 11 November 2022imported>Stashbot 786,024 bytes +125,544 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2168:3317', diff saved to https://phabricator.wikimedia.org/P39163 and previous config saved to /var/cache/conftool/dbconfig/20221111-012250-ladsgroup.json

10 November 2022

8 November 2022

  • curprev 22:0022:00, 8 November 2022imported>Stashbot 604,525 bytes +104,241 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
  • curprev 01:2501:25, 8 November 2022imported>Stashbot 500,284 bytes +154,243 cmjohnson@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

6 November 2022

5 November 2022

  • curprev 12:5612:56, 5 November 2022imported>Stashbot 345,478 bytes +537 mfossati@deploy1002: Finished deploy [airflow-dags/platform_eng@c849762]: (no justification provided) (duration: 00m 49s)

4 November 2022

  • curprev 18:3118:31, 4 November 2022imported>Stashbot 344,941 bytes +23,994 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS buster

3 November 2022

  • curprev 22:4522:45, 3 November 2022imported>Stashbot 320,947 bytes +90,964 andrew@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: (no justification provided) (duration: 01m 00s)

2 November 2022

  • curprev 23:2523:25, 2 November 2022imported>Stashbot 229,983 bytes +129,100 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T318605)', diff saved to https://phabricator.wikimedia.org/P37874 and previous config saved to /var/cache/conftool/dbconfig/20221102-232540-ladsgroup.json
  • curprev 01:1901:19, 2 November 2022imported>Stashbot 100,883 bytes −1,663,965 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P37541 and previous config saved to /var/cache/conftool/dbconfig/20221102-011937-ladsgroup.json

31 October 2022

  • curprev 22:2322:23, 31 October 2022imported>Stashbot 1,764,848 bytes +82,444 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance

29 October 2022

28 October 2022

  • curprev 20:4220:42, 28 October 2022imported>Stashbot 1,681,955 bytes +34,207 mutante: clouddumps* - deployed gerrit:848444 - as kind of expected it fails - most likely the project dirs are not automatically created before rsync runs the first time - T57503
  • curprev 01:1501:15, 28 October 2022imported>Stashbot 1,647,748 bytes +124,757 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135 (T318950)', diff saved to https://phabricator.wikimedia.org/P36942 and previous config saved to /var/cache/conftool/dbconfig/20221028-011505-ladsgroup.json

27 October 2022

  • curprev 01:1601:16, 27 October 2022imported>Stashbot 1,522,991 bytes +90,843 sukhe@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS buster

25 October 2022

  • curprev 22:3322:33, 25 October 2022imported>Stashbot 1,432,148 bytes +92,794 robh@cumin2002: START - Cookbook sre.hosts.provision for host cp4052.mgmt.ulsfo.wmnet with reboot policy FORCED
  • curprev 01:0901:09, 25 October 2022imported>Stashbot 1,339,354 bytes +82,471 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2169:3317 (T321312)', diff saved to https://phabricator.wikimedia.org/P36164 and previous config saved to /var/cache/conftool/dbconfig/20221025-010943-ladsgroup.json

22 October 2022

  • curprev 03:1603:16, 22 October 2022imported>Stashbot 1,256,883 bytes +83 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-reload (exit_code=99)
  • curprev 00:0300:03, 22 October 2022imported>Stashbot 1,256,800 bytes +69,045 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1005.eqiad.wmnet with reason: Maintenance

21 October 2022

  • curprev 01:1401:14, 21 October 2022imported>Stashbot 1,187,755 bytes +85,034 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2138:3314 (T321312)', diff saved to https://phabricator.wikimedia.org/P35798 and previous config saved to /var/cache/conftool/dbconfig/20221021-011452-ladsgroup.json

19 October 2022

  • curprev 23:3323:33, 19 October 2022imported>Stashbot 1,102,721 bytes +35,546 wfan: civicrm upgraded from 477323fe to c96dd3ae
  • curprev 01:2001:20, 19 October 2022imported>Stashbot 1,067,175 bytes +11,210 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-presto1011.eqiad.wmnet with reason: host reimage

17 October 2022

  • curprev 23:1623:16, 17 October 2022imported>Stashbot 1,055,965 bytes +26,027 bblack@puppetmaster2001: conftool action : set/pooled=yes; selector: service=git-ssh

15 October 2022

  • curprev 23:2723:27, 15 October 2022imported>Stashbot 1,029,938 bytes +932 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depool db1131 T320879', diff saved to https://phabricator.wikimedia.org/P35497 and previous config saved to /var/cache/conftool/dbconfig/20221015-232716-ladsgroup.json

14 October 2022

  • curprev 22:5622:56, 14 October 2022imported>Stashbot 1,029,006 bytes +9,465 mutante: pcc-worker1003.puppet-diffs.eqiad1.wikimedia.cloud - out of disk space again - deleted 3.5GB job "1460" to unblock puppet compiling
  • curprev 01:2601:26, 14 October 2022imported>Stashbot 1,019,541 bytes +26,484 tstarling@deploy1002: Synchronized wmf-config/CommonSettings.php: (no justification provided) (duration: 03m 36s)

13 October 2022

  • curprev 00:5800:58, 13 October 2022imported>Stashbot 993,057 bytes +28,159 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4045.ulsfo.wmnet with OS buster

11 October 2022

  • curprev 21:3621:36, 11 October 2022imported>Stashbot 964,898 bytes +24,881 mutante: phab1001 / phab2001 - temp. disabled puppet; stopped ssh-phab service; scheduled icinga downtimes for ssh-phab pybal backend alerts - effectively "soft shutting down" the service - T296022

10 October 2022

  • curprev 21:1921:19, 10 October 2022imported>Stashbot 940,017 bytes +15,468 robh@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dns4004.wikimedia.org with OS bullseye

8 October 2022

  • curprev 06:5606:56, 8 October 2022imported>Stashbot 924,549 bytes +110 hashar: Restarting Gerrit to fix up replicaton to GitHub - T320305

7 October 2022

  • curprev 21:2921:29, 7 October 2022imported>Stashbot 924,439 bytes +12,004 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: debugging

6 October 2022

  • curprev 21:1321:13, 6 October 2022imported>Stashbot 912,435 bytes +26,746 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 01:1201:12, 6 October 2022imported>Stashbot 885,689 bytes +23,589 reedy@deploy1002: Finished deploy [integration/docroot@dc380cb]: Update jQuery (duration: 00m 11s)

5 October 2022

  • curprev 00:0500:05, 5 October 2022imported>Stashbot 862,100 bytes +37,349 sukhe: disable puppet on dns4003 till we resolve the puppet failures

3 October 2022

  • curprev 21:4521:45, 3 October 2022imported>Stashbot 824,751 bytes +32,264 robh@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

2 October 2022

  • curprev 08:1308:13, 2 October 2022imported>Stashbot 792,487 bytes +109 elukey: `apt-get clean` on an-airflow1001 to free some space on the root partition

1 October 2022

  • curprev 13:2413:24, 1 October 2022imported>Stashbot 792,378 bytes +460 fab@deploy1002: Finished deploy [airflow-dags/research@44a1158]: (no justification provided) (duration: 00m 08s)

30 September 2022

  • curprev 23:2623:26, 30 September 2022imported>Stashbot 791,918 bytes +15,804 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1003.eqiad.wmnet with reason: Maintenance
  • curprev 00:3100:31, 30 September 2022imported>Stashbot 776,114 bytes +53,878 robh@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4045.ulsfo.wmnet with OS bullseye

29 September 2022

  • curprev 01:0101:01, 29 September 2022imported>Stashbot 722,236 bytes +72,509 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host logstash2037.codfw.wmnet with OS buster

28 September 2022

  • curprev 01:2201:22, 28 September 2022imported>Stashbot 649,727 bytes +31,209 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2145 (T314041)', diff saved to https://phabricator.wikimedia.org/P34972 and previous config saved to /var/cache/conftool/dbconfig/20220928-012205-ladsgroup.json

27 September 2022

25 September 2022

  • curprev 17:2917:29, 25 September 2022imported>Stashbot 587,776 bytes +2,487 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1053.eqiad.wmnet with OS bullseye

23 September 2022

  • curprev 19:1019:10, 23 September 2022imported>Stashbot 585,289 bytes +2,764 mforns@deploy1002: Finished deploy [airflow-dags/analytics@4c973d6]: (no justification provided) (duration: 00m 12s)

22 September 2022

  • curprev 22:2022:20, 22 September 2022imported>Stashbot 582,525 bytes +8,984 joal@deploy1002: Finished deploy [airflow-dags/analytics@901f810]: (no justification provided) (duration: 00m 11s)

21 September 2022

  • curprev 20:5120:51, 21 September 2022imported>Stashbot 573,541 bytes +10,680 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

20 September 2022

19 September 2022

  • curprev 22:5922:59, 19 September 2022imported>Stashbot 549,728 bytes +6,720 ebernhardson: T317200 start cirrussearch in-place reindex process for eqiad, codfw and cloudelastic

17 September 2022

16 September 2022

  • curprev 21:2921:29, 16 September 2022imported>Stashbot 538,373 bytes +24,573 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • curprev 00:1400:14, 16 September 2022imported>Stashbot 513,800 bytes +30,264 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 17s)

14 September 2022

  • curprev 22:0822:08, 14 September 2022imported>Stashbot 483,536 bytes +39,587 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1190 (T314041)', diff saved to https://phabricator.wikimedia.org/P34739 and previous config saved to /var/cache/conftool/dbconfig/20220914-220822-ladsgroup.json
  • curprev 01:1601:16, 14 September 2022imported>Stashbot 443,949 bytes +58,148 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P34670 and previous config saved to /var/cache/conftool/dbconfig/20220914-011637-ladsgroup.json

13 September 2022

  • curprev 00:5000:50, 13 September 2022imported>Stashbot 385,801 bytes +62,019 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on phab2001.codfw.wmnet with reason: syntax error in sudo

12 September 2022

  • curprev 01:2101:21, 12 September 2022imported>Stashbot 323,782 bytes +10,360 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2179 (T314041)', diff saved to https://phabricator.wikimedia.org/P34424 and previous config saved to /var/cache/conftool/dbconfig/20220912-012118-ladsgroup.json

10 September 2022

  • curprev 21:3321:33, 10 September 2022imported>Stashbot 313,422 bytes +11,937 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2146 (T312863)', diff saved to https://phabricator.wikimedia.org/P34396 and previous config saved to /var/cache/conftool/dbconfig/20220910-213300-ladsgroup.json
  • curprev 00:5000:50, 10 September 2022imported>Stashbot 301,485 bytes +17,350 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db2137:3314 (T314041)', diff saved to https://phabricator.wikimedia.org/P34358 and previous config saved to /var/cache/conftool/dbconfig/20220910-005046-ladsgroup.json

9 September 2022

  • curprev 00:0900:09, 9 September 2022imported>Stashbot 284,135 bytes +69,901 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b9be20d]: (no justification provided) (duration: 00m 22s)

8 September 2022

6 September 2022

  • curprev 23:3823:38, 6 September 2022imported>Stashbot 155,293 bytes +69,531 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2114 (T314041)', diff saved to https://phabricator.wikimedia.org/P33981 and previous config saved to /var/cache/conftool/dbconfig/20220906-233809-ladsgroup.json
  • curprev 01:0301:03, 6 September 2022imported>Stashbot 85,762 bytes +31,159 TimStarling: multi-DC stage 3: 2% of codfw/ulsfo/eqsin traffic going to codfw appservers, rolling out via puppet 00:54-01:24

5 September 2022

  • curprev 00:3600:36, 5 September 2022imported>Stashbot 54,603 bytes −976,269 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1107.eqiad.wmnet with reason: Maintenance

3 September 2022

  • curprev 23:5023:50, 3 September 2022imported>Stashbot 1,030,872 bytes +5,252 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106 (T312863)', diff saved to https://phabricator.wikimedia.org/P33759 and previous config saved to /var/cache/conftool/dbconfig/20220903-235001-ladsgroup.json

2 September 2022

  • curprev 19:0319:03, 2 September 2022imported>Stashbot 1,025,620 bytes +16,632 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

1 September 2022

  • curprev 20:5120:51, 1 September 2022imported>Stashbot 1,008,988 bytes +25,876 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 01:2001:20, 1 September 2022imported>Stashbot 983,112 bytes +32,085 eevans@cumin1001: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching restbase201[3-8].codfw.wmnet: Restart to apply new certificates (T316697) - eevans@cumin1001

31 August 2022

  • curprev 00:1500:15, 31 August 2022imported>Stashbot 951,027 bytes +47,602 ryankemper@cumin2002: START - Cookbook sre.elasticsearch.rolling-operation Operation.UPGRADE (3 nodes at a time) for ElasticSearch cluster search_codfw: codfw es7 cluster upgrade - ryankemper@cumin2002 - T316719

30 August 2022

  • curprev 01:0401:04, 30 August 2022imported>Stashbot 903,425 bytes +47,117 TimStarling: setting scaling_governor=performance on all mediawiki servers, via puppet gerrit 826405

28 August 2022

  • curprev 21:0321:03, 28 August 2022imported>Stashbot 856,308 bytes +47,266 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129', diff saved to https://phabricator.wikimedia.org/P33561 and previous config saved to /var/cache/conftool/dbconfig/20220828-210336-ladsgroup.json
  • curprev 01:0501:05, 28 August 2022imported>Stashbot 809,042 bytes +25,117 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P33423 and previous config saved to /var/cache/conftool/dbconfig/20220828-010522-ladsgroup.json

27 August 2022

  • curprev 01:0301:03, 27 August 2022imported>Stashbot 783,925 bytes +51,766 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T316186)', diff saved to https://phabricator.wikimedia.org/P33346 and previous config saved to /var/cache/conftool/dbconfig/20220827-010313-ladsgroup.json

26 August 2022

  • curprev 00:3800:38, 26 August 2022imported>Stashbot 732,159 bytes +75,646 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2131 (T312160)', diff saved to https://phabricator.wikimedia.org/P33172 and previous config saved to /var/cache/conftool/dbconfig/20220826-003819-ladsgroup.json

25 August 2022

  • curprev 01:2501:25, 25 August 2022imported>Stashbot 656,513 bytes +49,388 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P32985 and previous config saved to /var/cache/conftool/dbconfig/20220825-012538-ladsgroup.json

23 August 2022

  • curprev 22:3122:31, 23 August 2022imported>Stashbot 607,125 bytes +44,996 mutante: mwmaint1002 - find /var/lib/puppet/clientbucket -type f -size +100M -delete
  • curprev 01:1101:11, 23 August 2022imported>Stashbot 562,129 bytes +42,364 TimStarling: on mw1411, mw1413, mw1419, mw1429, mw1431, mw1433: set scaling_governor to powersave and energy_performance_preference to performance

22 August 2022

  • curprev 00:3800:38, 22 August 2022imported>Stashbot 519,765 bytes +4,589 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

20 August 2022

  • curprev 22:1822:18, 20 August 2022imported>Stashbot 515,176 bytes +5,114 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T314041)', diff saved to https://phabricator.wikimedia.org/P32637 and previous config saved to /var/cache/conftool/dbconfig/20220820-221826-ladsgroup.json
  • curprev 01:2601:26, 20 August 2022imported>Stashbot 510,062 bytes +33,432 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T314041)', diff saved to https://phabricator.wikimedia.org/P32622 and previous config saved to /var/cache/conftool/dbconfig/20220820-012602-ladsgroup.json

18 August 2022

  • curprev 23:3323:33, 18 August 2022imported>Stashbot 476,630 bytes +51,439 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:4900:49, 18 August 2022imported>Stashbot 425,191 bytes +51,429 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kubernetes2023.codfw.wmnet']

17 August 2022

  • curprev 01:2301:23, 17 August 2022imported>Stashbot 373,762 bytes +37,026 pt1979@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['kafka-logging2005']

16 August 2022

  • curprev 00:1800:18, 16 August 2022imported>Stashbot 336,736 bytes +19,280 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: replaceableSettings g 820247 (duration: 03m 18s)

14 August 2022

  • curprev 08:5408:54, 14 August 2022imported>Stashbot 317,456 bytes +542 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1141 (T312863)', diff saved to https://phabricator.wikimedia.org/P32380 and previous config saved to /var/cache/conftool/dbconfig/20220814-085443-ladsgroup.json

13 August 2022

  • curprev 13:3713:37, 13 August 2022imported>Stashbot 316,914 bytes +1,309 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db1145.eqiad.wmnet with reason: Maintenance

12 August 2022

  • curprev 23:4123:41, 12 August 2022imported>Stashbot 315,605 bytes +10,763 mutante: wikistats-bullseye:~$ /usr/lib/wikistats/update.php wp prefix blk ; /usr/lib/wikistats/update.php wp prefix kcg T315121
  • curprev 01:0301:03, 12 August 2022imported>Stashbot 304,842 bytes +25,321 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T312863)', diff saved to https://phabricator.wikimedia.org/P32369 and previous config saved to /var/cache/conftool/dbconfig/20220812-010312-ladsgroup.json

11 August 2022

  • curprev 00:5800:58, 11 August 2022imported>Stashbot 279,521 bytes +35,722 sukhe@puppetmaster1001: conftool action : set/pooled=no; selector: name=cp2042.codfw.wmnet,service=varnish-fe

9 August 2022

  • curprev 23:1723:17, 9 August 2022imported>Stashbot 243,799 bytes +16,655 bking@cumin1001: conftool action : set/weight=10:pooled=yes; selector: name=wdqs1011.eqiad.wmnet

8 August 2022

  • curprev 23:5223:52, 8 August 2022imported>Stashbot 227,144 bytes +14,190 tstarling@deploy1002: Synchronized wmf-config/InitialiseSettings.php: clean up testwiki experiments T314750 (duration: 03m 19s)

7 August 2022

  • curprev 19:5819:58, 7 August 2022imported>Stashbot 212,954 bytes +3,230 taavi: taavi@mwmaint1002 ~ $ echo "https://upload.wikimedia.org/wikipedia/commons/1/15/Keep_tidy_ask.svg" | mwscript purgeList.php --wiki enwiki # T314712

6 August 2022

  • curprev 17:5917:59, 6 August 2022imported>Stashbot 209,724 bytes +2,395 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1149 (T312863)', diff saved to https://phabricator.wikimedia.org/P32295 and previous config saved to /var/cache/conftool/dbconfig/20220806-175916-ladsgroup.json

5 August 2022

  • curprev 22:2022:20, 5 August 2022imported>Stashbot 207,329 bytes +9,421 dcausse@deploy1002: Finished deploy [wikimedia/discovery/analytics@71fe016]: Fix schedule_interval for image_recommendation_weekly (duration: 02m 01s)
  • curprev 00:5300:53, 5 August 2022imported>Stashbot 197,908 bytes +49,951 dzahn@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8 days, 0:00:00 on gerrit2001.wikimedia.org with reason: decom, replaced by gerrit2002

4 August 2022

  • curprev 01:2301:23, 4 August 2022imported>Stashbot 147,957 bytes +66,383 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1146:3312 (T312972)', diff saved to https://phabricator.wikimedia.org/P32278 and previous config saved to /var/cache/conftool/dbconfig/20220804-012341-marostegui.json

2 August 2022

  • curprev 22:3922:39, 2 August 2022imported>Stashbot 81,574 bytes +58,081 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:4100:41, 2 August 2022imported>Stashbot 23,493 bytes −741,414 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

1 August 2022

  • curprev 01:0001:00, 1 August 2022imported>Stashbot 764,907 bytes +3,017 krinkle@deploy1002: Synchronized multiversion/: Ic0dbcba9f60f20a (duration: 03m 31s)

30 July 2022

  • curprev 01:4401:44, 30 July 2022imported>Stashbot 761,890 bytes +392 bking@cumin1001: END (FAIL) - Cookbook sre.elasticsearch.rolling-operation (exit_code=99) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reimage (bullseye upgrade) - bking@cumin1001 - T289135
  • curprev 00:5500:55, 30 July 2022imported>Stashbot 761,498 bytes +10,804 bking@cumin1001: START - Cookbook sre.hosts.reimage for host elastic2028.codfw.wmnet with OS bullseye

29 July 2022

  • curprev 00:4800:48, 29 July 2022imported>Stashbot 750,694 bytes +35,853 TimStarling: slowly restarting (with batch 1 sleep 5) trafficserver on text caches to fully deploy g 817086 T313578

28 July 2022

  • curprev 01:2601:26, 28 July 2022imported>Stashbot 714,841 bytes +36,833 mwdebug-deploy@deploy1002: helmfile [codfw] START helmfile.d/services/mwdebug: apply

26 July 2022

  • curprev 23:5923:59, 26 July 2022imported>Stashbot 678,008 bytes +28,393 tzatziki: removing one file for legal compliance
  • curprev 00:1100:11, 26 July 2022imported>Stashbot 649,615 bytes +31,213 TimStarling: restarted php7.2-fpm on the 9 canary hosts in eqiad T313770

24 July 2022

  • curprev 20:5420:54, 24 July 2022imported>Stashbot 618,402 bytes +4,271 btullis@cumin1001: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM archiva1002.wikimedia.org
  • curprev 00:3700:37, 24 July 2022imported>Stashbot 614,131 bytes +18,168 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T312863)', diff saved to https://phabricator.wikimedia.org/P31802 and previous config saved to /var/cache/conftool/dbconfig/20220724-003718-ladsgroup.json

23 July 2022

  • curprev 01:3701:37, 23 July 2022imported>Stashbot 595,963 bytes +27,357 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1170:3317', diff saved to https://phabricator.wikimedia.org/P31750 and previous config saved to /var/cache/conftool/dbconfig/20220723-013755-ladsgroup.json

22 July 2022

  • curprev 00:4400:44, 22 July 2022imported>Stashbot 568,606 bytes +68,956 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

21 July 2022

  • curprev 00:4400:44, 21 July 2022imported>Stashbot 499,650 bytes +51,704 bking@cumin1001: END (ERROR) - Cookbook sre.elasticsearch.rolling-operation (exit_code=97) Operation.REIMAGE (1 nodes at a time) for ElasticSearch cluster search_codfw: codfw cluster reimage (bullseye upgrade) - bking@cumin1001 - T289135

20 July 2022

  • curprev 01:2701:27, 20 July 2022imported>Stashbot 447,946 bytes +56,169 bking@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on elastic2052.codfw.wmnet with reason: host reimage

18 July 2022

  • curprev 23:5823:58, 18 July 2022imported>Stashbot 391,777 bytes +61,137 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudvirt1050.eqiad.wmnet

17 July 2022

  • curprev 18:0518:05, 17 July 2022imported>Stashbot 330,640 bytes +10,758 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T312984)', diff saved to https://phabricator.wikimedia.org/P31256 and previous config saved to /var/cache/conftool/dbconfig/20220717-180539-ladsgroup.json
  • curprev 00:4800:48, 17 July 2022imported>Stashbot 319,882 bytes +13,275 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1101:3317', diff saved to https://phabricator.wikimedia.org/P31225 and previous config saved to /var/cache/conftool/dbconfig/20220717-004804-ladsgroup.json

16 July 2022

  • curprev 00:4700:47, 16 July 2022imported>Stashbot 306,607 bytes +27,543 ryankemper@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2064.codfw.wmnet with OS bullseye

15 July 2022

  • curprev 00:3000:30, 15 July 2022imported>Stashbot 279,064 bytes +31,156 TimStarling: on ms-fe1010 restarting swift-proxy

14 July 2022

  • curprev 00:4400:44, 14 July 2022imported>Stashbot 247,908 bytes +16,545 krinkle@deploy1002: Synchronized php-1.39.0-wmf.19/includes/ResourceLoader/: Ie11bdfdcf5e6724 (duration: 02m 55s)

12 July 2022

  • curprev 22:3222:32, 12 July 2022imported>Stashbot 231,363 bytes +15,830 bking@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2039.codfw.wmnet with OS bullseye
  • curprev 00:1000:10, 12 July 2022imported>Stashbot 215,533 bytes +11,642 ejegg: updated payments-wiki from 53a7b7bd to 2f95d8b4

11 July 2022

  • curprev 00:2300:23, 11 July 2022imported>Stashbot 203,891 bytes +379 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

9 July 2022

  • curprev 13:3413:34, 9 July 2022imported>Stashbot 203,512 bytes +504 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 01:4401:44, 9 July 2022imported>Stashbot 203,008 bytes +12,736 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

8 July 2022

  • curprev 00:0200:02, 8 July 2022imported>Stashbot 190,272 bytes +47,463 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2181.codfw.wmnet with OS bullseye

7 July 2022

  • curprev 00:5800:58, 7 July 2022imported>Stashbot 142,809 bytes +50,193 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

5 July 2022

  • curprev 23:3023:30, 5 July 2022imported>Stashbot 92,616 bytes +32,932 ebernhardson: start restore of commonswiki_file from thanos-swift to cloudelastic

4 July 2022

  • curprev 20:0920:09, 4 July 2022imported>Stashbot 59,684 bytes +22,620 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cloudcontrol1004.wikimedia.org

3 July 2022

  • curprev 11:3611:36, 3 July 2022imported>Stashbot 37,064 bytes +255 _joe_: temporarily raised replicas for shellbox to 24

2 July 2022

  • curprev 05:3605:36, 2 July 2022imported>Stashbot 36,809 bytes +2,607 bmansurov@deploy1002: Finished deploy [airflow-dags/research@b3fe77c]: (no justification provided) (duration: 00m 09s)
  • curprev 00:4500:45, 2 July 2022imported>Stashbot 34,202 bytes +30,284 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance

1 July 2022

  • curprev 01:3901:39, 1 July 2022imported>Stashbot 3,918 bytes −776,405 krinkle@deploy1002: Synchronized tests/: I60edfb0f60 (1/3) (duration: 03m 32s)

30 June 2022

  • curprev 01:3601:36, 30 June 2022imported>Stashbot 780,323 bytes +52,007 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2158.codfw.wmnet with OS bullseye

29 June 2022

  • curprev 00:1800:18, 29 June 2022imported>Stashbot 728,316 bytes +67,167 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2003-dev.codfw.wmnet with OS bullseye

27 June 2022

  • curprev 23:5123:51, 27 June 2022imported>Stashbot 661,149 bytes +85,388 cmjohnson@cumin1001: START - Cookbook sre.hosts.provision for host mw1474.mgmt.eqiad.wmnet with reboot policy FORCED
  • curprev 01:2501:25, 27 June 2022imported>Stashbot 575,761 bytes +7,333 cmjohnson@cumin1001: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1008.mgmt.eqiad.wmnet with reboot policy FORCED

25 June 2022

  • curprev 18:1718:17, 25 June 2022imported>Stashbot 568,428 bytes +2,028 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1002.eqiad.wmnet

24 June 2022

  • curprev 19:3519:35, 24 June 2022imported>Stashbot 566,400 bytes +54,579 dancy@deploy1002: backport aborted: (duration: 00m 12s)

23 June 2022

  • curprev 21:2321:23, 23 June 2022imported>Stashbot 511,821 bytes +39,618 mutante: restbase-dev1006 has manually installed packages (wrk, maybe others)
  • curprev 00:3500:35, 23 June 2022imported>Stashbot 472,203 bytes +29,515 brennen: end of phabricator maintenance window

22 June 2022

  • curprev 01:1801:18, 22 June 2022imported>Stashbot 442,688 bytes +15,474 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

20 June 2022

  • curprev 07:1407:14, 20 June 2022imported>Stashbot 427,214 bytes +308 SandraEbele: Started Airflow 3 Wikidata metrics jobs (Articleplaceholder, Reliability and SpecialEntityData metrics).

19 June 2022

  • curprev 10:2810:28, 19 June 2022imported>Stashbot 426,906 bytes +493 ayounsi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on db1132.eqiad.wmnet with reason: depooled

17 June 2022

  • curprev 22:0522:05, 17 June 2022imported>Stashbot 426,413 bytes +16,273 AndyRussG: update payments-wiki revision 10304f69 -> ef53c82e
  • curprev 01:4301:43, 17 June 2022imported>Stashbot 410,140 bytes +29,970 pt1979@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aqs1017.eqiad.wmnet with reason: host reimage

15 June 2022

  • curprev 22:4822:48, 15 June 2022imported>Stashbot 380,170 bytes +61,049 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T310011)', diff saved to https://phabricator.wikimedia.org/P29867 and previous config saved to /var/cache/conftool/dbconfig/20220615-224845-marostegui.json

14 June 2022

  • curprev 23:5223:52, 14 June 2022imported>Stashbot 319,121 bytes +44,141 mutante: gitlab-runner1001/1002 - clean revert not possible, icinga alerting about failed buildkitd service, manually deleting systemd unit and trying to clean up T308271
  • curprev 00:3600:36, 14 June 2022imported>Stashbot 274,980 bytes +45,898 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T310011)', diff saved to https://phabricator.wikimedia.org/P29701 and previous config saved to /var/cache/conftool/dbconfig/20220614-003608-marostegui.json

12 June 2022

  • curprev 18:3118:31, 12 June 2022imported>Stashbot 229,082 bytes +4,306 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host clouddumps1002.wikimedia.org with OS bullseye
  • curprev 01:4601:46, 12 June 2022imported>Stashbot 224,776 bytes +4,304 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on clouddumps1001.wikimedia.org with reason: host reimage

11 June 2022

  • curprev 01:1701:17, 11 June 2022imported>Stashbot 220,472 bytes +8,628 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply

10 June 2022

  • curprev 00:3300:33, 10 June 2022imported>Stashbot 211,844 bytes +35,139 ejegg: rolled back payments-wiki from 05139a0c to 8c6208c2

9 June 2022

  • curprev 00:4900:49, 9 June 2022imported>Stashbot 176,705 bytes +52,552 krinkle@deploy1002: Synchronized php-1.39.0-wmf.15/includes/libs/rdbms/: I99b817b3d50ffcdf56, T310214 (duration: 03m 23s)

8 June 2022

  • curprev 01:4301:43, 8 June 2022imported>Stashbot 124,153 bytes +33,565 cstone: civicrm revision changed from de12571a to b0b400ae

6 June 2022

  • curprev 23:1723:17, 6 June 2022imported>Stashbot 90,588 bytes +16,595 tzatziki: removing one file for legal compliance

5 June 2022

  • curprev 22:2122:21, 5 June 2022imported>Stashbot 73,993 bytes +6,438 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T298560)', diff saved to https://phabricator.wikimedia.org/P29417 and previous config saved to /var/cache/conftool/dbconfig/20220605-222110-ladsgroup.json
  • curprev 01:3701:37, 5 June 2022imported>Stashbot 67,555 bytes +6,227 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host clouddumps1001.wikimedia.org with OS bullseye

3 June 2022

  • curprev 22:1922:19, 3 June 2022imported>Stashbot 61,328 bytes +9,538 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • curprev 01:2001:20, 3 June 2022imported>Stashbot 51,790 bytes +28,593 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1121 (T298560)', diff saved to https://phabricator.wikimedia.org/P29365 and previous config saved to /var/cache/conftool/dbconfig/20220603-012045-ladsgroup.json

2 June 2022

  • curprev 01:4701:47, 2 June 2022imported>Stashbot 23,197 bytes −1,118,222 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply

1 June 2022

  • curprev 01:4101:41, 1 June 2022imported>Stashbot 1,141,419 bytes +62,344 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance

31 May 2022

  • curprev 00:4000:40, 31 May 2022imported>Stashbot 1,079,075 bytes +101,499 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1133.eqiad.wmnet with reason: Maintenance

30 May 2022

  • curprev 01:4501:45, 30 May 2022imported>Stashbot 977,576 bytes +7,857 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3316', diff saved to https://phabricator.wikimedia.org/P28904 and previous config saved to /var/cache/conftool/dbconfig/20220530-014458-ladsgroup.json

28 May 2022

  • curprev 23:3623:36, 28 May 2022imported>Stashbot 969,719 bytes +50,883 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1101:3317 (T298560)', diff saved to https://phabricator.wikimedia.org/P28882 and previous config saved to /var/cache/conftool/dbconfig/20220528-233650-ladsgroup.json
  • curprev 01:3201:32, 28 May 2022imported>Stashbot 918,836 bytes +45,130 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1122 (T309311)', diff saved to https://phabricator.wikimedia.org/P28737 and previous config saved to /var/cache/conftool/dbconfig/20220528-013212-ladsgroup.json

27 May 2022

  • curprev 00:4500:45, 27 May 2022imported>Stashbot 873,706 bytes +31,398 mutante: rsyncing /srv/gitlab-backup from gitlab1004 to gitlab2002 | systemctl status full-backup ..in progress on gitlab1001 - T274463

26 May 2022

  • curprev 00:5800:58, 26 May 2022imported>Stashbot 842,308 bytes +49,509 mutante: gitlab1001 - T308089 T274463 - gitlab1001 - systemctl start full-backup

25 May 2022

  • curprev 00:1500:15, 25 May 2022imported>Stashbot 792,799 bytes +52,401 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1096:3315 (T298560)', diff saved to https://phabricator.wikimedia.org/P28462 and previous config saved to /var/cache/conftool/dbconfig/20220525-001552-ladsgroup.json

24 May 2022

  • curprev 00:5200:52, 24 May 2022imported>Stashbot 740,398 bytes +67,605 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1113:3315', diff saved to https://phabricator.wikimedia.org/P28379 and previous config saved to /var/cache/conftool/dbconfig/20220524-005257-ladsgroup.json

22 May 2022

  • curprev 20:4620:46, 22 May 2022imported>Stashbot 672,793 bytes +13,528 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:2100:21, 22 May 2022imported>Stashbot 659,265 bytes +20,709 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T298560)', diff saved to https://phabricator.wikimedia.org/P28249 and previous config saved to /var/cache/conftool/dbconfig/20220522-002120-ladsgroup.json

21 May 2022

  • curprev 01:0601:06, 21 May 2022imported>Stashbot 638,556 bytes +27,942 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1158 (T298555)', diff saved to https://phabricator.wikimedia.org/P28208 and previous config saved to /var/cache/conftool/dbconfig/20220521-010640-ladsgroup.json

20 May 2022

  • curprev 01:3101:31, 20 May 2022imported>Stashbot 610,614 bytes +72,169 cmooney@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

19 May 2022

  • curprev 00:5800:58, 19 May 2022imported>Stashbot 538,445 bytes +60,753 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance

18 May 2022

  • curprev 01:0501:05, 18 May 2022imported>Stashbot 477,692 bytes +34,747 ejegg: updated fundraising CiviCRM from d45afdfc to b8b8c177

16 May 2022

  • curprev 22:1422:14, 16 May 2022imported>Stashbot 442,945 bytes +16,328 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: exim debugging

15 May 2022

  • curprev 21:4721:47, 15 May 2022imported>Stashbot 426,617 bytes +1,183 aqu@deploy1002: Finished deploy [airflow-dags/analytics_test@378e7ca]: (no justification provided) (duration: 00m 07s)

14 May 2022

  • curprev 08:3408:34, 14 May 2022imported>Stashbot 425,434 bytes +205 jynus@cumin1001: dbctl commit (dc=all): 'Depool db1172', diff saved to https://phabricator.wikimedia.org/P27830 and previous config saved to /var/cache/conftool/dbconfig/20220514-083421-jynus.json
  • curprev 00:5300:53, 14 May 2022imported>Stashbot 425,229 bytes +4,537 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on an-tool1005.eqiad.wmnet with reason: Server need to be downgraded to stretch, on monday

12 May 2022

  • curprev 21:5621:56, 12 May 2022imported>Stashbot 420,692 bytes +26,145 razzi@deploy1002: Finished deploy [analytics/turnilo/deploy@a2bdc3e]: (no justification provided) (duration: 02m 08s)

11 May 2022

  • curprev 22:2822:28, 11 May 2022imported>Stashbot 394,547 bytes +16,527 robh: cp305[67] returned to service and all green in icinga, cp305[89] depooling for firmware update T243167
  • curprev 01:4101:41, 11 May 2022imported>Stashbot 378,020 bytes +25,757 mutante: gitlab2001 - starting backup-restore service that had failed on previous automatic run

9 May 2022

  • curprev 21:5821:58, 9 May 2022imported>Stashbot 352,263 bytes +25,329 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx2001.wikimedia.org with reason: new kernel round deux

8 May 2022

  • curprev 07:1607:16, 8 May 2022imported>Stashbot 326,934 bytes +81 godog: silence probedown for thumbor:8800 until monday

7 May 2022

  • curprev 21:2921:29, 7 May 2022imported>Stashbot 326,853 bytes +2,312 andrew@deploy1002: Finished deploy [horizon/deploy@9d02cd6]: seeking consistency between codfw1dev and eqiad1 (duration: 04m 04s)

6 May 2022

  • curprev 19:1619:16, 6 May 2022imported>Stashbot 324,541 bytes +16,729 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-airflow1002.eqiad.wmnet
  • curprev 00:4600:46, 6 May 2022imported>Stashbot 307,812 bytes +83,868 rook@cumin1001: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudvirt1016.eqiad.wmnet

5 May 2022

  • curprev 01:4201:42, 5 May 2022imported>Stashbot 223,944 bytes +94,859 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P27586 and previous config saved to /var/cache/conftool/dbconfig/20220505-014205-ladsgroup.json

4 May 2022

  • curprev 00:5000:50, 4 May 2022imported>Stashbot 129,085 bytes +36,153 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on db1100.eqiad.wmnet with reason: Maintenance

2 May 2022

  • curprev 23:1523:15, 2 May 2022imported>Stashbot 92,932 bytes +77,432 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host krb2002.codfw.wmnet with OS bullseye
  • curprev 00:5900:59, 2 May 2022imported>Stashbot 15,500 bytes +15,385 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1096:3315', diff saved to https://phabricator.wikimedia.org/P27203 and previous config saved to /var/cache/conftool/dbconfig/20220502-005940-ladsgroup.json

1 May 2022

29 April 2022

  • curprev 23:1123:11, 29 April 2022imported>Stashbot 1,095,940 bytes +87,748 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1149 (T306560)', diff saved to https://phabricator.wikimedia.org/P27163 and previous config saved to /var/cache/conftool/dbconfig/20220429-231136-ladsgroup.json
  • curprev 00:5700:57, 29 April 2022imported>Stashbot 1,008,192 bytes +87,341 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P26967 and previous config saved to /var/cache/conftool/dbconfig/20220429-005702-ladsgroup.json

28 April 2022

  • curprev 01:4701:47, 28 April 2022imported>Stashbot 920,851 bytes +89,525 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1122', diff saved to https://phabricator.wikimedia.org/P26857 and previous config saved to /var/cache/conftool/dbconfig/20220428-014723-ladsgroup.json

27 April 2022

  • curprev 01:4301:43, 27 April 2022imported>Stashbot 831,326 bytes +86,215 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P26663 and previous config saved to /var/cache/conftool/dbconfig/20220427-014355-ladsgroup.json

25 April 2022

  • curprev 23:0523:05, 25 April 2022imported>Stashbot 745,111 bytes +49,942 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:5400:54, 25 April 2022imported>Stashbot 695,169 bytes +25,636 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3316', diff saved to https://phabricator.wikimedia.org/P26408 and previous config saved to /var/cache/conftool/dbconfig/20220425-005432-ladsgroup.json

24 April 2022

  • curprev 01:2801:28, 24 April 2022imported>Stashbot 669,533 bytes +28,655 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on 12 hosts with reason: Maintenance

23 April 2022

  • curprev 01:3401:34, 23 April 2022imported>Stashbot 640,878 bytes +45,596 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T306560)', diff saved to https://phabricator.wikimedia.org/P26246 and previous config saved to /var/cache/conftool/dbconfig/20220423-013450-ladsgroup.json

22 April 2022

  • curprev 01:4701:47, 22 April 2022imported>Stashbot 595,282 bytes −356,134 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance

21 April 2022

  • curprev 00:5200:52, 21 April 2022imported>Stashbot 951,416 bytes +154,237 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P25837 and previous config saved to /var/cache/conftool/dbconfig/20220421-005225-ladsgroup.json

20 April 2022

  • curprev 01:3101:31, 20 April 2022imported>Stashbot 797,179 bytes +136,676 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

19 April 2022

  • curprev 00:5300:53, 19 April 2022imported>Stashbot 660,503 bytes +82,833 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1138 (T298565)', diff saved to https://phabricator.wikimedia.org/P25214 and previous config saved to /var/cache/conftool/dbconfig/20220419-005334-ladsgroup.json

18 April 2022

  • curprev 01:4001:40, 18 April 2022imported>Stashbot 577,670 bytes +74,955 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P24982 and previous config saved to /var/cache/conftool/dbconfig/20220418-014003-ladsgroup.json

17 April 2022

  • curprev 00:5100:51, 17 April 2022imported>Stashbot 502,715 bytes +28,006 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1147 (T298565)', diff saved to https://phabricator.wikimedia.org/P24761 and previous config saved to /var/cache/conftool/dbconfig/20220417-005150-ladsgroup.json

16 April 2022

  • curprev 00:3500:35, 16 April 2022imported>Stashbot 474,709 bytes +12,679 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1129 (T298565)', diff saved to https://phabricator.wikimedia.org/P24681 and previous config saved to /var/cache/conftool/dbconfig/20220416-003538-ladsgroup.json

14 April 2022

  • curprev 22:2822:28, 14 April 2022imported>Stashbot 462,030 bytes +16,537 mutante: gitlab - deleting runner-1018, runner-1019, creating runner-1029, runner-1030 T297659
  • curprev 00:3700:37, 14 April 2022imported>Stashbot 445,493 bytes +50,258 ladsgroup@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1140.eqiad.wmnet with reason: Maintenance

13 April 2022

  • curprev 01:4201:42, 13 April 2022imported>Stashbot 395,235 bytes +39,686 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1135', diff saved to https://phabricator.wikimedia.org/P24545 and previous config saved to /var/cache/conftool/dbconfig/20220413-014214-ladsgroup.json

12 April 2022

  • curprev 00:4900:49, 12 April 2022imported>Stashbot 355,549 bytes +52,260 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1106', diff saved to https://phabricator.wikimedia.org/P24477 and previous config saved to /var/cache/conftool/dbconfig/20220412-004933-ladsgroup.json

11 April 2022

  • curprev 01:4301:43, 11 April 2022imported>Stashbot 303,289 bytes +5,989 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T298565)', diff saved to https://phabricator.wikimedia.org/P24355 and previous config saved to /var/cache/conftool/dbconfig/20220411-014316-ladsgroup.json

9 April 2022

  • curprev 12:3912:39, 9 April 2022imported>Stashbot 297,300 bytes +1,710 godog: bounce prometheus@ops on prometheus5001
  • curprev 00:5300:53, 9 April 2022imported>Stashbot 295,590 bytes +31,885 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1106 (T298565)', diff saved to https://phabricator.wikimedia.org/P24333 and previous config saved to /var/cache/conftool/dbconfig/20220409-005351-ladsgroup.json

7 April 2022

  • curprev 22:1822:18, 7 April 2022imported>Stashbot 263,705 bytes +77,590 ejegg: restarted fundraising scheduled jobs
  • curprev 00:5800:58, 7 April 2022imported>Stashbot 186,115 bytes +64,418 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T297189)', diff saved to https://phabricator.wikimedia.org/P24195 and previous config saved to /var/cache/conftool/dbconfig/20220407-005817-marostegui.json

6 April 2022

  • curprev 01:3401:34, 6 April 2022imported>Stashbot 121,697 bytes +61,769 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1105:3311', diff saved to https://phabricator.wikimedia.org/P24142 and previous config saved to /var/cache/conftool/dbconfig/20220406-013420-ladsgroup.json

5 April 2022

  • curprev 00:5800:58, 5 April 2022imported>Stashbot 59,928 bytes +52,310 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4034.ulsfo.wmnet

2 April 2022

  • curprev 11:2611:26, 2 April 2022imported>Stashbot 7,618 bytes +272 akosiaris: disable zotero paging until T291707 is resolved.

1 April 2022

  • curprev 23:2523:25, 1 April 2022imported>Stashbot 7,346 bytes −1,236,074 mutante: DNS - new project language 'kcg'. 'Tyap is a regionally important dialect cluster of Plateau languages in Nigeria's Middle Belt, named after its prestige dialect. It is also known by its Hausa exonym as Katab or Kataf.' T305279

31 March 2022

  • curprev 23:4523:45, 31 March 2022imported>Stashbot 1,243,420 bytes +66,130 mutante: gitlab2001 - fdisk /dev/vdb (g, w) (create partition table), (n, w) (create partition) ; mkfs.ext4 /dev/vdb1 (create filesystem); systemctl reset-failed (fix Icinga alert); mkdir /mnt/gitlab-backup; mount /dev/vdb1 /mnt/gitlab-backup ; blkid (get UUID); edit /etc/fstab and insert "UUID=c5235682-ac21-46a9-85ee-9603f694a6a4 /mnt/gitlab-backup ext4 errors=remount-ro 0 2" T274463
  • curprev 01:4401:44, 31 March 2022imported>Stashbot 1,177,290 bytes +118,543 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P23948 and previous config saved to /var/cache/conftool/dbconfig/20220331-014403-ladsgroup.json

30 March 2022

  • curprev 01:4601:46, 30 March 2022imported>Stashbot 1,058,747 bytes +98,930 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1146:3312 (T298565)', diff saved to https://phabricator.wikimedia.org/P23664 and previous config saved to /var/cache/conftool/dbconfig/20220330-014621-ladsgroup.json

28 March 2022

  • curprev 23:1523:15, 28 March 2022imported>Stashbot 959,817 bytes +54,742 eileen: civicrm revision 15d22bd1 -> 1c5d10e1
  • curprev 00:5500:55, 28 March 2022imported>Stashbot 905,075 bytes +30,248 ladsgroup@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P23315 and previous config saved to /var/cache/conftool/dbconfig/20220328-005533-ladsgroup.json

27 March 2022

  • curprev 00:5000:50, 27 March 2022imported>Stashbot 874,827 bytes +29,440 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T298565)', diff saved to https://phabricator.wikimedia.org/P23228 and previous config saved to /var/cache/conftool/dbconfig/20220327-005010-ladsgroup.json

26 March 2022

  • curprev 01:1201:12, 26 March 2022imported>Stashbot 845,387 bytes +31,577 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1129 (T298565)', diff saved to https://phabricator.wikimedia.org/P23147 and previous config saved to /var/cache/conftool/dbconfig/20220326-011216-ladsgroup.json

25 March 2022

  • curprev 00:3900:39, 25 March 2022imported>Stashbot 813,810 bytes +36,424 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2027.codfw.wmnet with OS buster

24 March 2022

  • curprev 00:3300:33, 24 March 2022imported>Stashbot 777,386 bytes +38,343 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1046.eqiad.wmnet with OS bullseye

23 March 2022

  • curprev 01:2001:20, 23 March 2022imported>Stashbot 739,043 bytes +42,681 ejegg: updated payments-wiki from 3048f0aa to 28e24856

22 March 2022

  • curprev 01:3501:35, 22 March 2022imported>Stashbot 696,362 bytes +39,479 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS bullseye

20 March 2022

  • curprev 23:4423:44, 20 March 2022imported>Stashbot 656,883 bytes +3,079 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300775)', diff saved to https://phabricator.wikimedia.org/P22857 and previous config saved to /var/cache/conftool/dbconfig/20220320-234358-marostegui.json

19 March 2022

  • curprev 17:1817:18, 19 March 2022imported>Stashbot 653,804 bytes +4,978 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300775)', diff saved to https://phabricator.wikimedia.org/P22845 and previous config saved to /var/cache/conftool/dbconfig/20220319-171757-marostegui.json
  • curprev 01:4601:46, 19 March 2022imported>Stashbot 648,826 bytes +15,864 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1016.eqiad.wmnet with reason: host reimage

17 March 2022

  • curprev 22:5522:55, 17 March 2022imported>Stashbot 632,962 bytes +44,817 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 01:1101:11, 17 March 2022imported>Stashbot 588,145 bytes +54,021 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye

16 March 2022

  • curprev 00:3600:36, 16 March 2022imported>Stashbot 534,124 bytes +72,992 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage

15 March 2022

  • curprev 01:3001:30, 15 March 2022imported>Stashbot 461,132 bytes +46,179 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300775)', diff saved to https://phabricator.wikimedia.org/P22465 and previous config saved to /var/cache/conftool/dbconfig/20220315-013013-marostegui.json

11 March 2022

  • curprev 15:5615:56, 11 March 2022imported>Stashbot 414,953 bytes +11,582 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2014.codfw.wmnet with OS bullseye
  • curprev 00:3300:33, 11 March 2022imported>Stashbot 403,371 bytes +68,747 TimStarling: on mwmaint1002 running populateGlobalEditCount.php

10 March 2022

  • curprev 00:2600:26, 10 March 2022imported>Stashbot 334,624 bytes +40,477 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@7975c27]: (no justification provided) (duration: 00m 08s)

9 March 2022

  • curprev 01:3201:32, 9 March 2022imported>Stashbot 294,147 bytes +75,530 marostegui@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P22170 and previous config saved to /var/cache/conftool/dbconfig/20220309-013256-marostegui.json

8 March 2022

  • curprev 00:3400:34, 8 March 2022imported>Stashbot 218,617 bytes +83,815 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@c8a753b]: (no justification provided) (duration: 00m 07s)

4 March 2022

  • curprev 17:5917:59, 4 March 2022imported>Stashbot 134,802 bytes +23,275 btullis@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • curprev 01:3501:35, 4 March 2022imported>Stashbot 111,527 bytes +40,357 rzl@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply

3 March 2022

  • curprev 01:4201:42, 3 March 2022imported>Stashbot 71,170 bytes +51,552 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on datahubsearch[1001-1003].eqiad.wmnet with reason: Still having errors setting up opensearch

2 March 2022

  • curprev 00:1500:15, 2 March 2022imported>Stashbot 19,618 bytes +18,689 topranks: Re-enabling Lumen AS3356 BGP session over IPv4 on cr3-ulsfo to assess affect on currently broken routing to ulsfo.

1 March 2022

  • curprev 01:1401:14, 1 March 2022imported>Stashbot 929 bytes −955,327 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T302185)', diff saved to https://phabricator.wikimedia.org/P21614 and previous config saved to /var/cache/conftool/dbconfig/20220301-011404-ladsgroup.json

27 February 2022

25 February 2022

  • curprev 23:3223:32, 25 February 2022imported>Stashbot 956,175 bytes +19,462 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply

24 February 2022

  • curprev 23:3523:35, 24 February 2022imported>Stashbot 936,713 bytes +51,509 ryankemper: T302526 Deployed https://gerrit.wikimedia.org/r/765652 and ran puppet across wcqs*
  • curprev 00:5900:59, 24 February 2022imported>Stashbot 885,204 bytes +60,442 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2074.codfw.wmnet with OS bullseye

23 February 2022

  • curprev 01:4101:41, 23 February 2022imported>Stashbot 824,762 bytes +60,474 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage

21 February 2022

  • curprev 22:3022:30, 21 February 2022imported>Stashbot 764,288 bytes +74,792 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21231 and previous config saved to /var/cache/conftool/dbconfig/20220221-223015-marostegui.json
  • curprev 01:3901:39, 21 February 2022imported>Stashbot 689,496 bytes +3,448 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2152.codfw.wmnet with OS bullseye

19 February 2022

  • curprev 16:5016:50, 19 February 2022imported>Stashbot 686,048 bytes +5,104 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:5900:59, 19 February 2022imported>Stashbot 680,944 bytes +28,538 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2020.codfw.wmnet with OS bullseye

17 February 2022

  • curprev 22:2822:28, 17 February 2022imported>Stashbot 652,406 bytes +28,397 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • curprev 01:3601:36, 17 February 2022imported>Stashbot 624,009 bytes +52,457 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20954 and previous config saved to /var/cache/conftool/dbconfig/20220217-013607-marostegui.json

15 February 2022

  • curprev 23:4723:47, 15 February 2022imported>Stashbot 571,552 bytes +59,268 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED

14 February 2022

  • curprev 22:0422:04, 14 February 2022imported>Stashbot 512,284 bytes +52,126 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

13 February 2022

  • curprev 23:1723:17, 13 February 2022imported>Stashbot 460,158 bytes +3,305 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20627 and previous config saved to /var/cache/conftool/dbconfig/20220213-231742-marostegui.json

12 February 2022

  • curprev 22:5822:58, 12 February 2022imported>Stashbot 456,853 bytes +2,897 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20617 and previous config saved to /var/cache/conftool/dbconfig/20220212-225806-marostegui.json

11 February 2022

10 February 2022

  • curprev 00:4200:42, 10 February 2022imported>Stashbot 362,781 bytes +25,944 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

8 February 2022

  • curprev 23:5223:52, 8 February 2022imported>Stashbot 336,837 bytes +73,681 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
  • curprev 00:1200:12, 8 February 2022imported>Stashbot 263,156 bytes +33,177 ryankemper: T294805 Re-enabling puppet across eqiad elastic fleet: `ryankemper@cumin1001:~$ sudo cumin -b 8 'elastic1*' 'sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent'` tmux session `elastic`

5 February 2022

  • curprev 22:1022:10, 5 February 2022imported>Stashbot 229,979 bytes +1,284 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye

4 February 2022

  • curprev 23:4323:43, 4 February 2022imported>Stashbot 228,695 bytes +5,568 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • curprev 01:0801:08, 4 February 2022imported>Stashbot 223,127 bytes +72,959 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

3 February 2022

2 February 2022

  • curprev 00:5300:53, 2 February 2022imported>Stashbot 90,142 bytes −738,181 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

1 February 2022

  • curprev 00:3100:31, 1 February 2022imported>Stashbot 828,323 bytes +72,250 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

29 January 2022

  • curprev 21:0821:08, 29 January 2022imported>Stashbot 756,073 bytes +1,014 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudservices2003-dev.wikimedia.org with OS bullseye
  • curprev 00:1400:14, 29 January 2022imported>Stashbot 755,059 bytes +14,112 ebernhardson: restart elasticsearch_6@production-search-psi-eqiad on elastic1049 to address CirrusSearchJVMGCOldPoolFlatlined alert

28 January 2022

  • curprev 01:4701:47, 28 January 2022imported>Stashbot 740,947 bytes +73,300 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2001-dev.wikimedia.org with OS bullseye

27 January 2022

26 January 2022

25 January 2022

  • curprev 00:3100:31, 25 January 2022imported>Stashbot 525,956 bytes +53,597 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

23 January 2022

  • curprev 22:0222:02, 23 January 2022imported>Stashbot 472,359 bytes +500 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@37937f6]: (no justification provided) (duration: 00m 08s)

22 January 2022

  • curprev 22:3822:38, 22 January 2022imported>Stashbot 471,859 bytes +812 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mx1001.wikimedia.org with reason: kernel testing
  • curprev 01:3001:30, 22 January 2022imported>Stashbot 471,047 bytes +18,324 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on mx1001.wikimedia.org with reason: kernel testing

20 January 2022

  • curprev 22:4022:40, 20 January 2022imported>Stashbot 452,723 bytes +39,372 inflatador: running puppet-merge for https://gerrit.wikimedia.org/r/755810

19 January 2022

17 January 2022

  • curprev 23:2723:27, 17 January 2022imported>Stashbot 327,176 bytes +12,624 jynus: forced session revocation on phab for a user T299315

16 January 2022

  • curprev 08:2108:21, 16 January 2022imported>Stashbot 314,552 bytes +684 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production

15 January 2022

  • curprev 08:5508:55, 15 January 2022imported>Stashbot 313,868 bytes +1,296 legoktm: finished running recountCategories on s4 wikis (T299244)
  • curprev 01:2201:22, 15 January 2022imported>Stashbot 312,572 bytes +10,517 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

14 January 2022

  • curprev 00:3600:36, 14 January 2022imported>Stashbot 302,055 bytes +32,093 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

13 January 2022

  • curprev 00:3500:35, 13 January 2022imported>Stashbot 269,962 bytes +64,421 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

12 January 2022

  • curprev 00:5500:55, 12 January 2022imported>Stashbot 205,541 bytes +59,425 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

11 January 2022

8 January 2022

  • curprev 10:5110:51, 8 January 2022imported>Stashbot 107,107 bytes +180 elukey: restart hive daemons on an-coord1002 (after my last upgrade/rollback of packages the prometheus agent settings were not picked up, so no metrics)

7 January 2022

6 January 2022

5 January 2022

  • curprev 00:5900:59, 5 January 2022imported>Stashbot 64,897 bytes +32,691 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

4 January 2022

  • curprev 00:5400:54, 4 January 2022imported>Stashbot 32,206 bytes +32,091 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18329 and previous config saved to /var/cache/conftool/dbconfig/20220104-005456-marostegui.json

1 January 2022

29 December 2021

  • curprev 10:3010:30, 29 December 2021imported>Stashbot 664,919 bytes +126 elukey: kill tcpdump process on kubestagemaster1001 (kept a big pcap file opened that kept growing)

28 December 2021

24 December 2021

  • curprev 20:0820:08, 24 December 2021imported>Stashbot 663,863 bytes +325 mforns@deploy1002: Finished deploy [airflow-dags/analytics@e282d2d]: (no justification provided) (duration: 00m 06s)
  • curprev 00:5700:57, 24 December 2021imported>Stashbot 663,538 bytes +3,353 ejegg: updated fundraising CiviCRM from 47dd67f2 to aaceb4ab

23 December 2021

  • curprev 00:0400:04, 23 December 2021imported>Stashbot 660,185 bytes +4,302 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) restart without plugin upgrade (3 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic cluster restart - bking@cumin1001 - T297986

21 December 2021

19 December 2021

18 December 2021

  • curprev 13:5713:57, 18 December 2021imported>Stashbot 633,583 bytes +93 dcausse: restarting blazegraph on wdqs1013 (jvm stuck for 10hours)

17 December 2021

16 December 2021

  • curprev 00:3700:37, 16 December 2021imported>Stashbot 600,611 bytes +14,349 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .

15 December 2021

14 December 2021

  • curprev 01:4201:42, 14 December 2021imported>Stashbot 545,410 bytes +41,246 ryankemper: T297468 `sudo cookbook sre.elasticsearch.rolling-operation search_eqiad "eqiad rolling restart" --nodes-per-run 3 --start-datetime 2021-12-14T01:27:58 --task-id T297468` on `ryankemper@cumin1001` tmux `elastic_restarts`

12 December 2021

  • curprev 14:3514:35, 12 December 2021imported>Stashbot 504,164 bytes +844 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1004.eqiad.wmnet

11 December 2021

  • curprev 19:0419:04, 11 December 2021imported>Stashbot 503,320 bytes +131 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster
  • curprev 00:0400:04, 11 December 2021imported>Stashbot 503,189 bytes +18,770 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .

10 December 2021

9 December 2021

  • curprev 00:2600:26, 9 December 2021imported>Stashbot 469,358 bytes +7,967 rzl: graphite1004.mgmt: /admin1-> racadm serveraction powercycle (T297265)

8 December 2021

  • curprev 00:5100:51, 8 December 2021imported>Stashbot 461,391 bytes +29,464 ebernhardson@deploy1002: Synchronized php-1.38.0-wmf.12/extensions/GrowthExperiments/includes/NewcomerTasks/AddImage/AddImageSubmissionHandler.php: backport window for 744896 (duration: 01m 05s)

7 December 2021

(newest | oldest) View (newer 500 | ) (20 | 50 | 100 | 250 | 500)