You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Server Admin Log: Revision history

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

(newest | oldest) View ( | ) (20 | 50 | 100 | 250 | 500)

25 March 2022

  • curprev 00:3900:39, 25 March 2022imported>Stashbot 813,810 bytes +36,424 pt1979@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host restbase2027.codfw.wmnet with OS buster

24 March 2022

  • curprev 00:3300:33, 24 March 2022imported>Stashbot 777,386 bytes +38,343 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt1046.eqiad.wmnet with OS bullseye

23 March 2022

  • curprev 01:2001:20, 23 March 2022imported>Stashbot 739,043 bytes +42,681 ejegg: updated payments-wiki from 3048f0aa to 28e24856

22 March 2022

  • curprev 01:3501:35, 22 March 2022imported>Stashbot 696,362 bytes +39,479 cmjohnson@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1047.eqiad.wmnet with OS bullseye

20 March 2022

  • curprev 23:4423:44, 20 March 2022imported>Stashbot 656,883 bytes +3,079 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1144:3314 (T300775)', diff saved to https://phabricator.wikimedia.org/P22857 and previous config saved to /var/cache/conftool/dbconfig/20220320-234358-marostegui.json

19 March 2022

  • curprev 17:1817:18, 19 March 2022imported>Stashbot 653,804 bytes +4,978 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1148 (T300775)', diff saved to https://phabricator.wikimedia.org/P22845 and previous config saved to /var/cache/conftool/dbconfig/20220319-171757-marostegui.json
  • curprev 01:4601:46, 19 March 2022imported>Stashbot 648,826 bytes +15,864 andrew@cumin1001: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudvirt1016.eqiad.wmnet with reason: host reimage

17 March 2022

  • curprev 22:5522:55, 17 March 2022imported>Stashbot 632,962 bytes +44,817 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 01:1101:11, 17 March 2022imported>Stashbot 588,145 bytes +54,021 andrew@cumin1001: START - Cookbook sre.hosts.reimage for host cloudvirt1016.eqiad.wmnet with OS bullseye

16 March 2022

  • curprev 00:3600:36, 16 March 2022imported>Stashbot 534,124 bytes +72,992 sukhe@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage

15 March 2022

  • curprev 01:3001:30, 15 March 2022imported>Stashbot 461,132 bytes +46,179 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1156 (T300775)', diff saved to https://phabricator.wikimedia.org/P22465 and previous config saved to /var/cache/conftool/dbconfig/20220315-013013-marostegui.json

11 March 2022

  • curprev 15:5615:56, 11 March 2022imported>Stashbot 414,953 bytes +11,582 elukey@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2014.codfw.wmnet with OS bullseye
  • curprev 00:3300:33, 11 March 2022imported>Stashbot 403,371 bytes +68,747 TimStarling: on mwmaint1002 running populateGlobalEditCount.php

10 March 2022

  • curprev 00:2600:26, 10 March 2022imported>Stashbot 334,624 bytes +40,477 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@7975c27]: (no justification provided) (duration: 00m 08s)

9 March 2022

  • curprev 01:3201:32, 9 March 2022imported>Stashbot 294,147 bytes +75,530 marostegui@cumin2002: dbctl commit (dc=all): 'Repooling after maintenance db1105:3312', diff saved to https://phabricator.wikimedia.org/P22170 and previous config saved to /var/cache/conftool/dbconfig/20220309-013256-marostegui.json

8 March 2022

  • curprev 00:3400:34, 8 March 2022imported>Stashbot 218,617 bytes +83,815 ebysans@deploy1002: Finished deploy [airflow-dags/analytics@c8a753b]: (no justification provided) (duration: 00m 07s)

4 March 2022

  • curprev 17:5917:59, 4 March 2022imported>Stashbot 134,802 bytes +23,275 btullis@cumin1001: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • curprev 01:3501:35, 4 March 2022imported>Stashbot 111,527 bytes +40,357 rzl@deploy1002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply

3 March 2022

  • curprev 01:4201:42, 3 March 2022imported>Stashbot 71,170 bytes +51,552 razzi@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on datahubsearch[1001-1003].eqiad.wmnet with reason: Still having errors setting up opensearch

2 March 2022

  • curprev 00:1500:15, 2 March 2022imported>Stashbot 19,618 bytes +18,689 topranks: Re-enabling Lumen AS3356 BGP session over IPv4 on cr3-ulsfo to assess affect on currently broken routing to ulsfo.

1 March 2022

  • curprev 01:1401:14, 1 March 2022imported>Stashbot 929 bytes −955,327 ladsgroup@cumin1001: dbctl commit (dc=all): 'Depooling db1104 (T302185)', diff saved to https://phabricator.wikimedia.org/P21614 and previous config saved to /var/cache/conftool/dbconfig/20220301-011404-ladsgroup.json

27 February 2022

25 February 2022

  • curprev 23:3223:32, 25 February 2022imported>Stashbot 956,175 bytes +19,462 dzahn@deploy1002: helmfile [staging] DONE helmfile.d/services/miscweb: apply

24 February 2022

  • curprev 23:3523:35, 24 February 2022imported>Stashbot 936,713 bytes +51,509 ryankemper: T302526 Deployed https://gerrit.wikimedia.org/r/765652 and ran puppet across wcqs*
  • curprev 00:5900:59, 24 February 2022imported>Stashbot 885,204 bytes +60,442 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host elastic2074.codfw.wmnet with OS bullseye

23 February 2022

  • curprev 01:4101:41, 23 February 2022imported>Stashbot 824,762 bytes +60,474 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2068.codfw.wmnet with reason: host reimage

21 February 2022

  • curprev 22:3022:30, 21 February 2022imported>Stashbot 764,288 bytes +74,792 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1164 (T300381)', diff saved to https://phabricator.wikimedia.org/P21231 and previous config saved to /var/cache/conftool/dbconfig/20220221-223015-marostegui.json
  • curprev 01:3901:39, 21 February 2022imported>Stashbot 689,496 bytes +3,448 ladsgroup@cumin1001: START - Cookbook sre.hosts.reimage for host db2152.codfw.wmnet with OS bullseye

19 February 2022

  • curprev 16:5016:50, 19 February 2022imported>Stashbot 686,048 bytes +5,104 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: apply
  • curprev 00:5900:59, 19 February 2022imported>Stashbot 680,944 bytes +28,538 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host kubernetes2020.codfw.wmnet with OS bullseye

17 February 2022

  • curprev 22:2822:28, 17 February 2022imported>Stashbot 652,406 bytes +28,397 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • curprev 01:3601:36, 17 February 2022imported>Stashbot 624,009 bytes +52,457 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1099:3318 (T300381)', diff saved to https://phabricator.wikimedia.org/P20954 and previous config saved to /var/cache/conftool/dbconfig/20220217-013607-marostegui.json

15 February 2022

  • curprev 23:4723:47, 15 February 2022imported>Stashbot 571,552 bytes +59,268 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host restbase-dev2003.mgmt.codfw.wmnet with reboot policy FORCED

14 February 2022

  • curprev 22:0422:04, 14 February 2022imported>Stashbot 512,284 bytes +52,126 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)

13 February 2022

  • curprev 23:1723:17, 13 February 2022imported>Stashbot 460,158 bytes +3,305 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1144:3315 (T300775)', diff saved to https://phabricator.wikimedia.org/P20627 and previous config saved to /var/cache/conftool/dbconfig/20220213-231742-marostegui.json

12 February 2022

  • curprev 22:5822:58, 12 February 2022imported>Stashbot 456,853 bytes +2,897 marostegui@cumin1001: dbctl commit (dc=all): 'Depooling db1161 (T300775)', diff saved to https://phabricator.wikimedia.org/P20617 and previous config saved to /var/cache/conftool/dbconfig/20220212-225806-marostegui.json

11 February 2022

10 February 2022

  • curprev 00:4200:42, 10 February 2022imported>Stashbot 362,781 bytes +25,944 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

8 February 2022

  • curprev 23:5223:52, 8 February 2022imported>Stashbot 336,837 bytes +73,681 pt1979@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mc2055.codfw.wmnet with OS buster
  • curprev 00:1200:12, 8 February 2022imported>Stashbot 263,156 bytes +33,177 ryankemper: T294805 Re-enabling puppet across eqiad elastic fleet: `ryankemper@cumin1001:~$ sudo cumin -b 8 'elastic1*' 'sudo enable-puppet "Add new eqiad replacement hosts elastic10[68-83] - T294805 - root" && sudo run-puppet-agent'` tmux session `elastic`

5 February 2022

  • curprev 22:1022:10, 5 February 2022imported>Stashbot 229,979 bytes +1,284 andrew@cumin1001: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudvirt2003-dev.codfw.wmnet with OS bullseye

4 February 2022

  • curprev 23:4323:43, 4 February 2022imported>Stashbot 228,695 bytes +5,568 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mirror1001.wikimedia.org with reason: new kernel
  • curprev 01:0801:08, 4 February 2022imported>Stashbot 223,127 bytes +72,959 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

3 February 2022

2 February 2022

  • curprev 00:5300:53, 2 February 2022imported>Stashbot 90,142 bytes −738,181 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

1 February 2022

  • curprev 00:3100:31, 1 February 2022imported>Stashbot 828,323 bytes +72,250 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

29 January 2022

  • curprev 21:0821:08, 29 January 2022imported>Stashbot 756,073 bytes +1,014 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudservices2003-dev.wikimedia.org with OS bullseye
  • curprev 00:1400:14, 29 January 2022imported>Stashbot 755,059 bytes +14,112 ebernhardson: restart elasticsearch_6@production-search-psi-eqiad on elastic1049 to address CirrusSearchJVMGCOldPoolFlatlined alert

28 January 2022

  • curprev 01:4701:47, 28 January 2022imported>Stashbot 740,947 bytes +73,300 andrew@cumin1001: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcontrol2001-dev.wikimedia.org with OS bullseye

27 January 2022

26 January 2022

25 January 2022

  • curprev 00:3100:31, 25 January 2022imported>Stashbot 525,956 bytes +53,597 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

23 January 2022

  • curprev 22:0222:02, 23 January 2022imported>Stashbot 472,359 bytes +500 ebysans@deploy1002: Finished deploy [airflow-dags/analytics-test@37937f6]: (no justification provided) (duration: 00m 08s)

22 January 2022

  • curprev 22:3822:38, 22 January 2022imported>Stashbot 471,859 bytes +812 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mx1001.wikimedia.org with reason: kernel testing
  • curprev 01:3001:30, 22 January 2022imported>Stashbot 471,047 bytes +18,324 jhathaway@cumin1001: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on mx1001.wikimedia.org with reason: kernel testing

20 January 2022

  • curprev 22:4022:40, 20 January 2022imported>Stashbot 452,723 bytes +39,372 inflatador: running puppet-merge for https://gerrit.wikimedia.org/r/755810

19 January 2022

17 January 2022

  • curprev 23:2723:27, 17 January 2022imported>Stashbot 327,176 bytes +12,624 jynus: forced session revocation on phab for a user T299315

16 January 2022

  • curprev 08:2108:21, 16 January 2022imported>Stashbot 314,552 bytes +684 oblivian@deploy1002: helmfile [eqiad] DONE helmfile.d/services/changeprop-jobqueue: sync on production

15 January 2022

  • curprev 08:5508:55, 15 January 2022imported>Stashbot 313,868 bytes +1,296 legoktm: finished running recountCategories on s4 wikis (T299244)
  • curprev 01:2201:22, 15 January 2022imported>Stashbot 312,572 bytes +10,517 mwdebug-deploy@deploy1002: helmfile [eqiad] START helmfile.d/services/mwdebug: apply on pinkunicorn

14 January 2022

  • curprev 00:3600:36, 14 January 2022imported>Stashbot 302,055 bytes +32,093 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

13 January 2022

  • curprev 00:3500:35, 13 January 2022imported>Stashbot 269,962 bytes +64,421 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

12 January 2022

  • curprev 00:5500:55, 12 January 2022imported>Stashbot 205,541 bytes +59,425 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

11 January 2022

8 January 2022

  • curprev 10:5110:51, 8 January 2022imported>Stashbot 107,107 bytes +180 elukey: restart hive daemons on an-coord1002 (after my last upgrade/rollback of packages the prometheus agent settings were not picked up, so no metrics)

7 January 2022

6 January 2022

5 January 2022

  • curprev 00:5900:59, 5 January 2022imported>Stashbot 64,897 bytes +32,691 mwdebug-deploy@deploy1002: helmfile [codfw] DONE helmfile.d/services/mwdebug: sync on pinkunicorn

4 January 2022

  • curprev 00:5400:54, 4 January 2022imported>Stashbot 32,206 bytes +32,091 marostegui@cumin1001: dbctl commit (dc=all): 'Repooling after maintenance db1098:3317', diff saved to https://phabricator.wikimedia.org/P18329 and previous config saved to /var/cache/conftool/dbconfig/20220104-005456-marostegui.json

1 January 2022

29 December 2021

  • curprev 10:3010:30, 29 December 2021imported>Stashbot 664,919 bytes +126 elukey: kill tcpdump process on kubestagemaster1001 (kept a big pcap file opened that kept growing)

28 December 2021

24 December 2021

  • curprev 20:0820:08, 24 December 2021imported>Stashbot 663,863 bytes +325 mforns@deploy1002: Finished deploy [airflow-dags/analytics@e282d2d]: (no justification provided) (duration: 00m 06s)
  • curprev 00:5700:57, 24 December 2021imported>Stashbot 663,538 bytes +3,353 ejegg: updated fundraising CiviCRM from 47dd67f2 to aaceb4ab

23 December 2021

  • curprev 00:0400:04, 23 December 2021imported>Stashbot 660,185 bytes +4,302 bking@cumin1001: END (PASS) - Cookbook sre.elasticsearch.rolling-operation (exit_code=0) restart without plugin upgrade (3 nodes at a time) for ElasticSearch cluster cloudelastic: cloudelastic cluster restart - bking@cumin1001 - T297986

21 December 2021

19 December 2021

18 December 2021

  • curprev 13:5713:57, 18 December 2021imported>Stashbot 633,583 bytes +93 dcausse: restarting blazegraph on wdqs1013 (jvm stuck for 10hours)

17 December 2021

16 December 2021

  • curprev 00:3700:37, 16 December 2021imported>Stashbot 600,611 bytes +14,349 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .

15 December 2021

14 December 2021

  • curprev 01:4201:42, 14 December 2021imported>Stashbot 545,410 bytes +41,246 ryankemper: T297468 `sudo cookbook sre.elasticsearch.rolling-operation search_eqiad "eqiad rolling restart" --nodes-per-run 3 --start-datetime 2021-12-14T01:27:58 --task-id T297468` on `ryankemper@cumin1001` tmux `elastic_restarts`

12 December 2021

  • curprev 14:3514:35, 12 December 2021imported>Stashbot 504,164 bytes +844 filippo@cumin1001: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host graphite1004.eqiad.wmnet

11 December 2021

  • curprev 19:0419:04, 11 December 2021imported>Stashbot 503,320 bytes +131 andrew@cumin1001: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudvirt1028.eqiad.wmnet with OS buster
  • curprev 00:0400:04, 11 December 2021imported>Stashbot 503,189 bytes +18,770 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .

10 December 2021

9 December 2021

  • curprev 00:2600:26, 9 December 2021imported>Stashbot 469,358 bytes +7,967 rzl: graphite1004.mgmt: /admin1-> racadm serveraction powercycle (T297265)

8 December 2021

  • curprev 00:5100:51, 8 December 2021imported>Stashbot 461,391 bytes +29,464 ebernhardson@deploy1002: Synchronized php-1.38.0-wmf.12/extensions/GrowthExperiments/includes/NewcomerTasks/AddImage/AddImageSubmissionHandler.php: backport window for 744896 (duration: 01m 05s)

7 December 2021

4 December 2021

  • curprev 01:1401:14, 4 December 2021imported>Stashbot 424,523 bytes +12,137 mutante: mx2001 - did not come back from reboot, did not get IP on interface, could not start ferm, logged in via console with root password, in /etc/network/interfaces replaced all "ens5" with "ens13", rebooted again, selected previous kernel version

3 December 2021

2 December 2021

  • curprev 01:2101:21, 2 December 2021imported>Stashbot 394,618 bytes +24,122 ryankemper: T280001 Rolling restart of low-traffic pybal hosts complete. All of `wcqs` is pooled and the pybal / ipvs related alerts have cleared

1 December 2021

  • curprev 00:3500:35, 1 December 2021imported>Stashbot 370,496 bytes +9,705 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .

30 November 2021

  • curprev 00:2200:22, 30 November 2021imported>Stashbot 360,791 bytes +17,004 mwdebug-deploy@deploy1002: helmfile [codfw] Ran 'sync' command on namespace 'mwdebug' for release 'pinkunicorn' .
(newest | oldest) View ( | ) (20 | 50 | 100 | 250 | 500)