You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Nova Resource:Tools/SAL: Revision history

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

(newest | oldest) View (newer 250 | ) (20 | 50 | 100 | 250 | 500)

28 September 2022

  • curprev 21:2321:23, 28 September 2022imported>Stashbot 25,736 bytes +391 lucaswerkmeister: on tools-sgebastion-10: run-puppet-agent # T318858

22 September 2022

  • curprev 12:3012:30, 22 September 2022imported>Stashbot 25,345 bytes +221 taavi: add TheresNoTime to the 'toollabs-trusted' gerrit group T317438

10 September 2022

  • curprev 07:3907:39, 10 September 2022imported>Stashbot 25,124 bytes +105 wm-bot2: removing instance tools-prometheus-03 - cookbook ran by taavi@runko

7 September 2022

  • curprev 10:2210:22, 7 September 2022imported>Stashbot 25,019 bytes +134 dcaro: Pushing the new toolforge builder image based on the new 0.8 buildpacks (T316854)

6 September 2022

  • curprev 08:0608:06, 6 September 2022imported>Stashbot 24,885 bytes +178 dcaro_away: Published new toolforge-bullseye0-run and toolforge-bullseye0-build images for the toolforge buildpack builder (T316854)

25 August 2022

  • curprev 10:4010:40, 25 August 2022imported>Stashbot 24,707 bytes +158 taavi: tagged new version of the python39-web container with a shell implementation of webservice-runner T293552

24 August 2022

  • curprev 12:2012:20, 24 August 2022imported>Stashbot 24,549 bytes +230 wm-bot2: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx (eba66bc) - cookbook ran by taavi@runko

20 August 2022

18 August 2022

  • curprev 14:4514:45, 18 August 2022imported>Stashbot 23,900 bytes +207 andrewbogott: adding lucaswerkmeister as projectadmin (T314527)

17 August 2022

  • curprev 16:3416:34, 17 August 2022imported>Stashbot 23,693 bytes +210 taavi: kubectl sudo delete cm -n tool-wdml maintain-kubeusers # T315459

16 August 2022

  • curprev 17:2817:28, 16 August 2022imported>Stashbot 23,483 bytes +107 taavi: fail over docker-registry, tools-docker-registry-06->docker-registry-05

11 August 2022

  • curprev 16:5816:58, 11 August 2022imported>Stashbot 23,376 bytes +204 wm-bot2: cleaned up grid queue errors on tools-sgegrid-master - cookbook ran by taavi@runko

5 August 2022

  • curprev 15:0815:08, 5 August 2022imported>Stashbot 23,172 bytes +386 wm-bot2: removing grid node tools-sgewebgen-10-1.tools.eqiad1.wikimedia.cloud - cookbook ran by taavi@runko

3 August 2022

  • curprev 15:5115:51, 3 August 2022imported>Stashbot 22,786 bytes +262 dhinus: recreated jobs-api pods to pick up new ConfigMap

20 July 2022

  • curprev 19:3119:31, 20 July 2022imported>Stashbot 22,524 bytes +234 taavi: reboot toolserver-proxy-01 to free up disk space probably held by stale file handles

19 July 2022

  • curprev 17:5317:53, 19 July 2022imported>Stashbot 22,290 bytes +587 wm-bot2: created node tools-sgeexec-10-21.tools.eqiad1.wikimedia.cloud and added it to the grid - cookbook ran by taavi@runko

17 July 2022

  • curprev 15:5215:52, 17 July 2022imported>Stashbot 21,703 bytes +383 wm-bot2: removing grid node tools-sgeexec-10-10.tools.eqiad1.wikimedia.cloud - cookbook ran by taavi@runko

14 July 2022

13 July 2022

  • curprev 12:0912:09, 13 July 2022imported>Stashbot 21,256 bytes +123 wm-bot2: cleaned up grid queue errors on tools-sgegrid-master - cookbook ran by dcaro@vulcanus

11 July 2022

  • curprev 16:0616:06, 11 July 2022imported>Stashbot 21,133 bytes +170 wm-bot2: Increased quotas by {self.increases} (T312692) - cookbook ran by nskaggs@x1carbon

7 July 2022

  • curprev 07:3407:34, 7 July 2022imported>Stashbot 20,963 bytes +123 wm-bot2: cleaned up grid queue errors on tools-sgegrid-master - cookbook ran by dcaro@vulcanus

28 June 2022

  • curprev 17:3417:34, 28 June 2022imported>Stashbot 20,840 bytes +213 wm-bot2: cleaned up grid queue errors on tools-sgegrid-master (T311538) - cookbook ran by dcaro@vulcanus

27 June 2022

  • curprev 18:1418:14, 27 June 2022imported>Stashbot 20,627 bytes +956 taavi: restart calico, appears to have got stuck after the ca replacement operation

23 June 2022

  • curprev 17:5117:51, 23 June 2022imported>Stashbot 19,671 bytes +1,622 wm-bot2: removing grid node tools-sgeexec-0941.tools.eqiad1.wikimedia.cloud - cookbook ran by taavi@runko

22 June 2022

  • curprev 15:5415:54, 22 June 2022imported>Stashbot 18,049 bytes +524 wm-bot2: removing grid node tools-sgewebgrid-lighttpd-0917.tools.eqiad1.wikimedia.cloud - cookbook ran by taavi@runko

21 June 2022

  • curprev 15:2315:23, 21 June 2022imported>Stashbot 17,525 bytes +524 wm-bot2: removing grid node tools-sgewebgrid-lighttpd-0914.tools.eqiad1.wikimedia.cloud - cookbook ran by taavi@runko

3 June 2022

  • curprev 20:0720:07, 3 June 2022imported>Stashbot 17,001 bytes +2,706 wm-bot2: created node tools-sgeweblight-10-26.tools.eqiad1.wikimedia.cloud and added it to the grid - cookbook ran by andrew@buster

2 June 2022

  • curprev 22:2622:26, 2 June 2022imported>Stashbot 14,295 bytes +1,706 bd808: Rebooting tools-sgeweblight-10-1.tools.eqiad1.wikimedia.cloud. Node is full of jobs that are not tracked by grid master and failing to spawn new jobs sent by the scheduler

1 June 2022

  • curprev 11:1811:18, 1 June 2022imported>Stashbot 12,589 bytes +77 taavi: depool and remove tools-sgeexec-09[07-14]

31 May 2022

  • curprev 16:5116:51, 31 May 2022imported>Stashbot 12,512 bytes +106 taavi: delete tools-sgeexec-0904 for T309525 experimentation

30 May 2022

  • curprev 08:2408:24, 30 May 2022imported>Stashbot 12,406 bytes +109 taavi: depool tools-sgeexec-[0901-0909] (7 nodes total) T277653

26 May 2022

  • curprev 15:3915:39, 26 May 2022imported>Stashbot 12,297 bytes +211 wm-bot2: deployed kubernetes component https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (e6fa299) (T309146) - cookbook ran by taavi@runko

22 May 2022

  • curprev 17:0417:04, 22 May 2022imported>Stashbot 12,086 bytes +245 taavi: failover tools-redis to the updated cluster T278541

16 May 2022

  • curprev 14:0214:02, 16 May 2022imported>Stashbot 11,841 bytes +183 wm-bot2: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx (7037eca) - cookbook ran by taavi@runko

14 May 2022

  • curprev 10:4710:47, 14 May 2022imported>Stashbot 11,658 bytes +80 taavi: hard reboot unresponsible tools-sgeexec-0940

12 May 2022

10 May 2022

  • curprev 15:1815:18, 10 May 2022imported>Stashbot 11,163 bytes +162 taavi: depool tools-k8s-worker-42 for experiments

6 May 2022

  • curprev 19:4619:46, 6 May 2022imported>Stashbot 11,001 bytes +155 bd808: Rebuilt toolforge-perl532-sssd-base & toolforge-perl532-sssd-web to add liblocale-codes-perl (T307812)

5 May 2022

  • curprev 17:2817:28, 5 May 2022imported>Stashbot 10,846 bytes +89 taavi: deploy tools-webservice 0.83 T307693

3 May 2022

  • curprev 08:2008:20, 3 May 2022imported>Stashbot 10,757 bytes +123 taavi: redis: start replication from the old cluster to the new one (T278541)

2 May 2022

25 April 2022

  • curprev 14:5614:56, 25 April 2022imported>Stashbot 10,547 bytes +180 bd808: Rebuilding all docker images to pick up toolforge-webservice v0.82 (T214343)

23 April 2022

  • curprev 16:5116:51, 23 April 2022imported>Stashbot 10,367 bytes +160 bd808: Built new perl532-sssd/{base,web} images and pushed to registry (T214343)

20 April 2022

  • curprev 16:5816:58, 20 April 2022imported>Stashbot 10,207 bytes +345 taavi: reboot toolserver-proxy-01 to free up disk space from stale file handles(?)

16 April 2022

  • curprev 18:5318:53, 16 April 2022imported>Stashbot 9,862 bytes +187 wm-bot: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/kubernetes-metrics (2c485e9) - cookbook ran by taavi@runko

12 April 2022

  • curprev 21:3221:32, 12 April 2022imported>Stashbot 9,675 bytes +257 bd808: Added komla to Gerrit group 'toollabs-trusted' (T305986)

10 April 2022

  • curprev 18:4318:43, 10 April 2022imported>Stashbot 9,418 bytes +152 taavi: deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space)

9 April 2022

  • curprev 15:3015:30, 9 April 2022imported>Stashbot 9,266 bytes +109 taavi: manually prune user.log on tools-prometheus-03 to free up some space on /

8 April 2022

  • curprev 10:4410:44, 8 April 2022imported>Stashbot 9,157 bytes +90 arturo: disabled debug mode on the k8s jobs-emailer component

5 April 2022

  • curprev 07:5207:52, 5 April 2022imported>Stashbot 9,067 bytes +483 wm-bot: deployed kubernetes component https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (d7d3463) - cookbook ran by arturo@nostromo

4 April 2022

  • curprev 17:0517:05, 4 April 2022imported>Stashbot 8,584 bytes +529 wm-bot: deployed kubernetes component https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (cbcfc47) - cookbook ran by arturo@nostromo

28 March 2022

  • curprev 09:3209:32, 28 March 2022imported>Stashbot 8,055 bytes +179 wm-bot: cleaned up grid queue errors on tools-sgegrid-master.tools.eqiad1.wikimedia.cloud (T304816) - cookbook ran by arturo@nostromo

15 March 2022

  • curprev 16:5716:57, 15 March 2022imported>Stashbot 7,876 bytes +410 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-emailer:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-emailer (084ee51) - cookbook ran by arturo@nostromo

14 March 2022

  • curprev 11:4411:44, 14 March 2022imported>Stashbot 7,466 bytes +270 arturo: deploy jobs-framework-emailer 9470a5f339fd5a44c97c69ce97239aef30f5ee41 (T286135)

10 March 2022

  • curprev 09:4209:42, 10 March 2022imported>Stashbot 7,196 bytes +99 arturo: cleaned grid queue error state @ tools-sgewebgrid-generic-0902

1 March 2022

  • curprev 13:4113:41, 1 March 2022imported>Stashbot 7,097 bytes +300 dcaro: rebooting tools-sgeexec-0916 to clear any state (T302702)

28 February 2022

17 February 2022

16 February 2022

10 February 2022

9 February 2022

  • curprev 19:2919:29, 9 February 2022imported>Stashbot 4,202 bytes +652 taavi: installed tools-puppetdb-1, not configured on puppetmaster side yet T214427

8 February 2022

  • curprev 15:0115:01, 8 February 2022imported>Nhatminh01 3,550 bytes −239,922 Replaced content with "=== 2022-02-07 === * 17:37 taavi: generated authdns_acmechief ssh key and stored password in a text file in local labs/private repository (T288406) * 12:52 taavi: updated maintain-kubeusers for T301081 === 2022-02-04 === * 22:33 taavi: `root@tools-sgebastion-10:/data/project/ru_monuments/.kube# mv config old_config` # experimenting with T301015 * 21:36 taavi: clear error state from some webgrid nodes === 2022-02-03..."

7 February 2022

  • curprev 17:3717:37, 7 February 2022imported>Stashbot 243,472 bytes +237 taavi: generated authdns_acmechief ssh key and stored password in a text file in local labs/private repository (T288406)

4 February 2022

  • curprev 22:3322:33, 4 February 2022imported>Stashbot 243,235 bytes +220 taavi: `root@tools-sgebastion-10:/data/project/ru_monuments/.kube# mv config old_config` # experimenting with T301015

3 February 2022

  • curprev 09:0609:06, 3 February 2022imported>Stashbot 243,015 bytes +191 taavi: run `sudo apt-get clean` on login-buster/dev-buster to clean up disk space

30 January 2022

  • curprev 14:4114:41, 30 January 2022imported>Stashbot 242,824 bytes +248 taavi: created a neutron port with ip 172.16.2.46 for a service ip for toolforge redis automatic failover T278541

26 January 2022

  • curprev 18:3318:33, 26 January 2022imported>Stashbot 242,576 bytes +1,010 wm-bot: depooled grid node tools-sgeexec-10-10 - cookbook ran by arturo@nostromo

25 January 2022

  • curprev 11:5011:50, 25 January 2022imported>Stashbot 241,566 bytes +226 wm-bot: reconfiguring the grid by using grid-configurator - cookbook ran by arturo@nostromo

24 January 2022

  • curprev 17:4417:44, 24 January 2022imported>Stashbot 241,340 bytes +209 wm-bot: reconfiguring the grid by using grid-configurator - cookbook ran by arturo@nostromo

20 January 2022

  • curprev 17:0517:05, 20 January 2022imported>Stashbot 241,131 bytes +205 arturo: drop 9 of the 10 buster exec nodes created earlier. They didn't get DNS records

19 January 2022

  • curprev 17:3417:34, 19 January 2022imported>Stashbot 240,926 bytes +161 andrewbogott: rebooting tools-sgeexec-0913.tools.eqiad1.wikimedia.cloud to recover from (presumed) fallout from the scratch/nfs move

14 January 2022

  • curprev 19:0919:09, 14 January 2022imported>Stashbot 240,765 bytes +131 taavi: set /var/run/lighttpd as world-writable on all lighttpd webgrid nodes, T299243

12 January 2022

  • curprev 11:2711:27, 12 January 2022imported>Stashbot 240,634 bytes +218 arturo: created puppet prefix `tools-sgeweblight`, drop `tools-sgeweblig`

4 January 2022

  • curprev 17:1817:18, 4 January 2022imported>Stashbot 240,416 bytes +154 bd808: tools-acme-chief-01: sudo service acme-chief restart

31 December 2021

  • curprev 19:4819:48, 31 December 2021imported>Stashbot 240,262 bytes +110 taavi: reset grid error status on webgrid-lighttpd@tools-sgewebgrid-lighttpd-0915

28 December 2021

24 December 2021

23 December 2021

20 December 2021

14 December 2021

  • curprev 09:4609:46, 14 December 2021imported>Stashbot 232,768 bytes +126 majavah: testing delete-crashing-pods emailer component with a test tool T292925

8 December 2021

7 December 2021

  • curprev 11:1111:11, 7 December 2021imported>Stashbot 232,558 bytes +116 arturo: updated member roles in github.com/toolforge: remove brooke as owner, add dcaro

6 December 2021

  • curprev 13:2313:23, 6 December 2021imported>Stashbot 232,442 bytes +141 majavah: root@toolserver-proxy-01:~# systemctl restart apache2.service # working around T293826

4 December 2021

  • curprev 12:1812:18, 4 December 2021imported>Stashbot 232,301 bytes +109 majavah: deploying delete-crashing-pods in dry run mode T292925

28 November 2021

  • curprev 17:4617:46, 28 November 2021imported>Stashbot 232,192 bytes +165 andrewbogott: moving tools-k8s-etcd-13 to cloudvirt1020; cloudvirt1018 (its old host) has a degraded raid which is affecting performance

19 November 2021

  • curprev 13:1613:16, 19 November 2021imported>Stashbot 232,027 bytes +97 majavah: manually add 3 project members after ldap issues were fixed

16 November 2021

  • curprev 12:3112:31, 16 November 2021imported>Stashbot 231,930 bytes +197 majavah: uploading calico 3.21.0 to the internal docker registry T292698

11 November 2021

5 November 2021

29 October 2021

  • curprev 23:5823:58, 29 October 2021imported>Stashbot 231,552 bytes +129 andrewbogott: deleting all files older than 14 days in /srv/tools/shared/tools/project/.shared/cache

28 October 2021

  • curprev 12:4212:42, 28 October 2021imported>Stashbot 231,423 bytes +122 arturo: set `allow-snippet-annotations: "false"` for ingress-nginx (T294330)

26 October 2021

  • curprev 18:0018:00, 26 October 2021imported>Stashbot 231,301 bytes +238 majavah: deleting legacy ingresses for tools.wmflabs.org urls

25 October 2021

  • curprev 14:3314:33, 25 October 2021imported>Stashbot 231,063 bytes +262 majavah: copy nginx-ingress controller v1.0.4 to internal registry T292771

22 October 2021

  • curprev 15:3515:35, 22 October 2021imported>Stashbot 230,801 bytes +240 majavah: remove "^tools-k8s-master-[0-9]+\.tools\.eqiad\.wmflabs$" from authorized_regexes for the main certificate

21 October 2021

20 October 2021

  • curprev 15:4115:41, 20 October 2021imported>Stashbot 230,488 bytes +276 majavah: removing toollabs-webservice from grid exec and master nodes where it's not needed and not managed by puppet

15 October 2021

  • curprev 15:0115:01, 15 October 2021imported>Stashbot 230,212 bytes +129 arturo: add updated ingress-nginx docker image in the registry (v1.0.1) for T293472

7 October 2021

  • curprev 09:1309:13, 7 October 2021imported>Stashbot 230,083 bytes +247 majavah: disabling settings api, now that all pod presets are gone T279106

6 October 2021

  • curprev 06:4606:46, 6 October 2021imported>Stashbot 229,836 bytes +154 majavah: taavi@toolserver-proxy-01:~$ sudo systemctl restart apache2.service # see if it helps with toolserver.org ssl alerts

3 October 2021

  • curprev 21:3121:31, 3 October 2021imported>Stashbot 229,682 bytes +254 bstorm: rebuilding buster containers since they are also affected T291387 T292355

1 October 2021

  • curprev 21:5921:59, 1 October 2021imported>Stashbot 229,428 bytes +347 bd808: clush -w @all -b 'sudo sed -i "s#mozilla/DST_Root_CA_X3.crt#!mozilla/DST_Root_CA_X3.crt#" /etc/ca-certificates.conf && sudo update-ca-certificates' for T292289

29 September 2021

  • curprev 22:3922:39, 29 September 2021imported>Stashbot 229,081 bytes +265 bstorm: finished deploy of the toollabs-webservice 0.77 and updating labels across the k8s cluster to match

27 September 2021

  • curprev 16:1916:19, 27 September 2021imported>Stashbot 228,816 bytes +257 majavah: deploy volume-admission fix for containers for some volumes mounted

23 September 2021

  • curprev 17:2017:20, 23 September 2021imported>Stashbot 228,559 bytes +118 majavah: deploying new maintain-kubeusers for lack of podpresets T279106

22 September 2021

  • curprev 18:0618:06, 22 September 2021imported>Stashbot 228,441 bytes +257 bstorm: launching tools-nfs-test-client-01 to run a "fair" test battery against T291406

20 September 2021

  • curprev 12:4412:44, 20 September 2021imported>Stashbot 228,184 bytes +130 majavah: deploying volume-admission to tools, should not affect anything yet T279106

15 September 2021

14 September 2021

  • curprev 10:3610:36, 14 September 2021imported>Stashbot 227,987 bytes +104 arturo: add toolforge-jobs-framework-cli v5 to aptly buster-tools/toolsbeta

13 September 2021

11 September 2021

10 September 2021

  • curprev 23:2623:26, 10 September 2021imported>Stashbot 227,529 bytes +359 bstorm: cleared error state for tools-sgeexec-0907.tools.eqiad.wmflabs

9 September 2021

  • curprev 16:2016:20, 9 September 2021imported>Stashbot 227,170 bytes +155 arturo: 70017ec0ac root@tools-k8s-control-3:~# kubectl apply -f /etc/kubernetes/psp/base-pod-security-policies.yaml

7 September 2021

6 September 2021

3 September 2021

2 September 2021

  • curprev 01:0201:02, 2 September 2021imported>Stashbot 226,557 bytes +140 bstorm: deployed new version of maintain-kubeusers with new count quotas for new tools T286784

20 August 2021

  • curprev 19:1019:10, 20 August 2021imported>Stashbot 226,417 bytes +236 majavah: rebuilding node12-sssd/{base,web} to use debian packaged npm 7

18 August 2021

  • curprev 21:3221:32, 18 August 2021imported>Stashbot 226,181 bytes +203 bstorm: rebooted tools-sgecron-01 due to a ram filling up and killing everything

16 August 2021

  • curprev 17:0017:00, 16 August 2021imported>Stashbot 225,978 bytes +316 majavah: remove and re-add toollabs-webservice 0.75 on stretch-toolsbeta repository

15 August 2021

  • curprev 17:3017:30, 15 August 2021imported>Stashbot 225,662 bytes +546 majavah: deploying update jobs-framework-api container list to include bullseye images

12 August 2021

7 August 2021

  • curprev 05:5905:59, 7 August 2021imported>Stashbot 224,739 bytes +134 majavah: restart nginx on toolserver-proxy-01 if that helps with flapping icinga certificate expiry check

6 August 2021

  • curprev 16:1716:17, 6 August 2021imported>Stashbot 224,605 bytes +104 bstorm: failed over to tools-docker-registry-06 (which has more space) T288229
  • curprev 00:4300:43, 6 August 2021imported>Stashbot 224,501 bytes +430 bstorm: set up sync between the new registry host and the existing one T288229

29 July 2021

  • curprev 18:0418:04, 29 July 2021imported>Stashbot 224,071 bytes +133 majavah: reset sul account mapping on striker for developer account "Derek Zax" T287369

28 July 2021

  • curprev 21:3321:33, 28 July 2021imported>Stashbot 223,938 bytes +111 majavah: add mdipietro as projectadmin and to sudo policy T287287

27 July 2021

  • curprev 16:2016:20, 27 July 2021imported>Stashbot 223,827 bytes +84 bstorm: built new php images with python2 on board T287421
  • curprev 00:0400:04, 27 July 2021imported>Stashbot 223,743 bytes +381 bstorm: deploy a version of the php3.7 web image that includes the python2 package with tag :testing T287421

23 July 2021

  • curprev 07:1507:15, 23 July 2021imported>Stashbot 223,362 bytes +109 majavah: restart nginx on tools-static-14 to see if it helps with fontcdn issues

22 July 2021

  • curprev 23:3523:35, 22 July 2021imported>Stashbot 223,253 bytes +336 bstorm: deleted tools-sgebastion-09 since it has been shut off since March anyway

21 July 2021

  • curprev 20:0120:01, 21 July 2021imported>Stashbot 222,917 bytes +817 bstorm: deployed new maintain-kubeusers to toolforge T285011

20 July 2021

  • curprev 18:4218:42, 20 July 2021imported>Stashbot 222,100 bytes +451 majavah: deploying systemd security tools on toolforge public stretch machines T287004

19 July 2021

  • curprev 23:2423:24, 19 July 2021imported>Stashbot 221,649 bytes +248 bstorm: applied matchPolicy: equivalent to tools ingress validation controller T280360

16 July 2021

  • curprev 14:0414:04, 16 July 2021imported>Stashbot 221,401 bytes +352 arturo: deployed jobs-framework-api 42b7a885a5bc1bf00c300e8d77bd92e1430a8327 (T286132)

15 July 2021

  • curprev 16:1216:12, 15 July 2021imported>Stashbot 221,049 bytes +417 arturo: deploy toolforge-jobs-framework-api git version d85d93ee1c5d4be6a526cf83e806b2679dde3875 (T285944, T286107, T285979, T286485, T286107)

14 July 2021

  • curprev 23:2923:29, 14 July 2021imported>Stashbot 220,632 bytes +250 bstorm: mounted nfs on tools-services-05 and backing up aptly to NFS dir T286003

12 July 2021

2 July 2021

  • curprev 18:4618:46, 2 July 2021imported>Stashbot 220,239 bytes +99 bstorm: cleared error state for tools-sgeexec-0940.tools.eqiad.wmflabs

1 July 2021

29 June 2021

  • curprev 21:5821:58, 29 June 2021imported>Stashbot 219,695 bytes +594 bstorm: clearing one errored queue and a stack of discarded jobs

15 June 2021

14 June 2021

  • curprev 22:2122:21, 14 June 2021imported>Stashbot 218,920 bytes +229 bstorm: push docker-registry.tools.wmflabs.org/toolforge-python37-sssd-web:testing to test staged os.execv (and other patches) using toolsbeta toollabs-webservice version 0.75 T282975

13 June 2021

  • curprev 08:1508:15, 13 June 2021imported>Stashbot 218,691 bytes +124 majavah: clear grid error state from tools-sgeexec-0907, tools-sgeexec-0916, tools-sgeexec-0940

12 June 2021

  • curprev 14:3914:39, 12 June 2021imported>Stashbot 218,567 bytes +267 majavah: remove nonexistent tools-prometheus-04 and add tools-prometheus-05 to hiera key "prometheus_nodes"

10 June 2021

  • curprev 17:3817:38, 10 June 2021imported>Stashbot 218,300 bytes +104 majavah: clear error state from tools-sgeexec-0907, task@tools-sgeexec-0939

9 June 2021

  • curprev 13:5713:57, 9 June 2021imported>Stashbot 218,196 bytes +135 majavah: clear error state from exec nodes tools-sgeexec-0913, tools-sgeexec-0936, task@tools-sgeexec-0940

7 June 2021

  • curprev 18:3918:39, 7 June 2021imported>Stashbot 218,061 bytes +334 bstorm: cleaning up more error conditions on grid queues

4 June 2021

  • curprev 21:3021:30, 4 June 2021imported>Stashbot 217,727 bytes +193 bstorm: deleting "tools-k8s-ingress-3", "tools-k8s-ingress-2", "tools-k8s-ingress-1" T264221

3 June 2021

  • curprev 18:2718:27, 3 June 2021imported>Stashbot 217,534 bytes +181 majavah: renew prometheus kubernetes certificate T280301

1 June 2021

  • curprev 10:1010:10, 1 June 2021imported>Stashbot 217,353 bytes +238 majavah: properly clean up deleted vms tools-k8s-haproxy-[1,2], tools-checker-03 from puppet after using the wrong fqdn first time

30 May 2021

  • curprev 18:5818:58, 30 May 2021imported>Stashbot 217,115 bytes +75 majavah: clear grid error state from 14 queues

27 May 2021

  • curprev 18:0318:03, 27 May 2021imported>Stashbot 217,040 bytes +283 bstorm: adjusted profile::wmcs::kubeadm::etcd_latency_ms from 30 back to the default (10)

24 May 2021

  • curprev 10:3610:36, 24 May 2021imported>Stashbot 216,757 bytes +230 arturo: rebased labs/private.git after merge conflict

22 May 2021

  • curprev 14:4714:47, 22 May 2021imported>Stashbot 216,527 bytes +389 majavah: manually remove jeh admin certificates and from maintain-kubeusers configmap T282725

21 May 2021

20 May 2021

  • curprev 17:0517:05, 20 May 2021imported>Stashbot 215,512 bytes +488 Majavah: pool tools-k8s-ingress-5 as an ingress node, depool ingress-1 T264221

19 May 2021

16 May 2021

  • curprev 16:5216:52, 16 May 2021imported>Stashbot 214,761 bytes +136 Majavah: clear error state from tools-sgeexec-0905 tools-sgeexec-0907 tools-sgeexec-0936 tools-sgeexec-0941

14 May 2021

  • curprev 19:1819:18, 14 May 2021imported>Stashbot 214,625 bytes +379 bstorm: adjusting the rate limits for bastions nfs_write upward a lot to make NFS writes faster now that the cluster is finally using 10Gb on the backend and frontend T218338

12 May 2021

11 May 2021

  • curprev 17:1717:17, 11 May 2021imported>Stashbot 213,862 bytes +593 Majavah: shutdown and delete tools-checker-03 T278540

10 May 2021

9 May 2021

  • curprev 06:5506:55, 9 May 2021imported>Stashbot 212,514 bytes +79 Majavah: clear error state from tools-sgeexec-0916

8 May 2021

  • curprev 10:5710:57, 8 May 2021imported>Stashbot 212,435 bytes +214 Majavah: import docker image k8s.gcr.io/ingress-nginx/controller:v0.46.0 to local registry as docker-registry.tools.wmflabs.org/nginx-ingress-controller:v0.46.0 T264221

7 May 2021

  • curprev 18:0718:07, 7 May 2021imported>Stashbot 212,221 bytes +665 Majavah: generate and add k8s haproxy keepalived password (profile::toolforge::k8s::haproxy::keepalived_password) to private puppet repo

6 May 2021

  • curprev 14:4314:43, 6 May 2021imported>Stashbot 211,556 bytes +296 Majavah: clear error states from all currently erroring exec nodes

5 May 2021

  • curprev 19:2719:27, 5 May 2021imported>Stashbot 211,260 bytes +120 andrewbogott: adding taavi as a sudo root to project toolforge for T278390

4 May 2021

  • curprev 15:2315:23, 4 May 2021imported>Stashbot 211,140 bytes +151 arturo: upgrading exim4-daemon-heavy in tools-mail-03

3 May 2021

  • curprev 16:2416:24, 3 May 2021imported>Stashbot 210,989 bytes +360 dcaro: started tools-sgeexec-0907, was stuck on initramfs due to an unclean fs (/dev/vda3, root), ran fsck manually fixing all the errors and booted up correctly after (T280641)

29 April 2021

  • curprev 18:2318:23, 29 April 2021imported>Stashbot 210,629 bytes +178 bstorm: removing one more etcd node via cookbook T279723

27 April 2021

  • curprev 16:4016:40, 27 April 2021imported>Stashbot 210,451 bytes +170 bstorm: deleted all the errored out grid jobs stuck in queue wait

26 April 2021

  • curprev 12:1712:17, 26 April 2021imported>Stashbot 210,281 bytes +110 arturo: allowing more tools into the legacy redirector (T281003)

22 April 2021

20 April 2021

  • curprev 22:2022:20, 20 April 2021imported>Stashbot 209,964 bytes +818 bd808: `clush -w @all -b "sudo exiqgrep -z -i | xargs sudo exim -Mt"`

19 April 2021

  • curprev 10:5310:53, 19 April 2021imported>Stashbot 209,146 bytes +205 dcaro: reverting setting prometheus data source in grafana to 'server', can't connect,

16 April 2021

  • curprev 23:1523:15, 16 April 2021imported>Stashbot 208,941 bytes +622 bstorm: cleaned up all source files for the grid with the old domain name to enable future node creation T277653

13 April 2021

  • curprev 13:2613:26, 13 April 2021imported>Stashbot 208,319 bytes +513 dcaro: upgrade puppet and python-wmflib on tools-prometheus-03

11 April 2021

  • curprev 16:0716:07, 11 April 2021imported>Stashbot 207,806 bytes +194 bstorm: cleared E state from tools-sgeexec-0917 tools-sgeexec-0933 tools-sgeexec-0934 tools-sgeexec-0937 from failures of jobs 761759, 815031, 815056, 855676, 898936

8 April 2021

  • curprev 18:2518:25, 8 April 2021imported>Stashbot 207,612 bytes +706 bstorm: cleaned up the deprecated entries in /data/project/.system_sge/gridengine/etc/submithosts for tools-sgegrid-master and tools-sgegrid-shadow using the old fqdns T277653

7 April 2021

  • curprev 04:3504:35, 7 April 2021imported>Stashbot 206,906 bytes +182 andrewbogott: replacing the mx record '10 mail.tools.wmcloud.org' with '10 mail.tools.wmcloud.org.' — trying to fix axfr for the tools.wmcloud.org zone

6 April 2021

  • curprev 15:1615:16, 6 April 2021imported>Stashbot 206,724 bytes +1,295 bstorm: cleared queue state since a few had "errored" for failed jobs.

5 April 2021

  • curprev 17:0217:02, 5 April 2021imported>Stashbot 205,429 bytes +205 bstorm: chowned the data volume for the docker registry to docker-registry:docker-registry

1 April 2021

  • curprev 20:4320:43, 1 April 2021imported>Stashbot 205,224 bytes +555 bstorm: cleared error state from the grid queues caused by unspecified job errors

31 March 2021

  • curprev 15:5715:57, 31 March 2021imported>Stashbot 204,669 bytes +891 arturo: rebooting `tools-mail-03` after enabling NFS (T267082, T278538)

30 March 2021

  • curprev 16:1516:15, 30 March 2021imported>Stashbot 203,778 bytes +821 bstorm: added `labstore::traffic_shaping::egress: 800mbps` to tools-static prefix T278539

28 March 2021

  • curprev 19:3119:31, 28 March 2021imported>Stashbot 202,957 bytes +127 legoktm: legoktm@tools-sgebastion-08:~$ sudo qdel -f 9999704 # T278645

27 March 2021

26 March 2021

  • curprev 12:2112:21, 26 March 2021imported>Stashbot 202,749 bytes +136 arturo: shutdown tools-package-builder-02 (stretch), we keep -03 which is buster (T275864)

25 March 2021

  • curprev 19:3019:30, 25 March 2021imported>Stashbot 202,613 bytes +909 bstorm: forced deletion of all jobs stuck in a deleting state T277653

24 March 2021

  • curprev 12:4612:46, 24 March 2021imported>Stashbot 201,704 bytes +1,273 arturo: shutoff the old stretch VMs `tools-docker-registry-03` and `tools-docker-registry-04` (T278303)

23 March 2021

  • curprev 12:4612:46, 23 March 2021imported>Stashbot 200,431 bytes +421 arturo: aborrero@tools-sgegrid-master:~$ sudo systemctl restart gridengine-master.service

18 March 2021

  • curprev 19:2419:24, 18 March 2021imported>Stashbot 200,010 bytes +868 bstorm: set profile::toolforge::infrastructure across the entire project with login_server set on the bastion and exec node-related prefixes
  • curprev 01:4601:46, 18 March 2021imported>Stashbot 199,142 bytes +341 bstorm: killed the toolschecker cron job, which had an LDAP error, and ran it again by hand

16 March 2021

12 March 2021

11 March 2021

10 March 2021

  • curprev 10:5610:56, 10 March 2021imported>Stashbot 198,019 bytes +96 arturo: briefly stopped VM tools-k8s-etcd-7 to disable VMX cpu flag

9 March 2021

  • curprev 13:3113:31, 9 March 2021imported>Stashbot 197,923 bytes +261 arturo: hard-reboot tools-docker-registry-04 because issues related to T276922

5 March 2021

4 March 2021

  • curprev 11:2511:25, 4 March 2021imported>Stashbot 197,523 bytes +219 arturo: rebooted tools-sgewebgrid-generic-0901, repool it again

3 March 2021

  • curprev 15:1715:17, 3 March 2021imported>Stashbot 197,304 bytes +471 arturo: shutting down tools-sgebastion-07 in an attempt to fix nova state and finish hypervisor migration

2 March 2021

  • curprev 15:2415:24, 2 March 2021imported>Stashbot 196,833 bytes +238 bstorm: depooling tools-sgewebgrid-lighttpd-0914.tools.eqiad.wmflabs for reboot. It isn't communicating right

27 February 2021

  • curprev 02:2302:23, 27 February 2021imported>Stashbot 196,595 bytes +252 bstorm: deployed typo fix to maintain-kubeusers in an innocent effort to make the weekend better T275910

26 February 2021

  • curprev 22:0422:04, 26 February 2021imported>Stashbot 196,343 bytes +338 bstorm: cleaned up grid jobs 1230666,1908277,1908299,2441500,2441513

24 February 2021

  • curprev 18:3018:30, 24 February 2021imported>Stashbot 196,005 bytes +212 bd808: `sudo wmcs-openstack role remove --user zfilipin --project tools user` T267313

23 February 2021

  • curprev 23:1123:11, 23 February 2021imported>Stashbot 195,793 bytes +227 bstorm: draining a bunch of k8s workers to clean up after dumps changes T272397

22 February 2021

19 February 2021

  • curprev 12:3112:31, 19 February 2021imported>Stashbot 194,925 bytes +100 arturo: deploying new version of toolforge ingress admission controller

17 February 2021

  • curprev 21:2621:26, 17 February 2021imported>Stashbot 194,825 bytes +118 bstorm: deleted tools-puppetdb-01 since it is unused at this time (and undersized anyway)

4 February 2021

26 January 2021

  • curprev 16:2716:27, 26 January 2021imported>Stashbot 194,636 bytes +110 bd808: Hard reboot of tools-sgeexec-0906 via Horizon for T272978

22 January 2021

  • curprev 09:5909:59, 22 January 2021imported>Stashbot 194,526 bytes +146 dcaro: added the record redis.svc.tools.eqiad1.wikimedia.cloud pointing to tools-redis1003 (T272679)

21 January 2021

19 January 2021

  • curprev 22:5722:57, 19 January 2021imported>Stashbot 194,278 bytes +503 bstorm: truncated 75GB error log /data/project/robokobot/virgule.err T272247

14 January 2021

  • curprev 20:5620:56, 14 January 2021imported>Stashbot 193,775 bytes +367 bstorm: setting bastions to have mostly-uncapped egress network and 40MBps nfs_read for better shared use

13 January 2021

  • curprev 10:0210:02, 13 January 2021imported>Stashbot 193,408 bytes +107 arturo: delete floating IP allocation 185.15.56.245 (T271867)

12 January 2021

  • curprev 18:1618:16, 12 January 2021imported>Stashbot 193,301 bytes +134 bstorm: deleted wedged CSR tool-adhs-wde to get maintain-kubeusers working again T271842

5 January 2021

  • curprev 18:4918:49, 5 January 2021imported>Stashbot 193,167 bytes +134 bstorm: changing the limits on k8s etcd nodes again, so disabling puppet on them T267966

4 January 2021

  • curprev 18:2118:21, 4 January 2021imported>Stashbot 193,033 bytes +191 bstorm: ran 'sudo systemctl stop getty@ttyS1.service && sudo systemctl disable getty@ttyS1.service' on tools-k8s-etcd-5 I have no idea why that keeps coming back.

22 December 2020

  • curprev 18:2218:22, 22 December 2020imported>Stashbot 192,842 bytes +190 bstorm: rebooting the grid master because it is misbehaving following the NFS outage

18 December 2020

17 December 2020

  • curprev 21:4221:42, 17 December 2020imported>Stashbot 192,543 bytes +2,476 bstorm: doing the same procedure to increase the timeouts more T267966

11 December 2020

  • curprev 18:2918:29, 11 December 2020imported>Stashbot 190,067 bytes +1,158 bstorm: certificatesigningrequest.certificates.k8s.io "tool-production-error-tasks-metrics" deleted to stop maintain-kubeusers issues

10 December 2020

8 December 2020

  • curprev 19:0119:01, 8 December 2020imported>Stashbot 187,730 bytes +140 bstorm: pushed updated calico node image (v3.14.0) to internal docker registry as well T269016

7 December 2020

  • curprev 22:5622:56, 7 December 2020imported>Stashbot 187,590 bytes +182 bstorm: pushed updated local copies of the typha, calico-cni and calico-pod2daemon-flexvol images to the tools internal registry T269016

3 December 2020

  • curprev 09:1809:18, 3 December 2020imported>Stashbot 187,408 bytes +312 arturo: restarted kubelet systemd service on tools-k8s-worker-38. Node was NotReady, complaining about 'use of closed network connection'

28 November 2020

  • curprev 23:3523:35, 28 November 2020imported>Stashbot 187,096 bytes +326 Krenair: Re-scheduled 4 continuous jobs from tools-sgeexec-0908 as it appears to be broken, at about 23:20 UTC

24 November 2020

10 November 2020

2 November 2020

29 October 2020

  • curprev 21:3321:33, 29 October 2020imported>Stashbot 186,307 bytes +489 legoktm: published docker-registry.tools.wmflabs.org/toolbeta-test image (T265681)

28 October 2020

  • curprev 23:4223:42, 28 October 2020imported>Stashbot 185,818 bytes +363 bstorm: dramatically elevated the egress cap on tools-k8s-ingress nodes that were affected by the NFS settings T266506

23 October 2020

  • curprev 22:2222:22, 23 October 2020imported>Stashbot 185,455 bytes +115 legoktm: imported pack_0.14.2-1_amd64.deb into buster-tools (T266270)

21 October 2020

  • curprev 17:5817:58, 21 October 2020imported>Stashbot 185,340 bytes +141 legoktm: pushed toolforge-buster0-{build,run}:latest images to docker registry

15 October 2020

  • curprev 22:0022:00, 15 October 2020imported>Stashbot 185,199 bytes +355 bstorm: manually removing nscd from tools-sgebastion-08 and running puppet

14 October 2020

  • curprev 21:0021:00, 14 October 2020imported>Stashbot 184,844 bytes +753 andrewbogott: repooling tools-sgewebgrid-generic-0901 and tools-sgewebgrid-lighttpd-0915

10 October 2020

  • curprev 17:0717:07, 10 October 2020imported>Stashbot 184,091 bytes +123 bstorm: cleared errors on tools-sgeexec-0912.tools.eqiad.wmflabs to get the queue moving again

8 October 2020

  • curprev 17:0717:07, 8 October 2020imported>Stashbot 183,968 bytes +103 bstorm: rebuilding docker images with locales-all T263339

6 October 2020

2 October 2020

  • curprev 21:0921:09, 2 October 2020imported>Stashbot 183,631 bytes +281 bstorm: rebooting tools-k8s-worker-70 because it seems to be unable to recover from an old NFS disconnect

1 October 2020

30 September 2020

23 September 2020

  • curprev 21:3821:38, 23 September 2020imported>Stashbot 182,914 bytes +111 bstorm: ran an 'apt clean' across the fleet to get ahead of the new locale install

18 September 2020

  • curprev 19:4119:41, 18 September 2020imported>Stashbot 182,803 bytes +1,384 andrewbogott: repooling tools-k8s-worker-30, 33, 34, 57, 60
  • curprev 01:0001:00, 18 September 2020imported>Stashbot 181,419 bytes +1,961 andrewbogott: depooling tools-sgeexec-0917, tools-sgeexec-0918, tools-sgeexec-0919, tools-sgeexec-0920 for flavor update

16 September 2020

  • curprev 23:2023:20, 16 September 2020imported>Stashbot 179,458 bytes +512 andrewbogott: repooled tools-sgeexec-0941 and tools-sgeexec-0939 for move to ceph

10 September 2020

9 September 2020

  • curprev 11:1211:12, 9 September 2020imported>Stashbot 178,587 bytes +560 arturo: new ingress nodes added to the cluster, and tainted/labeled per the docs https://wikitech.wikimedia.org/wiki/Portal:Toolforge/Admin/Kubernetes/Deploying#ingress_nodes (T250172)

8 September 2020

(newest | oldest) View (newer 250 | ) (20 | 50 | 100 | 250 | 500)