You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Nova Resource:Tools/SAL: Difference between revisions
Jump to navigation
Jump to search
imported>Stashbot (taavi: deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space)) |
imported>Stashbot (wm-bot2: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx (7037eca) - cookbook ran by taavi@runko) |
||
(12 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
=== 2022-05-16 === | |||
* 14:02 wm-bot2: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx ({{Gerrit|7037eca}}) - cookbook ran by taavi@runko | |||
=== 2022-05-14 === | |||
* 10:47 taavi: hard reboot unresponsible tools-sgeexec-0940 | |||
=== 2022-05-12 === | |||
* 12:36 taavi: re-enable CronJobControllerV2 [[phab:T308205|T308205]] | |||
* 09:28 taavi: deploy jobs-api update [[phab:T308204|T308204]] | |||
* 09:15 wm-bot2: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api ({{Gerrit|e6fa299}}) ([[phab:T308204|T308204]]) - cookbook ran by taavi@runko | |||
=== 2022-05-10 === | |||
* 15:18 taavi: depool tools-k8s-worker-42 for experiments | |||
* 13:54 taavi: enable distro-wikimedia unattended upgrades [[phab:T290494|T290494]] | |||
=== 2022-05-06 === | |||
* 19:46 bd808: Rebuilt toolforge-perl532-sssd-base & toolforge-perl532-sssd-web to add liblocale-codes-perl ([[phab:T307812|T307812]]) | |||
=== 2022-05-05 === | |||
* 17:28 taavi: deploy tools-webservice 0.83 [[phab:T307693|T307693]] | |||
=== 2022-05-03 === | |||
* 08:20 taavi: redis: start replication from the old cluster to the new one ([[phab:T278541|T278541]]) | |||
=== 2022-05-02 === | |||
* 08:54 taavi: restart acme-chief.service [[phab:T307333|T307333]] | |||
=== 2022-04-25 === | |||
* 14:56 bd808: Rebuilding all docker images to pick up toolforge-webservice v0.82 ([[phab:T214343|T214343]]) | |||
* 14:46 bd808: Building toolforge-webservice v0.82 | |||
=== 2022-04-23 === | |||
* 16:51 bd808: Built new perl532-sssd/<nowiki>{</nowiki>base,web<nowiki>}</nowiki> images and pushed to registry ([[phab:T214343|T214343]]) | |||
=== 2022-04-20 === | |||
* 16:58 taavi: reboot toolserver-proxy-01 to free up disk space from stale file handles(?) | |||
* 07:51 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api ({{Gerrit|8f37a04}}) - cookbook ran by taavi@runko | |||
=== 2022-04-16 === | |||
* 18:53 wm-bot: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/kubernetes-metrics ({{Gerrit|2c485e9}}) - cookbook ran by taavi@runko | |||
=== 2022-04-12 === | |||
* 21:32 bd808: Added komla to Gerrit group 'toollabs-trusted' ([[phab:T305986|T305986]]) | |||
* 21:27 bd808: Added komla to 'roots' sudoers policy ([[phab:T305986|T305986]]) | |||
* 21:24 bd808: Add komla as projectadmin ([[phab:T305986|T305986]]) | |||
=== 2022-04-10 === | === 2022-04-10 === | ||
* 18:43 taavi: deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space) | * 18:43 taavi: deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space) |
Revision as of 14:02, 16 May 2022
2022-05-16
- 14:02 wm-bot2: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx (7037eca) - cookbook ran by taavi@runko
2022-05-14
- 10:47 taavi: hard reboot unresponsible tools-sgeexec-0940
2022-05-12
- 12:36 taavi: re-enable CronJobControllerV2 T308205
- 09:28 taavi: deploy jobs-api update T308204
- 09:15 wm-bot2: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (e6fa299) (T308204) - cookbook ran by taavi@runko
2022-05-10
- 15:18 taavi: depool tools-k8s-worker-42 for experiments
- 13:54 taavi: enable distro-wikimedia unattended upgrades T290494
2022-05-06
- 19:46 bd808: Rebuilt toolforge-perl532-sssd-base & toolforge-perl532-sssd-web to add liblocale-codes-perl (T307812)
2022-05-05
- 17:28 taavi: deploy tools-webservice 0.83 T307693
2022-05-03
- 08:20 taavi: redis: start replication from the old cluster to the new one (T278541)
2022-05-02
- 08:54 taavi: restart acme-chief.service T307333
2022-04-25
- 14:56 bd808: Rebuilding all docker images to pick up toolforge-webservice v0.82 (T214343)
- 14:46 bd808: Building toolforge-webservice v0.82
2022-04-23
- 16:51 bd808: Built new perl532-sssd/{base,web} images and pushed to registry (T214343)
2022-04-20
- 16:58 taavi: reboot toolserver-proxy-01 to free up disk space from stale file handles(?)
- 07:51 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (8f37a04) - cookbook ran by taavi@runko
2022-04-16
- 18:53 wm-bot: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/kubernetes-metrics (2c485e9) - cookbook ran by taavi@runko
2022-04-12
- 21:32 bd808: Added komla to Gerrit group 'toollabs-trusted' (T305986)
- 21:27 bd808: Added komla to 'roots' sudoers policy (T305986)
- 21:24 bd808: Add komla as projectadmin (T305986)
2022-04-10
- 18:43 taavi: deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space)
2022-04-09
- 15:30 taavi: manually prune user.log on tools-prometheus-03 to free up some space on /
2022-04-08
- 10:44 arturo: disabled debug mode on the k8s jobs-emailer component
2022-04-05
- 07:52 wm-bot: deployed kubernetes component https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (d7d3463) - cookbook ran by arturo@nostromo
- 07:44 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (d7d3463) - cookbook ran by arturo@nostromo
- 07:21 arturo: deploying toolforge-jobs-framework-cli v7
2022-04-04
- 17:05 wm-bot: deployed kubernetes component https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (cbcfc47) - cookbook ran by arturo@nostromo
- 16:56 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (cbcfc47) - cookbook ran by arturo@nostromo
- 09:28 arturo: deployed toolforge-jobs-framework-cli v6 into aptly and installed it on buster bastions
2022-03-28
- 09:32 wm-bot: cleaned up grid queue errors on tools-sgegrid-master.tools.eqiad1.wikimedia.cloud (T304816) - cookbook ran by arturo@nostromo
2022-03-15
- 16:57 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-emailer:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-emailer (084ee51) - cookbook ran by arturo@nostromo
- 11:24 arturo: cleared error state on queue continuous@tools-sgeexec-0939.tools.eqiad.wmflabs (a job took a very long time to be scheduled...)
2022-03-14
- 11:44 arturo: deploy jobs-framework-emailer 9470a5f (T286135)
- 10:48 dcaro: pushed v0.33.2 tekton control and webhook images, and bashA5.1.4 to the local repo (T297090)
2022-03-10
- 09:42 arturo: cleaned grid queue error state @ tools-sgewebgrid-generic-0902
2022-03-01
- 13:41 dcaro: rebooting tools-sgeexec-0916 to clear any state (T302702)
- 12:11 dcaro: Cleared error state queues for sgeexec-0916 (T302702)
- 10:23 arturo: tools-sgeeex-0913/0916 are depooled, queue errors. Reboot them and clean errors by hand
2022-02-28
- 08:02 taavi: reboot sgeexec-0916
- 07:49 taavi: depool tools-sgeexec-0916.tools as it is out of disk space on /
2022-02-17
- 08:23 taavi: deleted tools-clushmaster-02
- 08:14 taavi: made tools-puppetmaster-02 its own client to fix `puppet node deactivate` puppetdb access
2022-02-16
- 00:12 bd808: Image builds completed.
2022-02-15
- 23:17 bd808: Image builds failed in buster php image with an apt error. The error looks transient, so starting builds over.
- 23:06 bd808: Started full rebuild of Toolforge containers to pick up webservice 0.81 and other package updates in tmux session on tools-docker-imagebuilder-01
- 22:58 bd808: `sudo apt-get update && sudo apt-get install toolforge-webservice` on all bastions to pick up 0.81
- 22:50 bd808: Built new toollabs-webservice 0.81
- 18:43 bd808: Enabled puppet on tools-proxy-05
- 18:38 bd808: Disabled puppet on tools-proxy-05 for manual testing of nginx config changes
- 18:21 taavi: delete tools-package-builder-03
- 11:49 arturo: invalidate sssd cache in all bastions to debug T301736
- 11:16 arturo: purge debian package `unscd` on tools-sgebastion-10/11 for T301736
- 11:15 arturo: reboot tools-sgebastion-10 for T301736
2022-02-10
- 15:07 taavi: shutdown tools-clushmaster-02 T298191
- 13:25 wm-bot: trying to join node tools-sgewebgen-10-2 to the grid cluster in tools. - cookbook ran by arturo@nostromo
- 13:24 wm-bot: trying to join node tools-sgewebgen-10-1 to the grid cluster in tools. - cookbook ran by arturo@nostromo
- 13:07 wm-bot: trying to join node tools-sgeweblight-10-5 to the grid cluster in tools. - cookbook ran by arturo@nostromo
- 13:06 wm-bot: trying to join node tools-sgeweblight-10-4 to the grid cluster in tools. - cookbook ran by arturo@nostromo
- 13:05 wm-bot: trying to join node tools-sgeweblight-10-3 to the grid cluster in tools. - cookbook ran by arturo@nostromo
- 13:03 wm-bot: trying to join node tools-sgeweblight-10-2 to the grid cluster in tools. - cookbook ran by arturo@nostromo
- 12:54 wm-bot: trying to join node tools-sgeweblight-10-1.tools.eqiad1.wikimedia.cloud to the grid cluster in tools. - cookbook ran by arturo@nostromo
- 08:45 taavi: set `profile::base::manage_ssh_keys: true` globally T214427
- 08:16 taavi: enable puppetdb and re-enable puppet with puppetdb ssh key management disabled (profile::base::manage_ssh_keys: false) - T214427
- 08:06 taavi: disable puppet globally for enabling puppetdb T214427
2022-02-09
- 19:29 taavi: installed tools-puppetdb-1, not configured on puppetmaster side yet T214427
- 18:56 wm-bot: pooled 10 grid nodes tools-sgeweblight-10-[1-5],tools-sgewebgen-10-[1,2],tools-sgeexec-10-[1-10] (T277653) - cookbook ran by arturo@nostromo
- 18:30 wm-bot: pooled 9 grid nodes tools-sgeexec-10-[2-10],tools-sgewebgen-[3,15] - cookbook ran by arturo@nostromo
- 18:25 arturo: ignore last message
- 18:24 wm-bot: pooled 9 grid nodes tools-sgeexec-10-[2-10],tools-sgewebgen-[3,15] - cookbook ran by arturo@nostromo
- 14:04 taavi: created tools-cumin-1/toolsbeta-cumin-1 T298191
2022-02-07
- 17:37 taavi: generated authdns_acmechief ssh key and stored password in a text file in local labs/private repository (T288406)
- 12:52 taavi: updated maintain-kubeusers for T301081
2022-02-04
- 22:33 taavi: `root@tools-sgebastion-10:/data/project/ru_monuments/.kube# mv config old_config` # experimenting with T301015
- 21:36 taavi: clear error state from some webgrid nodes
2022-02-03
- 09:06 taavi: run `sudo apt-get clean` on login-buster/dev-buster to clean up disk space
- 08:01 taavi: restart acme-chief to force renewal of toolserver.org certificate
2022-01-30
- 14:41 taavi: created a neutron port with ip 172.16.2.46 for a service ip for toolforge redis automatic failover T278541
- 14:22 taavi: creating a cluster of 3 bullseye redis hosts for T278541
2022-01-26
- 18:33 wm-bot: depooled grid node tools-sgeexec-10-10 - cookbook ran by arturo@nostromo
- 18:33 wm-bot: depooled grid node tools-sgeexec-10-9 - cookbook ran by arturo@nostromo
- 18:33 wm-bot: depooled grid node tools-sgeexec-10-8 - cookbook ran by arturo@nostromo
- 18:32 wm-bot: depooled grid node tools-sgeexec-10-7 - cookbook ran by arturo@nostromo
- 18:32 wm-bot: depooled grid node tools-sgeexec-10-6 - cookbook ran by arturo@nostromo
- 18:31 wm-bot: depooled grid node tools-sgeexec-10-5 - cookbook ran by arturo@nostromo
- 18:30 wm-bot: depooled grid node tools-sgeexec-10-4 - cookbook ran by arturo@nostromo
- 18:28 wm-bot: depooled grid node tools-sgeexec-10-3 - cookbook ran by arturo@nostromo
- 18:27 wm-bot: depooled grid node tools-sgeexec-10-2 - cookbook ran by arturo@nostromo
- 18:27 wm-bot: depooled grid node tools-sgeexec-10-1 - cookbook ran by arturo@nostromo
- 13:55 arturo: scaling up the buster web grid with 5 lighttd and 2 generic nodes (T277653)
2022-01-25
- 11:50 wm-bot: reconfiguring the grid by using grid-configurator - cookbook ran by arturo@nostromo
- 11:44 arturo: rebooting buster exec nodes
- 08:34 taavi: sign puppet certificate for tools-sgeexec-10-4
2022-01-24
- 17:44 wm-bot: reconfiguring the grid by using grid-configurator - cookbook ran by arturo@nostromo
- 15:23 arturo: scaling up the grid with 10 buster exec nodes (T277653)
2022-01-20
- 17:05 arturo: drop 9 of the 10 buster exec nodes created earlier. They didn't get DNS records
- 12:56 arturo: scaling up the grid with 10 buster exec nodes (T277653)
2022-01-19
- 17:34 andrewbogott: rebooting tools-sgeexec-0913.tools.eqiad1.wikimedia.cloud to recover from (presumed) fallout from the scratch/nfs move
2022-01-14
- 19:09 taavi: set /var/run/lighttpd as world-writable on all lighttpd webgrid nodes, T299243
2022-01-12
- 11:27 arturo: created puppet prefix `tools-sgeweblight`, drop `tools-sgeweblig`
- 11:03 arturo: created puppet prefix 'tools-sgeweblig'
- 11:02 arturo: created puppet prefix 'toolsbeta-sgeweblig'
2022-01-04
- 17:18 bd808: tools-acme-chief-01: sudo service acme-chief restart
- 08:12 taavi: disable puppet & exim4 on T298501