You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Nova Resource:Tools/SAL: Difference between revisions
Jump to navigation
Jump to search
imported>Stashbot (wm-bot: reconfiguring the grid by using grid-configurator - cookbook ran by arturo@nostromo) |
imported>Stashbot (wm-bot2: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx (7037eca) - cookbook ran by taavi@runko) |
||
(34 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
=== 2022- | === 2022-05-16 === | ||
* | * 14:02 wm-bot2: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx ({{Gerrit|7037eca}}) - cookbook ran by taavi@runko | ||
=== 2022- | === 2022-05-14 === | ||
* | * 10:47 taavi: hard reboot unresponsible tools-sgeexec-0940 | ||
=== 2022- | === 2022-05-12 === | ||
* | * 12:36 taavi: re-enable CronJobControllerV2 [[phab:T308205|T308205]] | ||
* 09:28 taavi: deploy jobs-api update [[phab:T308204|T308204]] | |||
* 09:15 wm-bot2: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api ({{Gerrit|e6fa299}}) ([[phab:T308204|T308204]]) - cookbook ran by taavi@runko | |||
=== 2022- | === 2022-05-10 === | ||
* | * 15:18 taavi: depool tools-k8s-worker-42 for experiments | ||
* 13:54 taavi: enable distro-wikimedia unattended upgrades [[phab:T290494|T290494]] | |||
=== 2022- | === 2022-05-06 === | ||
* | * 19:46 bd808: Rebuilt toolforge-perl532-sssd-base & toolforge-perl532-sssd-web to add liblocale-codes-perl ([[phab:T307812|T307812]]) | ||
=== 2022- | === 2022-05-05 === | ||
* 17: | * 17:28 taavi: deploy tools-webservice 0.83 [[phab:T307693|T307693]] | ||
=== | === 2022-05-03 === | ||
* | * 08:20 taavi: redis: start replication from the old cluster to the new one ([[phab:T278541|T278541]]) | ||
=== | === 2022-05-02 === | ||
* | * 08:54 taavi: restart acme-chief.service [[phab:T307333|T307333]] | ||
=== | === 2022-04-25 === | ||
* | * 14:56 bd808: Rebuilding all docker images to pick up toolforge-webservice v0.82 ([[phab:T214343|T214343]]) | ||
* 14:46 bd808: Building toolforge-webservice v0.82 | |||
=== | === 2022-04-23 === | ||
* | * 16:51 bd808: Built new perl532-sssd/<nowiki>{</nowiki>base,web<nowiki>}</nowiki> images and pushed to registry ([[phab:T214343|T214343]]) | ||
=== | === 2022-04-20 === | ||
* | * 16:58 taavi: reboot toolserver-proxy-01 to free up disk space from stale file handles(?) | ||
* | * 07:51 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api ({{Gerrit|8f37a04}}) - cookbook ran by taavi@runko | ||
=== | === 2022-04-16 === | ||
* | * 18:53 wm-bot: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/kubernetes-metrics ({{Gerrit|2c485e9}}) - cookbook ran by taavi@runko | ||
=== | === 2022-04-12 === | ||
* | * 21:32 bd808: Added komla to Gerrit group 'toollabs-trusted' ([[phab:T305986|T305986]]) | ||
* 21:27 bd808: Added komla to 'roots' sudoers policy ([[phab:T305986|T305986]]) | |||
* 21:24 bd808: Add komla as projectadmin ([[phab:T305986|T305986]]) | |||
=== | === 2022-04-10 === | ||
* | * 18:43 taavi: deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space) | ||
=== | === 2022-04-09 === | ||
* | * 15:30 taavi: manually prune user.log on tools-prometheus-03 to free up some space on / | ||
=== | === 2022-04-08 === | ||
* | * 10:44 arturo: disabled debug mode on the k8s jobs-emailer component | ||
=== | === 2022-04-05 === | ||
* | * 07:52 wm-bot: deployed kubernetes component https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api ({{Gerrit|d7d3463}}) - cookbook ran by arturo@nostromo | ||
* 07:44 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api ({{Gerrit|d7d3463}}) - cookbook ran by arturo@nostromo | |||
* 07:21 arturo: deploying toolforge-jobs-framework-cli v7 | |||
=== | === 2022-04-04 === | ||
* | * 17:05 wm-bot: deployed kubernetes component https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api ({{Gerrit|cbcfc47}}) - cookbook ran by arturo@nostromo | ||
* 16:56 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api ({{Gerrit|cbcfc47}}) - cookbook ran by arturo@nostromo | |||
* 09:28 arturo: deployed toolforge-jobs-framework-cli v6 into aptly and installed it on buster bastions | |||
=== | === 2022-03-28 === | ||
* | * 09:32 wm-bot: cleaned up grid queue errors on tools-sgegrid-master.tools.eqiad1.wikimedia.cloud ([[phab:T304816|T304816]]) - cookbook ran by arturo@nostromo | ||
=== | === 2022-03-15 === | ||
* | * 16:57 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-emailer:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-emailer ({{Gerrit|084ee51}}) - cookbook ran by arturo@nostromo | ||
* 11:24 arturo: cleared error state on queue continuous@tools-sgeexec-0939.tools.eqiad.wmflabs (a job took a very long time to be scheduled...) | |||
=== | === 2022-03-14 === | ||
* | * 11:44 arturo: deploy jobs-framework-emailer {{Gerrit|9470a5f339fd5a44c97c69ce97239aef30f5ee41}} ([[phab:T286135|T286135]]) | ||
* 10:48 dcaro: pushed v0.33.2 tekton control and webhook images, and bashA5.1.4 to the local repo ([[phab:T297090|T297090]]) | |||
=== | === 2022-03-10 === | ||
* | * 09:42 arturo: cleaned grid queue error state @ tools-sgewebgrid-generic-0902 | ||
=== | === 2022-03-01 === | ||
* | * 13:41 dcaro: rebooting tools-sgeexec-0916 to clear any state ([[phab:T302702|T302702]]) | ||
* 12:11 dcaro: Cleared error state queues for sgeexec-0916 ([[phab:T302702|T302702]]) | |||
* 10:23 arturo: tools-sgeeex-0913/0916 are depooled, queue errors. Reboot them and clean errors by hand | |||
=== | === 2022-02-28 === | ||
* | * 08:02 taavi: reboot sgeexec-0916 | ||
* | * 07:49 taavi: depool tools-sgeexec-0916.tools as it is out of disk space on / | ||
=== | === 2022-02-17 === | ||
* | * 08:23 taavi: deleted tools-clushmaster-02 | ||
* | * 08:14 taavi: made tools-puppetmaster-02 its own client to fix `puppet node deactivate` puppetdb access | ||
=== | === 2022-02-16 === | ||
* | * 00:12 bd808: Image builds completed. | ||
=== | === 2022-02-15 === | ||
* | * 23:17 bd808: Image builds failed in buster php image with an apt error. The error looks transient, so starting builds over. | ||
* 23:06 bd808: Started full rebuild of Toolforge containers to pick up webservice 0.81 and other package updates in tmux session on tools-docker-imagebuilder-01 | |||
* 22:58 bd808: `sudo apt-get update && sudo apt-get install toolforge-webservice` on all bastions to pick up 0.81 | |||
* 22:50 bd808: Built new toollabs-webservice 0.81 | |||
* 18:43 bd808: Enabled puppet on tools-proxy-05 | |||
* 18:38 bd808: Disabled puppet on tools-proxy-05 for manual testing of nginx config changes | |||
* 18:21 taavi: delete tools-package-builder-03 | |||
* 11:49 arturo: invalidate sssd cache in all bastions to debug [[phab:T301736|T301736]] | |||
* 11:16 arturo: purge debian package `unscd` on tools-sgebastion-10/11 for [[phab:T301736|T301736]] | |||
* 11:15 arturo: reboot tools-sgebastion-10 for [[phab:T301736|T301736]] | |||
=== | === 2022-02-10 === | ||
* 15: | * 15:07 taavi: shutdown tools-clushmaster-02 [[phab:T298191|T298191]] | ||
* 12: | * 13:25 wm-bot: trying to join node tools-sgewebgen-10-2 to the grid cluster in tools. - cookbook ran by arturo@nostromo | ||
* 13:24 wm-bot: trying to join node tools-sgewebgen-10-1 to the grid cluster in tools. - cookbook ran by arturo@nostromo | |||
* 13:07 wm-bot: trying to join node tools-sgeweblight-10-5 to the grid cluster in tools. - cookbook ran by arturo@nostromo | |||
* 13:06 wm-bot: trying to join node tools-sgeweblight-10-4 to the grid cluster in tools. - cookbook ran by arturo@nostromo | |||
* 13:05 wm-bot: trying to join node tools-sgeweblight-10-3 to the grid cluster in tools. - cookbook ran by arturo@nostromo | |||
* 13:03 wm-bot: trying to join node tools-sgeweblight-10-2 to the grid cluster in tools. - cookbook ran by arturo@nostromo | |||
* 12:54 wm-bot: trying to join node tools-sgeweblight-10-1.tools.eqiad1.wikimedia.cloud to the grid cluster in tools. - cookbook ran by arturo@nostromo | |||
* 08:45 taavi: set `profile::base::manage_ssh_keys: true` globally [[phab:T214427|T214427]] | |||
* 08:16 taavi: enable puppetdb and re-enable puppet with puppetdb ssh key management disabled (profile::base::manage_ssh_keys: false) - [[phab:T214427|T214427]] | |||
* 08:06 taavi: disable puppet globally for enabling puppetdb [[phab:T214427|T214427]] | |||
=== | === 2022-02-09 === | ||
* 15: | * 19:29 taavi: installed tools-puppetdb-1, not configured on puppetmaster side yet [[phab:T214427|T214427]] | ||
* 18:56 wm-bot: pooled 10 grid nodes tools-sgeweblight-10-[1-5],tools-sgewebgen-10-[1,2],tools-sgeexec-10-[1-10] ([[phab:T277653|T277653]]) - cookbook ran by arturo@nostromo | |||
* 18:30 wm-bot: pooled 9 grid nodes tools-sgeexec-10-[2-10],tools-sgewebgen-[3,15] - cookbook ran by arturo@nostromo | |||
* 18:25 arturo: ignore last message | |||
* 18:24 wm-bot: pooled 9 grid nodes tools-sgeexec-10-[2-10],tools-sgewebgen-[3,15] - cookbook ran by arturo@nostromo | |||
* 14:04 taavi: created tools-cumin-1/toolsbeta-cumin-1 [[phab:T298191|T298191]] | |||
=== | === 2022-02-07 === | ||
* | * 17:37 taavi: generated authdns_acmechief ssh key and stored password in a text file in local labs/private repository ([[phab:T288406|T288406]]) | ||
* | * 12:52 taavi: updated maintain-kubeusers for [[phab:T301081|T301081]] | ||
=== | === 2022-02-04 === | ||
* | * 22:33 taavi: `root@tools-sgebastion-10:/data/project/ru_monuments/.kube# mv config old_config` # experimenting with [[phab:T301015|T301015]] | ||
* 21:36 taavi: clear error state from some webgrid nodes | |||
=== | === 2022-02-03 === | ||
* | * 09:06 taavi: run `sudo apt-get clean` on login-buster/dev-buster to clean up disk space | ||
* | * 08:01 taavi: restart acme-chief to force renewal of toolserver.org certificate | ||
=== | === 2022-01-30 === | ||
* | * 14:41 taavi: created a neutron port with ip 172.16.2.46 for a service ip for toolforge redis automatic failover [[phab:T278541|T278541]] | ||
* 14:22 taavi: creating a cluster of 3 bullseye redis hosts for [[phab:T278541|T278541]] | |||
=== | === 2022-01-26 === | ||
* 13: | * 18:33 wm-bot: depooled grid node tools-sgeexec-10-10 - cookbook ran by arturo@nostromo | ||
* 18:33 wm-bot: depooled grid node tools-sgeexec-10-9 - cookbook ran by arturo@nostromo | |||
* 18:33 wm-bot: depooled grid node tools-sgeexec-10-8 - cookbook ran by arturo@nostromo | |||
* 18:32 wm-bot: depooled grid node tools-sgeexec-10-7 - cookbook ran by arturo@nostromo | |||
* 18:32 wm-bot: depooled grid node tools-sgeexec-10-6 - cookbook ran by arturo@nostromo | |||
* 18:31 wm-bot: depooled grid node tools-sgeexec-10-5 - cookbook ran by arturo@nostromo | |||
* 18:30 wm-bot: depooled grid node tools-sgeexec-10-4 - cookbook ran by arturo@nostromo | |||
* 18:28 wm-bot: depooled grid node tools-sgeexec-10-3 - cookbook ran by arturo@nostromo | |||
* 18:27 wm-bot: depooled grid node tools-sgeexec-10-2 - cookbook ran by arturo@nostromo | |||
* 18:27 wm-bot: depooled grid node tools-sgeexec-10-1 - cookbook ran by arturo@nostromo | |||
* 13:55 arturo: scaling up the buster web grid with 5 lighttd and 2 generic nodes ([[phab:T277653|T277653]]) | |||
=== | === 2022-01-25 === | ||
* | * 11:50 wm-bot: reconfiguring the grid by using grid-configurator - cookbook ran by arturo@nostromo | ||
* | * 11:44 arturo: rebooting buster exec nodes | ||
* | * 08:34 taavi: sign puppet certificate for tools-sgeexec-10-4 | ||
=== | === 2022-01-24 === | ||
* | * 17:44 wm-bot: reconfiguring the grid by using grid-configurator - cookbook ran by arturo@nostromo | ||
* 15:23 arturo: scaling up the grid with 10 buster exec nodes ([[phab:T277653|T277653]]) | |||
* | |||
=== | === 2022-01-20 === | ||
* 17: | * 17:05 arturo: drop 9 of the 10 buster exec nodes created earlier. They didn't get DNS records | ||
* 12:56 arturo: scaling up the grid with 10 buster exec nodes ([[phab:T277653|T277653]]) | |||
=== | === 2022-01-19 === | ||
* | * 17:34 andrewbogott: rebooting tools-sgeexec-0913.tools.eqiad1.wikimedia.cloud to recover from (presumed) fallout from the scratch/nfs move | ||
=== | === 2022-01-14 === | ||
* | * 19:09 taavi: set /var/run/lighttpd as world-writable on all lighttpd webgrid nodes, [[phab:T299243|T299243]] | ||
=== | === 2022-01-12 === | ||
* | * 11:27 arturo: created puppet prefix `tools-sgeweblight`, drop `tools-sgeweblig` | ||
* 11:03 arturo: created puppet prefix 'tools-sgeweblig' | |||
* 11:02 arturo: created puppet prefix 'toolsbeta-sgeweblig' | |||
=== | === 2022-01-04 === | ||
* 17:18 bd808: tools-acme-chief-01: sudo service acme-chief restart | |||
* 08:12 taavi: disable puppet & exim4 on [[phab:T298501|T298501]] | |||
* 17: | |||
* | |||
==Archives== | ==Archives== | ||
* [[/Archive 1|Archive 1]] (2013-2014) | * [[Nova Resource:Tools/SAL/Archive 1|Archive 1]] (2013-2014) | ||
* [[/Archive 2|Archive 2]] (2015-2017) | * [[Nova Resource:Tools/SAL/Archive 2|Archive 2]] (2015-2017) | ||
* [[Nova Resource:Tools/SAL/Archive 3|Archive 3]] (2018-2019) | |||
* [[Nova Resource:Tools/SAL/Archive 4|Archive 4]] (2020-2021) | |||
</noinclude> | </noinclude> | ||
{{SAL|Project Name=tools}} | {{SAL|Project Name=tools}} | ||
<noinclude>[[Category:SAL]]</noinclude> | <noinclude>[[Category:SAL]]</noinclude> |
Revision as of 14:02, 16 May 2022
2022-05-16
- 14:02 wm-bot2: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/ingress-nginx (7037eca) - cookbook ran by taavi@runko
2022-05-14
- 10:47 taavi: hard reboot unresponsible tools-sgeexec-0940
2022-05-12
- 12:36 taavi: re-enable CronJobControllerV2 T308205
- 09:28 taavi: deploy jobs-api update T308204
- 09:15 wm-bot2: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (e6fa299) (T308204) - cookbook ran by taavi@runko
2022-05-10
- 15:18 taavi: depool tools-k8s-worker-42 for experiments
- 13:54 taavi: enable distro-wikimedia unattended upgrades T290494
2022-05-06
- 19:46 bd808: Rebuilt toolforge-perl532-sssd-base & toolforge-perl532-sssd-web to add liblocale-codes-perl (T307812)
2022-05-05
- 17:28 taavi: deploy tools-webservice 0.83 T307693
2022-05-03
- 08:20 taavi: redis: start replication from the old cluster to the new one (T278541)
2022-05-02
- 08:54 taavi: restart acme-chief.service T307333
2022-04-25
- 14:56 bd808: Rebuilding all docker images to pick up toolforge-webservice v0.82 (T214343)
- 14:46 bd808: Building toolforge-webservice v0.82
2022-04-23
- 16:51 bd808: Built new perl532-sssd/{base,web} images and pushed to registry (T214343)
2022-04-20
- 16:58 taavi: reboot toolserver-proxy-01 to free up disk space from stale file handles(?)
- 07:51 wm-bot: build & push docker image docker-registry.tools.wmflabs.org/toolforge-jobs-framework-api:latest from https://gerrit.wikimedia.org/r/cloud/toolforge/jobs-framework-api (8f37a04) - cookbook ran by taavi@runko
2022-04-16
- 18:53 wm-bot: deployed kubernetes component https://gitlab.wikimedia.org/repos/cloud/toolforge/kubernetes-metrics (2c485e9) - cookbook ran by taavi@runko
2022-04-12
- 21:32 bd808: Added komla to Gerrit group 'toollabs-trusted' (T305986)
- 21:27 bd808: Added komla to 'roots' sudoers policy (T305986)
- 21:24 bd808: Add komla as projectadmin (T305986)
2022-04-10
- 18:43 taavi: deleted `/tmp/dwl02.out-20210915` on tools-sgebastion-07 (not touched since september, taking up 1.3G of disk space)
2022-04-09
- 15:30 taavi: manually prune user.log on tools-prometheus-03 to free up some space on /