You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Nova Resource:Toolsbeta/SAL
Jump to navigation
Jump to search
2020-03-16
- 21:38 bstorm_: removed lots of hiera related to the legacy k8s cluster T246689
- 19:45 bstorm_: deleting toolsbeta-worker-1001, toolsbeta-k8s-master, toolsbeta-flannel-etcd-01 and toolsbeta-k8s-etcd-01 T246689
- 19:07 bstorm_: shutting down toolsbeta-flannel-etcd-01 T246689
- 19:06 bstorm_: shutting down toolsbeta-worker-1001, toolsbeta-k8s-master and toolsbeta-k8s-etcd T246689
- 14:37 arturo: live-hacking the toollabs-webservice package in toolsbeta-sgewebgrid-lighttpd-0901 as well
- 14:22 arturo: live-hacking the toollabs-webservice package in toolsbeta*-sgebastion-04 with https://gerrit.wikimedia.org/r/c/operations/software/tools-webservice/+/578413 (T234617)
- 14:22 arturo: live-hacking the toollabs-webservice package in tools-sgebastion-04 with https://gerrit.wikimedia.org/r/c/operations/software/tools-webservice/+/578413 (T234617)
- 13:49 arturo: deleting 50 jobs of the `test` tool in the grid to leave room for other tests
- 13:18 arturo: live-hack toolsbeta-puppetmaster-02 with https://gerrit.wikimedia.org/r/c/operations/puppet/+/578406 (T234617)
2020-03-11
- 21:32 bstorm_: deployed jobutils_1.39 and miscutils_1.39 to toolsbeta
2020-03-09
- 13:11 arturo: created VM `toolsbeta-legacy-redirector` (T247236)
- 13:08 arturo: instance quota was full, bump it from 35 to 40
2020-03-06
- 16:22 bstorm_: updating maintain-kubeusers image to filter invalid tool names
2020-03-05
- 21:22 bstorm_: updated maintain-kubeusers to the latest version for toolsbeta only to live test
2020-02-27
- 19:19 bstorm_: upgraded toollabs-webservice to 0.64 on stretch-toolsbeta for testing
- 16:03 jeh: create 3 new VMs toolsbeta-elastic7-0[1,2,3]
- 16:00 jeh: increase CloudVPS quota instance count for new elasticsearch servers
2020-02-26
- 20:35 bstorm_: hard rebooting the grid master for toolsbeta
- 20:20 jeh: restart toolsbeta-sgegrid-shadow
2020-02-18
- 23:20 bstorm_: added toolsbeta-sgegrid-master.toolsbeta.eqiad1.wikimedia.cloud and toolsbeta-sgegrid-shadow.toolsbeta.eqiad1.wikimedia.cloud to gridengine admin host lists
2020-02-10
- 21:19 bstorm_: upgraded toollabs-webservice package for stretch toolsbeta to 0.62 T244293 T244289 T234617 T156626
2020-02-07
- 23:07 bstorm_: upgraded toollabs-webservice for stetch toolsbeta to 0.60 T244611
- 21:09 bstorm_: upgraded toollabs-webservice package for stretch toolsbeta to 0.59 T244293 T244289 T234617 T156626
2020-01-23
- 03:14 bd808: Demoted projectadmins not listed in the "roots" sudoer policy to project members just to avoid random confusion
- 03:06 bd808: Added legoktm to "roots" sudoer policy
- 02:53 bd808: Added legoktm as project admin
2020-01-22
- 11:59 arturo: remove toolviews scripts from toolsbeta-proxy-{1,2}, source of cronspam
2020-01-21
- 12:49 arturo: cleanup livehackings in toolsbeta-sgebastion-04 and toolsbeta-proxy-1
- 09:40 arturo: livehacking toolsbeta-sgebastion-04 (https://gerrit.wikimedia.org/r/c/566045 and https://gerrit.wikimedia.org/r/c/565575) and toolsbeta-proxy-1 (https://gerrit.wikimedia.org/r/c/565556) for testing T234617
2020-01-17
- 12:52 arturo: livehack toolsbeta-puppetmaster-02 to test https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/565556 (T234617)
- 10:37 arturo: enabling puppet agent in toolsbeta-proxy-1 which was disabled without reason since 2019-12-02 (probably by me)
2020-01-16
- 23:13 bstorm_: updated toollabs-webservice to 0.58 for stretch to test things out
- 12:07 arturo: live-hack tools-webservice in tools-sgebastion-04 to test https://gerrit.wikimedia.org/r/c/565259 (T242719)
2020-01-14
- 02:15 andrewbogott: rebooting toolsbeta-sgecron-01 and toolsbeta-test-k8s-etcd-3 to get nfs unstuch
2020-01-13
- 16:41 bstorm_: There was a filesystem unclean and other problems on the "old cluster" worker node 1001. Rebooting it in case that helps.
2020-01-10
- 21:05 bstorm_: updated toollabs-webservice package to 0.55 for testing
2020-01-07
- 15:51 bstorm_: changed kubeadm-config to use a list instead of a hash for extravols on the apiserver in the new k8s cluster T242067
2020-01-06
- 21:42 bstorm_: disabled rpcbind on toolsbeta-sgebastion-04 to test some things
2020-01-03
- 17:46 bstorm_: stashed uncommitted changes on the puppetmaster because they seem to be things that are already merged
- 11:27 arturo: [new k8s] cadvisor is running in the metrics namespace now (T237643)
2020-01-02
- 22:37 bstorm_: Deleting the massive number of test ingresses for tool-fourohfour so the ingress controllers aren't moving so slowly.
- 22:19 bstorm_: Changed the ingress-admission ValidatingWebhookConfiguration to check extensions as well as networking API groups
2019-12-17
- 00:14 bstorm_: Fully enabled encryption at rest for toolsbeta kubernetes
2019-12-16
- 23:03 bstorm_: updated the kubeadm-config configmap to match the new init file
2019-12-04
- 13:02 arturo: drop puppet prefix `toolsbeta-grid-master`, deprecated and no longer in use
- 12:50 arturo: drop puppet prefix `toolsbeta-bastion`, deprecated and no longer in use
2019-12-02
- 10:38 arturo: create wildcard DNS record for `*.toolsbeta.wmflabs.org` for use by the new k8s cluster
- 10:34 arturo: manually scale nginx-ingress deployment to 5 replicas (T239405)
2019-11-25
- 10:30 arturo: add puppet cert SANs via hiera to toolsbeta-test-k8s-etcd nodes (T238655)
2019-11-21
- 14:15 arturo: upgrade new k8s cluster to 1.15.6 using kubeadm (plus kubelet)
2019-11-15
- 14:46 arturo: stop live-hacks on toolsbeta-test-k8s-haproxy-1 T237643
2019-11-14
- 10:32 arturo: live-hacking toolsbeta-test-k8s-haproxy-1 to point to just the k8s apiserver in control-1 Turn on --v=10 in control-1 for extended debug
2019-11-08
- 19:36 bstorm_: rebooted the proxy server just in case that fixes something.
- 11:58 arturo: adding `profile::toolforge::bastion::nproc: 100` to puppet prefix `toolsbeta-sgebastion` (T236202)
- 11:38 arturo: new k8s: refresh deployment for nginx-ingress with latest changes from puppet
2019-11-07
- 21:55 bstorm_: killed pods for ingress admission controller to upgrade to new image T215531
2019-11-06
- 22:39 bstorm_: upgraded repo version of toollabs-webservice in toolsbeta-stretch to 0.49 -- changes for the new k8s cluster T215531
- 19:09 bstorm_: added profile::toolforge::proxies in global hiera to try and figure out why it won't let anything use redis T237443
- 18:53 bstorm_: launching toolsbeta-proxy-2 on a hunch that the config doesn't work well as a standalone T237443
- 18:46 bstorm_: rebooting toolsbeta-proxy-1 trying to convince redis it is not a read replica T237443
- 18:29 bstorm_: stopped broken kube-proxy service on toolsbeta-proxy-1 (should probably be puppetized)
- 17:35 bstorm_: changing some hiera to work with new proxy host
- 12:44 arturo: created VM toolsbeta-proxy-1 (T237443)
2019-11-05
2019-10-25
- 23:41 bstorm_: Deployed custom webhook controllers for registry and ingress checking to toolsbeta-test kubernetes cluster T215531 T215678 T234231
- 16:15 bstorm_: rebooting toolsbeta-test-k8s-worker-1 and -2
2019-10-23
- 12:04 arturo: created 2 new VMs `toolsbeta-test-k8s-worker-[1,2]` T236074
- 11:56 arturo: point FQDN `k8s.toolsbeta.eqiad1.wikimedia.cloud` to `toolsbeta-test-k8s-haproxy-1` (T236074)
- 11:20 arturo: re-create VM `toolsbeta-test-k8s-haproxy-1` to use new puppet profile (T236074)
- 11:10 arturo: re-create VM `toolsbeta-test-k8s-haproxy-2` to test https://gerrit.wikimedia.org/r/545532 (T236074)
2019-10-22
- 17:43 arturo: re-create VM `toolsbeta-test-k8s-control-1` T236074
- 15:48 arturo: point DNS record `k8s.toolsbeta.eqiad1.wikimedia.cloud` to the first controller node for the bootstrap T236074
- 15:30 arturo: created puppet prefix `toolsbeta-test-k8s-control` and delete `toolsbeta-test-k8s-master` T236074
- 12:27 arturo: refreshed puppet prefix `toolsbeta-test-k8s-control` with latest info T236074
- {{safesubst:SAL entry|1=12:26 arturo: created 3 VMs `toolsbeta-test-k8s-control-{1,2,3}` T236074}}
- 12:15 arturo: refresh IP addr of FQDN `k8s.toolsbeta.eqiad1.wikimedia.cloud` T236074
- 12:14 arturo: delete FQDN `toolsbeta-k8s-master.toolsbeta.wmflabs.org` T236074
- {{safesubst:SAL entry|1=11:57 arturo: created 2 new VMS `toolsbeta-test-k8s-haproxy-{1,2}` T236074}}
- 11:54 arturo: created puppet prefix `toolsbeta-test-k8s-haproxy` and delete `toolsbeta-test-k8s-lb` T236074
2019-10-21
- 15:13 arturo: refresh config in prefix puppet `toolsbeta-test-k8s-etcd` to account for new servers T236074
- {{safesubst:SAL entry|1=15:07 arturo: create 3 VMs toolsbeta-test-k8s-etcd-{1,2,3} T236074}}
- 14:58 arturo: deleting all toolsbeta-test-* VMs (master, worker, etcd, lb) T236074
2019-10-18
- 16:33 arturo: created DNS zone `toolsbeta.eqiad1.wikimedia.cloud`
- 09:06 arturo: remove puppet prefix toolsbeta-valhallasw-puppet-compiler (unused)
- {{safesubst:SAL entry|1=09:00 arturo: remove puppet prefix toolsbeta-arturo-k8s-{etcd,master,worker} (unused)}}
- {{safesubst:SAL entry|1=08:59 arturo: refresh role for servers in toolsbeta-test-k8s-{master,worker}}}
- 08:58 arturo: remove puppet prefix etcd-k8s-ctest (unused)
2019-10-14
- 12:26 arturo: delete VM `toolsbeta-test-proxy-01` no longer required
- 12:26 arturo: created security group arturo-test-dynamicproxy-backend to tests stuff related to T234037
2019-10-09
- 11:59 arturo: re-create toolsbeta-test-proxy-01 as Debian Buster (T235059)
2019-10-08
- 14:14 arturo: created puppet prefix `toolsbeta-test-proxy` for testing stuff related to T234037
- 12:27 arturo: created VM toolsbeta-test-proxy-01 for testing stuff related to T234037
2019-10-07
- 19:12 Krenair: reboot toolsbeta-sgecron-01 toolsbeta-sgewebgrid-generic-0901 toolsbeta-sgewebgrid-lighttpd-0901 due to nfs stale issue
2019-09-25
- 23:31 bd808: Updated user list for "roots" sudoer policy
- 23:30 bd808: Granted Krenair projectadmin
2019-09-05
- {{safesubst:SAL entry|1=15:08 zhuyifei1999_: `sudo truncate -s 0 /var/log/exim4/paniclog` on toolsbeta-{sgewebgrid-{lighttpd,generic}-0901,sgecron-01}.toolsbeta.eqiad.wmflabs because of email spam}}
2019-08-12
- 20:40 phamhi: toolsbeta-test-puppet-sandbox instance created for T230147
2019-08-09
- 10:51 arturo: rebalance load: reallocating toolsbeta-sgewebgrid-lighttpd-0901 from cloudvirt1018 to cloudvirt1003
2019-07-24
- 20:48 bstorm_: rebuilt toolsbeta-test cluster with the internal version of the pause container T228887 T215531
- 19:02 bstorm_: doing a clean rebuild of the toolsbeta-test-k8s cluster
2019-07-18
- 16:04 arturo: re-create VMs toolsbeta-test-k8s-{master,worker}-*
- 12:47 arturo: create toolsbeta-test-k8s-etcd-2 as buster to check status of latest puppet code (T226098)
- 12:00 arturo: create toolsbeta-test-k8s-worker-2 as buster to check status of latest puppet code
- {{safesubst:SAL entry|1=09:28 arturo: re-create toolsbeta-test-k8s-master-{1,2,3} as buster to test T228267}}
2019-07-17
- 09:51 arturo: re-create VM toolsbeta-test-k8s-worker-1 as Debian Buster T215531
- 09:13 arturo: create VM toolsbeta-test-k8s-master-4 (Debian Buster) T215531
2019-07-15
- 12:29 arturo: create `toolsbeta-test-k8s-etcd` puppet prefix
- 12:27 arturo: create `toolsbeta-test-k8s-etcd-1` VM T215531
2019-07-03
- 10:49 arturo: recreate `toolsbeta-test-k8s-master-1` VM (T215531)
- 09:32 arturo: create `toolsbeta-test-k8s-worker-1` VM and a puppet prefix for it (T215531)
- 09:22 arturo: delete all `toolsbeta-arturo-k8s-*` instances. We no longer require them per new approach at T215531
2019-07-02
- 17:24 arturo: `aborrero@toolsbeta-test-k8s-lb-01:~ $ sudo generate_haproxy_default.sh` (T215531)
- 10:32 arturo: re-creating toolsbeta-test-k8s-master-1 (T215531) for it to be created without swap
2019-07-01
- 17:13 arturo: re-creating instance `toolsbeta-test-k8s-master-1` with more CPU for T215531
- 17:03 arturo: updated FQDN `toolsbeta-k8s-master.toolsbeta.wmflabs.org` with 172.16.6.9 (the new LB VM) for T215531
- 17:02 arturo: re-creating instance `toolsbeta-test-k8s-lb-01` with more CPU for T215531
- 16:58 arturo: add puppet prefix `toolsbeta-test-k8s-lb` for T215531
- 11:50 arturo: add sssd hiera config for `toolsbeta-test-k8s-master` prefix
2019-06-28
- 19:10 bstorm_: T215531 removed toolsbeta-arturo-k8s-master-2/3 and added toolsbeta-test-k8s-master-1 for testing kubeadm
2019-06-25
- 10:35 arturo: create puppet prefix `toolsbeta-arturo-k8s-worker` for T215531
- 10:35 arturo: create 2 VMs toolsbeta-arturo-k8s-worker-[1,2] for T215531
2019-06-21
- 11:42 arturo: re-create 3 VMs toolsbeta-arturo-k8s-etcd-[1-3] to test latest puppet code in T226098
2019-06-19
- 10:39 arturo: add myself to the `toolsbeta.admin` LDAP group (T225303)
2019-06-14
- 16:24 bstorm_: Manually failed "back" to the toolsbeta-sgegrid-master to get the grid functioning again in toolsbeta
- 16:03 bstorm_: T221721 hard rebooted toolsbeta-sgegrid-master because it had oomkilled basically everything
- 15:55 bstorm_: T221721 deleted toolsbeta-proxy-01 until it can be actively worked on.
- 15:51 bstorm_: deleted toolsbeta-k8s-lb-01 since it isn't being actively worked on just now
2019-06-06
- 12:14 arturo: T215531 create 3 VMs `toolsbeta-arturo-k8s-etcd-[1-3]`
- 12:13 arturo: T215531 add `toolsbeta-arturo-k8s-etcd`* puppet prefix
- 12:12 arturo: T215531 add `toolsbeta-arturo-k8s-test` puppet prefix
2019-06-05
- 12:40 arturo: rebase git repos in toolsbeta-puppetmaster-02. There was some rebase problems in labs/private that required me re-creating by hand one of the [local] patches (puppetdb secrets)
- 12:33 arturo: drop VM instances toolsbeta-k8s-master-arturo-[1-3] and create toolsbeta-arturo-k8s-master-[1-3] T215531
- 12:32 arturo: drop puppet prefix `toolsbeta-k8s-master-arturo` and create `toolsbeta-arturo-k8s-master` since there is also `toolsbeta-k8s-master` which get applied to my VMs T215531
- 11:42 arturo: create VM `toolsbeta-k8s-master-arturo-3` for T215531 (so I have 3 master nodes in this k8s deployment)
- 11:38 arturo: delete instances arturo-sgeexec-sssd-test-2, arturo-sgeexec-sssd-test-1, arturo-bastion-sssd-test, unused
2019-05-24
- 11:49 arturo: T224273 create `toolsbeta-k8s-master-arturo` puppet prefix in horizon
- 11:45 arturo: T224273 create toolsbeta-k8s-master-arturo-[12] stretch VMs
- 11:17 arturo: install by hand some openstack client packages that puppet would refuse to install in toolsbeta-k8s-master-01
- 11:12 arturo: mangle sources.list to handle some apt warnings related to missing repos, etc in toolsbeta-k8s-master-01:
- 11:12 arturo: mangle sources.list to handle some apt warnings related to missing repos, etc
2019-05-07
- 10:22 arturo: T219362 drop the `toolsbeta-exec` puppet prefix
- 10:20 arturo: T219362 drop the `toolsbeta-webgrid-generic` puppet prefix
- 10:19 arturo: T219362 drop the `toolsbeta-webgrid-lighttpd` puppet prefix
2019-04-25
- 04:17 andrewbogott: edited resolv.conf on unpuppetized instances to use the new nameserver: toolsbeta-docker-registry-01, toolsbeta-k8s-lb-01, toolsbeta-proxy-01, toolsbeta-puppetdb-01, toolsbeta-sgegrid-master
2019-04-12
- 23:34 mutante: - toolsbeta-k8s-master-01 - was out of disk space on / , puppet failed to run because out of disk, rename existing syslog.1.gz, gzip syslog.1, rename existing daemon.log.1.gz, gzip daemong.log.1
- 00:05 andrewbogott: migrating remaining VMs to eqiad1-r
2019-03-25
- 18:00 bd808: All Trusty instances shutdown and now in process of deleting
- 17:42 bd808: Preparing to shutdown beta Trusty job grid
2019-03-22
- 13:59 arturo: create VMs arturo-sgeexec-sssd-test-[12] for testing T218126
2019-03-15
- 10:23 arturo: create VM `arturo-bastion-sssd-test` (T218126)
2019-02-20
- 14:58 andrewbogott: moving toolsbeta-grid-master and toolsbeta-puppetmaster-02 to labvirt1003
2019-02-14
- 18:30 andrewbogott: moving toolsbeta-puppetdb-01 to labvirt1002
2018-12-04
- 18:43 arturo: some hiera keys reallocated, see https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/477607/
2018-11-26
- 13:26 arturo: T210098 VM=toolsbeta-sgebastion-03
- 13:25 arturo: T210098 install systemd239 from stretch-backports and restart VM
2018-11-08
- 10:01 arturo: make myself projectadmin to test toolforge stuff on stretch (specifically T207970)
2018-10-22
- 21:20 bstorm_: launched a stretch/sonofgridengine master server
2018-09-19
- 20:11 bstorm_: toolsbeta-puppetmaster-02 is now the puppetmaster and puppetdb works for toolsbeta -- T200557
- 17:24 bstorm_: new puppetmaster is toolsbeta-puppetmaster-02, however, manual changes are required on each client, so it will be broken for a bit (enabling puppetdb for T200557)
- 17:06 bstorm_: working on replacing puppetmaster with one running stretch, as part of adding puppetdb
2018-07-22
- 14:28 zhuyifei1999_: backed up Neha16's changes to toolsbeta-bastion-01:/usr/lib/python2.7/dist-packages/toollabs to toollabs.bak in the same dir via cp -a, and re-install webservice code on the bastion to debug T156626
2018-07-18
- 10:46 harej: Deleted toolsbeta-flynn-01
2018-07-12
- 23:06 bstorm_: Got the grid master running
2018-06-28
- 16:34 chicocvenancio: adding harej as root for flynn testing
2018-06-27
- 22:35 chicocvenancio: add harej as project admin to test Flynn stuff
2018-06-22
- 22:26 chicocvenancio: reconfigured toolsbeta-paws-master-01 kubelet to test image pruning
- 09:39 zhuyifei1999_: fixed that by running `sudo mv /var/lib/puppet/ssl /var/lib/puppet/ssl.bak` then following the red instructions
- 09:33 zhuyifei1999_: puppet is broken on toolsbeta-bastion-01, investigating
- 09:03 zhuyifei1999_: killing and rebuilding toolsbeta-bastion-01
- 08:31 zhuyifei1999_: on toolsbeta-bastion-01, killed /etc/apt/sources.list.d/jonathonf-python-2_7-trusty.list ppa, downgraded python from 2.7.14 to 2.7.5, and reinstalled toollabs-webservice
- 07:56 andrewbogott: someone removed /usr/bin/webservice
2018-05-15
- 07:26 zhuyifei1999_: applied 5324236 via toolsbeta-puppetmaster-01 T190893
- 05:28 zhuyifei1999_: Making project puppetmaster at toolsbeta-puppetmaster-01
2018-05-08
- 02:18 zhuyifei1999_: manually created flannel etcd key T190893
2018-05-07
- 19:01 zhuyifei1999_: install kubernetes-client on toolsbeta-worker-1001 to debug stuffs
- 18:41 zhuyifei1999_: rebuilding toolsbeta-k8s-etcd-01
- 17:58 zhuyifei1999_: cleanup from maintain-kubeusers using the wrong project to create tool home dirs: `find /data/project/ -mindepth 1 -maxdepth 1 -type d \! -user 0 | (while read dir; do id toolsbeta.`basename $dir` 2> /dev/null || sudo rm -rfv $dir; done)`
- 16:41 zhuyifei1999_: rebuild toolsbeta-k8s-master-01 because I can't figure out why puppet can't update maintain-kubeusers.systemd
2018-05-06
- 04:06 zhuyifei1999_: locally patched `/usr/lib/python2.7/dist-packages/toollabs/common/tool.py` on bastion and webgrid-lighttpd
2018-05-05
- 19:51 zhuyifei1999_: `systemctl mask maintain-kubeusers` because it's causing a mess, tries to get the tool list from toolforge T190893
- 18:40 zhuyifei1999_: to unblock k8s testing while waiting on https://gerrit.wikimedia.org/r/430539, installed the package directly on `toolsbeta-k8s-master-01` with `$ sudo apt install python3-yaml`
2018-05-02
- 21:02 zhuyifei1999_: copy over labs/private:/hieradata/labs/tools/common.yaml to project puppet hiera
- 20:37 bd808: Added Neha16 as a project admin for work on T175768
- 20:31 zhuyifei1999_: nuke webservice instances and rebuild
- 20:31 zhuyifei1999_: Added k8s_infrastructure_users to project hiera on horizon T192618
2018-04-20
- 00:20 zhuyifei1999_: deleted all instances I just created except k8s master because of chicken-and-egg problem
2018-04-19
- 22:10 zhuyifei1999_: the trusty instances ask me for my password. the jessie instances don't like my ssh key. :(
- 21:59 zhuyifei1999_: got 'Error: RecordSet belongs in a child zone: toolsbeta.wmflabs.org', using tools-beta.wmflabs.org instead
- 21:57 zhuyifei1999_: Add proxy toolsbeta.wmflabs.org => toolsbeta-proxy-01.toolsbeta.eqiad.wmflabs
- 21:43 zhuyifei1999_: Start creating instances for webservice setup T190893
2018-03-30
- 22:40 zhuyifei1999_: copied over many prefix puppet configuration in horizon from toolforge T190893
2018-03-14
- 18:07 chicocvenancio: updated paws-beta k8s cluster and nodes to v1.9.4 for T189680
2018-03-05
- 19:33 chicocvenancio: added Zhuyifei1999 as project admin
2018-02-09
- 01:11 bd808: Removed Yuvipanda at user request (T186289)
2017-08-07
- 14:09 andrewbogott: deleted etcd-k8s-CTEST and k8s-master-CTEST
2017-04-26
- 15:38 madhuvishy: add Madhuvishy as projectadmin
2016-10-07
- 19:30 valhallasw`cloud: (puppet certs, to be precise)
- 19:30 valhallasw`cloud: fixed certs on toolsbeta-vagrant3-scfc.toolsbeta.eqiad.wmflabs
2016-10-04
- 19:31 valhallasw`cloud: puppet is broken due to incorrect certificates. Cleaning up ('puppet cert clean toolsbeta-webgrid-lighttpd-1406.toolsbeta.eqiad.wmflabs' on puppetmaster3, 'rm -f /var/lib/puppet/client/ssl/certs/toolsbeta-webgrid-lighttpd-1406.toolsbeta.eqiad.wmflabs.pem' on host, for all hosts that I got emails for)
2016-09-08
- 17:11 bd808: Added BryanDavis (self) to project as admin
2016-08-29
- 19:20 yuvipanda: reboot toolsbeta-master, seems, uh, stuck
- 19:18 yuvipanda: reboot toolsbeta-mail, seems, uh, stuck
- 18:48 yuvipanda: reboot toolsbeta-puppetmaster3, puppet run process became Zommmmbiiiieeee, ate all my brains
2016-07-03
- 15:02 yuvipanda: migrating toolsbeta-valhallasw-puppet-compiler to labvirt1011 to ease pressure on labvirt1010
2016-05-27
- 18:57 valhallasw`cloud: sudo qconf -Ae /var/lib/gridengine/etc/exechosts/toolsbeta-exec-1209.toolsbeta.eqiad.wmflabs
2016-05-26
- 15:08 valhallasw`cloud: toolsbeta-mail has high load (1.0) without clear origin, so rebooting the host
2015-10-13
- 19:21 valhallasw`cloud: started building toolsbeta-bastion.
2015-09-07
- 18:50 valhallasw`cloud: role::bastion is now applied on -exec-101. Now for the package_builder manifest...
- 18:30 valhallasw`cloud: applied role::toollabs::bastion on toolsbeta-exec-101 (spinning up a whole new instance will take ages)
July 4
- 12:57 valhallasw`cloud: restarting toolsbeta-webproxy, no response on port 22
July 2
- 14:55 valhallasw`cloud: toolsbeta-webproxy does not respond at all to SSH; rebooting
July 1
- 19:47 valhallasw`cloud: still can't login :/ not sure if this is a remainder of the NFS failure or something else; maybe a puppet run will solve it?
- 19:44 valhallasw`cloud: restarting toolsbeta-exec-01 and toolsbeta-mail as I can't login
June 7
- 14:44 valhallasw: updated /var/lib/git/operations/puppet to make sure the other hosts get the memo
- 14:42 YuviPanda: run sudo sed -i 's/GlobalSign_CA.pem/ca-certificates.crt/' /etc/ldap/ldap.conf on toolsbeta-puppetmaster3 to fix broken LDAP TLS config
May 11
- 18:14 valhallasw: building toolsbeta-pbuilder to experiment with pbuilder for building packages
May 2
- 11:11 valhallasw`cloud: commenting out include ::elasticsearch::ganglia in role::logstash seems to work. I think we have to write our own tools logstash roles anyway in the end, as the role::logstash code contains e.g. mediawiki specific code
- 10:37 valhallasw`cloud: that doesn't seem to be applied... setting has_ganglia: false manually in wikitech hiera
- 10:30 valhallasw`cloud: pulled new changes into puppetmaster to get https://github.com/wikimedia/operations-puppet/commit/4afd23d8e2905a84ef211ad92e8314173eb743ba in
- 10:25 valhallasw`cloud: set Hiera variable "elasticsearch::cluster_name": toolsbeta-logstash-eqiad
- 10:09 valhallasw`cloud: created toolsbeta-logstash to play around with logstash and figure out what we need for tools (phab:T97861)
April 26
- 18:18 valhallasw`cloud: having some issues with puppet-test, so postponing for now
- 17:12 valhallasw`cloud: deploying https://gerrit.wikimedia.org/r/#/c/206118/ on tools-beta using puppet-test
March 31
- 00:27 andrewbogott: shut down toolsbeta-webgrid-03 to conserve resources. It can be restarted when needed.
September 20
- 20:09 andrewbogott_afk: moved toolsbeta-exec-01 and toolsbeta-scfc-icinga-test off of virt1006
July 22
- 11:36 scfc_de: Removed andrewbogott_afk, Coren, petan, YuviPanda from service group admin to prevent further spamming :-)
August 19
- 12:44 petan: rebooting apache it seems to be frozen
August 4
- 23:50 scfc_de: Added scfc_de to local-admin so I don't log myself out again :-)
July 6
- 19:42 petan: rebooting login
June 26
- 08:03 wm-bot: petrb: updating logsplitter
June 24
- 14:47 wm-bot: petrb: rebooting exec-01 to fix the grid weird info
- 13:43 scfc_de: Made scfc root.
- 13:42 scfc_de: Created toolsbeta-puppetmaster.
- 11:09 YuviPanda: Granted yuvipanda root on toolsbeta
June 21
- 13:46 wm-bot: petrb: rebooting all servers
June 17
- 08:31 petan: switching all instances to nfs
June 16
- 15:37 petan: importing sudo policies of tools
- 15:36 petan: importing security groups of tools
- 15:36 petan: blah