You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Difference between revisions of "Nova Resource:Toolsbeta/SAL"

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(chicocvenancio: updated paws-beta k8s cluster and nodes to v1.9.4 for T189680)
imported>Stashbot
(zhuyifei1999_: backed up Neha16's changes to toolsbeta-bastion-01:/usr/lib/python2.7/dist-packages/toollabs to toollabs.bak in the same dir via cp -a, and re-install webservice code on the bastion to debug T156626)
Line 1: Line 1:
=== 2018-07-22 ===
* 14:28 zhuyifei1999_: backed up Neha16's changes to toolsbeta-bastion-01:/usr/lib/python2.7/dist-packages/toollabs to toollabs.bak in the same dir via cp -a, and re-install webservice code on the bastion to debug [[phab:T156626|T156626]]
=== 2018-07-18 ===
* 10:46 harej: Deleted toolsbeta-flynn-01
=== 2018-07-12 ===
* 23:06 bstorm_: Got the grid master running
=== 2018-06-28 ===
* 16:34 chicocvenancio: adding harej as root for flynn testing
=== 2018-06-27 ===
* 22:35 chicocvenancio: add harej as project admin to test Flynn stuff
=== 2018-06-22 ===
* 22:26 chicocvenancio: reconfigured toolsbeta-paws-master-01 kubelet to test image pruning
* 09:39 zhuyifei1999_: fixed that by running `sudo mv /var/lib/puppet/ssl /var/lib/puppet/ssl.bak` then following the red instructions
* 09:33 zhuyifei1999_: puppet is broken on toolsbeta-bastion-01, investigating
* 09:03 zhuyifei1999_: killing and rebuilding toolsbeta-bastion-01
* 08:31 zhuyifei1999_: on toolsbeta-bastion-01, killed /etc/apt/sources.list.d/jonathonf-python-2_7-trusty.list ppa, downgraded python from 2.7.14 to 2.7.5, and reinstalled toollabs-webservice
* 07:56 andrewbogott: someone removed /usr/bin/webservice
=== 2018-05-15 ===
* 07:26 zhuyifei1999_: applied {{Gerrit|5324236}} via toolsbeta-puppetmaster-01 [[phab:T190893|T190893]]
* 05:28 zhuyifei1999_: Making project puppetmaster at toolsbeta-puppetmaster-01
=== 2018-05-08 ===
* 02:18 zhuyifei1999_: manually created flannel etcd key [[phab:T190893|T190893]]
=== 2018-05-07 ===
* 19:01 zhuyifei1999_: install kubernetes-client on toolsbeta-worker-1001 to debug stuffs
* 18:41 zhuyifei1999_: rebuilding toolsbeta-k8s-etcd-01
* 17:58 zhuyifei1999_: cleanup from maintain-kubeusers using the wrong project to create tool home dirs: `find /data/project/ -mindepth 1 -maxdepth 1 -type d \! -user 0 {{!}} (while read dir; do id toolsbeta.`basename $dir` 2> /dev/null {{!}}{{!}} sudo rm -rfv $dir; done)`
* 16:41 zhuyifei1999_: rebuild toolsbeta-k8s-master-01 because I can't figure out why puppet can't update maintain-kubeusers.systemd
=== 2018-05-06 ===
* 04:06 zhuyifei1999_: locally patched `/usr/lib/python2.7/dist-packages/toollabs/common/tool.py` on bastion and webgrid-lighttpd
=== 2018-05-05 ===
* 19:51 zhuyifei1999_: `systemctl mask maintain-kubeusers` because it's causing a mess, tries to get the tool list from toolforge [[phab:T190893|T190893]]
* 18:40 zhuyifei1999_: to unblock k8s testing while waiting on https://gerrit.wikimedia.org/r/430539, installed the package directly on `toolsbeta-k8s-master-01` with `$ sudo apt install python3-yaml`
=== 2018-05-02 ===
* 21:02 zhuyifei1999_: copy over labs/private:/hieradata/labs/tools/common.yaml to project puppet hiera
* 20:37 bd808: Added Neha16 as a project admin for work on [[phab:T175768|T175768]]
* 20:31 zhuyifei1999_: nuke webservice instances and rebuild
* 20:31 zhuyifei1999_: Added k8s_infrastructure_users to project hiera on horizon [[phab:T192618|T192618]]
=== 2018-04-20 ===
* 00:20 zhuyifei1999_: deleted all instances I just created except k8s master because of chicken-and-egg problem
=== 2018-04-19 ===
* 22:10 zhuyifei1999_: the trusty instances ask me for my password. the jessie instances don't like my ssh key. :(
* 21:59 zhuyifei1999_: got 'Error: RecordSet belongs in a child zone: toolsbeta.wmflabs.org', using tools-beta.wmflabs.org instead
* 21:57 zhuyifei1999_: Add proxy toolsbeta.wmflabs.org => toolsbeta-proxy-01.toolsbeta.eqiad.wmflabs
* 21:43 zhuyifei1999_: Start creating instances for webservice setup [[phab:T190893|T190893]]
=== 2018-03-30 ===
* 22:40 zhuyifei1999_: copied over many prefix puppet configuration in horizon from toolforge [[phab:T190893|T190893]]
=== 2018-03-14 ===
=== 2018-03-14 ===
* 18:07 chicocvenancio: updated paws-beta k8s cluster  and nodes  to v1.9.4 for [[phab:T189680|T189680]]
* 18:07 chicocvenancio: updated paws-beta k8s cluster  and nodes  to v1.9.4 for [[phab:T189680|T189680]]

Revision as of 14:28, 22 July 2018

2018-07-22

  • 14:28 zhuyifei1999_: backed up Neha16's changes to toolsbeta-bastion-01:/usr/lib/python2.7/dist-packages/toollabs to toollabs.bak in the same dir via cp -a, and re-install webservice code on the bastion to debug T156626

2018-07-18

  • 10:46 harej: Deleted toolsbeta-flynn-01

2018-07-12

  • 23:06 bstorm_: Got the grid master running

2018-06-28

  • 16:34 chicocvenancio: adding harej as root for flynn testing

2018-06-27

  • 22:35 chicocvenancio: add harej as project admin to test Flynn stuff

2018-06-22

  • 22:26 chicocvenancio: reconfigured toolsbeta-paws-master-01 kubelet to test image pruning
  • 09:39 zhuyifei1999_: fixed that by running `sudo mv /var/lib/puppet/ssl /var/lib/puppet/ssl.bak` then following the red instructions
  • 09:33 zhuyifei1999_: puppet is broken on toolsbeta-bastion-01, investigating
  • 09:03 zhuyifei1999_: killing and rebuilding toolsbeta-bastion-01
  • 08:31 zhuyifei1999_: on toolsbeta-bastion-01, killed /etc/apt/sources.list.d/jonathonf-python-2_7-trusty.list ppa, downgraded python from 2.7.14 to 2.7.5, and reinstalled toollabs-webservice
  • 07:56 andrewbogott: someone removed /usr/bin/webservice

2018-05-15

  • 07:26 zhuyifei1999_: applied 5324236 via toolsbeta-puppetmaster-01 T190893
  • 05:28 zhuyifei1999_: Making project puppetmaster at toolsbeta-puppetmaster-01

2018-05-08

  • 02:18 zhuyifei1999_: manually created flannel etcd key T190893

2018-05-07

  • 19:01 zhuyifei1999_: install kubernetes-client on toolsbeta-worker-1001 to debug stuffs
  • 18:41 zhuyifei1999_: rebuilding toolsbeta-k8s-etcd-01
  • 17:58 zhuyifei1999_: cleanup from maintain-kubeusers using the wrong project to create tool home dirs: `find /data/project/ -mindepth 1 -maxdepth 1 -type d \! -user 0 | (while read dir; do id toolsbeta.`basename $dir` 2> /dev/null || sudo rm -rfv $dir; done)`
  • 16:41 zhuyifei1999_: rebuild toolsbeta-k8s-master-01 because I can't figure out why puppet can't update maintain-kubeusers.systemd

2018-05-06

  • 04:06 zhuyifei1999_: locally patched `/usr/lib/python2.7/dist-packages/toollabs/common/tool.py` on bastion and webgrid-lighttpd

2018-05-05

  • 19:51 zhuyifei1999_: `systemctl mask maintain-kubeusers` because it's causing a mess, tries to get the tool list from toolforge T190893
  • 18:40 zhuyifei1999_: to unblock k8s testing while waiting on https://gerrit.wikimedia.org/r/430539, installed the package directly on `toolsbeta-k8s-master-01` with `$ sudo apt install python3-yaml`

2018-05-02

  • 21:02 zhuyifei1999_: copy over labs/private:/hieradata/labs/tools/common.yaml to project puppet hiera
  • 20:37 bd808: Added Neha16 as a project admin for work on T175768
  • 20:31 zhuyifei1999_: nuke webservice instances and rebuild
  • 20:31 zhuyifei1999_: Added k8s_infrastructure_users to project hiera on horizon T192618

2018-04-20

  • 00:20 zhuyifei1999_: deleted all instances I just created except k8s master because of chicken-and-egg problem

2018-04-19

  • 22:10 zhuyifei1999_: the trusty instances ask me for my password. the jessie instances don't like my ssh key. :(
  • 21:59 zhuyifei1999_: got 'Error: RecordSet belongs in a child zone: toolsbeta.wmflabs.org', using tools-beta.wmflabs.org instead
  • 21:57 zhuyifei1999_: Add proxy toolsbeta.wmflabs.org => toolsbeta-proxy-01.toolsbeta.eqiad.wmflabs
  • 21:43 zhuyifei1999_: Start creating instances for webservice setup T190893

2018-03-30

  • 22:40 zhuyifei1999_: copied over many prefix puppet configuration in horizon from toolforge T190893

2018-03-14

  • 18:07 chicocvenancio: updated paws-beta k8s cluster and nodes to v1.9.4 for T189680

2018-03-05

  • 19:33 chicocvenancio: added Zhuyifei1999 as project admin

2018-02-09

  • 01:11 bd808: Removed Yuvipanda at user request (T186289)

2017-08-07

  • 14:09 andrewbogott: deleted etcd-k8s-CTEST and k8s-master-CTEST

2017-04-26

  • 15:38 madhuvishy: add Madhuvishy as projectadmin

2016-10-07

  • 19:30 valhallasw`cloud: (puppet certs, to be precise)
  • 19:30 valhallasw`cloud: fixed certs on toolsbeta-vagrant3-scfc.toolsbeta.eqiad.wmflabs

2016-10-04

  • 19:31 valhallasw`cloud: puppet is broken due to incorrect certificates. Cleaning up ('puppet cert clean toolsbeta-webgrid-lighttpd-1406.toolsbeta.eqiad.wmflabs' on puppetmaster3, 'rm -f /var/lib/puppet/client/ssl/certs/toolsbeta-webgrid-lighttpd-1406.toolsbeta.eqiad.wmflabs.pem' on host, for all hosts that I got emails for)

2016-09-08

  • 17:11 bd808: Added BryanDavis (self) to project as admin

2016-08-29

  • 19:20 yuvipanda: reboot toolsbeta-master, seems, uh, stuck
  • 19:18 yuvipanda: reboot toolsbeta-mail, seems, uh, stuck
  • 18:48 yuvipanda: reboot toolsbeta-puppetmaster3, puppet run process became Zommmmbiiiieeee, ate all my brains

2016-07-03

  • 15:02 yuvipanda: migrating toolsbeta-valhallasw-puppet-compiler to labvirt1011 to ease pressure on labvirt1010

2016-05-27

  • 18:57 valhallasw`cloud: sudo qconf -Ae /var/lib/gridengine/etc/exechosts/toolsbeta-exec-1209.toolsbeta.eqiad.wmflabs

2016-05-26

  • 15:08 valhallasw`cloud: toolsbeta-mail has high load (1.0) without clear origin, so rebooting the host

2015-10-13

  • 19:21 valhallasw`cloud: started building toolsbeta-bastion.

2015-09-07

  • 18:50 valhallasw`cloud: role::bastion is now applied on -exec-101. Now for the package_builder manifest...
  • 18:30 valhallasw`cloud: applied role::toollabs::bastion on toolsbeta-exec-101 (spinning up a whole new instance will take ages)

July 4

  • 12:57 valhallasw`cloud: restarting toolsbeta-webproxy, no response on port 22

July 2

  • 14:55 valhallasw`cloud: toolsbeta-webproxy does not respond at all to SSH; rebooting

July 1

  • 19:47 valhallasw`cloud: still can't login :/ not sure if this is a remainder of the NFS failure or something else; maybe a puppet run will solve it?
  • 19:44 valhallasw`cloud: restarting toolsbeta-exec-01 and toolsbeta-mail as I can't login

June 7

  • 14:44 valhallasw: updated /var/lib/git/operations/puppet to make sure the other hosts get the memo
  • 14:42 YuviPanda: run sudo sed -i 's/GlobalSign_CA.pem/ca-certificates.crt/' /etc/ldap/ldap.conf on toolsbeta-puppetmaster3 to fix broken LDAP TLS config

May 11

  • 18:14 valhallasw: building toolsbeta-pbuilder to experiment with pbuilder for building packages

May 2

  • 11:11 valhallasw`cloud: commenting out include ::elasticsearch::ganglia in role::logstash seems to work. I think we have to write our own tools logstash roles anyway in the end, as the role::logstash code contains e.g. mediawiki specific code
  • 10:37 valhallasw`cloud: that doesn't seem to be applied... setting has_ganglia: false manually in wikitech hiera
  • 10:30 valhallasw`cloud: pulled new changes into puppetmaster to get https://github.com/wikimedia/operations-puppet/commit/4afd23d8e2905a84ef211ad92e8314173eb743ba in
  • 10:25 valhallasw`cloud: set Hiera variable "elasticsearch::cluster_name": toolsbeta-logstash-eqiad
  • 10:09 valhallasw`cloud: created toolsbeta-logstash to play around with logstash and figure out what we need for tools (phab:T97861)

April 26

March 31

  • 00:27 andrewbogott: shut down toolsbeta-webgrid-03 to conserve resources. It can be restarted when needed.

September 20

  • 20:09 andrewbogott_afk: moved toolsbeta-exec-01 and toolsbeta-scfc-icinga-test off of virt1006

July 22

  • 11:36 scfc_de: Removed andrewbogott_afk, Coren, petan, YuviPanda from service group admin to prevent further spamming :-)

August 19

  • 12:44 petan: rebooting apache it seems to be frozen

August 4

  • 23:50 scfc_de: Added scfc_de to local-admin so I don't log myself out again :-)

July 6

  • 19:42 petan: rebooting login

June 26

  • 08:03 wm-bot: petrb: updating logsplitter

June 24

  • 14:47 wm-bot: petrb: rebooting exec-01 to fix the grid weird info
  • 13:43 scfc_de: Made scfc root.
  • 13:42 scfc_de: Created toolsbeta-puppetmaster.
  • 11:09 YuviPanda: Granted yuvipanda root on toolsbeta

June 21

  • 13:46 wm-bot: petrb: rebooting all servers

June 17

  • 08:31 petan: switching all instances to nfs

June 16

  • 15:37 petan: importing sudo policies of tools
  • 15:36 petan: importing security groups of tools
  • 15:36 petan: blah