You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Nova Resource:Admin/SAL: Revision history

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

(newest | oldest) View ( | ) (20 | 50 | 100 | 250 | 500)

3 August 2021

  • curprev 17:4017:40, 3 August 2021imported>Stashbot 135,915 bytes +85 bstorm: rerunning the glance backup script after failure

31 July 2021

  • curprev 00:1000:10, 31 July 2021imported>Stashbot 135,830 bytes +233 andrewbogott: "systemctl reset-failed cloud-init.service" on all VMs for T287309

27 July 2021

  • curprev 21:3221:32, 27 July 2021imported>Stashbot 135,597 bytes +313 andrewbogott: putting cloudvirt1012 back into service T286748

23 July 2021

  • curprev 15:2215:22, 23 July 2021imported>Stashbot 135,284 bytes +88 bstorm: update wikireplicas-dns for s7 fix for web replicas

20 July 2021

  • curprev 17:0717:07, 20 July 2021imported>Stashbot 135,196 bytes +215 andrewbogott: reloading haproxy on dbproxy1018 for T286598
  • curprev 00:1000:10, 20 July 2021imported>Stashbot 134,981 bytes +465 bstorm: restarting nova-api on cloudcontrol1003 to try and recover whatever it's doing with designate_floating_ip_ptr_records_updater

16 July 2021

  • curprev 09:5509:55, 16 July 2021imported>Stashbot 134,516 bytes +103 dcaro: checking HP raid issues on coludvirt1012 (T286766)

14 July 2021

  • curprev 21:0821:08, 14 July 2021imported>Stashbot 134,413 bytes +316 andrewbogott: restarting lots of openstack services while trying to resolve T286675

2 July 2021

  • curprev 10:1210:12, 2 July 2021imported>Stashbot 134,097 bytes +1,731 wm-bot: The cluster is not rebalance after adding the new OSDs ['cloudcephosd1019.eqiad.wmnet', 'cloudcephosd1020.eqiad.wmnet'] (T285858) - cookbook ran by dcaro@vulcanus

1 July 2021

  • curprev 16:2716:27, 1 July 2021imported>Stashbot 132,366 bytes +2,402 bstorm: failed over cloudstore1009 to cloudstore1008 T224747

30 June 2021

  • curprev 21:4821:48, 30 June 2021imported>Stashbot 129,964 bytes +115 bstorm: downtimed space alerts for scratch on cloudstore1008 until after the migration

25 June 2021

  • curprev 15:2815:28, 25 June 2021imported>Stashbot 129,849 bytes +238 andrewbogott: restarting openstack services on cloudcontrol1005

21 June 2021

  • curprev 13:5413:54, 21 June 2021imported>Stashbot 129,611 bytes +228 dcaro: puppet fix merged and deployed, servers are back to normal

20 June 2021

  • curprev 22:2122:21, 20 June 2021imported>Stashbot 129,383 bytes +144 andrewbogott: clearing admin-monitoring VMs; puppet has been failing lately due to a full drive on the puppetmaster

15 June 2021

  • curprev 01:1801:18, 15 June 2021imported>Stashbot 129,239 bytes +130 bstorm: running a modified version of the prometheus dir size cron in screen T284964

14 June 2021

  • curprev 10:1310:13, 14 June 2021imported>Stashbot 129,109 bytes +110 dcaro: setting ssd to debug mode on tools-sgeexec-0917 (T284130)

10 June 2021

  • curprev 10:5810:58, 10 June 2021imported>Stashbot 128,999 bytes +3,910 wm-bot: Finished rebooting the nodes ['cloudcephmon2002-dev', 'cloudcephmon2003-dev', 'cloudcephmon2004-dev'] (T281248) - cookbook ran by dcaro@vulcanus

9 June 2021

  • curprev 17:3317:33, 9 June 2021imported>Stashbot 125,089 bytes +1,815 arturo: removed icinga downtime for cloudmetrics1002 -- to see if hardware is healthy (T281881)

8 June 2021

  • curprev 23:1923:19, 8 June 2021imported>Stashbot 123,274 bytes +2,253 bd808: Downtimed cloudmetrics1002 in icinga until 2021-06-30 23:59:01 (T281881)

7 June 2021

  • curprev 14:2714:27, 7 June 2021imported>Stashbot 121,021 bytes +138 andrewbogott: moving cloudvirt1040 from 'maintenance' aggregate to 'ceph' aggregate T281399

1 June 2021

  • curprev 13:1213:12, 1 June 2021imported>Stashbot 120,883 bytes +293 dcaro: Changed the ceph osd_memory_target on eqiad pool to 6Gi (we were reaching the limit, swapping at some points)

27 May 2021

  • curprev 14:5814:58, 27 May 2021imported>Stashbot 120,590 bytes +77 wm-bot: Testing - cookbook ran by dcaro@vulcanus

26 May 2021

  • curprev 19:1019:10, 26 May 2021imported>Stashbot 120,513 bytes +688 andrewbogott: reimaging cloudvirt1018 to support local VM storage

25 May 2021

  • curprev 16:1416:14, 25 May 2021imported>Stashbot 119,825 bytes +412 bd808: Closed #wikimedia-cloud-admin on f***node

24 May 2021

  • curprev 22:3222:32, 24 May 2021imported>Stashbot 119,413 bytes +302 andrewbogott: changing the default ttl for eqiad1.wikimedia.cloud. from 3600 to 60; this should help us avoid madness when re-using hostnames.

22 May 2021

  • curprev 02:1402:14, 22 May 2021imported>Stashbot 119,111 bytes +159 bstorm: downtiming SMART alerts on dumps server labstore1007 for the weekend because it has been flapping T281045

13 May 2021

  • curprev 21:2521:25, 13 May 2021imported>Stashbot 118,952 bytes +245 bstorm: converted the maps and scratch volumes on cloudstore1008 (standby) to drbd T224747

12 May 2021

  • curprev 14:2314:23, 12 May 2021imported>Stashbot 118,707 bytes +189 arturo: [codfw1dev] cleanup old unused agents (bgp, ovs)

11 May 2021

  • curprev 18:0018:00, 11 May 2021imported>Stashbot 118,518 bytes +198 andrewbogott: adding 'trove' service project in advance of deploying trove in eqiad1

9 May 2021

  • curprev 10:5310:53, 9 May 2021imported>Stashbot 118,320 bytes +109 arturo: icinga-downtime cloudmetrics1002 for 3 months (T275605)

7 May 2021

  • curprev 13:5113:51, 7 May 2021imported>Stashbot 118,211 bytes +252 andrewbogott: add inherited 'admin' right to novaadmin user throughout eqiad1. I was trying to narrow down the rights here but lack of admin breaks some workflows, e.g. T281894 and T282235

6 May 2021

  • curprev 15:3115:31, 6 May 2021imported>Stashbot 117,959 bytes +249 arturo: about to migrating CloudVPS network to the cloudgw architecture T270704

5 May 2021

  • curprev 16:0716:07, 5 May 2021imported>Stashbot 117,710 bytes +4,552 dcaro: disallowing insecure global ids on the eqiad ceph cluster (T280641)

4 May 2021

  • curprev 16:0516:05, 4 May 2021imported>Stashbot 113,158 bytes +1,656 wm-bot: Safe reboot of 'cloudvirt1028.eqiad.wmnet' finished successfully. (T280641) - cookbook ran by dcaro@vulcanus

3 May 2021

  • curprev 23:5323:53, 3 May 2021imported>Stashbot 111,502 bytes +1,153 bstorm: running `maintain-dbusers harvest-replicas` on labstore1004 T281287

30 April 2021

  • curprev 11:1611:16, 30 April 2021imported>Stashbot 110,349 bytes +267 dcaro: draining and rebooting coludvirt1017, last one today (T280641)

29 April 2021

  • curprev 15:1115:11, 29 April 2021imported>Stashbot 110,082 bytes +404 dcaro: hard rebooting cloudmetrics1002, got hung again (T275605)

28 April 2021

  • curprev 21:1121:11, 28 April 2021imported>Stashbot 109,678 bytes +2,619 andrewbogott: cleaning up more references to deleted hypervisors with delete from services where topic='compute' and version != 53;

27 April 2021

  • curprev 14:1014:10, 27 April 2021imported>Stashbot 107,059 bytes +1,057 dcaro: codfw.openstack upgraded ceph libraries to 15.2.11 (T280641)

26 April 2021

  • curprev 20:5620:56, 26 April 2021imported>Stashbot 106,002 bytes +265 andrewbogott: deleting spurious 'codfw1dev' and 'codw1dev-4' regions in the dallas deployment; regions without endpoints break a bunch of things

23 April 2021

  • curprev 13:4913:49, 23 April 2021imported>Stashbot 105,737 bytes +569 dcaro: testing the drain_cloudvirt cookbook on codfw1 openstack cluster, draining cloudvirt2001 (T280641)

21 April 2021

  • curprev 17:5917:59, 21 April 2021imported>Stashbot 105,168 bytes +439 dcaro: all monitors upgraded on codfw1 with one cookbook `cookbook --verbose -c ~/.config/spicerack/cookbook.yaml wmcs.ceph.upgrade_mons --monitor-node-fqdn cloudcephmon2002-dev.codfw.wmnet` (T280641)

20 April 2021

19 April 2021

  • curprev 08:4008:40, 19 April 2021imported>Stashbot 104,615 bytes +218 dcaro: enabling puppet on labstore1004 after mysql restart (T279657)

14 April 2021

  • curprev 10:4810:48, 14 April 2021imported>Stashbot 104,397 bytes +588 dcaro: Upgrade of codfw ceph to octopus 15.2.20 done, will run some performance tests now (T274566)

13 April 2021

  • curprev 16:4216:42, 13 April 2021imported>Stashbot 103,809 bytes +989 dcaro: Ceph balancer got the cluster to eval 0.014916, that is 88-77% usage for compute pool, and 28-19% usage for the cinder one \o/ (T274573)

7 April 2021

  • curprev 21:3321:33, 7 April 2021imported>Stashbot 102,820 bytes +84 andrewbogott: upgrading codfw1dev designate to Victoria

4 April 2021

  • curprev 17:3617:36, 4 April 2021imported>Stashbot 102,736 bytes +79 andrewbogott: upgrading eqiad1 designate to Ussuri

2 April 2021

  • curprev 14:1214:12, 2 April 2021imported>Stashbot 102,657 bytes +90 andrewbogott: upgrading codfw1dev to OpenStack version Ussuri

1 April 2021

  • curprev 12:1512:15, 1 April 2021imported>Stashbot 102,567 bytes +431 dcaro: Restoring the 4.9 kernel on cloudcephosd2003-dev and upgrading (T274565)
(newest | oldest) View ( | ) (20 | 50 | 100 | 250 | 500)