You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Revision history of "Nova Resource:Admin/SAL"

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

(newest | oldest) View (newer 100 | ) (20 | 50 | 100 | 250 | 500)
  • curprev 18:56, 3 December 2021imported>Stashbot 151,474 bytes +176 andrewbogott: maintain-views and maintain-meta-p on clouddb1013-1020
  • curprev 01:17, 2 December 2021imported>Stashbot 151,298 bytes +1,832 wm-bot: Drained 'cloudvirt1028.eqiad.wmnet'. (T296790) - cookbook ran by andrew@buster
  • curprev 17:48, 28 November 2021imported>Stashbot 149,466 bytes +209 andrewbogott: moved cloudvirt1018 out of the 'localstorage' aggregate and into 'maintenance' for T296592. It will need to be moved back after the raid is rebuilt.
  • curprev 07:19, 21 November 2021imported>Stashbot 149,257 bytes +120 dcaro_away: restarting designate-sink with some extra logs in it (T296144)
  • curprev 15:48, 17 November 2021imported>Stashbot 149,137 bytes +280 andrewbogott: upgrading mariadb packages on eqiad1 cloudcontrols
  • curprev 13:31, 12 November 2021imported>Stashbot 148,857 bytes +142 arturo: restarting glance-api services to make sure they work with new ceph auth creds (T293752)
  • curprev 21:50, 8 November 2021imported>Stashbot 148,715 bytes +620 andrewbogott: returned clouddb pools back to normal after maintain_views run: https://gerrit.wikimedia.org/r/c/operations/puppet/+/737505 T216481
  • curprev 11:18, 5 November 2021imported>Stashbot 148,095 bytes +742 wm-bot: Added 1 new OSDs ['cloudcephosd1024.eqiad.wmnet'] (T295012) - cookbook ran by arturo@endurance
  • curprev 16:39, 4 November 2021imported>Stashbot 147,353 bytes +2,597 wm-bot: Added 1 new OSDs ['cloudcephosd1023.eqiad.wmnet'] (T295012) - cookbook ran by arturo@endurance
  • curprev 17:22, 3 November 2021imported>Stashbot 144,756 bytes +279 arturo: [codfw1dev] installing keepalived 2.1.5 from buster-backports on cloudgw2001-dev/2002-dev (T294956)
  • curprev 10:54, 2 November 2021imported>Stashbot 144,477 bytes +179 arturo: rebooting cloudnet1004/1003 for T291813
  • curprev 00:47, 24 October 2021imported>Stashbot 144,298 bytes +166 andrewbogott: deploying a change so that openstack clients use tls endpoints: https://gerrit.wikimedia.org/r/c/operations/puppet/+/732738
  • curprev 10:19, 21 October 2021imported>Stashbot 144,132 bytes +227 arturo: drop firewall exception on core routers for wiki replicas legacy setup (T293897)
  • curprev 21:06, 20 October 2021imported>Stashbot 143,905 bytes +99 andrewbogott: creating cloudinfra-nfs project T293936
  • curprev 19:21, 18 October 2021imported>Stashbot 143,806 bytes +252 andrewbogott: also ticked the 'admin' box on wikitech for majavah T292827
  • curprev 12:28, 14 October 2021imported>Stashbot 143,554 bytes +149 arturo: [codfw1dev] add DB grants for cloudbackup2002.codfw.wmnet IP address to the cinder DB (T292546)
  • curprev 10:46, 13 October 2021imported>Stashbot 143,405 bytes +105 arturo: updating python3-neutron across the fleet (T292936)
  • curprev 09:06, 12 October 2021imported>Stashbot 143,300 bytes +200 dcaro: upgrading eqiad cloudnet hosts neutron packages (T292936)
  • curprev 09:39, 5 October 2021imported>Stashbot 143,100 bytes +152 arturo: [codfw1dev] cleaning up manila stuff from openstack (db, endpoints, tenant, VMs, and such) T291257
  • curprev 14:50, 30 September 2021imported>Stashbot 142,948 bytes +391 andrewbogott: sudo cumin "cloud*" "ps -ef | grep nslcd && service nslcd restart" and sudo cumin "lab*" "ps -ef | grep nslcd && service nslcd restart" T292202
  • curprev 09:41, 29 September 2021imported>Stashbot 142,557 bytes +196 arturo: [codfw1dev] cleanup manila shares definitions for a clean start now that the manila-sharecontroller VM is apparently well configured (T291257)
  • curprev 16:23, 28 September 2021imported>Stashbot 142,361 bytes +531 bstorm: downtime for clouddb1020 to reduce re-pages in case this goes badly T291963
  • curprev 10:07, 27 September 2021imported>Stashbot 141,830 bytes +169 arturo: cloudcontrol1004 apparently healthy T291446
  • curprev 13:02, 24 September 2021imported>Stashbot 141,661 bytes +211 arturo: [codfw1dev] create VM manila-share-controller-01 on cloudinfra-codfw1dev
  • curprev 12:13, 21 September 2021imported>Stashbot 141,450 bytes +677 arturo: [codfw1dev] trying to create a manila service image (T291257)
  • curprev 23:08, 20 September 2021imported>Stashbot 140,773 bytes +408 bstorm: ran `echo check > /sys/block/md0/md/sync_action` on cloudcontrol1004 to check raid
  • curprev 11:35, 17 September 2021imported>Stashbot 140,365 bytes +114 arturo: [codfw1dev] install manila on cloudcontrol2001-dev (T291257)
  • curprev 15:56, 16 September 2021imported>Stashbot 140,251 bytes +134 bstorm: removing downtime for labstore1005 so we'll know if it has another issue T290318
  • curprev 22:03, 9 September 2021imported>Stashbot 140,117 bytes +315 bstorm: restarted the prometheus-mysqld-exporter@s1 service as it was not working T290630
  • curprev 15:34, 3 September 2021imported>Stashbot 139,802 bytes +365 bstorm: rebooting labstore1005 to disconnect the drives from labstore1004 T290318
  • curprev 16:16, 30 August 2021imported>Stashbot 139,437 bytes +825 wm-bot: Added 1 new OSDs ['cloudcephosd1018.eqiad.wmnet'] - cookbook ran by andrew@buster
  • curprev 18:57, 27 August 2021imported>Stashbot 138,612 bytes +126 andrewbogott: raising toolsbeta ram/core/instances quotas so majavah can experiment with bullseye
  • curprev 14:45, 25 August 2021imported>Stashbot 138,486 bytes +534 wm-bot: Finished rebooting node cloudcephosd1018.eqiad.wmnet - cookbook ran by andrew@buster
  • curprev 17:39, 19 August 2021imported>Stashbot 137,952 bytes +93 bstorm: restarting glance image backup to try and clear the page
  • curprev 16:21, 18 August 2021imported>Stashbot 137,859 bytes +899 wm-bot: Rebooting node cloudcephosd1018.eqiad.wmnet - cookbook ran by andrew@buster
  • curprev 15:11, 17 August 2021imported>Stashbot 136,960 bytes +119 andrewbogott: rebooting cloudcephosd1008 to force raid rebuild -- T287838
  • curprev 13:51, 11 August 2021imported>Stashbot 136,841 bytes +480 wm-bot: Finished rebooting node cloudcephosd1018.eqiad.wmnet - cookbook ran by dcaro@vulcanus
  • curprev 15:15, 10 August 2021imported>Stashbot 136,361 bytes +214 andrewbogott: restarting all designate services in eqiad1
  • curprev 09:37, 5 August 2021imported>Stashbot 136,147 bytes +106 dcaro: Taking one osd daemon down ot codfw cluster (T288203)
  • curprev 19:20, 4 August 2021imported>Stashbot 136,041 bytes +126 bd808: Running deleteBatch.php on cloudweb2001-dev to remove legacy Heira: pages from labtestwiki
  • curprev 17:40, 3 August 2021imported>Stashbot 135,915 bytes +85 bstorm: rerunning the glance backup script after failure
  • curprev 00:10, 31 July 2021imported>Stashbot 135,830 bytes +233 andrewbogott: "systemctl reset-failed cloud-init.service" on all VMs for T287309
  • curprev 21:32, 27 July 2021imported>Stashbot 135,597 bytes +313 andrewbogott: putting cloudvirt1012 back into service T286748
  • curprev 15:22, 23 July 2021imported>Stashbot 135,284 bytes +88 bstorm: update wikireplicas-dns for s7 fix for web replicas
  • curprev 17:07, 20 July 2021imported>Stashbot 135,196 bytes +215 andrewbogott: reloading haproxy on dbproxy1018 for T286598
  • curprev 00:10, 20 July 2021imported>Stashbot 134,981 bytes +465 bstorm: restarting nova-api on cloudcontrol1003 to try and recover whatever it's doing with designate_floating_ip_ptr_records_updater
  • curprev 09:55, 16 July 2021imported>Stashbot 134,516 bytes +103 dcaro: checking HP raid issues on coludvirt1012 (T286766)
  • curprev 21:08, 14 July 2021imported>Stashbot 134,413 bytes +316 andrewbogott: restarting lots of openstack services while trying to resolve T286675
  • curprev 10:12, 2 July 2021imported>Stashbot 134,097 bytes +1,731 wm-bot: The cluster is not rebalance after adding the new OSDs ['cloudcephosd1019.eqiad.wmnet', 'cloudcephosd1020.eqiad.wmnet'] (T285858) - cookbook ran by dcaro@vulcanus
  • curprev 16:27, 1 July 2021imported>Stashbot 132,366 bytes +2,402 bstorm: failed over cloudstore1009 to cloudstore1008 T224747
  • curprev 21:48, 30 June 2021imported>Stashbot 129,964 bytes +115 bstorm: downtimed space alerts for scratch on cloudstore1008 until after the migration
  • curprev 15:28, 25 June 2021imported>Stashbot 129,849 bytes +238 andrewbogott: restarting openstack services on cloudcontrol1005
  • curprev 13:54, 21 June 2021imported>Stashbot 129,611 bytes +228 dcaro: puppet fix merged and deployed, servers are back to normal
  • curprev 22:21, 20 June 2021imported>Stashbot 129,383 bytes +144 andrewbogott: clearing admin-monitoring VMs; puppet has been failing lately due to a full drive on the puppetmaster
  • curprev 01:18, 15 June 2021imported>Stashbot 129,239 bytes +130 bstorm: running a modified version of the prometheus dir size cron in screen T284964
  • curprev 10:13, 14 June 2021imported>Stashbot 129,109 bytes +110 dcaro: setting ssd to debug mode on tools-sgeexec-0917 (T284130)
  • curprev 10:58, 10 June 2021imported>Stashbot 128,999 bytes +3,910 wm-bot: Finished rebooting the nodes ['cloudcephmon2002-dev', 'cloudcephmon2003-dev', 'cloudcephmon2004-dev'] (T281248) - cookbook ran by dcaro@vulcanus
  • curprev 17:33, 9 June 2021imported>Stashbot 125,089 bytes +1,815 arturo: removed icinga downtime for cloudmetrics1002 -- to see if hardware is healthy (T281881)
  • curprev 23:19, 8 June 2021imported>Stashbot 123,274 bytes +2,253 bd808: Downtimed cloudmetrics1002 in icinga until 2021-06-30 23:59:01 (T281881)
  • curprev 14:27, 7 June 2021imported>Stashbot 121,021 bytes +138 andrewbogott: moving cloudvirt1040 from 'maintenance' aggregate to 'ceph' aggregate T281399
  • curprev 13:12, 1 June 2021imported>Stashbot 120,883 bytes +293 dcaro: Changed the ceph osd_memory_target on eqiad pool to 6Gi (we were reaching the limit, swapping at some points)
  • curprev 14:58, 27 May 2021imported>Stashbot 120,590 bytes +77 wm-bot: Testing - cookbook ran by dcaro@vulcanus
  • curprev 19:10, 26 May 2021imported>Stashbot 120,513 bytes +688 andrewbogott: reimaging cloudvirt1018 to support local VM storage
  • curprev 16:14, 25 May 2021imported>Stashbot 119,825 bytes +412 bd808: Closed #wikimedia-cloud-admin on f***node
  • curprev 22:32, 24 May 2021imported>Stashbot 119,413 bytes +302 andrewbogott: changing the default ttl for eqiad1.wikimedia.cloud. from 3600 to 60; this should help us avoid madness when re-using hostnames.
  • curprev 02:14, 22 May 2021imported>Stashbot 119,111 bytes +159 bstorm: downtiming SMART alerts on dumps server labstore1007 for the weekend because it has been flapping T281045
  • curprev 21:25, 13 May 2021imported>Stashbot 118,952 bytes +245 bstorm: converted the maps and scratch volumes on cloudstore1008 (standby) to drbd T224747
  • curprev 14:23, 12 May 2021imported>Stashbot 118,707 bytes +189 arturo: [codfw1dev] cleanup old unused agents (bgp, ovs)
  • curprev 18:00, 11 May 2021imported>Stashbot 118,518 bytes +198 andrewbogott: adding 'trove' service project in advance of deploying trove in eqiad1
  • curprev 10:53, 9 May 2021imported>Stashbot 118,320 bytes +109 arturo: icinga-downtime cloudmetrics1002 for 3 months (T275605)
  • curprev 13:51, 7 May 2021imported>Stashbot 118,211 bytes +252 andrewbogott: add inherited 'admin' right to novaadmin user throughout eqiad1. I was trying to narrow down the rights here but lack of admin breaks some workflows, e.g. T281894 and T282235
  • curprev 15:31, 6 May 2021imported>Stashbot 117,959 bytes +249 arturo: about to migrating CloudVPS network to the cloudgw architecture T270704
  • curprev 16:07, 5 May 2021imported>Stashbot 117,710 bytes +4,552 dcaro: disallowing insecure global ids on the eqiad ceph cluster (T280641)
  • curprev 16:05, 4 May 2021imported>Stashbot 113,158 bytes +1,656 wm-bot: Safe reboot of 'cloudvirt1028.eqiad.wmnet' finished successfully. (T280641) - cookbook ran by dcaro@vulcanus
  • curprev 23:53, 3 May 2021imported>Stashbot 111,502 bytes +1,153 bstorm: running `maintain-dbusers harvest-replicas` on labstore1004 T281287
  • curprev 11:16, 30 April 2021imported>Stashbot 110,349 bytes +267 dcaro: draining and rebooting coludvirt1017, last one today (T280641)
  • curprev 15:11, 29 April 2021imported>Stashbot 110,082 bytes +404 dcaro: hard rebooting cloudmetrics1002, got hung again (T275605)
  • curprev 21:11, 28 April 2021imported>Stashbot 109,678 bytes +2,619 andrewbogott: cleaning up more references to deleted hypervisors with delete from services where topic='compute' and version != 53;
  • curprev 14:10, 27 April 2021imported>Stashbot 107,059 bytes +1,057 dcaro: codfw.openstack upgraded ceph libraries to 15.2.11 (T280641)
  • curprev 20:56, 26 April 2021imported>Stashbot 106,002 bytes +265 andrewbogott: deleting spurious 'codfw1dev' and 'codw1dev-4' regions in the dallas deployment; regions without endpoints break a bunch of things
  • curprev 13:49, 23 April 2021imported>Stashbot 105,737 bytes +569 dcaro: testing the drain_cloudvirt cookbook on codfw1 openstack cluster, draining cloudvirt2001 (T280641)
  • curprev 17:59, 21 April 2021imported>Stashbot 105,168 bytes +439 dcaro: all monitors upgraded on codfw1 with one cookbook `cookbook --verbose -c ~/.config/spicerack/cookbook.yaml wmcs.ceph.upgrade_mons --monitor-node-fqdn cloudcephmon2002-dev.codfw.wmnet` (T280641)
  • curprev 20:21, 20 April 2021imported>Stashbot 104,729 bytes +114 andrewbogott: reboot cloudservices1003
  • curprev 08:40, 19 April 2021imported>Stashbot 104,615 bytes +218 dcaro: enabling puppet on labstore1004 after mysql restart (T279657)
  • curprev 10:48, 14 April 2021imported>Stashbot 104,397 bytes +588 dcaro: Upgrade of codfw ceph to octopus 15.2.20 done, will run some performance tests now (T274566)
  • curprev 16:42, 13 April 2021imported>Stashbot 103,809 bytes +989 dcaro: Ceph balancer got the cluster to eval 0.014916, that is 88-77% usage for compute pool, and 28-19% usage for the cinder one \o/ (T274573)
  • curprev 21:33, 7 April 2021imported>Stashbot 102,820 bytes +84 andrewbogott: upgrading codfw1dev designate to Victoria
  • curprev 17:36, 4 April 2021imported>Stashbot 102,736 bytes +79 andrewbogott: upgrading eqiad1 designate to Ussuri
  • curprev 14:12, 2 April 2021imported>Stashbot 102,657 bytes +90 andrewbogott: upgrading codfw1dev to OpenStack version Ussuri
  • curprev 12:15, 1 April 2021imported>Stashbot 102,567 bytes +431 dcaro: Restoring the 4.9 kernel on cloudcephosd2003-dev and upgrading (T274565)
  • curprev 08:47, 31 March 2021imported>Stashbot 102,136 bytes +109 dcaro: upgrading cinder on codfw cloudcontrol2* nodes (T278845)
  • curprev 09:53, 30 March 2021imported>Stashbot 102,027 bytes +119 arturo: rebooting cloudnet1003 to cleanup conntrack table, it wouldn't cleanup by hand ...
  • curprev 15:42, 28 March 2021imported>Stashbot 101,908 bytes +80 andrewbogott: updated debian-10.0-buster base image
  • curprev 09:54, 27 March 2021imported>Stashbot 101,828 bytes +102 arturo: cleanup conntrack table in qrouter nents in cloudnet1003 (backup)
  • curprev 19:03, 25 March 2021imported>Stashbot 101,726 bytes +576 andrewbogott: deleting all unused (per wmcs-imageusage) Jessie base images from Glance
  • curprev 09:19, 24 March 2021imported>Stashbot 101,150 bytes +158 dcaro: restarted wmcs-backup on cloudvirt1024 as it failed due to an image being removed while running (T276892)
  • curprev 11:33, 23 March 2021imported>Stashbot 100,992 bytes +94 arturo: root@cloudcontrol1005:~# wmcs-novastats-dnsleaks --delete
  • curprev 10:10, 22 March 2021imported>Stashbot 100,898 bytes +191 arturo: cleanup conntrack table in standby node: aborrero@cloudnet1003:~ $ sudo ip netns exec qrouter-d93771ba-2711-4f88-804a-8df6fd03978a conntrack -F
  • curprev 17:18, 19 March 2021imported>Stashbot 100,707 bytes +293 bstorm: running `ALTER TABLE account MODIFY COLUMN type ENUM('user','tool','paws');` against the labsdbaccounts database on m5 T276284
  • curprev 00:30, 19 March 2021imported>Stashbot 100,414 bytes +94 bstorm: downtimed labstore1004 to check some things in debug mode
(newest | oldest) View (newer 100 | ) (20 | 50 | 100 | 250 | 500)