You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Revision history of "Nova Resource:Admin/SAL"

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

(newest | oldest) View ( | ) (20 | 50 | 100 | 250 | 500)
  • curprev 17:28, 17 March 2021imported>Stashbot 100,320 bytes +432 bstorm: restarted the backup-glance-images job to clear errors in systemd T271782
  • curprev 16:51, 10 March 2021imported>Stashbot 99,888 bytes +1,162 arturo: rebooting cloudvirt1030 for T275753
  • curprev 16:27, 9 March 2021imported>Stashbot 98,726 bytes +871 arturo: rebooting cloudvirt1027 (T275753)
  • curprev 21:40, 5 March 2021imported>Stashbot 97,855 bytes +748 andrewbogott: replacing 'observer' role with 'reader' role in eqiad1 T276018
  • curprev 18:36, 4 March 2021imported>Stashbot 97,107 bytes +1,044 andrewbogott: rebooting cloudmetrics1002; the console is hanging
  • curprev 17:16, 3 March 2021imported>Stashbot 96,063 bytes +1,804 andrewbogott: restarting rabbitmq-server on cloudcontrol1003,1004,1005; trying to explain amqp errors in scheduler logs
  • curprev 17:16, 2 March 2021imported>Stashbot 94,259 bytes +717 andrewbogott: rebooting cloudvirt1039 to see if I can trigger T276208
  • curprev 20:12, 1 March 2021imported>Stashbot 93,542 bytes +347 andrewbogott: removing novaadmin from all projects save 'admin' for T274385
  • curprev 04:54, 28 February 2021imported>Stashbot 93,195 bytes +162 andrewbogott: restarted redis-server on tools-redis-1003 and tools-redis-1004 in an attempt to reduce replag, no real change detected
  • curprev 00:33, 27 February 2021imported>Stashbot 93,033 bytes +4,713 andrewbogott: sudo cumin --timeout 500 "A:all and not O{project:clouddb-services}" 'lsb_release -c | grep -i buster && uname -r | grep -v 4.19.0-14-amd64 && reboot'
  • curprev 14:56, 25 February 2021imported>Stashbot 88,320 bytes +121 arturo: deployed wmcs-netns-events daemon to all cloudnet servers (T275483)
  • curprev 11:07, 24 February 2021imported>Stashbot 88,199 bytes +112 arturo: force-reboot cloudmetrics1002, add icinga downtime for 2 hours. Investigating some server issue
  • curprev 00:17, 24 February 2021imported>Stashbot 88,087 bytes +717 bstorm: set --property hw_scsi_model=virtio-scsi and --property hw_disk_bus=scsi on the main stretch image in glance on eqiad1 T275430
  • curprev 17:15, 22 February 2021imported>Stashbot 87,370 bytes +368 bstorm: restarting nova-compute on cloudvirt1016 and cloudvirt1036 in case it helps T275411
  • curprev 14:50, 18 February 2021imported>Stashbot 87,002 bytes +426 arturo: rebooting cloudnet1004 for T271058
  • curprev 15:58, 17 February 2021imported>Stashbot 86,576 bytes +153 arturo: deploying https://gerrit.wikimedia.org/r/c/operations/puppet/+/664845 to cloudnet servers (T268335)
  • curprev 16:25, 15 February 2021imported>Stashbot 86,423 bytes +395 arturo: [codfw1dev] rebooting all cloudgw200x-dev / cloudnet200x-dev servers (T272963)
  • curprev 12:01, 11 February 2021imported>Stashbot 86,028 bytes +692 arturo: [codfw1dev] drop instance `tools-codfw1dev-bastion-1` in `tools-codfw1dev` (was buster, cannot use it yet)
  • curprev 15:23, 9 February 2021imported>Stashbot 85,336 bytes +224 arturo: icinga-downtime for 2h everything *labs *cloud for openstack upgrades
  • curprev 18:50, 8 February 2021imported>Stashbot 85,112 bytes +253 bstorm: enabled puppet on cloudvirt1023 for now T274144
  • curprev 10:59, 5 February 2021imported>Stashbot 84,859 bytes +334 arturo: icinga-downtime labstore1004 tools share space check for 1 week (T272247)
  • curprev 10:12, 4 February 2021imported>Stashbot 84,525 bytes +147 dcaro: Increasing the memory limit of osds in eqiad from 8589934592(8G) to 12884901888(12G) (T273851)
  • curprev 09:59, 3 February 2021imported>Stashbot 84,378 bytes +203 dcaro: Doing a full vm backup on cloudvirt1024 with the new script (T260692)
  • curprev 17:14, 2 February 2021imported>Stashbot 84,175 bytes +346 dcaro: Changed osd memory limit from 4G to 8G (T273649)
  • curprev 15:36, 29 January 2021imported>Stashbot 83,829 bytes +155 andrewbogott: disabling puppet and some services on eqiad1 cloudcontrol nodes; replacing nova-placement-api with placement-api
  • curprev 19:44, 28 January 2021imported>Stashbot 83,674 bytes +158 andrewbogott: shutting down cloudcontrol2001-dev because it's in a partially upgraded state; will revive when it's time for Train
  • curprev 00:50, 27 January 2021imported>Stashbot 83,516 bytes +101 bstorm: icinga-downtime cloudnet1004 for a week T271058
  • curprev 16:44, 22 January 2021imported>Stashbot 83,415 bytes +191 andrewbogott: upgrading designate on cloudvirt1003/1004 to OpenStack 'train'
  • curprev 11:35, 21 January 2021imported>Stashbot 83,224 bytes +338 arturo: merging core router firewall changes https://gerrit.wikimedia.org/r/c/operations/homer/public/+/657439 (T209082)
  • curprev 10:49, 20 January 2021imported>Stashbot 82,886 bytes +1,118 arturo: merging core router firewall change https://gerrit.wikimedia.org/r/c/operations/homer/public/+/657302 (T209082)
  • curprev 10:17, 19 January 2021imported>Stashbot 81,768 bytes +103 arturo: icinga-downtime cloudnet1004 for 1 week (T271058)
  • curprev 16:00, 18 January 2021imported>Stashbot 81,665 bytes +865 dcaro: Codfw1 ceph cluster uprgaded, will wait until tomorrow to see if there's any instability, but everything looks fine (T272303)
  • curprev 16:53, 17 January 2021imported>Stashbot 80,800 bytes +126 arturo: icinga downtime labstore1004 /srv/tools space check for 3 days (T272247)
  • curprev 13:41, 15 January 2021imported>Stashbot 80,674 bytes +405 arturo: icinga downtime labstore1004 maintain-dbuser alert until 2021-01-19 (T272125)
  • curprev 17:03, 13 January 2021imported>Stashbot 80,269 bytes +927 arturo: remove cloudvirt1013 cloudvirt1032 cloudvirt1037 to the 'toobusy' host aggregate to prevent further CPU oversubscribing
  • curprev 10:33, 12 January 2021imported>Stashbot 79,342 bytes +175 arturo: reboot cloudnet1004
  • curprev 10:22, 11 January 2021imported>Stashbot 79,167 bytes +573 arturo: doubling size of conntrack table in cloudnet servers https://gerrit.wikimedia.org/r/c/operations/puppet/+/655407 (T271058)
  • curprev 16:02, 10 January 2021imported>Stashbot 78,594 bytes +198 andrewbogott: restarting rabbitmq-server on all eqiad1 cloudcontrols
  • curprev 11:25, 8 January 2021imported>Stashbot 78,396 bytes +1,559 arturo: rebooting both cloudnet2002-dev/cloudnet2003-dev to make sure interfaces are set up correctl (T271517)
  • curprev 15:19, 7 January 2021imported>Stashbot 76,837 bytes +447 dcaro: Finished speed tests on cloudcephosd2001-dev, reprovisioning the osd.0 sdc (T271417)
  • curprev 10:40, 5 January 2021imported>Stashbot 76,390 bytes +134 dcaro: removing dumps-[1..*] backups from cloudvirt1024 as they are not needed (T271094)
  • curprev 07:06, 3 January 2021imported>Stashbot 76,256 bytes +117 dcaro: Got a network hiccup on cloudnet1004, keeping track here T271058
  • curprev 12:32, 28 December 2020imported>Stashbot 76,139 bytes +567 arturo: stop doing backups for the dumps project https://gerrit.wikimedia.org/r/c/operations/puppet/+/652182 (T260692)
  • curprev 15:38, 23 December 2020imported>Stashbot 75,572 bytes +333 andrewbogott: restarting rabbitmq on cloudcontrol1004; suspected leaks
  • curprev 15:30, 22 December 2020imported>Stashbot 75,239 bytes +231 dcaro: cleaning up 6778 dangling snapshots for glance images in eqiad (T270478)
  • curprev 16:18, 19 December 2020imported>Stashbot 75,008 bytes +84 dcaro: gzipped a bunch of logs on cloudvirt1004 due to / being out of space
  • curprev 00:14, 19 December 2020imported>Stashbot 74,924 bytes +2,096 bstorm: truncated /var/log/debug.1 on cloudcontrol1003 which appears to be the exact same content as the user.log files anyway
  • curprev 22:17, 17 December 2020imported>Stashbot 72,828 bytes +570 andrewbogott: correction to above, set the pg and pgp to 1024 for eqiad1-glance-images
  • curprev 09:31, 16 December 2020imported>Stashbot 72,258 bytes +121 dcaro: removing invalid backups from cloudvirt1024 (196 in total) (T269419)
  • curprev 17:42, 14 December 2020imported>Stashbot 72,137 bytes +359 dcaro: The removal freed ~12GB (still 100% usage :S) (T269419)
  • curprev 09:11, 13 December 2020imported>Stashbot 71,778 bytes +108 _dcaro: running backup purge script on cloudvirt1024 (T269419)
  • curprev 23:36, 10 December 2020imported>Stashbot 71,670 bytes +334 bstorm: cleaned up the logs for haproxy on cloudcontrol1003 by deleting all the gzipped ones and truncating the .1 file
  • curprev 18:01, 8 December 2020imported>Stashbot 71,336 bytes +387 dcaro: Host cloudvirt1030 up and running (T216195)
  • curprev 18:33, 7 December 2020imported>Stashbot 70,949 bytes +249 andrewbogott: putting cloudvirt1023 back into service T269467
  • curprev 00:35, 5 December 2020imported>Stashbot 70,700 bytes +1,279 andrewbogott: moving cloudvirt1023 back into maintenance because T269467 continues to puzzle
  • curprev 23:21, 3 December 2020imported>Stashbot 69,421 bytes +765 andrewbogott: removing all osds on cloudcephosd1004 for rebuild, T268746
  • curprev 20:04, 2 December 2020imported>Stashbot 68,656 bytes +909 andrewbogott: removing all osds on cloudcephosd1010 for rebuild, T268746
  • curprev 20:06, 1 December 2020imported>Stashbot 67,747 bytes +322 andrewbogott: removing all osds on cloudcephosd1014 for rebuild, T268746
  • curprev 18:12, 30 November 2020imported>Stashbot 67,425 bytes +131 andrewbogott: removing all osds from cloudcephosd1015 in order to investigate T268746
  • curprev 17:18, 29 November 2020imported>Stashbot 67,294 bytes +106 andrewbogott: cleaning up some logfiles in tools-sgecron-01 — drive is full
  • curprev 22:58, 26 November 2020imported>Stashbot 67,188 bytes +257 andrewbogott: deleting /var/log/haproxy logs older than 7 days in cloudcontrol100x. We need log rotation here it seems.
  • curprev 19:35, 25 November 2020imported>Stashbot 66,931 bytes +1,218 bstorm: repairing ceph pg `instructing pg 6.91 on osd.117 to repair`
  • curprev 17:40, 22 November 2020imported>Stashbot 65,713 bytes +161 andrewbogott: apt-get upgrade on cloudservices1003/1004
  • curprev 12:44, 20 November 2020imported>Stashbot 65,552 bytes +237 arturo: [codfw1dev] install conntrackd in cloudnet2003-dev/cloudnet2002-dev to research l3 agent HA reliability
  • curprev 19:21, 17 November 2020imported>Stashbot 65,315 bytes +103 andrewbogott: draining cloudvirt1012 to experiment with libvirt/cpu things
  • curprev 11:21, 15 November 2020imported>Stashbot 65,212 bytes +103 arturo: icinga downtime cloudbackup2002 for 48h (T267865)
  • curprev 16:38, 10 November 2020imported>Stashbot 65,109 bytes +243 arturo: icinga downtime toolschecker for 2h becasue toolsdb maintenance (T266587)
  • curprev 12:42, 9 November 2020imported>Stashbot 64,866 bytes +944 arturo: restarted neutron l3 agent in cloudnet1003 bc it still had the old default route (T265288)
  • curprev 13:36, 2 November 2020imported>Stashbot 63,922 bytes +127 arturo: (typo: dcaro)
  • curprev 16:57, 29 October 2020imported>Stashbot 63,795 bytes +167 bstorm: silenced deployment-prep project alerts for 60 days since the downtime expired
  • curprev 16:20, 25 October 2020imported>Stashbot 63,628 bytes +230 andrewbogott: adding cloudvirt1038 to the 'ceph' aggregate and removing from the 'spare' aggregate. We need this space while waiting on network upgrades for empty cloudvirts (T216195)
  • curprev 11:30, 23 October 2020imported>Stashbot 63,398 bytes +467 arturo: [codfw1dev] openstack --os-project-id cloudinfra-codfw1dev recordset create --type PTR --record nat.cloudgw.codfw1dev.wikimediacloud.org. --description "created by hand" 0-29.57.15.185.in-addr.arpa. 1.0-29.57.15.185.in-addr.arpa. (T261724)
  • curprev 10:46, 22 October 2020imported>Stashbot 62,931 bytes +285 arturo: [codfw1dev] rebooting cloudinfra-internal-puppetmaster-01.cloudinfra-codfw1dev.codfw1dev.wikimedia.cloud to try fixing some DNS weirdness
  • curprev 14:36, 21 October 2020imported>Stashbot 62,646 bytes +343 andrewbogott: running apt-get update && apt-get install -y facter on all cloud-vps instances
  • curprev 15:47, 20 October 2020imported>Stashbot 62,303 bytes +315 arturo: changing DNS recursor ACLs (https://gerrit.wikimedia.org/r/c/operations/puppet/+/635314) this can be reverted any time if it causes problems (T261724)
  • curprev 01:41, 19 October 2020imported>Stashbot 61,988 bytes +280 andrewbogott: deleting all Precise base images
  • curprev 09:29, 16 October 2020imported>Stashbot 61,708 bytes +678 arturo: [codfw1dev] still some DNS weirdness, investigating
  • curprev 15:17, 15 October 2020imported>Stashbot 61,030 bytes +258 arturo: [codfw1dev] try cleaning up anything related to address scopes in the neutron database (T261724)
  • curprev 17:54, 13 October 2020imported>Stashbot 60,772 bytes +373 andrewbogott: rebuilding cloudvirt1021 for backy support
  • curprev 10:15, 9 October 2020imported>Stashbot 60,399 bytes +1,133 arturo: [codfwd1ev] root@cloudcontrol2001-dev:~# openstack router set --disable-snat cloudinstances2b-gw --external-gateway wan-transport-codfw (T261724)
  • curprev 16:17, 8 October 2020imported>Stashbot 59,266 bytes +546 arturo: [codfw1dev] `root@cloudcontrol2001-dev:~# openstack subnet create --network wan-transport-codfw --gateway 185.15.57.8 --no-dhcp --subnet-range 185.15.57.8/31 cloud-gw-transport-codfw` (with a hack -- see task) (T263622)
  • curprev 21:30, 6 October 2020imported>Stashbot 58,720 bytes +326 andrewbogott: moved cloudvirt1013 out of the 'ceph' aggregate and into the 'maintenance' aggregate for T243414
  • curprev 17:40, 5 October 2020imported>Stashbot 58,394 bytes +129 bd808: `service uwsgi-labspuppetbackend restart` on cloud-puppetmaster-03 (T264649)
  • curprev 11:05, 2 October 2020imported>Stashbot 58,265 bytes +234 arturo: [codfw1dev] restarting rabbitmq-server in all 3 control nodes, the l3 agent was misbehaving
  • curprev 16:06, 1 October 2020imported>Stashbot 58,031 bytes +166 arturo: rebooting cloudvirt1024 to validate changes to /etc/network/interfaces file
  • curprev 16:47, 30 September 2020imported>Stashbot 57,865 bytes +1,958 andrewbogott: rebooting cloudvir1032, 1033, 1034 for T262979
  • curprev 14:55, 28 September 2020imported>Stashbot 55,907 bytes +256 arturo: [jbond42] upgraded facter to v3 across the VM fleet
  • curprev 15:47, 24 September 2020imported>Stashbot 55,651 bytes +294 arturo: stopping/restarting rabbitmq-server in all cloudcontrol servers
  • curprev 10:16, 18 September 2020imported>Stashbot 55,357 bytes +593 arturo: cloudvirt1039 libvirtd service issues were fixed with a reboot
  • curprev 20:32, 15 September 2020imported>Stashbot 54,764 bytes +181 andrewbogott: rebooting cloudvirt1038 to see if it resolves T262979
  • curprev 14:21, 14 September 2020imported>Stashbot 54,583 bytes +421 andrewbogott: draining cloudvirt1001, migrating all VMs with wmcs-ceph-migrate
  • curprev 18:13, 9 September 2020imported>Stashbot 54,162 bytes +433 andrewbogott: restarting ceph-mon@cloudcephmon1003 in hopes that the slow ops reported are phantoms
  • curprev 00:05, 9 September 2020imported>Stashbot 53,729 bytes +517 bd808: Running wmcs-novastats-dnsleaks (T262359)
  • curprev 09:32, 3 September 2020imported>Stashbot 53,212 bytes +106 arturo: icinga downtime cloud* servers for 30 mins (T261866)
  • curprev 08:46, 2 September 2020imported>Stashbot 53,106 bytes +131 arturo: [codfw1dev] reimaging spare server labtestvirt2003 as debian buster (T261724)
  • curprev 18:18, 1 September 2020imported>Stashbot 52,975 bytes +640 andrewbogott: adding drives on cloudcephosd100[3-5] to ceph osd pool
  • curprev 23:26, 31 August 2020imported>Stashbot 52,335 bytes +303 bd808: Removed stale lockfile at cloud-puppetmaster-03.cloudinfra.eqiad.wmflabs:/var/lib/puppet/volatile/GeoIP/.geoipupdate.lock
  • curprev 20:12, 28 August 2020imported>Stashbot 52,032 bytes +100 bd808: Running `wmcs-novastats-dnsleaks --delete` from cloudcontrol1003
  • curprev 17:12, 26 August 2020imported>Stashbot 51,932 bytes +198 bstorm: Running 'ionice -c 3 nice -19 find /srv/tools -type f -size +100M -printf "%k KB %p\n" > tools_large_files_20200826.txt' on labstore1004 T261336
  • curprev 21:34, 21 August 2020imported>Stashbot 51,734 bytes +99 andrewbogott: restarting nova-compute on cloudvirt1033; it seems stuck
(newest | oldest) View ( | ) (20 | 50 | 100 | 250 | 500)