You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Revision history of "Nova Resource:Tools/SAL"

Jump to navigation Jump to search

Diff selection: Mark the radio buttons of the revisions to compare and hit enter or the button at the bottom.
Legend: (cur) = difference with latest revision, (prev) = difference with preceding revision, m = minor edit.

(newest | oldest) View ( | older 250) (20 | 50 | 100 | 250 | 500)
  • curprev 21:08, 15 March 2019imported>Stashbot 70,800 bytes +373 bstorm_: cleared error state on several queues T217280
  • curprev 23:52, 14 March 2019imported>Stashbot 70,427 bytes +2,052 bd808: Disabled job queues and rescheduled continuous jobs away from tools-exec-14{21,22,23,24,25,26,27,28,29,30,31,32} (T217152)
  • curprev 23:30, 13 March 2019imported>Stashbot 68,375 bytes +775 bd808: Rebuilding stretch Kubernetes images
  • curprev 00:22, 13 March 2019imported>Stashbot 67,600 bytes +113 bd808: Raise web-memlimit for isbn tool to 6G for tomcat8 (T217406)
  • curprev 15:53, 11 March 2019imported>Stashbot 67,487 bytes +344 bd808: Manually started `service gridengine-master` on tools-sgegrid-master after reboot (T218038)
  • curprev 00:53, 11 March 2019imported>Stashbot 67,143 bytes +562 bd808: Re-enabled 13 queue instances that had been disabled by LDAP failures during job initialization (T217280)
  • curprev 00:30, 8 March 2019imported>Stashbot 66,581 bytes +418 bd808: DNS record created for trusty-dev.tools.wmflabs.org (Trusty secondary bastion)
  • curprev 00:49, 7 March 2019imported>Stashbot 66,163 bytes +380 zhuyifei1999_: clushed misctools 1.37 upgrade on @bastion,@cron,@bastion-stretch T217406
  • curprev 19:07, 4 March 2019imported>Stashbot 65,783 bytes +276 bstorm_: umounted /mnt/nfs/dumps-labstore1006.wikimedia.org for T217473
  • curprev 20:54, 3 March 2019imported>Stashbot 65,507 bytes +79 andrewbogott: cleaning out /tmp on tools-exec-1412
  • curprev 19:36, 28 February 2019imported>Stashbot 65,428 bytes +234 zhuyifei1999_: built with debuild instead T217297
  • curprev 20:41, 27 February 2019imported>Stashbot 65,194 bytes +734 andrewbogott: restarting nginx on tools-checker-01
  • curprev 20:51, 26 February 2019imported>Stashbot 64,460 bytes +223 gtirloni: reboot tools-package-builder-02 (unresponsive)
  • curprev 23:20, 25 February 2019imported>Stashbot 64,237 bytes +1,248 bstorm_: Depooled tools-sgeexec-0914 and tools-sgeexec-0915 for T217066
  • curprev 16:29, 22 February 2019imported>Stashbot 62,989 bytes +213 gtirloni: upgraded and rebooted tools-puppetmaster-01 (new kernel)
  • curprev 09:59, 21 February 2019imported>Stashbot 62,776 bytes +61 gtirloni: upgraded all packages in all stretch nodes
  • curprev 00:12, 21 February 2019imported>Stashbot 62,715 bytes +1,098 zhuyifei1999_: forcing puppet run on tools-k8s-master-01
  • curprev 01:49, 19 February 2019imported>Stashbot 61,617 bytes +118 bd808: Revoked Toolforge project membership for user DannyS712 (T215092)
  • curprev 20:45, 18 February 2019imported>Stashbot 61,499 bytes +370 gtirloni: upgraded and rebooted tools-sgebastion-07 (login-stretch)
  • curprev 22:23, 17 February 2019imported>Stashbot 61,129 bytes +321 zhuyifei1999_: uncordon tools-worker-1010.tools.eqiad.wmflabs
  • curprev 05:00, 16 February 2019imported>Stashbot 60,808 bytes +1,745 zhuyifei1999_: fixed by restarting flannel. another puppet run simply started kubelet
  • curprev 21:57, 14 February 2019imported>Stashbot 59,063 bytes +1,078 bd808: Deleted old tools-proxy-02 instance
  • curprev 19:16, 13 February 2019imported>Stashbot 57,985 bytes +680 andrewbogott: deleting tools-sgewebgrid-generic-0901, tools-sgewebgrid-lighttpd-0901, tools-sgebastion-06
  • curprev 01:24, 12 February 2019imported>Stashbot 57,305 bytes +153 bd808: Stopped maintain-kubeusers, edited /etc/kubernetes/tokenauth, restarted maintain-kubeusers (T215704)
  • curprev 22:57, 11 February 2019imported>Stashbot 57,152 bytes +1,621 bd808: Shutoff tools-webgrid-lighttpd-14{01,13,24,26,27,28} via Horizon UI
  • curprev 19:17, 8 February 2019imported>Stashbot 55,531 bytes +434 hauskatze: Stopped webservice of `tools.sulinfo` which redirects to `tools.quentinv57-tools` which is also unavalaible
  • curprev 01:07, 8 February 2019imported>Stashbot 55,097 bytes +351 bd808: Creating tools-sgebastion-07
  • curprev 13:20, 4 February 2019imported>Stashbot 54,746 bytes +395 arturo: T215154 another reboot for tools-sgebastion-06
  • curprev 23:54, 30 January 2019imported>Stashbot 54,351 bytes +70 gtirloni: cleared apt cache on sge* hosts
  • curprev 20:50, 25 January 2019imported>Stashbot 54,281 bytes +336 bd808: Deployed new tcl/web Kubernetes image based on Debian Stretch (T214668)
  • curprev 11:09, 24 January 2019imported>Stashbot 53,945 bytes +341 arturo: T213421 delete tools-services-01/02
  • curprev 22:18, 23 January 2019imported>Stashbot 53,604 bytes +679 bd808: Building new tools-sgewebgrid-lighttpd-0904 instance using Stretch base image (T214519)
  • curprev 20:21, 22 January 2019imported>Stashbot 52,925 bytes +326 gtirloni: published new docker images (all)
  • curprev 21:22, 18 January 2019imported>Stashbot 52,599 bytes +102 bd808: Forcing php-igbinary update via clush for T213666
  • curprev 23:37, 17 January 2019imported>Stashbot 52,497 bytes +574 bd808: Shutdown tools-package-builder-01. Use tools-package-builder-02 instead!
  • curprev 17:29, 16 January 2019imported>Stashbot 51,923 bytes +476 andrewbogott: depooling and moving tools-sgeexec-0904 tools-sgeexec-0906 tools-sgewebgrid-lighttpd-0904
  • curprev 21:02, 15 January 2019imported>Stashbot 51,447 bytes −178,393 bstorm_: restarting webservicemonitor on tools-services-02 -- acting funny
  • curprev 11:55, 11 January 2019imported>Stashbot 229,840 bytes +296 arturo: T213418 shutdown tools-docker-builder-05, will give a grace period before deleting the VM
  • curprev 22:45, 10 January 2019imported>Stashbot 229,544 bytes +292 bstorm_: T213357 - Added 24 lighttpd nodes tot he new grid
  • curprev 00:12, 10 January 2019imported>Stashbot 229,252 bytes +432 bstorm_: T213353 Added 36 exec nodes to the new grid
  • curprev 17:21, 7 January 2019imported>Stashbot 228,820 bytes +325 bstorm_: T67777 - set the max_u_jobs global grid config setting to 50 in the new grid
  • curprev 22:06, 6 January 2019imported>Stashbot 228,495 bytes +103 bd808: Added floating ip to tools-sgebastion-06 (T212360)
  • curprev 23:54, 5 January 2019imported>Stashbot 228,392 bytes +173 bd808: Manually installed php-mbstring on tools-sgebastion-06. Gerrit patch submitted to install it on the rest of the Son of Grid Engine nodes.
  • curprev 21:37, 4 January 2019imported>Stashbot 228,219 bytes +114 bd808: Truncated /data/project/.system/accounting after archiving ~30 days of history
  • curprev 21:03, 3 January 2019imported>Stashbot 228,105 bytes +214 bd808: Enabled Puppet on tools-proxy-02
  • curprev 16:29, 21 December 2018imported>Stashbot 227,891 bytes +126 andrewbogott: migrating tools-exec-1416 to labvirt1004
  • curprev 00:35, 21 December 2018imported>Stashbot 227,765 bytes +615 bd808: Installed tools-manifest 0.14 for T212390
  • curprev 22:16, 17 December 2018imported>Stashbot 227,150 bytes +478 bstorm_: Adding a bunch of hiera values and prefixes for the new grid - T212153
  • curprev 13:19, 11 December 2018imported>Stashbot 226,672 bytes +84 gtirloni: Removed BigBrother (T208357)
  • curprev 12:17, 5 December 2018imported>Stashbot 226,588 bytes +129 gtirloni: remoted node tools-worker-1029.tools.eqiad.wmflabs from cluster (T196973)
  • curprev 22:47, 4 December 2018imported>Stashbot 226,459 bytes +262 bstorm_: gtirloni added back main floating IP for tools-k8s-master-01 and removed unnecessary ones to stop k8s outage T164123
  • curprev 02:44, 1 December 2018imported>Stashbot 226,197 bytes +88 gtirloni: deleted instance tools-exec-gift-trusty-01 (T194615)
  • curprev 00:10, 1 December 2018imported>Stashbot 226,109 bytes +402 andrewbogott: moving tools-worker-1020 and tools-worker-1022 to different labvirts
  • curprev 17:49, 27 November 2018imported>Stashbot 225,707 bytes +121 bstorm_: restarted maintain-kubeusers just in case it had any issues reconnecting to toolsdb
  • curprev 17:39, 26 November 2018imported>Stashbot 225,586 bytes +348 gtirloni: updated tools-manifest package on tools-services-01/02 to version 0.12 (10->60 seconds sleep time) (T210190)
  • curprev 23:05, 20 November 2018imported>Stashbot 225,238 bytes +451 gtirloni: Published stretch-tools and stretch-toolsbeta aptly repositories individually on tools-services-01
  • curprev 21:16, 16 November 2018imported>Stashbot 224,787 bytes +435 bd808: Ran grid engine orphan process kill script from T153281. Only 3 orphan php-cgi processes belonging to iluvatarbot found.
  • curprev 17:29, 14 November 2018imported>Stashbot 224,352 bytes +214 andrewbogott: moving tools-worker-1027 to labvirt1008
  • curprev 17:40, 13 November 2018imported>Stashbot 224,138 bytes +717 arturo: remove misctools 1.31 and jobutils 1.30 from the stretch-tools repo (T207970)
  • curprev 18:12, 8 November 2018imported>Stashbot 223,421 bytes +861 gtirloni: cleaned up old tmp files on tools-bastion-02
  • curprev 10:37, 7 November 2018imported>Stashbot 222,560 bytes +112 gtirloni: removed invalid apt.conf.d file from all hosts (T110055)
  • curprev 18:11, 2 November 2018imported>Stashbot 222,448 bytes +174 arturo: T206223 some disturbances due to the certificate renewal
  • curprev 18:02, 31 October 2018imported>Stashbot 222,274 bytes +163 gtirloni: truncated big .err and error.log files
  • curprev 17:00, 29 October 2018imported>Stashbot 222,111 bytes +108 bd808: Ran grid engine orphan process kill script from T153281
  • curprev 10:34, 26 October 2018imported>Stashbot 222,003 bytes +236 arturo: T207970 added misctools 1.31 and jobutils 1.30 to stretch-tools aptly repo
  • curprev 14:17, 19 October 2018imported>Stashbot 221,767 bytes +65 andrewbogott: moving tools-clushmaster-01 to labvirt1004
  • curprev 00:29, 19 October 2018imported>Stashbot 221,702 bytes +321 andrewbogott: migrating tools-exec-1411 and tools-exec-1410 off of cloudvirt1017
  • curprev 15:13, 16 October 2018imported>Stashbot 221,381 bytes +205 bd808: (repost for gtirloni) T186571 removed legofan4000 user from project-tools group (leftover from T165624 legofan4000->macfan4000 rename)
  • curprev 21:57, 7 October 2018imported>Stashbot 221,176 bytes +380 zhuyifei1999_: restarted maintain-kubeusers on tools-k8s-master-01 T194859
  • curprev 12:35, 21 September 2018imported>Stashbot 220,796 bytes +431 arturo: cleanup stalled apt preference files (pinning) in tools-clushmaster-01
  • curprev 09:13, 17 September 2018imported>Stashbot 220,365 bytes +128 arturo: T204481 aborrero@tools-mail:~$ sudo exiqgrep -i | xargs sudo exim -Mrm
  • curprev 11:22, 14 September 2018imported>Stashbot 220,237 bytes +246 arturo: T204267 stop the corhist tool (k8s) because is hammering the wikidata API
  • curprev 10:35, 8 September 2018imported>Stashbot 219,991 bytes +118 gtirloni: restarted cron and truncated /var/log/exim4/paniclog (T196137)
  • curprev 05:07, 7 September 2018imported>Stashbot 219,873 bytes +88 legoktm: uploaded/imported toollabs-webservice_0.42_all.deb
  • curprev 23:40, 27 August 2018imported>Stashbot 219,785 bytes +320 bd808: `# exec-manage repool tools-webgrid-generic-1402.eqiad.wmflabs` T202932
  • curprev 13:02, 22 August 2018imported>Stashbot 219,465 bytes +236 arturo: I used this command: `sudo exim -bp | sudo exiqgrep -i | xargs sudo exim -Mrm`
  • curprev 09:12, 19 August 2018imported>Stashbot 219,229 bytes +140 legoktm: rebuilding python/base k8s images for https://gerrit.wikimedia.org/r/453665 (T202218)
  • curprev 21:02, 14 August 2018imported>Stashbot 219,089 bytes +182 legoktm: rebuilt php7.2 docker images for https://gerrit.wikimedia.org/r/452755
  • curprev 23:31, 13 August 2018imported>Stashbot 218,907 bytes +234 legoktm: rebuilding docker images for webservice upgrade
  • curprev 10:40, 9 August 2018imported>Stashbot 218,673 bytes +293 arturo: T201602 upgrade packages from jessie-backports (excluding python-designateclient)
  • curprev 10:01, 8 August 2018imported>Stashbot 218,380 bytes +192 zhuyifei1999_: building & publishing toollabs-webservice 0.40 deb, and all Docker images T156626 T148872 T158244
  • curprev 12:33, 6 August 2018imported>Stashbot 218,188 bytes +98 arturo: T197176 installing texlive-full in toolforge
  • curprev 14:31, 1 August 2018imported>Stashbot 218,090 bytes +145 andrewbogott: temporarily depooling tools-exec-1409, 1410, 1414, 1419, 1427, 1428 to try to give labvirt1009 a break
  • curprev 20:33, 30 July 2018imported>Stashbot 217,945 bytes +186 bd808: Started rebuilding all Kubernetes Docker images to pick up latest apt updates
  • curprev 04:52, 27 July 2018imported>Stashbot 217,759 bytes +108 zhuyifei1999_: rebuilding python/base docker container T190274
  • curprev 19:02, 25 July 2018imported>Stashbot 217,651 bytes +175 chasemp: tools-worker-1004 reboot
  • curprev 13:24, 18 July 2018imported>Stashbot 217,476 bytes +506 arturo: upgrading packages from `stretch-wikimedia` T199905
  • curprev 18:15, 30 June 2018imported>Stashbot 216,970 bytes +486 chicocvenancio: pushed new config to PAWS to fix dumps nfs mountpoint
  • curprev 17:41, 29 June 2018imported>Stashbot 216,484 bytes +431 bd808: Rescheduling continuous jobs away from tools-exec-1408 where load is high
  • curprev 19:50, 28 June 2018imported>Stashbot 216,053 bytes +640 chasemp: tools-clushmaster-01:~$ clush -w @all 'sudo umount -fl /mnt/nfs/dumps-labstore1006.wikimedia.org'
  • curprev 13:18, 21 June 2018imported>Stashbot 215,413 bytes +109 chasemp: tools-bastion-03:~# bash -x /data/project/paws/paws-userhomes-hack.bash
  • curprev 15:09, 20 June 2018imported>Stashbot 215,304 bytes +138 bd808: Killed orphan processes on webgrid nodes (T182070); most owned by jembot and croptool
  • curprev 14:20, 14 June 2018imported>Stashbot 215,166 bytes +102 chasemp: timeout 180s bash -x /data/project/paws/paws-userhomes-hack.bash
  • curprev 10:11, 11 June 2018imported>Stashbot 215,064 bytes +279 arturo: T196137 `aborrero@tools-clushmaster-01:~$ clush -w@all 'sudo wc -l /var/log/exim4/paniclog 2>/dev/null | grep -v ^0 && sudo rm -rf /var/log/exim4/paniclog && sudo service prometheus-node-exporter restart || true'`
  • curprev 07:46, 8 June 2018imported>Stashbot 214,785 bytes +172 arturo: T196137 more rootspam today, restarting again `prometheus-node-exporter` and force rotating exim4 paniclog in 12 nodes
  • curprev 11:01, 7 June 2018imported>Stashbot 214,613 bytes +218 arturo: T196137 force rotate all exim panilog files to avoid rootspam `aborrero@tools-clushmaster-01:~$ clush -w@all 'sudo logrotate /etc/logrotate.d/exim4-paniclog -f -v'`
  • curprev 22:00, 6 June 2018imported>Stashbot 214,395 bytes +620 bd808: Scripting a restart of webservice for tools that are still in CrashLoopBackOff state after 2nd attempt (T196589)
  • curprev 18:02, 5 June 2018imported>Stashbot 213,775 bytes +369 bd808: Forced puppet run on tools-bastion-03 to re-enable logins by dubenben (T196486)
  • curprev 10:28, 4 June 2018imported>Stashbot 213,406 bytes +102 arturo: T196006 installing sqlite3 package in exec nodes
  • curprev 10:19, 3 June 2018imported>Stashbot 213,304 bytes +211 zhuyifei1999_: Grid is full. qdel'ed all jobs belonging to tools.dibot except lighttpd, and tools.mbh that has a job name starting 'comm_delin', 'delfilexcl' T195834
  • curprev 11:31, 31 May 2018imported>Stashbot 213,093 bytes +237 zhuyifei1999_: building & pushing python/web docker image T174769
  • curprev 10:52, 30 May 2018imported>Stashbot 212,856 bytes +425 zhuyifei1999_: undid both changes to tools-bastion-05
  • curprev 12:09, 28 May 2018imported>Stashbot 212,431 bytes +250 arturo: T194665 adding mono packages to apt.wikimedia.org for jessie-wikimedia and stretch-wikimedia
  • curprev 05:31, 25 May 2018imported>Stashbot 212,181 bytes +181 zhuyifei1999_: Edit /data/project/.system/gridengine/default/common/sge_request, h_vmem 256M -> 512M, release precise -> trusty T195558
  • curprev 11:53, 22 May 2018imported>Stashbot 212,000 bytes +157 arturo: running puppet to deploy https://gerrit.wikimedia.org/r/#/c/433996/ for T194665 (mono framework update)
  • curprev 16:36, 18 May 2018imported>Stashbot 211,843 bytes +77 bd808: Restarted bigbrother on tools-services-02
  • curprev 21:17, 16 May 2018imported>Stashbot 211,766 bytes +104 zhuyifei1999_: maintain-kubeusers on stuck in infinite sleeps of 10 seconds
  • curprev 04:28, 15 May 2018imported>Stashbot 211,662 bytes +371 andrewbogott: depooling, rebooting, re-pooling tools-exec-1414. It's hanging for unknown reasons.
  • curprev 10:09, 12 May 2018imported>Stashbot 211,291 bytes +129 Hauskatze: tools.quentinv57-tools@tools-bastion-02:~$ webservice stop | T194343
  • curprev 14:34, 11 May 2018imported>Stashbot 211,162 bytes +406 andrewbogott: repooling labvirt1001 tools instances
  • curprev 18:55, 10 May 2018imported>Stashbot 210,756 bytes +114 andrewbogott: depooling, rebooting, repooling tools-exec-1401 to test a kernel update
  • curprev 21:11, 9 May 2018imported>Stashbot 210,642 bytes +70 Reedy: Added Tim Starling as member/admin
  • curprev 21:02, 7 May 2018imported>Stashbot 210,572 bytes +185 zhuyifei1999_: re-building all docker images T190893
  • curprev 00:25, 7 May 2018imported>Stashbot 210,387 bytes +160 zhuyifei1999_: `renice -n 15 -p 28865` (`tar cvzf` of `tools.giftbot`) on tools-bastion-02, been hogging the NFS IO for a few hours
  • curprev 23:37, 5 May 2018imported>Stashbot 210,227 bytes +126 zhuyifei1999_: regenerate k8s creds for tools.zhuyifei1999-test because I messed up while testing
  • curprev 14:48, 3 May 2018imported>Stashbot 210,101 bytes +146 arturo: uploaded a new ruby docker image to the registry with the libmysqlclient-dev package T192566
  • curprev 14:05, 1 May 2018imported>Stashbot 209,955 bytes +114 andrewbogott: moving tools-webgrid-lighttpd-1406 to labvirt1016 (routine rebalancing)
  • curprev 18:26, 27 April 2018imported>Stashbot 209,841 bytes +267 zhuyifei1999_: `$ write` doesn't seem to be able to write to their tmux tty, so echoed into their pts directly: `# echo -e '\n\n[...]\n' > /dev/pts/81`
  • curprev 14:41, 23 April 2018imported>Stashbot 209,574 bytes +170 zhuyifei1999_: `chown tools.pywikibot:tools.pywikibot /shared/pywikipedia/` Prior owner: tools.russbot:project-tools T192732
  • curprev 13:07, 22 April 2018imported>Stashbot 209,404 bytes +219 bd808: Kill orphan php-cgi processes across the job grid via clush -w @exec -w @webgrid -b 'ps axwo user:20,ppid,pid,cmd | grep -E " 1 " | grep php-cgi | xargs sudo kill -9'`
  • curprev 17:51, 15 April 2018imported>Stashbot 209,185 bytes +215 zhuyifei1999_: forced puppet puns across tools-elastic-0[1-3] T192224
  • curprev 13:25, 11 April 2018imported>Stashbot 208,970 bytes +103 chasemp: cleanup exim frozen messages in an effort to aleve queue pressure
  • curprev 16:30, 6 April 2018imported>Stashbot 208,867 bytes +334 chicocvenancio: killed job in bastion, tools.gpy affected
  • curprev 18:46, 5 April 2018imported>Stashbot 208,533 bytes +76 chicocvenancio: killed wget that was hogging io
  • curprev 20:09, 29 March 2018imported>Stashbot 208,457 bytes +230 chicocvenancio: killed interactive processes in tools-bastion-03
  • curprev 13:06, 28 March 2018imported>Stashbot 208,227 bytes +133 zhuyifei1999_: SIGTERM PID 30633 on tools-bastion-03 (tool 3d2commons's celery). Please run this on grid
  • curprev 21:35, 26 March 2018imported>Stashbot 208,094 bytes +108 bd808: clush -w @exec -w @webgrid -b 'sudo find /tmp -type f -atime +1 -delete'
  • curprev 23:26, 23 March 2018imported>Stashbot 207,986 bytes +207 bd808: clush -w @exec -w @webgrid -b 'sudo find /tmp -type f -atime +1 -delete'
  • curprev 22:04, 22 March 2018imported>Stashbot 207,779 bytes +371 bd808: Forced puppet run on tools-proxy-02 for T130748
  • curprev 17:50, 21 March 2018imported>Stashbot 207,408 bytes +230 bd808: Cleaned up stale /project/.system/bigbrother.scoreboard.* files from labstore1004
  • curprev 08:28, 20 March 2018imported>Stashbot 207,178 bytes +163 zhuyifei1999_: unmount dumps & remount on tools-bastion-02 (can someone clush this?) T189018 T190126
  • curprev 11:02, 19 March 2018imported>Stashbot 207,015 bytes +131 arturo: reboot tools-exec-1408, to balance load. Server is unresponsive due to high load by some tools
  • curprev 22:44, 16 March 2018imported>Stashbot 206,884 bytes +277 zhuyifei1999_: suspended process 22825 (BotOrderOfChapters.exe) on tools-bastion-03. Threads continuously going to D-state & R-state. Also sent message via $ write on pts/10
  • curprev 16:56, 15 March 2018imported>Stashbot 206,607 bytes +122 zhuyifei1999_: granted elasticsearch credentials to tools.denkmalbot T185624
  • curprev 20:57, 14 March 2018imported>Stashbot 206,485 bytes +503 bd808: Upgrading elasticsearch on tools-elastic-01 (T181531)
  • curprev 20:09, 12 March 2018imported>Stashbot 205,982 bytes +976 madhuvishy: Run clush -w @all -b 'sudo umount /mnt/nfs/labstore1003-scratch && sudo mount -a' to remount scratch across all of tools
  • curprev 16:05, 8 March 2018imported>Stashbot 205,006 bytes +257 chasemp: tools-clushmaster-01:~$ clush -g all 'sudo puppet agent --test'
  • curprev 20:42, 7 March 2018imported>Stashbot 204,749 bytes +360 chicocvenancio: killed io intensive recursive zip of huge folder
  • curprev 16:15, 6 March 2018imported>Stashbot 204,389 bytes +1,691 madhuvishy: Reboot tools-docker-registry-02 T189018
  • curprev 18:56, 5 March 2018imported>Stashbot 202,698 bytes +695 zhuyifei1999_: also published jobutils_1.30_all.deb
  • curprev 13:41, 2 March 2018imported>Stashbot 202,003 bytes +115 arturo: doing some testing with puppet classes in tools-package-builder-01 via horizon
  • curprev 13:27, 1 March 2018imported>Stashbot 201,888 bytes +86 arturo: deploy https://gerrit.wikimedia.org/r/#/c/415057/
  • curprev 17:37, 27 February 2018imported>Stashbot 201,802 bytes +194 chasemp: add chico as admin to toolsbeta
  • curprev 19:18, 26 February 2018imported>Stashbot 201,608 bytes +221 chasemp: tools-clushmaster-01:~$ clush -w @all "sudo puppet agent --test"
  • curprev 19:04, 25 February 2018imported>Stashbot 201,387 bytes +117 chicocvenancio: killed jobs in tools-bastion-03, wrote notice to tools owners' terminals
  • curprev 19:11, 23 February 2018imported>Stashbot 201,270 bytes +340 arturo: enable puppet in tools-proxy-01
  • curprev 16:31, 22 February 2018imported>Stashbot 200,930 bytes +90 bstorm_: Enabled puppet on tools-static-12 as the test server
  • curprev 19:02, 21 February 2018imported>Stashbot 200,840 bytes +1,143 bstorm_: disabled puppet on tools-static-* pending change 413197
  • curprev 12:42, 20 February 2018imported>Stashbot 199,697 bytes +112 arturo: upgrading tools-flannel-etcd-01
  • curprev 19:13, 19 February 2018imported>Stashbot 199,585 bytes +437 arturo: upgrade all packages of tools-services-01
  • curprev 18:21, 16 February 2018imported>Stashbot 199,148 bytes +937 arturo: upgrading tools-proxy-01 and tools-paws-master-01, same as others
  • curprev 13:54, 15 February 2018imported>Stashbot 198,211 bytes +672 arturo: cleanup ferm (deinstall) in tools-services-01 for T187435
  • curprev 13:09, 14 February 2018imported>Stashbot 197,539 bytes +236 arturo: the reboot was OK, the server seems working and kubectl sees all the pods running in the deployment (T187315)
  • curprev 01:28, 11 February 2018imported>Stashbot 197,303 bytes +367 zhuyifei1999_: `# find /home/ -maxdepth 1 -perm -o+w \! -uid 0 -exec chmod -v o-w {} \;` Affected: only /home/tr8dr, mode 0777 -> 0775
  • curprev 10:35, 9 February 2018imported>Stashbot 196,936 bytes +989 arturo: deploy https://gerrit.wikimedia.org/r/#/c/409226/ T179343 T182562 T186846
  • curprev 18:38, 8 February 2018imported>Stashbot 195,947 bytes +1,291 arturo: aborrero@tools-k8s-master-01:~$ sudo kubectl uncordon tools-worker-1002.tools.eqiad.wmflabs
  • curprev 13:15, 6 February 2018imported>Stashbot 194,656 bytes +302 arturo: deploy https://gerrit.wikimedia.org/r/#/c/408529/ to tools-services-01
  • curprev 17:58, 5 February 2018imported>Stashbot 194,354 bytes +325 arturo: publishing/unpublishing trusty-tools repo in tools-services-01 to address T186539
  • curprev 01:04, 3 February 2018imported>Stashbot 194,029 bytes +129 chicocvenancio: killed io intensive process in bastion-03 "vltools python3 ./broken_ref_anchors.py"
  • curprev 22:54, 31 January 2018imported>Stashbot 193,900 bytes +67 chasemp: add bstorm to sudoers as root
  • curprev 20:02, 29 January 2018imported>Stashbot 193,833 bytes +168 chasemp: add zhuyifei1999_ tools root for T185577
  • curprev 22:49, 28 January 2018imported>Stashbot 193,665 bytes +165 chicocvenancio: killed compromised session generating miner processes
  • curprev 00:55, 27 January 2018imported>Stashbot 193,500 bytes +209 arturo: at tools-static-11 the kernel OOM killer stopped git gc at about 20% :-(
  • curprev 23:47, 25 January 2018imported>Stashbot 193,291 bytes +422 arturo: fix last deprecation warnings in tools-elastic-03, tools-elastic-02, tools-proxy-01 and tools-proxy-02 by replacing by hand configtimeout with http_configtimeout in /etc/puppet/puppet.conf
  • curprev 19:41, 23 January 2018imported>Stashbot 192,869 bytes +207 madhuvishy: Add bstorm to project admins
  • curprev 18:32, 22 January 2018imported>Stashbot 192,662 bytes +538 arturo: T181948 T185314 deploying jobutils and misctools v1.28 in the cluster
  • curprev 17:32, 19 January 2018imported>Stashbot 192,124 bytes +356 arturo: T185314 deploying new version of jobutils 1.27
  • curprev 16:11, 18 January 2018imported>Stashbot 191,768 bytes +877 arturo: aborrero@tools-clushmaster-01:~$ sudo aptitude purge vblade vblade-persist runit (for something similar to T182781)
  • curprev 18:48, 17 January 2018imported>Stashbot 190,891 bytes +692 arturo: aborrero@tools-clushmaster-01:~$ clush -w @all 'apt-show-versions | grep upgradeable | grep trusty-wikimedia' | tee pending-upgrades-report-trusty-wikimedia.txt
  • curprev 22:01, 16 January 2018imported>Stashbot 190,199 bytes +3,082 chasemp: qstat -explain E -xml | grep 'name' | sed 's/<name>//' | sed 's/<\/name>//' | xargs qmod -cq
  • curprev 20:33, 11 January 2018imported>Stashbot 187,117 bytes +848 andrewbogott: repooling tools-exec-1411, tools-exec-1440, tools-webgrid-lighttpd-1419, tools-webgrid-lighttpd-1420, tools-webgrid-lighttpd-1421
  • curprev 15:14, 10 January 2018imported>Stashbot 186,269 bytes +1,549 chasemp: tools-clushmaster-01:~$ clush -f 1 -w @k8s-worker "sudo puppet agent --enable && sudo puppet agent --test"
  • curprev 23:21, 9 January 2018imported>Stashbot 184,720 bytes +2,117 yuvipanda: paws new cluster master is up, re-adding nodes by executing same sequence of commands for upgrading
  • curprev 20:34, 8 January 2018imported>Stashbot 182,603 bytes +219 madhuvishy: Restart kube services and uncordon tools-worker-1001
  • curprev 00:35, 6 January 2018imported>Stashbot 182,384 bytes +1,399 madhuvishy: Run `clush -w @paws-worker -b 'sudo iptables -L FORWARD'`
  • curprev 17:24, 4 January 2018imported>Stashbot 180,985 bytes +120 andrewbogott: rebooting tools-paws-worker-1019 to verify repair of T184018
  • curprev 15:38, 3 January 2018imported>Stashbot 180,865 bytes +194 bd808: Forced Puppet run on tools-services-01
  • curprev 02:01, 31 December 2017imported>Stashbot 180,671 bytes +102 bd808: Killed some pwb.py and qacct processes running on tools-bastion-03
  • curprev 17:57, 21 December 2017imported>Stashbot 180,569 bytes +199 bd808: PAWS: deleted hub-deployment pod stuck in crashloopbackoff
  • curprev 21:27, 19 December 2017imported>Stashbot 180,370 bytes +197 chasemp: reboot tools-paws-master-01
  • curprev 12:04, 18 December 2017imported>Stashbot 180,173 bytes +621 arturo: it seems jupyterhub tries to use a database which doesn't exists: [E 2017-12-18 11:59:49.896 JupyterHub app:904] Failed to connect to db: sqlite:///jupyterhub.sqlite
  • curprev 13:55, 15 December 2017imported>Stashbot 179,552 bytes +289 arturo: same in tools-checker-02.tools.eqiad.wmflabs
  • curprev 16:58, 14 December 2017imported>Stashbot 179,263 bytes +188 arturo: running clush -w @all 'sudo puppet agent --test' from tools-clushmaster-01.eqiad.wmflabs due to https://gerrit.wikimedia.org/r/#/c/394572/ being merged
  • curprev 17:37, 13 December 2017imported>Stashbot 179,075 bytes +59 andrewbogott: upgrading puppet packages on all VMs
  • curprev 00:59, 13 December 2017imported>Stashbot 179,016 bytes +524 madhuvishy: Cordon and Drain tools-worker-1016
  • curprev 19:32, 11 December 2017imported>Stashbot 178,492 bytes +261 bd808: git gc on tools-static-11; --aggressive was killed by system (T182604)
  • curprev 15:33, 1 December 2017imported>Stashbot 178,231 bytes +218 chasemp: put the weird mess of untracked files on tools puppetmaster into stash to see what breaks as they should not be there?
  • curprev 23:23, 30 November 2017imported>Stashbot 178,013 bytes +146 bd808: Hard reboot of tools-bastion-03 via Horizon
  • curprev 20:34, 20 November 2017imported>Stashbot 177,867 bytes +94 chasemp: backup crons tools-cron-01:/var/spool/cron# cp -Rp crontabs/ /root/20112017/
  • curprev 00:52, 20 November 2017imported>Stashbot 177,773 bytes +128 andrewbogott: cherry-picking https://gerrit.wikimedia.org/r/#/c/392172/ onto the tools puppetmaster
  • curprev 21:33, 17 November 2017imported>Stashbot 177,645 bytes +232 valhallasw`cloud: also g-w'ed those files, and sent emails to all the affected users
  • curprev 17:40, 16 November 2017imported>Stashbot 177,413 bytes +291 chasemp: tools-clushmaster-01:~$ clush -w @all 'sudo puppet agent --enable && sudo puppet agent --test && sudo unattended-upgrades -d'
  • curprev 22:48, 15 November 2017imported>Stashbot 177,122 bytes +200 madhuvishy: Rebooted tools-paws-worker-1017
  • curprev 01:21, 7 November 2017imported>Stashbot 176,922 bytes +290 bd808: Removed all non-directory files from /home (via labstore1004 direct access)
  • curprev 23:48, 5 November 2017imported>Stashbot 176,632 bytes +391 bd808: Cleaned up 2 huge /tmp files left by tools.croptool (~6.5G)
  • curprev 21:19, 3 November 2017imported>Stashbot 176,241 bytes +86 bd808: Deployed misctools 1.26 (T156174)
  • curprev 16:15, 2 November 2017imported>Stashbot 176,155 bytes +71 bd808: Restarted nslcd on tools-bastion-03
  • curprev 07:11, 1 November 2017imported>Stashbot 176,084 bytes +213 madhuvishy: Clear nscd cache across all projects post labsdb dns switchover T179464
  • curprev 16:50, 31 October 2017imported>Stashbot 175,871 bytes +93 bd808: tools-bastion-03 (tools-login, login.tools) is overloaded
  • curprev 17:35, 30 October 2017imported>Stashbot 175,778 bytes +485 madhuvishy: Clear dns caches across tools hosts `sudo nscd -i hosts`
  • curprev 18:09, 24 October 2017imported>Stashbot 175,293 bytes +201 madhuvishy: Disable puppet on tools-package-builder-01 temporarily (T178920)
  • curprev 14:49, 23 October 2017imported>Stashbot 175,092 bytes +92 chasemp: wall message and scheduled reboot in 5m for bastion-03
  • curprev 21:36, 18 October 2017imported>Stashbot 175,000 bytes +275 chasemp: stop basebot -- it is going crazy and spamming email w/ failing to log to error.log. Need to figure out how to notify but it's clearly in a failure loop.
  • curprev 16:57, 12 October 2017imported>Stashbot 174,725 bytes +163 bd808: Rebuilding all Kubernetes Docker images to include toollabs-webservice 0.38
  • curprev 15:33, 6 October 2017imported>Stashbot 174,562 bytes +67 bd808: Upgrade jobutils to 1.25 (T177614)
  • curprev 00:27, 6 October 2017imported>Stashbot 174,495 bytes +142 bd808: Updated misctools to 1.24
  • curprev 15:46, 5 October 2017imported>Stashbot 174,353 bytes +169 chasemp: tools-bastion-03 has tons of local tools running long lived NFS intensive processes. I'm rebooting rather than playing whackamole.
  • curprev 19:30, 3 October 2017imported>Stashbot 174,184 bytes +103 bd808: `kubectl --namespace=prod delete pod --all` on tools-paws-master-01
  • curprev 21:46, 1 October 2017imported>Stashbot 174,081 bytes +108 madhuvishy: Cold migrating tools-clushmaster-01 from labvirt1015 to labvirt1017
  • curprev 19:49, 29 September 2017imported>Stashbot 173,973 bytes +88 andrewbogott: migration tools-clushmaster-01 to labvirt1015
  • curprev 15:14, 25 September 2017imported>Stashbot 173,885 bytes +224 andrewbogott: rebooting tools-paws-worker-1006 since I can't access it
  • curprev 16:52, 20 September 2017imported>Stashbot 173,661 bytes +922 madhuvishy: apt-get install --only-upgrade apache2; service apache2 restart on tools-puppetmaster-01
  • curprev 01:09, 13 September 2017imported>Stashbot 172,739 bytes +397 bd808: Removed user WiktCAPT from project
  • curprev 20:33, 31 August 2017imported>Stashbot 172,342 bytes +503 madhuvishy: Updated certs and ran puppet, restarted nginx on tools-proxy-* and tools-static-* (T174611)
  • curprev 19:59, 24 August 2017imported>Stashbot 171,839 bytes +140 bd808: restarted nslcd and nscd on tools-bastion-03
  • curprev 19:20, 22 August 2017imported>Stashbot 171,699 bytes +108 andrewbogott: deleted tools-puppetmaster-02, it was replaced a month ago by -01
  • curprev 18:38, 12 August 2017imported>Stashbot 171,591 bytes +61 chasemp: retart admin webservice
  • curprev 16:09, 11 August 2017imported>Stashbot 171,530 bytes +67 chasemp: qdel -f -j 7441503
  • curprev 14:59, 10 August 2017imported>Stashbot 171,463 bytes +117 chasemp: 'become stimmberechtigung && restart' && 'become intersect-contribs && restart'
  • curprev 17:28, 9 August 2017imported>Stashbot 171,346 bytes +74 chasemp: webservices restart tools.orphantalk
  • curprev 00:47, 3 August 2017imported>Stashbot 171,272 bytes +244 bd808: tools-bastion-03 not usably responsive to interactive commands; will reboot
  • curprev 15:28, 31 July 2017imported>Stashbot 171,028 bytes +82 chasemp: remove python-keystoneclient from bastion-03
  • curprev 23:27, 27 July 2017imported>Stashbot 170,946 bytes +353 bd808: Killed python procs owned by sdesabbata on tools-login that were stealing all cpu/io
  • curprev 22:33, 26 July 2017imported>Stashbot 170,593 bytes +95 chasemp: hotpatching an hiera value on tools master to see effects
  • curprev 19:48, 20 July 2017imported>Stashbot 170,498 bytes +694 bd808: Clearing all Eqw state jobs in all queues with: qstat -u '*' | grep Eqw | awk '{print $1;}' | xargs -L1 qmod -cj
  • curprev 23:52, 19 July 2017imported>Stashbot 169,804 bytes +302 bd808: Restarted cron on tools-cron-01; toolschecker job showing user not found errors
  • curprev 19:51, 18 July 2017imported>Stashbot 169,502 bytes +112 andrewbogott: enabling puppet on tools-proxy-02. I don't know why it was disabled.
  • curprev 01:43, 17 July 2017imported>Stashbot 169,390 bytes +182 bd808: Uncordoned tools-worker-1020 after it deleted pods with local storage that were filling the entire disk
  • curprev 21:59, 13 July 2017imported>Stashbot 169,208 bytes +376 bd808: Elasticsearch cluster upgraded to 5.3.2
  • curprev 15:46, 12 July 2017imported>Stashbot 168,832 bytes +122 chasemp: push out puppet run across tools
  • curprev 18:26, 7 July 2017imported>Stashbot 168,710 bytes +88 bd808: Forced puppet runs on tools-redis-* for security fix
  • curprev 04:26, 3 July 2017imported>Stashbot 168,622 bytes +224 bd808: cdnjs on tools-static-10 is up to date
  • curprev 19:40, 1 July 2017imported>Stashbot 168,398 bytes +175 bd808: Disabled puppet on tools-k8s-master-01 to try and fix maintain-kubeusers
  • curprev 01:33, 30 June 2017imported>Stashbot 168,223 bytes +144 chasemp: time for i in `cat tools-hosts`; do ssh -i ~/.ssh/labs_root_id_rsa root@$i.eqiad.wmflabs 'hostname -f; uptime; tc-setup'; done
  • curprev 01:29, 30 June 2017imported>Stashbot 168,079 bytes +1,063 andrewbogott: rebooting tools-cron-01
  • curprev 21:32, 27 June 2017imported>Stashbot 167,016 bytes +128 andrewbogott: moving all tools nodes to new puppetmaster, tools-puppetmaster-01.tools.eqiad.wmflabs
  • curprev 15:13, 25 June 2017imported>Stashbot 166,888 bytes +78 madhuvishy: Restarted webservice on tools.fatameh
  • curprev 16:01, 24 June 2017imported>Stashbot 166,810 bytes +133 bd808: Created and provisioned elasticsearch password for tools.wmde-uca-test (T167971)
  • curprev 20:20, 23 June 2017imported>Stashbot 166,677 bytes +175 bd808: Reindexing various elasticsearch indexes created before we upgraded to v2.x
  • curprev 17:03, 22 June 2017imported>Stashbot 166,502 bytes +287 bd808: Rolled back attempt at Elasticsearch upgrade. Indices need to be rebuilt with 2.x before 5.x can be installed. T164842
  • curprev 00:12, 22 June 2017imported>Stashbot 166,215 bytes +3,075 bd808: Set ownership and permissions on $HOME/.kube for all tools (T165875)
  • curprev 22:09, 14 June 2017imported>Stashbot 163,140 bytes +83 bd808: Restarted apache2 proc on tools-puppetmaster-02
  • curprev 18:14, 8 June 2017imported>Stashbot 163,057 bytes +369 madhuvishy: Also delete from /tmp on tools-webgrid-lighttpd-1411 xvfb-run.*, calibre_* and ws-*.epub
  • curprev 19:05, 7 June 2017imported>Stashbot 162,688 bytes +94 madhuvishy: Killed scp job run by user torin8 on tools-bastion-02
  • curprev 20:30, 6 June 2017imported>Stashbot 162,594 bytes +142 chasemp: rebooting tools-bastion-02 as unresponsive (up 76 days and lots of seemingly left behind things running)
  • curprev 23:44, 5 June 2017imported>Stashbot 162,452 bytes +390 bd808: Deleted tools.iabot crontab that somehow got locally installed on tools-exec-1412 on 2017-05-24T20:55Z
  • curprev 15:15, 1 June 2017imported>Stashbot 162,062 bytes +124 andrewbogott: depooling/rebooting/repooling tools-exec-1403 as part of old kernel-purge testing
  • curprev 19:29, 31 May 2017imported>Stashbot 161,938 bytes +668 bd808: Rebuiding all Docker images to pick up toollabs-webservice v0.37 (T163355)
  • curprev 22:32, 30 May 2017imported>Stashbot 161,270 bytes +1,004 andrewbogott: migrating tools-webgrid-lighttpd-1406, tools-exec-1410 from labvirt1006 to labvirt1009 to balance cpu usage
  • curprev 20:32, 26 May 2017imported>Stashbot 160,266 bytes +160,266 bd808: Added tools-webgrid-lighttpd-14{19,2[0-8]} as submit hosts
(newest | oldest) View ( | older 250) (20 | 50 | 100 | 250 | 500)