You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Release Engineering/SAL: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(thcipriani: repooling integration-slave-jessie-1003 after cleaning mvn and gradle cache)
imported>Stashbot
(andrewbogott: moving deployment-imagescaler03 to cloudvirt1029)
Line 1: Line 1:
== 2019-06-05 ==
* 19:34 andrewbogott: moving deployment-imagescaler03 to cloudvirt1029
* 17:35 James_F: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/514547
* 16:01 James_F: Reloading Zuul to deploy {{Gerrit|172a4e7886b285adb48a85dce7287dce5c3778e2}}
* 09:07 hashar: Pooled in integration-slave-docker-1058 and integration-slave-docker-1059
== 2019-06-04 ==
* 20:39 James_F: Updating jjb PHP code coverage jobs to quibble-stretch-php70:0.0.31-5 [[phab:T220917|T220917]]
* 20:31 James_F: Updating docker-pkg files on contint1001 for quibble-stretch-php70 [[phab:T220917|T220917]]
* 17:44 James_F: Updating jjb PHP code coverage jobs to node10 quibble [[phab:T224983|T224983]]
* 17:31 James_F: Reloading Zuul to make mwext-MobileFrontend-npm-run-lint-modules-docker non-voting [[phab:T224997|T224997]]
* 16:10 hashar: Deleting  integration-slave-docker-1021 and integration-slave-docker-1049  / too small disk (20G partition) and not enough ram (2G) # [[phab:T221872|T221872]]
* 13:24 hashar: Update all selenium-daily* jobs to use NodeJS 10 instead of NodeJS 6. [[phab:T217545|T217545]]
* 13:01 hashar: Building docker-registry.discovery.wmnet/releng/node10-test-browser:0.6.0 # [[phab:T217545|T217545]]
* 11:32 hashar: Upgrading Jenkins Pipeline plugins
* 11:31 hashar: Upgrading Jenkins Warnings Next Generation Plugin # [[phab:T224745|T224745]]
* 11:30 hashar: Upgrading Jenkins BlueOcean and all its dependencies
== 2019-06-03 ==
* 19:32 James_F: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/514089
* 19:29 James_F: Generated mwgate-node10-docker and deployed via jjb.
* 18:34 hashar: hswitch most Quibble jobs to node 10 [[phab:T222406|T222406]] - ttps://gerrit.wikimedia.org/r/#/c/integration/config/+/514034/ [[phab:T222406|T222406]]
* 17:40 James_F: Reloading Zuul to switch most Quibble jobs to node 10 [[phab:T222406|T222406]]
* 15:58 hashar: Deleting integration-slave-docker-1055 and integration-slave-docker-1056 . CPU is way too slow [[phab:T223971|T223971]]
* 15:46 hashar: reduce number of executors on all integration slave docker, they are somehow starving on CPU and/or IO when lot of mediawiki builds are running in parallel
* 14:38 James_F: hashar and I are temporarily disabling running selenium tests in CI. See [[phab:T211784|T211784]] [[phab:T222406|T222406]] for more details.
== 2019-06-01 ==
* 22:54 James_F: Reloading Zuul to re-add Kartographer dependency for JsonConfig [[phab:T224785|T224785]]
* 21:14 James_F: Reloading Zuul to add phan for CentralNotice and add DannyS712 to whitelist
== 2019-05-31 ==
* 23:34 James_F: Reloading Zuul to remove Kartographer dependency for JsonConfig [[phab:T224785|T224785]]
* 22:12 James_F: Reloading Zuul to add phan for LdapAuthentication
* 21:18 James_F: Reloading Zuul to add dependencies for TimedMediaHandler, ReadingLists, JsonConfig, and FlaggedRevs.
* 21:13 James_F: Reloading Zuul to add phan for Collection, ContentTranslation, and Jade
* 19:43 James_F: Reloading Zuul to add phan for Math(s)
* 19:31 James_F: Reloading Zuul to add phan for MobileFrontend
* 19:20 James_F: Reloading Zuul to add phan for Sentry
* 18:35 James_F: Reloading Zuul to deploy CentralNotice dependency to EventBus
* 18:17 James_F: Reloading Zuul to deploy extra dependencies for MobileFrontend
* 18:10 James_F: Reloading Zuul to deploy phan for cldr and GlobalPreferences
* 18:02 James_F: Reloading Zuul to deploy {{Gerrit|I0ee1c8166}}
* 17:56 James_F: Reloading Zuul to deploy {{Gerrit|I9c115b1f6}} and {{Gerrit|I0d98f01d}}
== 2019-05-30 ==
* 23:19 James_F: Pruned releng/quibble-fresnel:0.0.31-3 {{Gerrit|8ca03484a2c3}} from contint1001 to save space
* 23:02 James_F: Updating docker-pkg files on contint1001 for +quibble-fresnel (0.0.31-4)
* 21:33 Krinkle: Jenkins admin says ""Your Jenkins data directory /var/lib/jenkins (AKA JENKINS_HOME) is almost full. You should act on it before it gets completely full.""
* 21:27 greg-g: back in business (ugh)
* 21:21 greg-g: doing the zuul deadlock dance again
* 20:51 legoktm: deploying https://gerrit.wikimedia.org/r/513168
* 02:48 legoktm: deploying https://gerrit.wikimedia.org/r/512474 https://gerrit.wikimedia.org/r/512507 https://gerrit.wikimedia.org/r/513153 https://gerrit.wikimedia.org/r/513157 https://gerrit.wikimedia.org/r/512729 https://gerrit.wikimedia.org/r/513206 https://gerrit.wikimedia.org/r/513214 https://gerrit.wikimedia.org/r/513217 https://gerrit.wikimedia.org/r/513218 https://gerrit.wikimedia.org/r/513211
* 02:35 legoktm: deployed https://gerrit.wikimedia.org/r/513129 https://gerrit.wikimedia.org/r/513071
== 2019-05-29 ==
* 21:46 greg-g: doing the executor deadlock dance: https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Jenkins_execution_lock
== 2019-05-28 ==
* 12:43 paladox: delete extensions/3D repo in gerrit - [[phab:T224463|T224463]]
* 01:35 legoktm: deployed https://gerrit.wikimedia.org/r/512794
== 2019-05-26 ==
* 18:50 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/512544
== 2019-05-22 ==
* 07:21 hashar: Updating Jenkins job to have castor use rsync --delay-updates # [[phab:T203506|T203506]] {{!}} https://gerrit.wikimedia.org/r/#/c/integration/config/+/511672/
== 2019-05-21 ==
* 22:05 marxarelli: generating new blubber-pipeline-* and trigger-blubber-pipeline-* jenkins jobs defined by https://gerrit.wikimedia.org/r/c/integration/config/+/510602
* 11:23 hashar: Depooling integration-slave-docker-1055 and integration-slave-docker-1056 : CPU is too slow # [[phab:T223971|T223971]]
* 08:24 hashar: Updated phan jobs to no more install mediawiki development dependencies (potentially conflicting with the extensions code) https://gerrit.wikimedia.org/r/#/c/integration/config/+/511447/ # [[phab:T223397|T223397]]
* 07:29 legoktm: deployed https://gerrit.wikimedia.org/r/511627
* 07:10 legoktm: manually rebuilding mediawiki-core-code-coverage-docker with 5 hour timeout
== 2019-05-20 ==
* 13:11 hashar: updating phan jobs to use docker-registry.wikimedia.org/releng/mediawiki-phan:0.1.15 # [[phab:T219114|T219114]]
* 01:43 legoktm: deploying https://gerrit.wikimedia.org/r/511361
== 2019-05-19 ==
* 15:43 hashar: Purging webperformance.integration.eqiad.wmflabs  /tmp directory (inodes full)
* 15:43 hashar: Purging all images on Docker CI slaves
* 08:03 legoktm: deployed https://gerrit.wikimedia.org/r/511122
== 2019-05-18 ==
* 18:14 Amir1: cherry-picking 511078 on puppetmaster
* 12:07 Reedy: reload zuul to deploy https://gerrit.wikimedia.org/r/510967
== 2019-05-17 ==
* 22:55 paladox: created mediawiki/extensions/Scribe gerrit repo - [[phab:T223662|T223662]]
* 00:35 brennen: Updating dev-images docker-pkg files on contint1001
== 2019-05-16 ==
* 09:36 hashar: Successfully tagged docker-registry.discovery.wmnet/releng/composer-test-php72:0.1.0 # [[phab:T223428|T223428]]
* 07:35 hashar: integration-slave-jessie-1002: purging all php Debian packages and rerunning puppet. There is some package conflict somewhere :-(
== 2019-05-15 ==
* 11:55 hashar: Bring back https://integration.wikimedia.org/ci/computer/integration-castor03/ to restore the central cache behavior. In turn unblocking a wide range of builds
* 11:32 hashar: bringing back castor, integration-castor03 is back
* 09:35 hashar: Regenerating all CI jobs to disable castor saving entirely
* 02:35 Krenair: Logged into deployment-sca0[12] as root and given them the correct nameservers to try to unbreak things. [[phab:T221654|T221654]]
* 01:00 thcipriani: /usr/local/sbin/keyholder arm
== 2019-05-14 ==
* 09:43 hashar: Upgraded all tox jobs to tox 3.10.0
* 09:19 hashar: Building docker containers releng/tox-*:0.4.0
== 2019-05-13 ==
* 20:21 thcipriani: reloading zuul to deploy  https://gerrit.wikimedia.org/r/509930
* 19:47 thcipriani: reloading zuul to deploy  https://gerrit.wikimedia.org/r/502606
* afk: updating docker-pkg images on contint1001 for https://gerrit.wikimedia.org/r/508019
* 14:48 hashar: if you build Docker containers, there is a long delay between it being build/published and it actually being available https://phabricator.wikimedia.org/T222210#5176863  known issue
* 13:49 hashar: Building docker image releng/tox:0.4.0
* 10:37 hashar: Rolling CI config change https://gerrit.wikimedia.org/r/508512 which caused some patches to not be processed last week # https://wikitech.wikimedia.org/wiki/Incident_documentation/20190506-zuul / [[phab:T105474|T105474]]
== 2019-05-10 ==
* 18:28 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/509483
* 18:08 Reedy: Reloading Zuul to deploy dependancies, new tests, phan
* 17:53 Reedy: Reloading Zuul to deploy 5 more phan patches
* 17:37 Reedy: Reloading Zuul to deploy 5 phan enabling patches
* 17:23 Reedy: Reloading Zuul to deploy various phan additions and one dependancy
* 17:08 Reedy: Reloading Zuul to deploy patches adding phab
* 16:46 Reedy: Reloading Zuul to deploy various dependancy patches
== 2019-05-09 ==
* 08:17 elukey: remove mediawiki memcached nutcracker config from deployment-prep (should be unused) - [[phab:T214275|T214275]]
* 00:19 thcipriani: updating docker images on contint1001 for https://gerrit.wikimedia.org/r/508929
* 00:19 thcipriani: clean docker images on contint1001
== 2019-05-07 ==
* 22:23 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/507871
* 10:57 hashar: Upgraded Zuul to 2.5.1-wmf8 # [[phab:T105474|T105474]]  [[phab:T140297|T140297]]
== 2019-05-06 ==
* 21:57 thcipriani: update docker-pkg on contint1001 for https://gerrit.wikimedia.org/r/508086
* 19:07 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/508012
* 18:37 marxarelli: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/c/integration/config/+/508370
* 17:44 marxarelli: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/c/integration/config/+/508036
* 13:31 hashar: Jenkins: installed Least Load plugin {{!}} [[phab:T218458|T218458]]
== 2019-05-03 ==
* 20:58 thcipriani: updating docker-pkg on contint1001 for https://gerrit.wikimedia.org/r/508006
* 20:58 thcipriani: clean docker images from contint1001
* 17:02 thcipriani: reloading zuul to deploy  https://gerrit.wikimedia.org/r/507992
* 09:22 hashar: removed zuul debian package from integration-castor03
* 08:30 hashar: Building Docker image releng/gradle:0.1.0  {{!}} https://gerrit.wikimedia.org/r/#/c/integration/config/+/507872/ {{!}} [[phab:T222199|T222199]]
== 2019-05-01 ==
* 17:53 halfak: deploying ores:52e9759
== 2019-04-30 ==
* 18:00 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/494778
* 17:01 bearND: (beta): Update mobileapps to {{Gerrit|142ba30}}
* 02:01 hashar: Polled in integration-slave-docker-1055 and ntegration-slave-docker-1056
* 01:35 hashar: Deleting integration-slave-docker-1037 (bigram)  it is too slow for some reason # [[phab:T222023|T222023]]
== 2019-04-29 ==
* 14:04 godog: add dsharpe user
== 2019-04-26 ==
* 13:33 Krenair: shut off deployment-conf03 after discussion with otto.mata and elu.key - it seems ancient, broken, unused. [[phab:T218729|T218729]]
== 2019-04-25 ==
* 16:35 Krenair: shutting down deployment-ms-fe02 and deployment-poolcounter04 [[phab:T218729|T218729]]
== 2019-04-23 ==
* 15:33 paladox: merging the 2.15.13 release into stable-2.15 following https://wikitech.wikimedia.org/wiki/Gerrit#Update_our_repository
* 14:41 Krenair: Shut down deployment-ms-be03 and deployment-ms-be04 [[phab:T218729|T218729]]
* 10:22 Amir1: ores:060fc37 going beta
== 2019-04-19 ==
* 19:56 mutante: phab1003 - editing /srv/deployment/phabricator/deployment-cache/.config manually to replace tin.eqiad.wmnet with deploy1001.eqiad.wmnet to fix git cloning issue on first puppet run on new host where somehow tin.eqiad still shows up. fixes puppet run on [[phab:T221389|T221389]]
== 2019-04-18 ==
* 19:25 Reedy: reloading zuul to deploy https://gerrit.wikimedia.org/r/504824
== 2019-04-17 ==
* 17:17 andrewbogott: cherry-picking https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/504580/ to move off of soon-to-be-shutdown dns recursors
== 2019-04-16 ==
* 17:36 Lucas_WMDE: lucaswerkmeister-wmde@deployment-deploy01:~$ mwscript extensions/WikibaseQualityConstraints/maintenance/ImportConstraintEntities.php --wiki=wikidatawiki --config-format=wgConf {{!}} tee [[phab:T221107|T221107]].php
== 2019-04-15 ==
* 10:16 Amir1: ores:8f01d40 going beta
* 08:51 hashar: castor: nuked /srv/jenkins-workspace/caches/castor-mw-ext-and-skins/master/mwselenium-quibble-docker  # [[phab:T220948|T220948]]
== 2019-04-13 ==
* 21:05 Krinkle: Deleting a bunch of job config+history from Jenkins for jobs that no longer exist in JJB/Zuul. [[phab:T91410|T91410]]
* 21:00 Krinkle: Deleting a bunch of job config+history from Jenkins for jobs that no longer exist in JJB/Zuul.
* 21:00 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/503669
* 18:52 Krinkle: "Your JENKINS_HOME (/var/lib/jenkins) is almost full. "
* 18:16 Krinkle: Updating docker-pkg files on contint1001 for  https://gerrit.wikimedia.org/r/503664
* 17:44 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/502807 (postgres php72)
* 00:06 Krenair: transferred /home from deployment-cache-upload04 to deployment-cache-upload05 and shut down old one
* 00:06 Krenair: transferred /home from deployment-cumin to deployment-cumin02 and shut down old one
== 2019-04-12 ==
* 15:09 Krenair: upload traffic now through cache-upload05
== 2019-04-11 ==
* 18:29 Reedy: reloading zuul to deploy https://gerrit.wikimedia.org/r/480463
* 18:23 Reedy: reloading zuul to deploy https://gerrit.wikimedia.org/r/499334
* 17:50 Reedy: deployed 'quibble-donationinterface-REL1_31-php70-docker' to jenkins
* 16:31 Krenair: deleting deployment-db04, unused and shut down since 28th March. [[phab:T219087|T219087]]
* 16:18 mateusbs17: deployment-chromium0[1-2] Update Proton on to {{Gerrit|8988283}} ([[phab:T213362|T213362]], [[phab:T216191|T216191]], [[phab:T212322|T212322]])
== 2019-04-10 ==
* 23:32 James_F: Manually created REL1_33 branches for the core, vendor, and tarball extensions and skins. Eurgh. [[phab:T220653|T220653]]
* 22:36 James_F: Deleted faulty REL1_33 branches for the Timeless, Vector and Monobook skins; they were duplicates of the REL1_32 branches.
* 20:27 paladox: create operations/software/gerrit/plugins/MassBranchCreation repository
* 14:16 hashar: contint1001: sudo -u zuul git -C /srv/zuul/git/mediawiki/core remote prune origin  # [[phab:T220606|T220606]]
* 14:13 hashar: contint2001: sudo -u zuul git -C /srv/zuul/git/mediawiki/core remote prune origin  # [[phab:T220606|T220606]]
* 13:00 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/502790
* 12:46 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/502786
* 12:40 hashar: contint2001: stopped puppet and zuul-merger for debugging
== 2019-04-09 ==
* 23:08 Krinkle: Reloading Zuul to deploy https://phabricator.wikimedia.org/T220561
* 17:59 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/502560
* 16:38 Krenair: deleting deployment-db03, unused since [[phab:T216635|T216635]] and shut down since 24th March. [[phab:T219087|T219087]]
* 16:21 bearND: (beta): Update mobileapps to {{Gerrit|3edfcad}} ([[phab:T220045|T220045]] [[phab:T219411|T219411]] [[phab:T219667|T219667]]) - 3rd time is the charm
== 2019-04-08 ==
* 22:46 hashar: cleaned docker images on  integration-slave-docker-1030 and  integration-slave-docker-1043 :)
* 20:06 bearND: (beta): Update mobileapps to {{Gerrit|cdb9928}} ([[phab:T220045|T220045]] [[phab:T219411|T219411]] [[phab:T219667|T219667]])
* 14:47 hashar: hard rebooting integration-slave-docker-1053 OOM / deadlocked
* 13:39 hashar: Deleting integration-slave-docker-1045 again. It uses Stretch instead of Jessie
* 13:21 hashar: integration: fix cumin profiles that got renamed by {{Gerrit|9e0aa8264659799e74c3e815ea39c640e5f05393}}
* 10:15 hashar: Building Docker image  releng/quibble-stretch-php73:0.0.31-2 # [[phab:T220237|T220237]]
== 2019-04-07 ==
* 00:30 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/501793
== 2019-04-06 ==
* 02:17 legoktm: rebuilding npm-php image again https://gerrit.wikimedia.org/r/501848
* 01:52 legoktm: rebuilding npm-php image https://gerrit.wikimedia.org/r/501847
* 01:10 legoktm: deploying https://gerrit.wikimedia.org/r/501786 https://gerrit.wikimedia.org/r/501714 https://gerrit.wikimedia.org/r/501707 https://gerrit.wikimedia.org/r/501782 https://gerrit.wikimedia.org/r/501709 https://gerrit.wikimedia.org/r/500111 https://gerrit.wikimedia.org/r/500106 https://gerrit.wikimedia.org/r/500127 https://gerrit.wikimedia.org/r/500119
* 00:53 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/501813
== 2019-04-05 ==
* 23:53 Krinkle: Beta cluster puppetmaster is stalled behind origin/production as of 24 hours ago (57 patches behind) due to a local merge conflict
* 22:24 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/501789
* 22:02 legoktm: rebuilding mediawiki-phan docker image https://gerrit.wikimedia.org/r/501794
* 20:11 thcipriani: updating docker-pkg files on contint1001 for  https://gerrit.wikimedia.org/r/501465
* 08:56 hashar: Reloaded Zuul for operations/software/gerrit/plugins/barricade https://gerrit.wikimedia.org/r/#/c/integration/config/+/501507/
* 08:21 legoktm: deploying https://gerrit.wikimedia.org/r/501416
== 2019-04-04 ==
* 23:37 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/501442
* 22:27 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/500360
* 21:48 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/501428
* 20:47 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/501407
* 17:17 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/500719 (
== 2019-04-03 ==
* 06:13 hashar: gerrit: renamed group "scholarships" to "wikimedia-wikimania-scholarships". Made it owned by "Gerrit Managers" # [[phab:T218864|T218864]]
== 2019-04-02 ==
* 22:24 hauskatze: maurelio@deployment-deploy01:~$ mwscript extensions/PageAssessments/maintenance/purgeUnusedProjects.php --wiki=enwikivoyage {{!}} [[phab:T219935|T219935]]
* 09:46 hashar: Upgrading CI Quibble jobs to 0.0.31
== 2019-04-01 ==
* 21:42 hauskatze: Imported tool-ldap from Diffusion to Gerrit with full history {{!}} [[phab:T219703|T219703]]
* 21:33 hauskatze: Created https://gerrit.wikimedia.org/r/#/admin/projects/labs/tools/ldap {{!}} [[phab:T219703|T219703]]
* 20:53 hashar: ssh contint1001.wikimedia.org sudo rm /tmp/docker-pkg-build.log
* 20:44 hashar: Building Quibble 0.0.31 containers again # [[phab:T219647|T219647]]  [[phab:T219786|T219786]]
* 20:10 hashar: gerrit: flush-caches --cache git_tags  # some tag got stalled when querying over https -  [[phab:T219786|T219786]]
* 18:50 hauskatze: Created mediawiki/extensions/ContributionCredits.git per request on mediawiki.org
* 18:00 Krinkle: Updating docker-pkg files on contint1001 for  https://gerrit.wikimedia.org/r/#/c/integration/config/+/500504/
* 17:25 Krinkle: fresnel-node10-browser-docker failing with ENOMEM. Depooled integration-slave-docker-1049 as precaution.
* 17:04 hashar: Building CI docker images for Quibble 0.0.31 (yes it is a long day...)
* 13:50 hashar: Reverted CI Jenkins jobs to Quibble 0.0.28 # [[phab:T219647|T219647]]
* 13:11 hashar: Upgraded CI Jenkins jobs to Quibble 0.0.30 # [[phab:T219647|T219647]]
* 10:50 hashar: Manually triggering postmerge step of citoid due to [[phab:T219017|T219017]] for mvolz. On contint1001: zuul enqueue --trigger gerrit --pipeline postmerge --project mediawiki/services/citoid --change 497315,1
* 08:08 hashar: Rebuilding Quibble Jessie containers that failed to build last week due to wikimedia-jessie container. # [[phab:T219647|T219647]]
* 08:07 hashar: Rebuilding container docker-registry.wikimedia.org/wikimedia-jessie # [[phab:T219683|T219683]]
== 2019-03-29 ==
* 19:12 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/497975/ + https://gerrit.wikimedia.org/r/498887/
* 17:30 hashar: some quibble 0.0.30 images fail to build for an unknown reason . Left to figure out after the week-end has passed..
* 17:22 hashar: Building Ci docker images for quibble 0.0.30 [[phab:T219647|T219647]]  [[phab:T219645|T219645]]
* 12:03 Krenair: added alaasarhan to deployment-prep [[phab:T219621|T219621]]
== 2019-03-28 ==
* 18:49 Krenair: shut off deployment-db04 instance per [[phab:T219087|T219087]]
* 18:20 thcipriani: reload zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/496387/
* 17:36 paladox: forking plugins/quota from upstream
* 17:24 Krenair: deployment-prep [[phab:T219087|T219087]] beginning master switch
* 10:45 hashar: Tagged Quibble 0.0.30 {{Gerrit|6ddc6d508cb554e6443ff72648da3ea8a3253fff}}
* 08:08 legoktm: deployed https://gerrit.wikimedia.org/r/c/integration/config/+/499539 (no-op) and https://gerrit.wikimedia.org/r/499717 (SecurePoll phan)
== 2019-03-27 ==
* 22:27 James_F: Altered Wikimedia GitHub settings to require 2FA; see [[phab:T198810|T198810]]
* 14:16 hashar: contint: refreshed all git caches manually from cumin:  cumin --force 'name:docker' 'find /srv/git -name '*.git' -type d -print -exec git -C {} fetch --prune \;'
* 14:16 hashar: contint: added repositories to the git caches https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/499482/
== 2019-03-26 ==
* 23:04 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/499344 (part 1, 2 and 3)
* 20:36 paladox: create gerrit repo operations/software/gerrit/plugins/WikimediaBlocks [[phab:T219300|T219300]]
* 20:00 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/499312
* 17:43 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/499262
* 17:42 Krinkle: Reloading Zuul to deploy merged-but-not-deployed patches https://gerrit.wikimedia.org/r/#/c/integration/config/+/497800/ and https://gerrit.wikimedia.org/r/#/c/498684/
== 2019-03-25 ==
* 14:46 mateusbs17: sunset deployment-maps03
== 2019-03-24 ==
* 16:06 Krenair: shut off old deployment-db03 instance per [[phab:T219087|T219087]]
* 05:39 Krenair: cleaned up old puppet certs/nodes -certcentral-testclient03 -certcentral-testdns -certcentral03 -zotero01 -eventgate-analytics -t153468-test -rd3-cptest-master01 -maps05
* 04:12 Krenair: removed php7.0-fpm package (conflicting with php7.2-fpm) and removed /etc/nginx/sites-enabled/default (conflicting with apache, puppet will remove the available copy too) from -deploy02, -jobrunner03, -mwmaint01, and -mediawiki-07 hosts to try to get puppet there happy again
== 2019-03-23 ==
* 23:44 legoktm: deploying https://gerrit.wikimedia.org/r/498681 https://gerrit.wikimedia.org/r/498682 https://gerrit.wikimedia.org/r/498683
* 23:33 legoktm: deploying https://gerrit.wikimedia.org/r/498247 https://gerrit.wikimedia.org/r/498249 https://gerrit.wikimedia.org/r/498680
* 20:52 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/498664 / [[phab:T215562|T215562]]
== 2019-03-22 ==
* 19:59 hashar: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/498466/ # [[phab:T219017|T219017]]
* 19:34 thcipriani: updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/integration/config/+/494548/
* 19:24 thcipriani: clean old docker-pkg image from contint1001
* 03:08 legoktm: rebuilding mediawiki-phan for https://gerrit.wikimedia.org/r/498297
* 00:36 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/498276  / [[phab:T218963|T218963]]
* 00:36 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/498276  / [[phab:T215562|T215562]])
== 2019-03-21 ==
* 23:34 legoktm: rebuilding mediawiki-phan docker image for https://gerrit.wikimedia.org/r/498266
* 22:34 legoktm: onlined integration-slave-jessie-1002
* 22:33 legoktm: legoktm@integration-slave-jessie-1002:/srv/jenkins-workspace/workspace$ sudo rm -rf *
* 21:03 legoktm: deploying https://gerrit.wikimedia.org/r/498178 https://gerrit.wikimedia.org/r/498185 https://gerrit.wikimedia.org/r/498187 https://gerrit.wikimedia.org/r/498189 https://gerrit.wikimedia.org/r/498182
* 15:22 <hashar>: pruning images/containers on integration-slave-docker-1021
* 10:13 <hashar>: deployment-deploy01: sudo rm -fR /tmp/mw-cache-master | files were from Mar 15 10:21
* 00:33 <legoktm>: deploying https://gerrit.wikimedia.org/r/497600
== 2019-03-20 ==
* 13:20 <hashar>: Scheduled update of Diffusion repository wikibase-termbox via https://phabricator.wikimedia.org/source/wikibase-termbox/manage/ on request of Pablo_WMDE
== 2019-03-19 ==
* 22:47 <Krinkle>: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/497322 / T218553
* 21:00 <paladox>: hiding operations/software/gerrit/plugins/WikimediaWebSessions from normal users
* 20:59 <paladox>: create operations/software/gerrit/plugins/WikimediaWebSessions project T218739
* 20:23 <hashar>: integration: sudo cumin --force '*' 'rm /etc/apt/preferences.d/jessie_mitaka_pinning_*' # T218559
* 19:56 <hashar>: integration: sudo cumin --force '*' 'rm /etc/apt/sources.list.d/openstack-mitaka-jessie.list' # T218559
* 16:56 <Reedy>: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/494802
* 15:48 <Krinkle>: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/497437
* 14:01 <hashar>: Removed Zuul "check" pipeline | https://gerrit.wikimedia.org/r/#/c/integration/config/+/493188/ | T192217
* 02:21 <Krenair>: sudo aptly publish --architectures="all,amd64" --skip-signing repo buster-deployment-prep
* 02:09 <Krenair>: sudo aptly repo create -component="main" -distribution="buster-deployment-prep" buster-deployment-prep
== 2019-03-18 ==
* 16:39 Krenair: yet*
* 16:38 Krenair: created deployment-acme-chief01 and a client instance for further acme-chief testing + dev. used stretch, would be buster like prod but not sure that's easily available outside testlabs yes
* 10:09 hashar: contint1001: rm -fR /srv/doc1001.eqiad.wmnet
* 10:06 hashar: github: deleting https://github.com/wikimedia/wikidata-gremlin # archived [[phab:T155829|T155829]]
* 09:58 hashar: arming keyholder on integration-cumin
* 09:55 hashar: deleting shutdowned instance integration-publisher02 , we do not use it anymore since doc publishing got overhauled ( [[phab:T137890|T137890]] )  # [[phab:T218146|T218146]]
* 09:12 hashar: deployment-deploy01: cleaning disk: rm /var/cache/hhvm/cli.hhbc.sq3
* 07:37 legoktm: deployed https://gerrit.wikimedia.org/r/496610
== 2019-03-17 ==
* 16:29 Krenair: deactivated and cleaned certs for deployment-redis3-changeprop02 and deployment-prometheus01 (which no longer appear to exist but were causing cumin to be upset)
* 16:27 Krenair: armed deployment-cumin keyholder
* 00:08 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/496887
== 2019-03-16 ==
* 21:56 Krinkle: krinkle@contint1001:~$ zuul enqueue --trigger gerrit --pipeline postmerge --project mediawiki/extensions/OOJsUIAjaxLogin --change 490979,1
* 21:49 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/497077
* 21:41 legoktm: deploying https://gerrit.wikimedia.org/r/496507 https://gerrit.wikimedia.org/r/496524 https://gerrit.wikimedia.org/r/496520 https://gerrit.wikimedia.org/r/496516 https://gerrit.wikimedia.org/r/496510
* 21:21 Krinkle: krinkle@contint1001$ zuul enqueue --trigger gerrit --pipeline postmerge --project mediawiki/extensions/EventLogging --change 264494,4
* 21:20 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/497076
* 21:20 Krinkle: Removing doc1001:/srv/docroot/org/wikimedia/doc/mediawiki-extensions-EventLogging (created by accident)
== 2019-03-15 ==
* 21:58 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/496978
* 21:52 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/496882
* 20:10 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/496688
* 18:21 thcipriani: clean old docker images from contint1001
* 17:40 thcipriani: rearm beta keyholder
* 16:54 thcipriani: reenable beta-scap-eqiad
* 16:08 thcipriani: disable beta-scap-eqiad to test new php, back shortly
== 2019-03-14 ==
* 23:48 Krinkle: Abort job quibble-vendor-mysql-hhvm-docker/39874/ for mwext-CentralAuth (stuck after 59 minutes)
* 21:25 hashar: Manually triggered tests for 12 ContentTranslation changes that had label:verified=-1 # [[phab:T216689|T216689]]
* 21:21 hashar: Updated quibble-vendor-mysql-hhvm-docker with latest libc6 hopefully fixing HHVM segfault within libpthread # [[phab:T216689|T216689]]
* 19:14 ebernhardson: restart logstash on deployment-logstash2 to re-read and re-create apifeatureusage template
* 16:34 hashar: rollback quibble-vendor-mysql-hhvm-docker job to no more capture core files, we have enough and a good lead ( reverting https://gerrit.wikimedia.org/r/#/c/integration/config/+/496392/ ) # [[phab:T216689|T216689]]
* 12:31 hashar: Updated quibble-vendor-mysql-hhvm-docker to hopefully allow core dumps and capture them {{!}} https://gerrit.wikimedia.org/r/#/c/integration/config/+/496392/4 # [[phab:T216689|T216689]]
* 12:12 hashar: Updated quibble-vendor-mysql-hhvm-docker to hopefully allow core dumps and capture them {{!}} https://gerrit.wikimedia.org/r/#/c/integration/config/+/496392/3 # [[phab:T216689|T216689]]
* 10:49 hashar: triggering tests for all ContentTranslation pending changes # [[phab:T216689|T216689]]
* 09:54 hashar: ci: live hacked job https://integration.wikimedia.org/ci/job/quibble-vendor-mysql-hhvm-docker/ in attempt to capture 'core' files from hhvm {{!}} https://gerrit.wikimedia.org/r/#/c/integration/config/+/496392/ {{!}} [[phab:T216689|T216689]]
== 2019-03-13 ==
* 21:26 thcipriani: pool new bigram CI instance integration-slave-docker-1054
* 20:58 thcipriani: deleting bigram CI instance integration-slave-docker-1046 due to corrupt disk cf: [[phab:T218245|T218245]]
* 20:44 thcipriani: marking integration-slave-docker-1046 offline
* 20:38 hashar: Added integration-slave-docker-1045 to Jenkins. The instance existed in WMCS but was not in Jenkins
* 20:13 bearND: (beta): Update mobileapps to {{Gerrit|5865552}} ({{Gerrit|7074964}} {{Gerrit|d6dc3cd}} {{Gerrit|fbc6262}})
* 19:13 hashar: integration-slave-docker-1046 is back online # [[phab:T218245|T218245]]
* 19:04 hashar: hard rebooting integration-slave-docker-1046 , not reachable over ssh # [[phab:T218245|T218245]]
* 18:48 hashar: Building containers releng/quibble-jessie-hhvm and releng/quibble-stretch-hhvm with HHVM core_dump_report enabled # [[phab:T216689|T216689]]
* 18:09 ebernhardson: restart elasticsearch on deployment-elastic* to deploy apifeature usage fix ([[phab:T183156|T183156]])
* 10:39 hashar: Triggered tests for ContentTranslation changes that had label:verified=-1 # [[phab:T217654|T217654]]
* 10:03 hashar: Bump Quibble tmpfs disk space used to hold the database from 256MBytes to 320Mbytes. l10n_cache causes an overflow # [[phab:T217654|T217654]]
== 2019-03-12 ==
* 18:38 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/479558  / [[phab:T203506|T203506]]
* 18:23 thcipriani: bring integration-slave-jessie-1001 back online, /srv disk space now at 20% (not sure if someone cleared disk and forgot to repool)
* 16:45 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/495931
* 13:50 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/493139/
== 2019-03-11 ==
* 15:59 Krenair: previous !log was for deployment-prep
* 15:56 Krenair: added MSantos to projectadmin, per chat in -infrastructure he'll use it to work on maps. user is a new foundation software engineer
* 01:20 legoktm: deploying https://gerrit.wikimedia.org/r/495593
== 2019-03-10 ==
* 02:50 legoktm: deploying https://gerrit.wikimedia.org/r/495443 https://gerrit.wikimedia.org/r/493741
* 02:45 legoktm: deploying https://gerrit.wikimedia.org/r/495332 https://gerrit.wikimedia.org/r/495470
== 2019-03-08 ==
* 22:18 thcipriani: Reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/490950/
* 20:20 legoktm: deploying https://gerrit.wikimedia.org/r/495302
* 20:16 legoktm: deploying https://gerrit.wikimedia.org/r/495300 https://gerrit.wikimedia.org/r/495298 https://gerrit.wikimedia.org/r/493502
* 11:36 hasharLunch: integration: deleting old java8 docker images:  sudo cumin --force 'name:docker' 'docker images{{!}}grep java8{{!}}awk "{ print $3 }"{{!}}xargs docker rmi'
* 10:04 hashar: Updating maven based jobs to latest java8 containers {{!}} https://gerrit.wikimedia.org/r/#/c/integration/config/+/495022/ {{!}} [[phab:T208938|T208938]]
== 2019-03-06 ==
* 18:25 hasharAway: contint1001: restart Jenkins for plugins upgrade
* 18:24 hasharAway: reloading zuul for {{Gerrit|I21d7fa2939f507441f7d130f8207541e28d762ad}}
* 18:02 hashar: Upgrading plugins on https://releases-jenkins.wikimedia.org
* 17:37 hashar: Reloading Zuul for  mediawiki/libs/Zest https://gerrit.wikimedia.org/r/#/c/integration/config/+/494609/2/zuul/layout.yaml
* 16:27 Krinkle: doc1001 had bad permissions set on /org/wikimedia/doc (chmod 755 instead of 775). Making it impossible to git pull in the way that post-merge Jenkins job on integration/docroot recommends. Fixed with `sudo -u doc-uploader chmod 775 /srv/wikimedia/org/wikimedia/doc`.
* 15:58 andrewbogott: deleting deployment-prometheus01 on Filippo's advice
* 13:31 gehel: upgrading logstash to 5.6.14 on deployment-logstash2
* 13:15 gehel: upgrading elasticsearch to 5.6.14 on deployment-logstash2
* 11:33 Lucas_WMDE: lucaswerkmeister-wmde@deployment-mediawiki-09:~$ sudo systemctl restart php7.2-fpm # [[phab:T217323|T217323]]
== 2019-03-05 ==
* 20:10 thcipriani: reenable beta-scap-eqiad
* 19:17 thcipriani: disable beta-scap-eqiad due to [[phab:T217587|T217587]]
* 10:13 hashar: integration: fixed erroneous ssh key restriction for cumin {{!}} [[phab:T217642|T217642]]
== 2019-03-04 ==
* 17:36 hashar: Build Docker containers for chromium=v71 pin ( https://gerrit.wikimedia.org/r/#/c/integration/config/+/494243/ )
* 15:14 herron: cherry picking https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/492390/ to deployment-puppetmaster03
* 13:49 hauskatze: GitHub: deleted wikimedia/mediawiki-extensions-UploadLocal {{!}} [[phab:T213011|T213011]]
* 09:59 hashar: cleaned docker images on integration-slave-docker-1037
* 07:38 legoktm: deploying https://gerrit.wikimedia.org/r/494167
== 2019-03-03 ==
* 02:26 Krinkle: tried rebooting or shutting down integration-slave-docker-1021, no response on horizon. Did pause/resume instead, which did work, after which shutdown/start worked. Jenkins agent has been relaunched and seems online again.
* 02:20 Krinkle: integration-slave-docker-1021 (ci1.medium) has jobs failing on it due to ENOMEM. Horizon shows in log: integration-slave-docker-1021 login: [4961938.696837] Out of memory: Kill process 21770 (chromium) score 841 or sacrifice child; [4961938.699176] Killed process 21770 (chromium) total-vm:3171496kB, anon-rss:1379288kB, file-rss:0kB, shmem-rss:1636kB
== 2019-03-02 ==
* 22:12 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/493959
* 20:44 hauskatze: Renamed https://github.com/wikimedia/wikimedia-github-community-health-defaults to https://github.com/wikimedia/.github
* 20:42 hauskatze: ssh -p 29418 gerrit.wikimedia.org replication start wikimedia/github-community-health-defaults --wait
* 20:40 hauskatze: github created https://github.com/wikimedia/wikimedia-github-community-health-defaults
* 20:31 Reedy: reloading zuul to deploy https://gerrit.wikimedia.org/r/493881
* 20:30 Krinkle: Failure on integration-slave-docker-1021 (ENOMEM) https://integration.wikimedia.org/ci/job/fresnel-node10-browser-docker/61/console
* 19:51 legoktm: deploying https://gerrit.wikimedia.org/r/493872
* 19:37 legoktm: deploying https://gerrit.wikimedia.org/r/493862
* 18:26 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/493808
* 18:21 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/493837
== 2019-03-01 ==
* 19:17 thcipriani: integration-slave-docker-1021:/# docker rmi $(docker images {{!}} grep " months " {{!}}grep -v " [1-2] months " {{!}} awk '{print $3}')
* 17:02 thcipriani: integration-slave-jessie-1004 back online
* 16:58 thcipriani: integration-slave-jessie-1002 back online (disk space looked fine); rebooting integration-slave-jessie-1004 -- can't ssh to machine
* 16:11 Lucas_WMDE: delete refs/master and refs/gerrit/master on WikibaseQualityConstraints repository [[phab:T217408|T217408]]
* 15:49 hashar: wikidata/query/blazegraph  change Gerrit config to require a change-id # [[phab:T216855|T216855]]
* 14:28 hashar: Upgrading integration/jenkins-job-builder to version 2.0.2  + one custom hack 11aa5de4...a06d173e  # [[phab:T143731|T143731]]
* 14:18 hashar: integration/jenkins-job-builder : importing upstream code to new branch "upstream".  Push all upstream tags to our repository
== 2019-02-28 ==
* 18:14 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/493261
* 03:42 duh: deploying https://gerrit.wikimedia.org/r/493357
* 03:42 duh: deploying
* 03:30 duh: deploying https://gerrit.wikimedia.org/r/c/integration/config/+/493355
== 2019-02-27 ==
* 22:02 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/490678/
* 19:20 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/492758
* 19:08 thcipriani: updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/c/integration/config/+/492758 (take II)
* 18:27 thcipriani: updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/c/integration/config/+/492758
== 2019-02-26 ==
* 22:52 ebernhardson: delete logstash logs in /var/log/logstash generated prior to 2019
* 22:51 ebernhardson: restart logstash on deployment-logstash2 while hacking around to see why apifeatureusage doesn't work
* 18:01 dcausse: deployement-prep: installing elastic 6.5.4 to deployment-elastic* machines
* 16:46 addshore: added Cparle to deployment-prep
* 16:11 hashar: Generating 1.33.0-wmf.19 deploy notes https://integration.wikimedia.org/ci/job/train-deploy-notes/9/console {{!}} [[phab:T206673|T206673]]
* 09:32 godog: remove now-merged node-exporter timer disable, cherry pick https://gerrit.wikimedia.org/r/c/operations/puppet/+/492632
== 2019-02-25 ==
* 23:32 twentyafterfour: root@deployment-db05# mariabackup --innobackupex --apply-log --use-memory=10G /srv/sqldata # [[phab:T216067|T216067]]
* 22:14 thcipriani: docker rmi images without "latest" tag on contint1001 to free space -- should have kept all current docker-pkg images as well as images with children -- [[phab:T217094|T217094]]
* 13:39 hashar: Rebuilding some CI Docker images using PHP sury.org to switch the sury.org component from jessie to stretch ( https://gerrit.wikimedia.org/r/#/c/integration/config/+/492666/ )
== 2019-02-24 ==
* 21:09 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/492561, [[phab:T216964|T216964]]
* 04:54 legoktm: rebuilding docker image for https://gerrit.wikimedia.org/r/485241
* 04:41 legoktm: legoktm@contint1001:/srv/zuul/git/mediawiki/tools$ sudo -u zuul rm -rf phan
== 2019-02-23 ==
* 22:25 legoktm: deploying https://gerrit.wikimedia.org/r/492497
* 02:32 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/492427
* 01:59 Krinkle: Reloading Zuul to deploy ttps://gerrit.wikimedia.org/r/492425
== 2019-02-22 ==
* 20:59 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/492377
* 16:48 paladox: created "block-users" group for https://bugs.chromium.org/p/gerrit/issues/detail?id=10507 (reported by a wmf user)
* 13:38 Reedy: restarting zuul because nothing is being run
== 2019-02-21 ==
* 21:02 paladox: branch deploy/wmf/stable-2.16 from {{Gerrit|0e0ea0ff735da0a494347884917fc48881d7e545}} in operations/software/gerrit
* 19:49 Amir1: ores:5d937b1 is going beta
* 18:22 Amir1: ores:2d84709 going beta
* 17:45 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/483225/
* 16:35 hashar: Adjusting IP of https://integration.wikimedia.org/ci/computer/compiler1002.puppet-diffs.eqiad.wmflabs/ and adding it back # [[phab:T216513|T216513]]
* 14:10 Amir1: deploying ores:5d50713 to beta
== 2019-02-20 ==
* 19:45 hashar: deployment-db03: restored some old /var/lib/dpkg/status file : sudo zcat /var/backups/dpkg.status.2.gz {{!}} sudo tee /var/lib/dpkg/status  # [[phab:T216635|T216635]]
* 19:45 hashar: deployment-db03: restored some old /var/lib/dpkg/status file : sudo zcat /var/backups/dpkg.status.2.gz {{!}} sudo tee /var/lib/dpkg/status
* 17:26 hashar: For beta cluster the MySQL master database has some innodb issue [[phab:T216635|T216635]] , the MySQL slave has an issue as well [[phab:T216067|T216067]]
* 17:09 hashar: reloading zuul for {{Gerrit|Id1e3afd0afba9b388778066b9b6e8e564a25826b}}
* 17:09 hashar: contint1001: fix broken root ownership on zuul git deploy repo: sudo find /etc/zuul/wikimedia/.git -not -user zuul -exec chown zuul:zuul {} +
* 17:05 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/490606/
* 16:57 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/490559
* 16:36 hauskatze: Ran replication start mediawiki/extensions/PageViewInfo --wait on gerrit.wikimedia to populate GitHub mirror (success messages afterwards) {{!}} [[phab:T180864|T180864]]
* 00:25 greg-g: disabled beta-update-databases-eqiad in the jenkins UI - [[phab:T216067|T216067]]
== 2019-02-19 ==
* 21:23 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/491577, also deploys  https://gerrit.wikimedia.org/r/490492  / [[phab:T198495|T198495]]
* 20:28 hashar: jenkins: disable Android Emulator plugin {{!}} [[phab:T198495|T198495]]
* 20:27 hashar: gerrit: archived repository integration/jenkinsci/android-emulator-plugin {{!}} [[phab:T198495|T198495]]
* 20:23 hashar: Deleting Jenkins job apps-android-wikipedia-periodic-test {{!}} [[phab:T198495|T198495]]
* 15:55 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/491513
* 15:46 hashar: Creating integration-slave-jessie-android m1.large it got deleted last week {{!}} [[phab:T216517|T216517]]
* 08:55 hashar: Cleaning contint1001 / partition
* 04:55 bd808: Removed stale cherry-pick of {{Gerrit|Ic7e726768701fefdee68622b08e3f2995779fe5a}} from deployment-puppetmaster03 that was blocking rebase on origin/production
* 04:50 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/491405
* 04:14 Krinkle: Disabled spam account on Phab - https://phabricator.wikimedia.org/people/manage/18915/
* 03:55 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/491403
* 00:54 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/491381  / {{Gerrit|bc8e4198961cb73}} /  [[phab:T133646|T133646]]
== 2019-02-18 ==
* 22:32 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/491379  / [[phab:T211784|T211784]]
* 21:51 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/491378 / [[phab:T133646|T133646]])
* 21:37 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/459268 / [[phab:T133646|T133646]]
* 15:38 elukey: kill/spawn deployment-aqs0[2,3] in deployment-prep with Debian Stretch
* 14:12 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/490590
* 12:00 Krenair: [[phab:T216067|T216067]] Stopping mysql on -db04 to begin copy to -db05. Note crashed tables centralauth.globaluser and centralauth.localuser
* 11:57 elukey: kill/spawn deployment-aqs01 with Debian Stretch in deployment-prep
* 11:45 arturo: manually start deployment-db03 per Krenair request
* 11:29 hasharAway: beta: tried to start instance deployment-db03 172.16.5.23 --> ERROR  {{!}} [[phab:T216067|T216067]]
== 2019-02-17 ==
* 07:21 legoktm: deploying https://gerrit.wikimedia.org/r/491029
* 07:10 legoktm: Building image docker-registry.discovery.wmnet/releng/tox-acme-chief:0.3.4
* 06:28 legoktm: building new tox-acme-chief docker image https://gerrit.wikimedia.org/r/489725
* 01:12 Krinkle: beta-scap-eqiad (cron) failing with "sudo: a password is required"
== 2019-02-16 ==
* 19:44 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/490937 / [[phab:T216275|T216275]])
* 17:23 thcipriani: installed php7.0-curl on deployment-deploy01 (why was that suddenly necessary?)
== 2019-02-15 ==
* 17:28 thcipriani: integration-slave-jessie-1002:/srv/jenkins-workspace/workspace$ `sudo rm -rf *` due to full disk
== 2019-02-14 ==
* 20:50 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/490528  / [[phab:T216102|T216102]]
* 20:50 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/490528  / [[phab:T133646|T133646]]
* 13:24 thcipriani: rearm keyholder on deployment-deploy01: sudo keyholder arm, passwords on https://wikitech.wikimedia.org/wiki/Keyholder
== 2019-02-13 ==
* 21:32 marxarelli: dduvall@integration-slave-jessie-1001:/mnt/home/jenkins-deploy$ `rm -rf .gradle/ .m2/` due to full disk
* 21:21 marxarelli: bringing integration-slave-docker-1046 and integration-slave-jessie-1001 back online
* 21:20 marxarelli: dduvall@integration-slave-jessie-1001:/srv/jenkins-workspace/workspace$ `sudo rm -rf *` due to full disk
* 21:15 marxarelli: removing old docker images on integration-slave-docker-1046
* 21:10 marxarelli: starting migrated integration-slave-docker-1046 instance
* 21:01 marxarelli: pooling new jenkins node for integration-slave-docker-1052
* 20:46 marxarelli: pooling jenkins node for integration-slave-docker-1051
* 20:45 marxarelli: launching replacement instance integration-slave-docker-1052
* 20:35 marxarelli: launching replacement instance integration-slave-docker-1051
* 20:32 marxarelli: pooling jenkins node for integration-slave-docker-1050
* {{safesubst:SAL entry|1=20:15 marxarelli: integration-slave-docker-{1044,1046,1047} unresponsiveness due to cloudvirt failure. 1046 is being moved already by CS. deleting 1044 and 1047}}
* {{safesubst:SAL entry|1=19:57 marxarelli: seeing jenkins agent connection failures for integration-slave-docker-{1044,1046,1047}}}
* 19:48 marxarelli: pooling replacement jenkins node integration-slave-docker-1049
* 19:34 marxarelli: deleting integration-slave-jessie-android jenkins node and instance
* 19:33 marxarelli: deleting integration-slave-jessie-1003 jenkins node and instance
* 19:32 marxarelli: deleting integration-slave-docker-1033 jenkins node and instance
* 19:25 marxarelli: deleting integration-slave-docker-1017 jenkins node and instance
* 18:45 Krinkle: integration-slave-jessie-1003 seems to be consitently unable to start jobs, marking as offline manually
* 18:32 thcipriani: bringing up new integration-castor03, re-enabling castor-save* jobs
* 18:15 marxarelli: adding new jenkins node integration-slave-docker-1048
* 18:02 marxarelli: launching new integration-slave-docker-1048 instance
* 17:59 marxarelli: deleting integration-slave-docker-1038 node and deleting instance
* 17:50 marxarelli: bringing integration-slave-docker-1033 back online after clearing out old docker images
* 17:33 thcipriani: rebuilding integration-castor03
* 17:21 thcipriani: stopping rsync server on castor03
* 17:21 twentyafterfour: stopped rsync on castor03
* 17:16 twentyafterfour: disconnected castor03 from jenkins
* 16:48 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/487880/
* 14:34 thcipriani: modified castor-save-workspace-cache to exit 0 and run on blubber nodes while integration-castor03 is down
* 14:26 dcausse: deployement-prep: upgrading to elastic 5.6.14
== 2019-02-12 ==
* 23:15 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/490130/1
== 2019-02-11 ==
* 21:05 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/489689
* 20:39 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/489688/
* 19:46 marxarelli: installing/enabling HTTP Request jenkins plugin on integration.wikimedia.org/ci to support https://gerrit.wikimedia.org/r/c/integration/pipelinelib/+/480689 changes
* 16:04 addshore: bring integration-slave-docker-1040 back online
* 16:04 addshore: addshore@integration-slave-docker-1040:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 17.52GB
* 10:47 godog: shut deployment-prometheus01, unused now
* 03:06 Reedy: graceful restart of zuul as no jobs were running
== 2019-02-10 ==
* 03:22 Krinkle: Updating docker-pkg files on contint1001 for  https://gerrit.wikimedia.org/r/489434  (Create quibble-stretch-hhvm, replacing jessie)
* 02:06 Krinkle: Updating docker-pkg files on contint1001 for  https://gerrit.wikimedia.org/r/489430
== 2019-02-08 ==
* 20:20 Krinkle: Delete various jobs on Jenkins that no longer exist in JJB config, ref [[phab:T91410|T91410]]
* 15:59 addshore: this reload also included "Switch npm-audit job to node10"? [[phab:T211784|T211784]], which did touch the zuul file
* 15:58 addshore: reloaded zuul for https://gerrit.wikimedia.org/r/#/c/integration/config/+/489241/
* 03:10 Krinkle: Delete various jobs on Jenkins that no longer exist in JJB config
* 00:28 Krinkle: krinkle@doc1001: sudo -u doc-uploader chmod 775 /srv/docroot/org/wikimedia/doc/
* 00:12 marxarelli: removed old docker images on contint1001 to free up space
== 2019-02-07 ==
* 23:17 thcipriani: integration-slave-jessie-1003:sudo rm -rf /srv/jenkins-workspace/workspace/*
* 23:15 thcipriani: integration-slave-docker-1033:sudo docker image prune and bring back online
* 22:28 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/467550 (
* 19:09 paladox: created integration/zuul/build gerrit repo for [[phab:T215458|T215458]]
* 19:05 paladox: created integration/zuul/wheels gerrit repo for [[phab:T215458|T215458]]
* 15:48 addshore: brought integration-slave-docker-1043 back online
* 15:48 addshore: addshore@integration-slave-docker-1043:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 14.86GB
* 08:49 hashar: cleaning docker images on integration-slave-docker-1021
== 2019-02-06 ==
* 22:34 shdubsh: Deploy node-exporter 0.17 [[phab:T213708|T213708]]
* 14:12 godog: shut off deployment-prometheus01 - [[phab:T215272|T215272]]
* 14:00 godog: switch beta-prometheus to deployment-prometheus02 - [[phab:T215272|T215272]]
== 2019-02-05 ==
* 20:07 ebernhardson: jobrunner port 9006 is firewalled, revert to 9005 and created [[phab:T215339|T215339]] to fix job queue in beta cluste
* 19:36 ebernhardson: Update profile::cpjobqueue::{jobrunner,videoscaler}_host in horizon hiera from port 9005 to 9006 to match new restrictions in gerrit.wikimedia.org/r/481866
* 16:29 addshore: [[phab:T215288|T215288]] added mirrys to deployment-prep as a user
* 15:32 addshore: [[phab:T215278|T215278]] addshore@integration-slave-docker-1037:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 16.59GB
== 2019-02-04 ==
* 23:13 thcipriani: integration-slave-docker-1040:sudo docker image prune and bring back online
* 23:12 thcipriani: integration-slave-docker-1038:sudo docker image prune and bring back online
* 21:48 ebernhardson: restart logstash on deployment-logstash2
* 15:25 hashar: removed Jenkins user "nodepoolmanager" as well as related authorizations {{!}} [[phab:T209361|T209361]]
== 2019-02-03 ==
* 06:15 legoktm: deployed https://gerrit.wikimedia.org/r/487627
* 05:44 legoktm: deployed https://gerrit.wikimedia.org/r/485967
* 04:36 legoktm: deploying https://gerrit.wikimedia.org/r/487534
== 2019-02-02 ==
* 22:17 legoktm: legoktm@integration-slave-jessie-1004:/srv/jenkins-workspace/workspace$ sudo rm -rf *
== 2019-01-31 ==
* 15:03 thcipriani: rearm keyholder on deployment-deploy01
* 12:05 arturo: VM instances deployment-deploy01,deployment-deploy02,deployment-fluorine02,deployment-kafka-jumbo-2,deployment-kafka-main-1,deployment-maps04,deployment-mcs01,deployment-mediawiki-09,deployment-memc04,deployment-ms-be03,deployment-ms-fe02,deployment-parsoid09,deployment-sca04,deployment-webperf12, were stopped briefly due to issue in hypervisor ([[phab:T215012|T215012]])
== 2019-01-30 ==
* 08:35 legoktm: deploying https://gerrit.wikimedia.org/r/486439 https://gerrit.wikimedia.org/r/481570 https://gerrit.wikimedia.org/r/481571
== 2019-01-29 ==
* 07:41 legoktm: legoktm@integration-slave-jessie-1001:/srv/jenkins-workspace/workspace$ sudo rm -rf * b/c full disk
== 2019-01-28 ==
* 16:33 hashar: contint1001: cleaning up disk space on /
* 13:07 addshore: bringing integration-slave-docker-1041 back online
* 13:07 addshore: addshore@integration-slave-docker-1041:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 16.12GB
* 09:37 Amir1: ores:ad160b0 is going beta
== 2019-01-27 ==
* 19:57 addshore: bringing integration-slave-docker-1034 back online
* 19:50 addshore: addshore@integration-slave-docker-1034:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 17.12GB
== 2019-01-26 ==
* 22:48 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/486791
* 21:21 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/486737
== 2019-01-25 ==
* 19:25 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/486503/
* 19:07 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/486501/
* 14:56 hashar: contint1001: systemctl stop zuul-merger && find /srv/zuul/git -name .git -type d -print -execdir git gc --prune=now \;
* 13:35 hashar: flake8 broken under python2.7 due to configparser==3.5.2  https://github.com/jaraco/configparser/issues/27 https://github.com/jaraco/configparser/issues/27
* 00:58 thcipriani: integration-slave-jessie-1002:sudo rm -rf /srv/jenkins-workspace/workspace/* clean gradle cache, bring back online
== 2019-01-23 ==
* 22:15 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/486170/
== 2019-01-22 ==
* 23:46 thcipriani: Updating (more) tox docker image on contint1001 for https://gerrit.wikimedia.org/r/#/c/integration/config/+/485971/
* 23:24 thcipriani: integration-slave-docker-1017 clean docker images and repool
* 21:06 thcipriani: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/integration/config/+/485895/
* 20:18 legoktm: rebuilding php, composer, phan images for apt security update https://gerrit.wikimedia.org/r/485883
* 19:59 thcipriani: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/integration/config/+/485882/
* 19:28 legoktm: rebuilding ci support images for apt security upgrade https://gerrit.wikimedia.org/r/485874
* 19:09 legoktm: rebuilding ci-stretch/ci-jessie images https://gerrit.wikimedia.org/r/485873
* 19:08 legoktm: manually docker pulled docker-registry.wikimedia.org/wikimedia-jessie and docker-registry.wikimedia.org/wikimedia-stretch on contint1001
* 18:56 bearND: (beta) Update mobileapps to {{Gerrit|0aac268}} (fix pronunciation detection in mobile-sections [[phab:T214338|T214338]])
* 18:46 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/485870/ and https://gerrit.wikimedia.org/r/#/c/integration/config/+/485331/
* 14:31 addshore: reloaded zuul for REVERT: Add ArticlePlaceholder to gated extensions [integration/config] - https://gerrit.wikimedia.org/r/485754
* 13:26 addshore: reloaded zuul for Add ArticlePlaceholder to gated extensions [integration/config] - https://gerrit.wikimedia.org/r/485754
* 13:03 addshore: reload zuul for Add ArticlePlaceholder to Wikibase Tests [integration/config] - https://gerrit.wikimedia.org/r/485753
== 2019-01-21 ==
* 19:49 hashar: integration: update sudo rule for debian-glue to keep env variable EXTRAPACKAGES. Would let us get eatmydata included {{!}} [[phab:T214328|T214328]]
* 15:40 hashar: contint1001: removing all generated doc/cover from /srv/org/wikimedia/doc {{!}} [[phab:T137890|T137890]]
== 2019-01-18 ==
* 23:22 hashar: contint1001: sudo docker image prune  # Total reclaimed space: 3.592GB
* 23:00 Krinkle: Some docker builds on integration-slave-docker-1021 failing with ENOMEM
* 23:00 mutante: contint1001 - gzipping more files in /var/log/zuul/
* 22:57 mutante: contint1001 - moved zuul logs from 2018 and gzipped zuul logs from /var/log/zuul to /srv/logs/zuul to free disk space on /
* 22:39 mutante: contint1001 - apt-get clean - disk space low
* 22:31 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/482527  / [[phab:T212602|T212602]]
== 2019-01-17 ==
* 19:34 thcipriani: integration-slave-jessie-1002:sudo rm -rf /srv/jenkins-workspace/workspace/* and bring back online
* 08:39 legoktm: deploying composer docker image - https://gerrit.wikimedia.org/r/484853
== 2019-01-16 ==
* 21:11 bearND: (beta): Update mobileapps to {{Gerrit|258d76b}} page summary changes
== 2019-01-15 ==
* 09:00 hashar: Deleting Docker images on integration-slave-docker-1021
== 2019-01-14 ==
* 22:02 bearND: (beta): Update mobileapps to {{Gerrit|f2658de}}
* 21:47 mutante: deployment-mcs01 - sudo su deploy-service; cd /srv/deployment/mobileapps/deploy-cache/revs/1182b3b8f288df0221257b929ca43fb86862c2f8/scap ; touch log  (for debugging permission problem reported by bearND)
* 14:31 hashar: Nuked Castor cache for all *tox* jobs. Some might have cached binary wheels compiled against a lib that is no more existing (eg libmysqlclient.so.18 for mysql-python). Follow up the jessie -> stretch upgrade # [[phab:T191764|T191764]]
* 14:28 hashar: Deleted Castor cache for wikimedia-cz/tracker  mysql-python got cached as  a wheel but compiled against libmysqlclient.so.18. That fails with the new tox...:0.3.0 containers which uses mariadb / libmysqlclient.so compat symlink
== 2019-01-11 ==
== 2019-01-11 ==
* 20:48 thcipriani: repooling integration-slave-jessie-1003 after cleaning mvn and gradle cache
* 20:48 thcipriani: repooling integration-slave-jessie-1003 after cleaning mvn and gradle cache

Revision as of 19:34, 5 June 2019

2019-06-05

  • 19:34 andrewbogott: moving deployment-imagescaler03 to cloudvirt1029
  • 17:35 James_F: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/514547
  • 16:01 James_F: Reloading Zuul to deploy 172a4e7
  • 09:07 hashar: Pooled in integration-slave-docker-1058 and integration-slave-docker-1059

2019-06-04

  • 20:39 James_F: Updating jjb PHP code coverage jobs to quibble-stretch-php70:0.0.31-5 T220917
  • 20:31 James_F: Updating docker-pkg files on contint1001 for quibble-stretch-php70 T220917
  • 17:44 James_F: Updating jjb PHP code coverage jobs to node10 quibble T224983
  • 17:31 James_F: Reloading Zuul to make mwext-MobileFrontend-npm-run-lint-modules-docker non-voting T224997
  • 16:10 hashar: Deleting integration-slave-docker-1021 and integration-slave-docker-1049 / too small disk (20G partition) and not enough ram (2G) # T221872
  • 13:24 hashar: Update all selenium-daily* jobs to use NodeJS 10 instead of NodeJS 6. T217545
  • 13:01 hashar: Building docker-registry.discovery.wmnet/releng/node10-test-browser:0.6.0 # T217545
  • 11:32 hashar: Upgrading Jenkins Pipeline plugins
  • 11:31 hashar: Upgrading Jenkins Warnings Next Generation Plugin # T224745
  • 11:30 hashar: Upgrading Jenkins BlueOcean and all its dependencies

2019-06-03

  • 19:32 James_F: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/514089
  • 19:29 James_F: Generated mwgate-node10-docker and deployed via jjb.
  • 18:34 hashar: hswitch most Quibble jobs to node 10 T222406 - ttps://gerrit.wikimedia.org/r/#/c/integration/config/+/514034/ T222406
  • 17:40 James_F: Reloading Zuul to switch most Quibble jobs to node 10 T222406
  • 15:58 hashar: Deleting integration-slave-docker-1055 and integration-slave-docker-1056 . CPU is way too slow T223971
  • 15:46 hashar: reduce number of executors on all integration slave docker, they are somehow starving on CPU and/or IO when lot of mediawiki builds are running in parallel
  • 14:38 James_F: hashar and I are temporarily disabling running selenium tests in CI. See T211784 T222406 for more details.

2019-06-01

  • 22:54 James_F: Reloading Zuul to re-add Kartographer dependency for JsonConfig T224785
  • 21:14 James_F: Reloading Zuul to add phan for CentralNotice and add DannyS712 to whitelist

2019-05-31

  • 23:34 James_F: Reloading Zuul to remove Kartographer dependency for JsonConfig T224785
  • 22:12 James_F: Reloading Zuul to add phan for LdapAuthentication
  • 21:18 James_F: Reloading Zuul to add dependencies for TimedMediaHandler, ReadingLists, JsonConfig, and FlaggedRevs.
  • 21:13 James_F: Reloading Zuul to add phan for Collection, ContentTranslation, and Jade
  • 19:43 James_F: Reloading Zuul to add phan for Math(s)
  • 19:31 James_F: Reloading Zuul to add phan for MobileFrontend
  • 19:20 James_F: Reloading Zuul to add phan for Sentry
  • 18:35 James_F: Reloading Zuul to deploy CentralNotice dependency to EventBus
  • 18:17 James_F: Reloading Zuul to deploy extra dependencies for MobileFrontend
  • 18:10 James_F: Reloading Zuul to deploy phan for cldr and GlobalPreferences
  • 18:02 James_F: Reloading Zuul to deploy I0ee1c8166
  • 17:56 James_F: Reloading Zuul to deploy I9c115b1f6 and I0d98f01d

2019-05-30

2019-05-29

2019-05-28

2019-05-26

2019-05-22

2019-05-21

2019-05-20

2019-05-19

  • 15:43 hashar: Purging webperformance.integration.eqiad.wmflabs /tmp directory (inodes full)
  • 15:43 hashar: Purging all images on Docker CI slaves
  • 08:03 legoktm: deployed https://gerrit.wikimedia.org/r/511122

2019-05-18

2019-05-17

  • 22:55 paladox: created mediawiki/extensions/Scribe gerrit repo - T223662
  • 00:35 brennen: Updating dev-images docker-pkg files on contint1001

2019-05-16

  • 09:36 hashar: Successfully tagged docker-registry.discovery.wmnet/releng/composer-test-php72:0.1.0 # T223428
  • 07:35 hashar: integration-slave-jessie-1002: purging all php Debian packages and rerunning puppet. There is some package conflict somewhere :-(

2019-05-15

  • 11:55 hashar: Bring back https://integration.wikimedia.org/ci/computer/integration-castor03/ to restore the central cache behavior. In turn unblocking a wide range of builds
  • 11:32 hashar: bringing back castor, integration-castor03 is back
  • 09:35 hashar: Regenerating all CI jobs to disable castor saving entirely
  • 02:35 Krenair: Logged into deployment-sca0[12] as root and given them the correct nameservers to try to unbreak things. T221654
  • 01:00 thcipriani: /usr/local/sbin/keyholder arm

2019-05-14

  • 09:43 hashar: Upgraded all tox jobs to tox 3.10.0
  • 09:19 hashar: Building docker containers releng/tox-*:0.4.0

2019-05-13

2019-05-10

  • 18:28 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/509483
  • 18:08 Reedy: Reloading Zuul to deploy dependancies, new tests, phan
  • 17:53 Reedy: Reloading Zuul to deploy 5 more phan patches
  • 17:37 Reedy: Reloading Zuul to deploy 5 phan enabling patches
  • 17:23 Reedy: Reloading Zuul to deploy various phan additions and one dependancy
  • 17:08 Reedy: Reloading Zuul to deploy patches adding phab
  • 16:46 Reedy: Reloading Zuul to deploy various dependancy patches

2019-05-09

  • 08:17 elukey: remove mediawiki memcached nutcracker config from deployment-prep (should be unused) - T214275
  • 00:19 thcipriani: updating docker images on contint1001 for https://gerrit.wikimedia.org/r/508929
  • 00:19 thcipriani: clean docker images on contint1001

2019-05-07

2019-05-06

2019-05-03

2019-05-01

  • 17:53 halfak: deploying ores:52e9759

2019-04-30

  • 18:00 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/494778
  • 17:01 bearND: (beta): Update mobileapps to 142ba30
  • 02:01 hashar: Polled in integration-slave-docker-1055 and ntegration-slave-docker-1056
  • 01:35 hashar: Deleting integration-slave-docker-1037 (bigram) it is too slow for some reason # T222023

2019-04-29

  • 14:04 godog: add dsharpe user

2019-04-26

  • 13:33 Krenair: shut off deployment-conf03 after discussion with otto.mata and elu.key - it seems ancient, broken, unused. T218729

2019-04-25

  • 16:35 Krenair: shutting down deployment-ms-fe02 and deployment-poolcounter04 T218729

2019-04-23

2019-04-19

  • 19:56 mutante: phab1003 - editing /srv/deployment/phabricator/deployment-cache/.config manually to replace tin.eqiad.wmnet with deploy1001.eqiad.wmnet to fix git cloning issue on first puppet run on new host where somehow tin.eqiad still shows up. fixes puppet run on T221389

2019-04-18

2019-04-17

2019-04-16

  • 17:36 Lucas_WMDE: lucaswerkmeister-wmde@deployment-deploy01:~$ mwscript extensions/WikibaseQualityConstraints/maintenance/ImportConstraintEntities.php --wiki=wikidatawiki --config-format=wgConf | tee T221107.php

2019-04-15

  • 10:16 Amir1: ores:8f01d40 going beta
  • 08:51 hashar: castor: nuked /srv/jenkins-workspace/caches/castor-mw-ext-and-skins/master/mwselenium-quibble-docker # T220948

2019-04-13

  • 21:05 Krinkle: Deleting a bunch of job config+history from Jenkins for jobs that no longer exist in JJB/Zuul. T91410
  • 21:00 Krinkle: Deleting a bunch of job config+history from Jenkins for jobs that no longer exist in JJB/Zuul.
  • 21:00 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/503669
  • 18:52 Krinkle: "Your JENKINS_HOME (/var/lib/jenkins) is almost full. "
  • 18:16 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/503664
  • 17:44 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/502807 (postgres php72)
  • 00:06 Krenair: transferred /home from deployment-cache-upload04 to deployment-cache-upload05 and shut down old one
  • 00:06 Krenair: transferred /home from deployment-cumin to deployment-cumin02 and shut down old one

2019-04-12

  • 15:09 Krenair: upload traffic now through cache-upload05

2019-04-11

2019-04-10

  • 23:32 James_F: Manually created REL1_33 branches for the core, vendor, and tarball extensions and skins. Eurgh. T220653
  • 22:36 James_F: Deleted faulty REL1_33 branches for the Timeless, Vector and Monobook skins; they were duplicates of the REL1_32 branches.
  • 20:27 paladox: create operations/software/gerrit/plugins/MassBranchCreation repository
  • 14:16 hashar: contint1001: sudo -u zuul git -C /srv/zuul/git/mediawiki/core remote prune origin # T220606
  • 14:13 hashar: contint2001: sudo -u zuul git -C /srv/zuul/git/mediawiki/core remote prune origin # T220606
  • 13:00 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/502790
  • 12:46 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/502786
  • 12:40 hashar: contint2001: stopped puppet and zuul-merger for debugging

2019-04-09

2019-04-08

  • 22:46 hashar: cleaned docker images on integration-slave-docker-1030 and integration-slave-docker-1043 :)
  • 20:06 bearND: (beta): Update mobileapps to cdb9928 (T220045 T219411 T219667)
  • 14:47 hashar: hard rebooting integration-slave-docker-1053 OOM / deadlocked
  • 13:39 hashar: Deleting integration-slave-docker-1045 again. It uses Stretch instead of Jessie
  • 13:21 hashar: integration: fix cumin profiles that got renamed by 9e0aa82
  • 10:15 hashar: Building Docker image releng/quibble-stretch-php73:0.0.31-2 # T220237

2019-04-07

2019-04-06

2019-04-05

2019-04-04

2019-04-03

  • 06:13 hashar: gerrit: renamed group "scholarships" to "wikimedia-wikimania-scholarships". Made it owned by "Gerrit Managers" # T218864

2019-04-02

  • 22:24 hauskatze: maurelio@deployment-deploy01:~$ mwscript extensions/PageAssessments/maintenance/purgeUnusedProjects.php --wiki=enwikivoyage | T219935
  • 09:46 hashar: Upgrading CI Quibble jobs to 0.0.31

2019-04-01

  • 21:42 hauskatze: Imported tool-ldap from Diffusion to Gerrit with full history | T219703
  • 21:33 hauskatze: Created https://gerrit.wikimedia.org/r/#/admin/projects/labs/tools/ldap | T219703
  • 20:53 hashar: ssh contint1001.wikimedia.org sudo rm /tmp/docker-pkg-build.log
  • 20:44 hashar: Building Quibble 0.0.31 containers again # T219647 T219786
  • 20:10 hashar: gerrit: flush-caches --cache git_tags # some tag got stalled when querying over https - T219786
  • 18:50 hauskatze: Created mediawiki/extensions/ContributionCredits.git per request on mediawiki.org
  • 18:00 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/#/c/integration/config/+/500504/
  • 17:25 Krinkle: fresnel-node10-browser-docker failing with ENOMEM. Depooled integration-slave-docker-1049 as precaution.
  • 17:04 hashar: Building CI docker images for Quibble 0.0.31 (yes it is a long day...)
  • 13:50 hashar: Reverted CI Jenkins jobs to Quibble 0.0.28 # T219647
  • 13:11 hashar: Upgraded CI Jenkins jobs to Quibble 0.0.30 # T219647
  • 10:50 hashar: Manually triggering postmerge step of citoid due to T219017 for mvolz. On contint1001: zuul enqueue --trigger gerrit --pipeline postmerge --project mediawiki/services/citoid --change 497315,1
  • 08:08 hashar: Rebuilding Quibble Jessie containers that failed to build last week due to wikimedia-jessie container. # T219647
  • 08:07 hashar: Rebuilding container docker-registry.wikimedia.org/wikimedia-jessie # T219683

2019-03-29

2019-03-28

2019-03-27

  • 22:27 James_F: Altered Wikimedia GitHub settings to require 2FA; see T198810
  • 14:16 hashar: contint: refreshed all git caches manually from cumin: cumin --force 'name:docker' 'find /srv/git -name '*.git' -type d -print -exec git -C {} fetch --prune \;'
  • 14:16 hashar: contint: added repositories to the git caches https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/499482/

2019-03-26

2019-03-25

  • 14:46 mateusbs17: sunset deployment-maps03

2019-03-24

  • 16:06 Krenair: shut off old deployment-db03 instance per T219087
  • 05:39 Krenair: cleaned up old puppet certs/nodes -certcentral-testclient03 -certcentral-testdns -certcentral03 -zotero01 -eventgate-analytics -t153468-test -rd3-cptest-master01 -maps05
  • 04:12 Krenair: removed php7.0-fpm package (conflicting with php7.2-fpm) and removed /etc/nginx/sites-enabled/default (conflicting with apache, puppet will remove the available copy too) from -deploy02, -jobrunner03, -mwmaint01, and -mediawiki-07 hosts to try to get puppet there happy again

2019-03-23

2019-03-22

2019-03-21

2019-03-20

2019-03-19

  • 22:47 <Krinkle>: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/497322 / T218553
  • 21:00 <paladox>: hiding operations/software/gerrit/plugins/WikimediaWebSessions from normal users
  • 20:59 <paladox>: create operations/software/gerrit/plugins/WikimediaWebSessions project T218739
  • 20:23 <hashar>: integration: sudo cumin --force '*' 'rm /etc/apt/preferences.d/jessie_mitaka_pinning_*' # T218559
  • 19:56 <hashar>: integration: sudo cumin --force '*' 'rm /etc/apt/sources.list.d/openstack-mitaka-jessie.list' # T218559
  • 16:56 <Reedy>: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/494802
  • 15:48 <Krinkle>: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/497437
  • 14:01 <hashar>: Removed Zuul "check" pipeline | https://gerrit.wikimedia.org/r/#/c/integration/config/+/493188/ | T192217
  • 02:21 <Krenair>: sudo aptly publish --architectures="all,amd64" --skip-signing repo buster-deployment-prep
  • 02:09 <Krenair>: sudo aptly repo create -component="main" -distribution="buster-deployment-prep" buster-deployment-prep

2019-03-18

  • 16:39 Krenair: yet*
  • 16:38 Krenair: created deployment-acme-chief01 and a client instance for further acme-chief testing + dev. used stretch, would be buster like prod but not sure that's easily available outside testlabs yes
  • 10:09 hashar: contint1001: rm -fR /srv/doc1001.eqiad.wmnet
  • 10:06 hashar: github: deleting https://github.com/wikimedia/wikidata-gremlin # archived T155829
  • 09:58 hashar: arming keyholder on integration-cumin
  • 09:55 hashar: deleting shutdowned instance integration-publisher02 , we do not use it anymore since doc publishing got overhauled ( T137890 ) # T218146
  • 09:12 hashar: deployment-deploy01: cleaning disk: rm /var/cache/hhvm/cli.hhbc.sq3
  • 07:37 legoktm: deployed https://gerrit.wikimedia.org/r/496610

2019-03-17

  • 16:29 Krenair: deactivated and cleaned certs for deployment-redis3-changeprop02 and deployment-prometheus01 (which no longer appear to exist but were causing cumin to be upset)
  • 16:27 Krenair: armed deployment-cumin keyholder
  • 00:08 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/496887

2019-03-16

2019-03-15

2019-03-14

2019-03-13

  • 21:26 thcipriani: pool new bigram CI instance integration-slave-docker-1054
  • 20:58 thcipriani: deleting bigram CI instance integration-slave-docker-1046 due to corrupt disk cf: T218245
  • 20:44 thcipriani: marking integration-slave-docker-1046 offline
  • 20:38 hashar: Added integration-slave-docker-1045 to Jenkins. The instance existed in WMCS but was not in Jenkins
  • 20:13 bearND: (beta): Update mobileapps to 5865552 (7074964 d6dc3cd fbc6262)
  • 19:13 hashar: integration-slave-docker-1046 is back online # T218245
  • 19:04 hashar: hard rebooting integration-slave-docker-1046 , not reachable over ssh # T218245
  • 18:48 hashar: Building containers releng/quibble-jessie-hhvm and releng/quibble-stretch-hhvm with HHVM core_dump_report enabled # T216689
  • 18:09 ebernhardson: restart elasticsearch on deployment-elastic* to deploy apifeature usage fix (T183156)
  • 10:39 hashar: Triggered tests for ContentTranslation changes that had label:verified=-1 # T217654
  • 10:03 hashar: Bump Quibble tmpfs disk space used to hold the database from 256MBytes to 320Mbytes. l10n_cache causes an overflow # T217654

2019-03-12

2019-03-11

  • 15:59 Krenair: previous !log was for deployment-prep
  • 15:56 Krenair: added MSantos to projectadmin, per chat in -infrastructure he'll use it to work on maps. user is a new foundation software engineer
  • 01:20 legoktm: deploying https://gerrit.wikimedia.org/r/495593

2019-03-10

2019-03-08

2019-03-06

  • 18:25 hasharAway: contint1001: restart Jenkins for plugins upgrade
  • 18:24 hasharAway: reloading zuul for I21d7fa
  • 18:02 hashar: Upgrading plugins on https://releases-jenkins.wikimedia.org
  • 17:37 hashar: Reloading Zuul for mediawiki/libs/Zest https://gerrit.wikimedia.org/r/#/c/integration/config/+/494609/2/zuul/layout.yaml
  • 16:27 Krinkle: doc1001 had bad permissions set on /org/wikimedia/doc (chmod 755 instead of 775). Making it impossible to git pull in the way that post-merge Jenkins job on integration/docroot recommends. Fixed with `sudo -u doc-uploader chmod 775 /srv/wikimedia/org/wikimedia/doc`.
  • 15:58 andrewbogott: deleting deployment-prometheus01 on Filippo's advice
  • 13:31 gehel: upgrading logstash to 5.6.14 on deployment-logstash2
  • 13:15 gehel: upgrading elasticsearch to 5.6.14 on deployment-logstash2
  • 11:33 Lucas_WMDE: lucaswerkmeister-wmde@deployment-mediawiki-09:~$ sudo systemctl restart php7.2-fpm # T217323

2019-03-05

  • 20:10 thcipriani: reenable beta-scap-eqiad
  • 19:17 thcipriani: disable beta-scap-eqiad due to T217587
  • 10:13 hashar: integration: fixed erroneous ssh key restriction for cumin | T217642

2019-03-04

2019-03-03

  • 02:26 Krinkle: tried rebooting or shutting down integration-slave-docker-1021, no response on horizon. Did pause/resume instead, which did work, after which shutdown/start worked. Jenkins agent has been relaunched and seems online again.
  • 02:20 Krinkle: integration-slave-docker-1021 (ci1.medium) has jobs failing on it due to ENOMEM. Horizon shows in log: integration-slave-docker-1021 login: [4961938.696837] Out of memory: Kill process 21770 (chromium) score 841 or sacrifice child; [4961938.699176] Killed process 21770 (chromium) total-vm:3171496kB, anon-rss:1379288kB, file-rss:0kB, shmem-rss:1636kB

2019-03-02

2019-03-01

  • 19:17 thcipriani: integration-slave-docker-1021:/# docker rmi $(docker images | grep " months " |grep -v " [1-2] months " | awk '{print $3}')
  • 17:02 thcipriani: integration-slave-jessie-1004 back online
  • 16:58 thcipriani: integration-slave-jessie-1002 back online (disk space looked fine); rebooting integration-slave-jessie-1004 -- can't ssh to machine
  • 16:11 Lucas_WMDE: delete refs/master and refs/gerrit/master on WikibaseQualityConstraints repository T217408
  • 15:49 hashar: wikidata/query/blazegraph change Gerrit config to require a change-id # T216855
  • 14:28 hashar: Upgrading integration/jenkins-job-builder to version 2.0.2 + one custom hack 11aa5de4...a06d173e # T143731
  • 14:18 hashar: integration/jenkins-job-builder : importing upstream code to new branch "upstream". Push all upstream tags to our repository

2019-02-28

2019-02-27

2019-02-26

2019-02-25

  • 23:32 twentyafterfour: root@deployment-db05# mariabackup --innobackupex --apply-log --use-memory=10G /srv/sqldata # T216067
  • 22:14 thcipriani: docker rmi images without "latest" tag on contint1001 to free space -- should have kept all current docker-pkg images as well as images with children -- T217094
  • 13:39 hashar: Rebuilding some CI Docker images using PHP sury.org to switch the sury.org component from jessie to stretch ( https://gerrit.wikimedia.org/r/#/c/integration/config/+/492666/ )

2019-02-24

2019-02-23

2019-02-22

2019-02-21

2019-02-20

  • 19:45 hashar: deployment-db03: restored some old /var/lib/dpkg/status file : sudo zcat /var/backups/dpkg.status.2.gz | sudo tee /var/lib/dpkg/status # T216635
  • 19:45 hashar: deployment-db03: restored some old /var/lib/dpkg/status file : sudo zcat /var/backups/dpkg.status.2.gz | sudo tee /var/lib/dpkg/status
  • 17:26 hashar: For beta cluster the MySQL master database has some innodb issue T216635 , the MySQL slave has an issue as well T216067
  • 17:09 hashar: reloading zuul for Id1e3af
  • 17:09 hashar: contint1001: fix broken root ownership on zuul git deploy repo: sudo find /etc/zuul/wikimedia/.git -not -user zuul -exec chown zuul:zuul {} +
  • 17:05 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/490606/
  • 16:57 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/490559
  • 16:36 hauskatze: Ran replication start mediawiki/extensions/PageViewInfo --wait on gerrit.wikimedia to populate GitHub mirror (success messages afterwards) | T180864
  • 00:25 greg-g: disabled beta-update-databases-eqiad in the jenkins UI - T216067

2019-02-19

2019-02-18

2019-02-17

2019-02-16

2019-02-15

  • 17:28 thcipriani: integration-slave-jessie-1002:/srv/jenkins-workspace/workspace$ `sudo rm -rf *` due to full disk

2019-02-14

2019-02-13

  • 21:32 marxarelli: dduvall@integration-slave-jessie-1001:/mnt/home/jenkins-deploy$ `rm -rf .gradle/ .m2/` due to full disk
  • 21:21 marxarelli: bringing integration-slave-docker-1046 and integration-slave-jessie-1001 back online
  • 21:20 marxarelli: dduvall@integration-slave-jessie-1001:/srv/jenkins-workspace/workspace$ `sudo rm -rf *` due to full disk
  • 21:15 marxarelli: removing old docker images on integration-slave-docker-1046
  • 21:10 marxarelli: starting migrated integration-slave-docker-1046 instance
  • 21:01 marxarelli: pooling new jenkins node for integration-slave-docker-1052
  • 20:46 marxarelli: pooling jenkins node for integration-slave-docker-1051
  • 20:45 marxarelli: launching replacement instance integration-slave-docker-1052
  • 20:35 marxarelli: launching replacement instance integration-slave-docker-1051
  • 20:32 marxarelli: pooling jenkins node for integration-slave-docker-1050
  • {{safesubst:SAL entry|1=20:15 marxarelli: integration-slave-docker-{1044,1046,1047} unresponsiveness due to cloudvirt failure. 1046 is being moved already by CS. deleting 1044 and 1047}}
  • {{safesubst:SAL entry|1=19:57 marxarelli: seeing jenkins agent connection failures for integration-slave-docker-{1044,1046,1047}}}
  • 19:48 marxarelli: pooling replacement jenkins node integration-slave-docker-1049
  • 19:34 marxarelli: deleting integration-slave-jessie-android jenkins node and instance
  • 19:33 marxarelli: deleting integration-slave-jessie-1003 jenkins node and instance
  • 19:32 marxarelli: deleting integration-slave-docker-1033 jenkins node and instance
  • 19:25 marxarelli: deleting integration-slave-docker-1017 jenkins node and instance
  • 18:45 Krinkle: integration-slave-jessie-1003 seems to be consitently unable to start jobs, marking as offline manually
  • 18:32 thcipriani: bringing up new integration-castor03, re-enabling castor-save* jobs
  • 18:15 marxarelli: adding new jenkins node integration-slave-docker-1048
  • 18:02 marxarelli: launching new integration-slave-docker-1048 instance
  • 17:59 marxarelli: deleting integration-slave-docker-1038 node and deleting instance
  • 17:50 marxarelli: bringing integration-slave-docker-1033 back online after clearing out old docker images
  • 17:33 thcipriani: rebuilding integration-castor03
  • 17:21 thcipriani: stopping rsync server on castor03
  • 17:21 twentyafterfour: stopped rsync on castor03
  • 17:16 twentyafterfour: disconnected castor03 from jenkins
  • 16:48 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/487880/
  • 14:34 thcipriani: modified castor-save-workspace-cache to exit 0 and run on blubber nodes while integration-castor03 is down
  • 14:26 dcausse: deployement-prep: upgrading to elastic 5.6.14

2019-02-12

2019-02-11

2019-02-10

2019-02-08

  • 20:20 Krinkle: Delete various jobs on Jenkins that no longer exist in JJB config, ref T91410
  • 15:59 addshore: this reload also included "Switch npm-audit job to node10"? T211784, which did touch the zuul file
  • 15:58 addshore: reloaded zuul for https://gerrit.wikimedia.org/r/#/c/integration/config/+/489241/
  • 03:10 Krinkle: Delete various jobs on Jenkins that no longer exist in JJB config
  • 00:28 Krinkle: krinkle@doc1001: sudo -u doc-uploader chmod 775 /srv/docroot/org/wikimedia/doc/
  • 00:12 marxarelli: removed old docker images on contint1001 to free up space

2019-02-07

  • 23:17 thcipriani: integration-slave-jessie-1003:sudo rm -rf /srv/jenkins-workspace/workspace/*
  • 23:15 thcipriani: integration-slave-docker-1033:sudo docker image prune and bring back online
  • 22:28 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/467550 (
  • 19:09 paladox: created integration/zuul/build gerrit repo for T215458
  • 19:05 paladox: created integration/zuul/wheels gerrit repo for T215458
  • 15:48 addshore: brought integration-slave-docker-1043 back online
  • 15:48 addshore: addshore@integration-slave-docker-1043:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 14.86GB
  • 08:49 hashar: cleaning docker images on integration-slave-docker-1021

2019-02-06

  • 22:34 shdubsh: Deploy node-exporter 0.17 T213708
  • 14:12 godog: shut off deployment-prometheus01 - T215272
  • 14:00 godog: switch beta-prometheus to deployment-prometheus02 - T215272

2019-02-05

  • 20:07 ebernhardson: jobrunner port 9006 is firewalled, revert to 9005 and created T215339 to fix job queue in beta cluste
  • 19:36 ebernhardson: Update profile::cpjobqueue::{jobrunner,videoscaler}_host in horizon hiera from port 9005 to 9006 to match new restrictions in gerrit.wikimedia.org/r/481866
  • 16:29 addshore: T215288 added mirrys to deployment-prep as a user
  • 15:32 addshore: T215278 addshore@integration-slave-docker-1037:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 16.59GB

2019-02-04

  • 23:13 thcipriani: integration-slave-docker-1040:sudo docker image prune and bring back online
  • 23:12 thcipriani: integration-slave-docker-1038:sudo docker image prune and bring back online
  • 21:48 ebernhardson: restart logstash on deployment-logstash2
  • 15:25 hashar: removed Jenkins user "nodepoolmanager" as well as related authorizations | T209361

2019-02-03

2019-02-02

  • 22:17 legoktm: legoktm@integration-slave-jessie-1004:/srv/jenkins-workspace/workspace$ sudo rm -rf *

2019-01-31

  • 15:03 thcipriani: rearm keyholder on deployment-deploy01
  • 12:05 arturo: VM instances deployment-deploy01,deployment-deploy02,deployment-fluorine02,deployment-kafka-jumbo-2,deployment-kafka-main-1,deployment-maps04,deployment-mcs01,deployment-mediawiki-09,deployment-memc04,deployment-ms-be03,deployment-ms-fe02,deployment-parsoid09,deployment-sca04,deployment-webperf12, were stopped briefly due to issue in hypervisor (T215012)

2019-01-30

2019-01-29

  • 07:41 legoktm: legoktm@integration-slave-jessie-1001:/srv/jenkins-workspace/workspace$ sudo rm -rf * b/c full disk

2019-01-28

  • 16:33 hashar: contint1001: cleaning up disk space on /
  • 13:07 addshore: bringing integration-slave-docker-1041 back online
  • 13:07 addshore: addshore@integration-slave-docker-1041:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 16.12GB
  • 09:37 Amir1: ores:ad160b0 is going beta

2019-01-27

  • 19:57 addshore: bringing integration-slave-docker-1034 back online
  • 19:50 addshore: addshore@integration-slave-docker-1034:~$ sudo docker image prune -a --force --filter "until=2191h" // (3 months?) Total reclaimed space: 17.12GB

2019-01-26

2019-01-25

2019-01-23

2019-01-22

2019-01-21

  • 19:49 hashar: integration: update sudo rule for debian-glue to keep env variable EXTRAPACKAGES. Would let us get eatmydata included | T214328
  • 15:40 hashar: contint1001: removing all generated doc/cover from /srv/org/wikimedia/doc | T137890

2019-01-18

  • 23:22 hashar: contint1001: sudo docker image prune # Total reclaimed space: 3.592GB
  • 23:00 Krinkle: Some docker builds on integration-slave-docker-1021 failing with ENOMEM
  • 23:00 mutante: contint1001 - gzipping more files in /var/log/zuul/
  • 22:57 mutante: contint1001 - moved zuul logs from 2018 and gzipped zuul logs from /var/log/zuul to /srv/logs/zuul to free disk space on /
  • 22:39 mutante: contint1001 - apt-get clean - disk space low
  • 22:31 Krinkle: Updating docker-pkg files on contint1001 for https://gerrit.wikimedia.org/r/482527 / T212602

2019-01-17

  • 19:34 thcipriani: integration-slave-jessie-1002:sudo rm -rf /srv/jenkins-workspace/workspace/* and bring back online
  • 08:39 legoktm: deploying composer docker image - https://gerrit.wikimedia.org/r/484853

2019-01-16

  • 21:11 bearND: (beta): Update mobileapps to 258d76b page summary changes

2019-01-15

  • 09:00 hashar: Deleting Docker images on integration-slave-docker-1021

2019-01-14

  • 22:02 bearND: (beta): Update mobileapps to f2658de
  • 21:47 mutante: deployment-mcs01 - sudo su deploy-service; cd /srv/deployment/mobileapps/deploy-cache/revs/1182b3b8f288df0221257b929ca43fb86862c2f8/scap ; touch log (for debugging permission problem reported by bearND)
  • 14:31 hashar: Nuked Castor cache for all *tox* jobs. Some might have cached binary wheels compiled against a lib that is no more existing (eg libmysqlclient.so.18 for mysql-python). Follow up the jessie -> stretch upgrade # T191764
  • 14:28 hashar: Deleted Castor cache for wikimedia-cz/tracker mysql-python got cached as a wheel but compiled against libmysqlclient.so.18. That fails with the new tox...:0.3.0 containers which uses mariadb / libmysqlclient.so compat symlink

2019-01-11

2019-01-09

2019-01-08

  • 21:52 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/476600/
  • 21:31 Hauskatze: github: @niedzielski updated @jdlrobson permission on Wikimedia from `read` to `admin`
  • 21:30 Hauskatze: github:
  • 20:46 thcipriani: reloading zuul to deploy https://gerrit.wikimedia.org/r/#/c/integration/config/+/482855/
  • 19:53 mutante: deployment-prep adjusting puppet config on deployment-mwmaint01. remove "mediawiki_maintenance" role from "other classes" section and apply "mediawiki::maintenance" instead after role rename in gerrit:479131 for consistency with other mediawiki:: roles
  • 19:53 mutante: adjusting puppet config on deployment-mwmaint01. remove "mediawiki_maintenance" role from "other classes" section and apply "mediawiki::maintenance" instead after role rename in gerrit:479131 for consistency with other mediawiki:: roles
  • 14:25 hashar: Upgrading plugins on https://releases-jenkins.wikimedia.org/
  • 09:19 hashar: gerrit: resaved configuration for All-Projects by changing "Max Reviewers" from 3 to 4. Might enable adding reviewers automatically based on git blame. See task for config diff # T 101131
  • 05:37 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/482752
  • 02:45 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/482751

2019-01-07

2019-01-06

2019-01-03

2019-01-02

  • 10:19 hashar: updating all debian-glue jobs and creating new ones with hardcoded distributions (trusty, jessie, stretch, unstable) T210780

2019-01-01

  • 15:33 hashar: contint1001: deleting some extensions documentation for wmf branches: rm -fR /srv/org/wikimedia/doc/{Kartographer,MinervaNeue,MobileFrontend,Wikibase}/wmf # T118599

Archives