You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Release Engineering/SAL: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Stashbot
(hashar: Log refresh Nodepool instances to deploy slave script update to be able to merge mediawiki/composer.json into vendor/composer.json 6527f49..a7728a5 https://gerrit.wikimedia.org/r/#/c/339202/ T158674)
imported>Stashbot
(James_F: Zuul: [mediawiki/extensions/PagePermissions] Add basic tests)
 
Line 1: Line 1:
== 2017-02-24 ==
== 2022-12-02 ==
* 13:56 hashar: Log refresh Nodepool instances to deploy slave script update to be able to merge mediawiki/composer.json into vendor/composer.json  6527f49..a7728a5  https://gerrit.wikimedia.org/r/#/c/339202/ [[phab:T158674|T158674]]
* 16:07 James_F: Zuul: [mediawiki/extensions/PagePermissions] Add basic tests
* 13:52 hashar: deployed slave script update to be able to merge mediawiki/composer.json into vendor/composer.json  6527f49..a7728a5  https://gerrit.wikimedia.org/r/#/c/339202/ [[phab:T158674|T158674]]
* 10:59 slyngs: Updating development images bullseye-openldap and puppet-debugger for merge request 22 and 24
* 01:51 James_F: Zuul: [mediawiki/extensions/WatchAnalytics] Fix name
* 01:04 James_F: Zuul: [mediawiki/extensions/ImageSuggestions] Add phan CirrusSearch and Elastica dependencies for [[phab:T302711|T302711]]
* 01:02 James_F: Zuul: [mediawiki/libs/Bcp47Code] Add basic test


== 2017-02-23 ==
== 2022-12-01 ==
* 18:35 greg-g: 18:29 <  chasemp> !log labnodepool1001:~# service nodepool restart
* 17:25 James_F: Zuul: [mediawiki/extensions/Wikisource] Add ProofreadPage dependency
* 09:27 hashar: Clearing skins from testextension jobs [[phab:T117710|T117710]] salt -v '*slave*' cmd.run 'rm -fR /srv/jenkins-workspace/workspace/mwext-testextension*/src/skins/*'
* 17:22 James_F: Zuul: [operations/software/bitu-ldap] Add basic tox CI
* 14:27 James_F: Zuul: [mediawiki/extensions/WatchAnalyics] Add basic CI
* 14:20 James_F: Zuul: [css-sanitizer] Re-disable PHP 8.1 job, still broken [[phab:T311451|T311451]]


== 2017-02-22 ==
== 2022-11-30 ==
* 20:58 hashar: Deleted jenkins job pplint-HEAD. Fully replaced by rake / puppet-syntax gem - [[phab:T154894|T154894]]
* 20:42 James_F: Zuul: Drop all CI for the REL1_37 branch, now EOL for [[phab:T324133|T324133]]
* 20:54 hashar: Deleted jenkins job erblint-HEAD. Fully replaced by rake / puppet-syntax gem - [[phab:T154894|T154894]]


== 2017-02-20 ==
== 2022-11-29 ==
* 14:53 hashar: integration: applying role::ci::slave::saucelabs to saucelabs-01
* 21:35 brennen: gitlab repos/releng/scap: added direct membership for some non-releng maintainers who show up frequently/recently in commit log
* 12:50 hashar: integration-slave-jessie-1001  downgraded cowbuilder to 0.73 from jessie to match integration-slave-jessie-1002
* 20:33 James_F: Zuul: [wikimedia/wikimania-scholarships] Set as archived for [[phab:T243037|T243037]]


== 2017-02-17 ==
== 2022-11-27 ==
* 14:07 hashar: integration: deleting "repository" instance. No time to figure out how to ship Sonatype Nexus to it. [[phab:T147635|T147635]]
* 21:27 James_F: Docker: Publishing new php82 images with rc.7 for [[phab:T314093|T314093]]


== 2017-02-16 ==
== 2022-11-25 ==
* 18:34 greg-g: chase restarted nodepool, the daemon crashed
* 12:45 hashar: Reloaded Zuul for {{Gerrit|I717ad1fe4ef7b151808b242cdf16f0268c58fbd7}} "add pipelinename to autogenerated:ci tags" # [[phab:T214068|T214068]]
* 18:32 greg-g: no active nodepool instances listed in Jenkin's view: https://integration.wikimedia.org/ci/ but zuul has plenty to do https://integration.wikimedia.org/zuul/
* 16:56 hashar: integration: provisioned browsertests-1001 with role::ci::slaves::browsertests . Added it to Jenkins with label  BrowserTests
* 16:33 halfak: deploying ores:e9bbda3
* 16:30 hashar: integration: created browsertests-1001  intended to run the daily browser tests later on


== 2017-02-15 ==
== 2022-11-23 ==
* 15:47 hashar: Zuul reducing gate-and-submit minimum amount of changes to process from the wrong 12 down to 2.  In case of repeating failures it would  end up running jobs for only two jobs which would prevent cancelling jobs for up to 11 changes!
* 22:41 urandom: accidentally deleted deployment-sessionstore04
* 15:07 James_F: Zuul: configure CI for operations/debs/varnish-modules for [[phab:T321309|T321309]]


== 2017-02-14 ==
== 2022-11-22 ==
* 14:38 hashar: Updating castor-save publish job to properly capture composer cache on Jessie ( it is in ~/.composer/cache for some reason)  [[phab:T156359|T156359]]
* 21:51 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/859615
* 21:06 TheresNoTime: samtar@deployment-mwmaint02:~$ mwscript extensions/WikimediaMaintenance/createExtensionTables.php zhwiki pagetriage [[phab:T323378|T323378]]
* 14:11 TheresNoTime: [samtar@deployment-deploy03 ~]$ sudo keyholder arm


== 2017-02-13 ==
== 2022-11-21 ==
* 21:25 bearND: Update mobileapps to {{Gerrit|3af473f}}
* 14:54 James_F: Zuul: [mediawiki/extensions/WikiLambda] Disable selenium tests for [[phab:T294388|T294388]]
* 20:15 hashar: Image snapshot-ci-jessie-1487016035 in wmflabs-eqiad is ready
* 14:41 vgutierrez: move deployment-cache-(text{{!}}upload)07 from role::cache::(text{{!}}upload)_haproxy to role::cache::(text{{!}}upload) - [[phab:T323365|T323365]]
* 20:01 hashar: Updating Nodepool Jessie snapshot to update the Parsoid zuul-cloner map ( https://gerrit.wikimedia.org/r/#/c/337430/ )
* 09:25 hashar: Changing Jenkins slave contint1001 working dir  from /srv/ssd/jenkins-slave to /srv/jenkins-slave ( https://gerrit.wikimedia.org/r/#/c/337286/ )


== 2017-02-10 ==
== 2022-11-18 ==
* 22:11 halfak: deployed ores:a15ec90
* 10:05 hashar: gerrit: change HEAD branch to point to `deploy/wmf/stable-3.5` # [[phab:T307334|T307334]]
* 21:25 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/337079
* 16:18 thcipriani: deployment-puppetmaster02:/var/lib/git/operations/puppet removed untracked file "how", updated submodules
* 14:49 hashar: rebase beta puppet master. Fixed conflicts with https://gerrit.wikimedia.org/r/#/c/321096/ and https://gerrit.wikimedia.org/r/#/c/312523/
* 11:36 hashar: Pruning some old caches from castor.integration.eqiad.wmflabs  (eg node-4  jobs are gone)


== 2017-02-09 ==
== 2022-11-17 ==
* 22:22 greg-g: manually kicked off a bunch of selenium tests after tgr and Reedy fixed [[phab:T157636|T157636]]
* 17:44 taavi: reloading zuul to deploy https://gerrit.wikimedia.org/r/858391
* 18:36 Reedy: someone should kill the ee_prototypewiki db from beta
* 16:40 Amir1: deploying {{Gerrit|030c269}} ores to sca03
* 11:11 legoktm[NE]: deploying https://gerrit.wikimedia.org/r/336615 and https://gerrit.wikimedia.org/r/336779


== 2017-02-08 ==
== 2022-11-16 ==
* 22:26 mdholloway: mobileapps deployed {{Gerrit|0efa7b8}} in the beta cluster
* 20:53 thcipriani: restarting jenkins for update
* 14:14 hashar: integration-slave-jessie-1001 upgrading cowbuilder
* 08:46 hashar: gerrit: reindexed accounts `ssh -p 29418 gerrit.wikimedia.org -- gerrit index start accounts --force` # [[phab:T323135|T323135]]
* 09:20 hashar: deployment-fluorine02  upgraded packages, deleted old files from /srv/mw-log/archive
* 08:45 hashar: gerrit: deleted 192 LDAP accounts (scheme `gerrit:`) containing upper case characters which had an exact equivalent in an all lower case form. `All-Users.git` commit is {{Gerrit|5e5800ecc8fd5da591567e616898dd6df988c0c8}} # [[phab:T323135|T323135]]
* 08:45 hashar: gerrit: deleted 192 LDAP accounts (scheme `gerrit:`) containing upper case characters which had an exact equivalent in an all lower case form #


== 2017-02-07 ==
== 2022-11-15 ==
* 17:49 halfak: deploying ores {{Gerrit|7c80636}}
* 20:21 hashar: gerrit: removed legacy mixed case accounts and moved the extra secondary email to a mailto id for `gerrit:krinkle`, `gerrit:revi`, `gerrit:daniel kinzler`, `gerrit:harej` and `gerrit:samanthanguyen` [[phab:T323135|T323135]]#8397539
* 09:02 hashar: Hard rebooting integration-slave-jessie-1001 . I messed up with the DHCP client :(
* 20:20 hashar: gerrit: removed legacy mixed case accounts for `gerrit:Fomafix` and `gerrit:Ricordisamoa` [[phab:T323135|T323135]]#8397539
* 16:25 James_F: Zuul: [mediawiki/services/parsoid] Make MW jobs voting in test
* 15:57 James_F: Zuul: [mediawiki/extensions/CampaignEvents] Add Echo as phan dependency for [[phab:T317231|T317231]]
* 15:24 hashar: gerrit: converted, to all lower case, the Gerrit accounts `username:Kaldari`, `username:Fran McCrory` and `username:SamanthaNguyen` # [[phab:T323097|T323097]]


== 2017-02-06 ==
== 2022-11-14 ==
* 21:31 bearND: Update mobileapps to {{Gerrit|034a391}}
* 17:36 hashar: Nuking unused Castor cached files in `/srv/jenkins-workspace/caches` # [[phab:T323051|T323051]]
* 17:35 hashar: Changing Castor cache saving from `/srv/jenkins-workspace/caches/` to `/srv/cache/caches/` which is the one served by rsync [[phab:T323051|T323051]]
* 17:34 hashar: Changing Castor cache saving from `/srv/jenkins-workspace/caches/` to `/srv/cache/caches/` which is the one served by rsync.
* 14:19 James_F: Zuul: [mediawiki/services/function-schemata] Move from node 12 to 16


== 2017-02-04 ==
== 2022-11-10 ==
* 21:37 halfak: deploying ores {{Gerrit|7c80636}}
* 21:33 James_F: Docker: Upgrading quibble-buster-php74-coverage with a new vesion of phpunit-patch-coverage for [[phab:T322864|T322864]]
* 21:24 halfak: deploying ores {{Gerrit|691b340}}
* 08:37 hashar: Rebuilding https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/ , it probably failed due to Gerrit being restarted
* 01:09 James_F: Zuul: Make PHP 8.1 voting for all quibble items for [[phab:T316078|T316078]]
* 01:05 James_F: Zuul: Drop mwext-php74-phan-docker from experimental for gate


== 2017-02-03 ==
== 2022-11-09 ==
* 11:09 hashar: beta: removed old kernels from deployment-redis02  to free up disk space
* 23:02 James_F: Zuul: [mediawiki/core] Add PHP 8.1 phan job for [[phab:T322278|T322278]]
* 10:42 hashar: Image ci-jessie-wikimedia-1486115643 in wmflabs-eqiad is ready  [[phab:T156923|T156923]]
* 14:56 andrewbogott: fixed puppet breakage on several instances
* 10:12 hashar: Image ci-jessie-wikimedia-1486115643 in wmflabs-eqiad is ready  [[phab:T156923|T156923]]
* 09:54 hashar: Regenerate Nodepool Jessie snapshot.  Would get a new HHVM version [[phab:T156923|T156923]]


== 2017-02-02 ==
== 2022-11-08 ==
* 21:56 hashar: integration-slave-jessie-1001 wiping /srv/pbuilder/base-trusty-amd64.cow  it was not properly provisioned causing build to fail (eg lack of /etc/hosts) Running puppet to reprocvision it (poke [[phab:T156651|T156651]])
* 20:17 dduvall: puppet re-enabled on gitlab-runner hosts ([[phab:T322453|T322453]]) normal log level will be restored on next puppet run
* 16:26 Amir1: deploying {{Gerrit|9fd75a1}} ores in beta
* 20:01 dduvall: temporarily enabling buildkitd debug logging on gitlab-runner hosts ([[phab:T322453|T322453]])
* 16:17 hashar: integration-slave-jessie-1001 wiping /srv/pbuilder/base-trusty-i386.cow/  it was not properly provisioned causing build to fail (eg lack of /etc/hosts)  Running puppet to reprocvision it (poke [[phab:T156651|T156651]])
* 15:58 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/854535
* 14:15 hashar: Nodepool: delete the image building of Jessie (image id 1322) to prevent a faulty HHVM version from being added. [[phab:T156923|T156923]]
* 15:26 vgutierrez: delete deployment-ms-be06 - [[phab:T322231|T322231]]
* 00:52 tgr: added mhurd as member
* 15:21 vgutierrez: shutdown deployment-ms-be06 - [[phab:T322231|T322231]]
* 06:39 vgutierrez: delete deployment-ms-be05 - [[phab:T322231|T322231]]
* 06:36 vgutierrez: delete deployment-ms-fe03 - [[phab:T322554|T322554]]
* 06:30 vgutierrez: downgrade to firejail 0.9.44.8-2 on deployment-imagescaler03
* 05:51 vgutierrez: shutdown deployment-ms-fe03 - [[phab:T322554|T322554]]


== 2017-02-01 ==
== 2022-11-07 ==
* 21:43 bearND: Update mobileapps to {{Gerrit|e48a88c}}
* 18:19 vgutierrez: let deployment-cache-upload07 use deployment-ms-fe04 - [[phab:T322554|T322554]]
* 18:51 thcipriani: nodepool delete-image 1320 per [[phab:T156923|T156923]]
* 15:57 vgutierrez: shutting down deployment-ms-be05 - [[phab:T322231|T322231]]
* 14:53 gehel: deployment-elastic* fully migrated to Jessie and /srv as data partition - [[phab:T151326|T151326]]
* 14:52 gehel: killing test node deployment-elastic08 - [[phab:T151326|T151326]]
* 14:32 gehel: shutting down and reimaging deployment-elastic07 - [[phab:T151326|T151326]]
* 14:06 gehel: shutting down and reimaging deployment-elastic06 - [[phab:T151326|T151326]]
* 13:34 gehel: shutting down and reimaging deployment-elastic05 - [[phab:T151326|T151326]]
* 13:29 gehel: starting deployment-elastic* migration to jessie and moving data partition to /srv ([[phab:T151326|T151326]] / [[phab:T151328|T151328]])
* 13:18 moritzm: upgraded deployment-prep to hhvm 3.12.12


== 2017-01-31 ==
== 2022-11-03 ==
* 22:12 thcipriani: started mysql on all integration precise instances via salt -- was stopped for some reason
* 20:31 hashar: Reloaded Zuul for {{Gerrit|Ic473bd57059d4eccad0f52c1d11d61f6ba1a4ad1}}
* 01:59 bd808: nodepool is full of instance stuck in "delete"
* 19:19 brennen: attempting initial phab1004 phabricator deploy
* 01:53 bd808: https://integration.wikimedia.org/zuul/ showing huge backlogs but https://integration.wikimedia.org/ci/ looks mostly idle
* 17:45 James_F: Zuul: Add CI for CategoryExplorer and EmailDeletedPages extensions and Cavendish and Pivot skins
* 17:15 James_F: Zuul: Add experimental PHP 8.2 jobs for PHP extensions for [[phab:T314093|T314093]]
* 16:53 James_F: Docker: Publishing initial PHP 8.2 CI test images for [[phab:T314093|T314093]]
* 13:44 TheresNoTime: add `cxserver-beta` (port 8080) proxy for deployment-prep, [[phab:T322323|T322323]]


== 2017-01-26 ==
== 2022-11-02 ==
* 14:25 hashar: Created Github repo for Gerrit replication https://github.com/wikimedia/mediawiki-libs-phpstorm-stubs  [[phab:T153252|T153252]]
* 22:44 James_F: Zuul: [mediawiki/tools/scap] Mark as archived for [[phab:T322269|T322269]]
* 13:49 hashar: Gerrit creating  mediawiki/libs/phpstorm-stubs to fork https://github.com/JetBrains/phpstorm-stubs for [[phab:T153252|T153252]]
* 09:56 vgutierrez: update to HAProxy 2.6.6 in deployment-cache-(text{{!}}upload)07 - [[phab:T321775|T321775]]


== 2017-01-24 ==
== 2022-10-31 ==
* 11:04 hashar: Deleting integration-publisher (Precise) replaced by integration-publishing (Jessie). [[phab:T156064|T156064]] [[phab:T143349|T143349]]
* 15:56 andrewbogott: shutting down  deployment-echostore01, deployment-ms-be0[56], deployment-mdb01, deployment-prometheus02, deployment-wikifeeds01 as per  https://phabricator.wikimedia.org/T306068
* 15:50 James_F: Zuul: [mediawiki/libs/RemexHtml] Re-enable PHP 8.1 CI for [[phab:T311450|T311450]]


== 2017-01-23 ==
== 2022-10-28 ==
* 23:41 bearND: Update mobileapps to {{Gerrit|66ef3c2}}
* 14:14 zabe: delete deployment-db07 and deployment-db08
* 21:05 hashar: Created integration-publishing Jessie instance 10.68.23.254 with puppet class role::ci::publisher::labs . Meant to replace Precise instance integration-publisher [[phab:T156064|T156064]]
* 06:24 hashar: devtools: phabricator-prod-1001: `rmdir /etc/envoy/clusters.d /etc/envoy/listeners.d`
* 12:45 hashar: Image ci-jessie-wikimedia-1485174573 in wmflabs-eqiad is ready  {{!}} should no more spawn varnish on boot
* 06:24 hashar: devtools: `rmdir /etc/envoy/clusters.d /etc/envoy/listeners.d`
* 09:02 hashar: Archiving Gerrit project wikidata/gremlin marking it read-only [[phab:T155829|T155829]]
* 06:23 hashar: devtools: set `profile::phabricator::main::dumps_rsync_clients: []` project wide to fix up Puppet. Settings got moved to a `role` ( https://gerrit.wikimedia.org/r/c/operations/puppet/+/842875 {{!}} [[phab:T313360|T313360]] )
* 07:15 _joe_: cherry-picking the move of base to profile::base


== 2017-01-21 ==
== 2022-10-27 ==
* 21:20 hashar: integration: updating slave scripts for https://gerrit.wikimedia.org/r/#/c/333389/
* 21:38 James_F: Zuul: [mediawiki/core] Run standalone jobs [[phab:T203694|T203694]]
* 21:08 bd808: Puppet failures on deployment-restbase0[12] seem to be some sort of hang of the Puppet process itself. Run prints "Finished catalog run in 2n.nn seconds" but Puppet doesn't terminate for about a minute longer. The only state change logged is cassandra-metrics-collector service start.
* 20:58 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/850198 # [[phab:T321769|T321769]]


== 2017-01-20 ==
== 2022-10-26 ==
* 10:14 hashar: puppet fails on "integration" labs instances due to an attempt to unmount the non existing NFS /home.  Filled [[phab:T155820|T155820]]
* 23:12 dancy: Restarted Zuul CI server due to stall ssh connections which went against the max per user connection limit in Gerrit #[[phab:T308943|T308943]]
* 09:18 hashar: beta: reset workspace of /srv/mediawiki-staging/php-master/extensions/reCaptcha  it had a .gitignore local hack for some reason
* 18:28 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/849638 # [[phab:T321668|T321668]]
* 09:05 hashar: integration restarted mysql on trusty permanent slaves [[phab:T141450|T141450]] [[phab:T155815|T155815]] salt -v '*trusty*' cmd.run 'service mysql start'
* 08:58 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/849480 # [[phab:T321594|T321594]]


== 2017-01-19 ==
== 2022-10-25 ==
* 22:11 Krenair: added bunch of others to the same group per request. we should figure out how to make this process sane somehow
* 16:33 hashar: Updating Jenkins jobs for Quibble 1.4.7 # [[phab:T320935|T320935]] [[phab:T318029|T318029]]
* 22:06 Krenair: added nuria to deploy-service group on deployment-tin
* 15:36 hashar: Tag Quibble 1.4.7 @ {{Gerrit|f838a24cc2}} # [[phab:T320935|T320935]] [[phab:T318029|T318029]]
* 16:56 hashar: rebased puppet master on integration and deployment-prep Trivial conflict between https://gerrit.wikimedia.org/r/#/c/312523/ and a lint change
* 14:30 hashar: Manually cleaned /srv/jenkins/workspace on integration-agent-docker-1024
* 09:36 hashar: Nuking workspaces of all mwext-testextension-hhvm-composer* jobs. Lame attempt for [[phab:T155600|T155600]].  salt -v '*slave*' cmd.run 'rm -fR /srv/jenkins-workspace/workspace/mwext-testextension-hhvm-composer*'
* 07:24 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/848514 # [[phab:T317378|T317378]]


== 2017-01-18 ==
== 2022-10-24 ==
* 10:49 hashar: Disconnected/connected Jenkins Gearman client.  The beta cluster builds had a deadlock.
* 17:42 James_F: Zuul: Add new e-mail for Hoo man to allow list
* 10:39 hashar: Image ci-jessie-wikimedia-1484735445 in wmflabs-eqiad is ready (add python-conftool to hopefully have puppet rspec pass on https://gerrit.wikimedia.org/r/#/c/332475/ )


== 2017-01-17 ==
== 2022-10-21 ==
* 21:47 urandom: deployment-prep restarting Cassandra on deployment-restbase02
* 08:46 hashar: Created https://gerrit.wikimedia.org/r/admin/repos/phabricator/translations # [[phab:T321350|T321350]]
* 21:46 urandom: deployment-prep restarting Cassandra on deployment-restbase01
* 19:02 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/332534/
* 18:25 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/332521/
* 18:07 urandom: deployment-prep restarting Cassandra on deployment-restbase01
* 17:50 urandom: re-enabling puppet on deployment-restbase02
* 17:47 urandom: re-enabling puppet on deployment-restbase01
* 10:32 hashar: Refreshing all jobs in Jenkins 'jenkins-jobs --conf jenkins_jobs.ini update config/jjb'


== 2017-01-16 ==
== 2022-10-20 ==
* 09:33 hashar: integration  nuked the Zuul merger path for SelectTag mw extension ( on scandium /srv/ssd/zuul/git/mediawiki/extensions/SelectTag )  Failed to merge https://gerrit.wikimedia.org/r/#/c/331974/
* 13:04 hashar: Updating Jenkins jobs to add `AllowEncodedSlashes On` to Apache config https://gerrit.wikimedia.org/r/c/integration/config/+/844974 [[phab:T321278|T321278]]
* 12:40 hashar: Building Quibble Docker images to add `AllowEncodedSlashes On` to Apache configuration {{!}} https://gerrit.wikimedia.org/r/c/integration/config/+/844937 {{!}} [[phab:T321278|T321278]]
* 07:23 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/844540


== 2017-01-12 ==
== 2022-10-19 ==
* 00:33 legoktm: deploying https://gerrit.wikimedia.org/r/331796 and https://gerrit.wikimedia.org/r/331795
* 22:27 dduvall: deleted 'trigger-blubber-pipeline-*' 'blubber-pipeline-*' jobs to deploy https://gerrit.wikimedia.org/r/844529
* 22:22 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/844528
* 09:56 hashar: Reloaded Zuul for noop change https://gerrit.wikimedia.org/r/c/integration/config/+/830584 Zuul: [mediawiki/extensions/SearchVue] Mark as in production


== 2017-01-11 ==
== 2022-10-18 ==
* 18:07 urandom: restarting restbase cassandra nodes
* 19:13 hashar: devtools: unbreak puppet on `deploy-1004.devtools.eqiad1.wikimedia.cloud` by applying `profile::mediawiki::scap_client::is_master: true` # [[phab:T319681|T319681]]
* 18:01 urandom: disabling puppet on restbase cassandra nodes to experiment with prometheus exporter
* 17:51 James_F: Zuul: [wikimedia/fundraising/SmashPig] Use composer-test-php74-only template
* 08:03 vgutierrez: wipe deployment-cache-(text{{!}}upload)06 - [[phab:T320930|T320930]]


== 2017-01-10 ==
== 2022-10-17 ==
* 23:07 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/331099
* 16:21 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/843499 [[phab:T309600|T309600]]
* 14:01 vgutierrez: shutdown deployment-cache-(text{{!}}upload)06 - [[phab:T320930|T320930]]
* 13:56 vgutierrez: switch 185.15.56.36 from deployment-cache-text06 to deployment-cache-text07 - [[phab:T320930|T320930]]
* 13:54 vgutierrez: switch 185.15.56.35 from deployment-cache-upload06 to deployment-cache-upload07 - [[phab:T320930|T320930]]
* 11:02 urbanecm: deployment-prep: wikiadmin@172.16.0.238(wikishared)> source /srv/mediawiki-staging/php-master/extensions/ContentTranslation/sql/significant-edits.sql; # cswiki beta was failing with cx_significant_edits table not found
* 09:41 wm-bot2: Increased quotas by 4 cores ([[phab:T320932|T320932]]) - cookbook ran by arturo@nostromo


== 2017-01-08 ==
== 2022-10-14 ==
* 05:20 Krenair: deployment-stream: live hacked /usr/lib/python2.7/dist-packages/socketio/handler.py a bit (added apostrophes) to try to make rcstream work
* 20:57 James_F: Zuul: Fix dependencies for BlueSpice extensions that depend on VisualEditor
* 20:49 James_F: Docker: Publishing helm-linter without deprecated kubeyaml for [[phab:T316348|T316348]]
* 20:06 James_F: Docker: Publish images with php-ast upgraded from v1.0.14 to v1.1.0
* 18:22 dduvall: upgrade of docker on contint hosts aborted due to missing buster package. agents are back online
* 18:01 dduvall: upgrading docker on contint servers. agents will be available for a short time
* 16:07 James_F: Zuul: [mediawiki/libs/Zest] Re-enable PHP 8.1 tests for [[phab:T311463|T311463]]
* 15:54 James_F: Zuul: [mediawiki/vendor] Add experimental job to check composer.lock for [[phab:T74952|T74952]]
* 13:48 James_F: Zuul: [css-sanitizer] Re-enable PHP 8.1 jobs for [[phab:T311451|T311451]]


== 2017-01-07 ==
== 2022-10-13 ==
* 10:17 Amir1: ladsgroup@deployment-tin:~$ mwscript updateCollation.php --wiki=fawiki ([[phab:T139110|T139110]])
* 21:16 dduvall: all integration-agent-docker-* hosts have been upgraded to docker 20.10.18
* 20:37 dduvall: starting rolling upgrade of docker on integration-agent-docker-* hosts to deploy https://gerrit.wikimedia.org/r/c/operations/puppet/+/834399


== 2017-01-06 ==
== 2022-10-12 ==
* 16:31 hashar: Nodepool Image ci-jessie-wikimedia-1483719758 in wmflabs-eqiad is ready
* 20:09 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/841996
* 16:24 hashar: Nodepool Image ci-trusty-wikimedia-1483719370 in wmflabs-eqiad is ready
* 17:07 dduvall: deployed blubberoid using docker-registry.discovery.wmnet/wikimedia/blubber:2022-10-12-162839-production
* 04:56 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/330843


== 2017-01-05 ==
== 2022-10-11 ==
* 17:20 hashar: Dropping puppet source from https://doc.wikimedia.org/puppetsource/ . contint1001: sudo rm -fR /srv/org/wikimedia/doc/puppetsource  ([[phab:T143233|T143233]])
* 15:49 dduvall: manually (re-)re-running `sudo -u mwpresync /usr/bin/scap stage-train --yes auto` after patch cleanup
* 15:29 dduvall: correction ^ full command is `sudo -u mwpresync /usr/bin/scap stage-train --yes auto`
* 15:28 dduvall: manually (re)running `stage-train --yes auto` following cron job failure
* 10:53 TheresNoTime: add MVernon to deployment-prep, [[phab:T316845|T316845]]#8307183


== 2017-01-04 ==
== 2022-10-10 ==
* 21:29 mutante: deployment-cache-text-04 - running acme-setup command to debug  .. Creating CSR /etc/acme/csr/beta_wmflabs_org.pem
* 12:04 TheresNoTime: cherry 836953 picking for [[phab:T316845|T316845]] to deployment-prep/Swift
* 21:26 Krenair: trying to troubleshoot puppet by stopping nginx then letting puppet start it
* 21:05 mutante: deployment-cache-text04 stopping nginx service, running puppet to debug dependency issue
* 09:41 hashar: integration: pruning /srv/pbuilder/aptcache/ on Jessie perm slaves


== 2017-01-02 ==
== 2022-10-08 ==
* 11:22 hashar: Nodepool Image ci-jessie-wikimedia-1483355768 in wmflabs-eqiad is ready
* 21:00 Reedy: Updating docker-pkg files on contint primary for https://gerrit.wikimedia.org/r/840337
* 11:17 hashar: Jessie images have the wrong python-pbr version ( [[phab:T153877|T153877]] ) causing zuul-cloner to fail. Refreshing image
* 10:02 hashar: Nodepool Image ci-jessie-wikimedia-1483350885 in wmflabs-eqiad is ready
* 09:57 hashar: Nodepool Image ci-trusty-wikimedia-1483350368 in wmflabs-eqiad is ready


== 2016-12-27 ==
== 2022-10-07 ==
* 05:00 Amir1: deploying {{Gerrit|5230e7d}} in ores beta node ([[phab:T154168|T154168]])
* 13:27 James_F: Zuul: Add two former contractors to the CI allowlist


== 2016-12-26 ==
== 2022-10-06 ==
* 12:09 hashar: beta: restarted varnish.service and varnish-frontend.service on deployment-cache-text04
* 13:17 hashar: Mass updating Jenkins jobs for https://gerrit.wikimedia.org/r/c/integration/config/+/839520
* 13:12 taavi: reloading zuul for https://gerrit.wikimedia.org/r/839472
* 13:02 jelto: update gitlab-settings to enable admin_mode on gitlab production instances - [[phab:T316419|T316419]]
* 13:00 James_F: Docker: Building and publishing php74:0.3.2 and cascade for [[phab:T318918|T318918]]
* 12:59 jelto: update gitlab-settings to enable admin_mode on gitlab replica instances - [[phab:T316419|T316419]]
* 12:55 jelto: update gitlab-settings to enable admin_mode on gitlab test instance - [[phab:T316419|T316419]]


== 2016-12-24 ==
== 2022-10-05 ==
* 09:02 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/329038
* 22:03 James_F: layout: [mediawiki/tools/phan/SecurityCheckPlugin] Publish PHP coverage for [[phab:T279423|T279423]]
* 17:29 hashar: Building docker images for https://gerrit.wikimedia.org/r/814154


== 2016-12-23 ==
== 2022-10-04 ==
* 12:18 legoktm: deploying https://gerrit.wikimedia.org/r/328886
* 19:46 dduvall: Updating docker-pkg files on contint primary for https://gerrit.wikimedia.org/r/838249


== 2016-12-22 ==
== 2022-10-03 ==
* 22:11 thcipriani: disable production l10nupdate for deployment freeze
* 14:22 TheresNoTime: set `ring_manager` host to `deployment-ms-fe03` in deployment-prep's _.yaml. [[phab:T316845|T316845]]
* 13:22 hashar: Triggering CI for design/codex@v0.2.1 using `zuul enqueue-ref --trigger gerrit --pipeline publish --project design/codex --ref refs/tags/v0.2.1 --newrev 4abb7677b3ea076bbd6778977d9a9374cf45015c`  # [[phab:T313767|T313767]]
* 13:15 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/822144 # [[phab:T313767|T313767]]
* 12:47 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/833061
* 09:02 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/837494/


== 2016-12-21 ==
== 2022-09-30 ==
* 05:57 Krinkle: Jenkins "Collapsing Console Sections" for PHPUnit was broken since "-d zend.enable_gc=0" was added to phpunit.php invocation. Updated pattern in Jenkins system configuration.
* 18:43 James_F: Triggering graceful restart of zuul to see if that fixes on-going merge/gerrit connection issues.
* 17:07 James_F: Zuul: Make PHP 8.1 non-voting for all skins and extensions [[phab:T316078|T316078]]
* 16:46 James_F: Zuul: Make PHP 8.0 and PHP 8.1 voting for all skins and extensions in master for [[phab:T300463|T300463]] and [[phab:T316078|T316078]]
* 15:12 James_F: Docker: Building and publishing PHP 8.0.24 images for [[phab:T315167|T315167]]
* 02:33 James_F: Zuul: [mediawiki/core] Clean up REL1_35 and REL1_37 PHP 8 jobs
* 02:30 James_F: Zuul: [mediawiki/core] Upgrade PHP 8.0 and 8.1 jobs to full vendor jobs for [[phab:T300463|T300463]] and [[phab:T316078|T316078]]
* 02:27 James_F: Zuul: Drop FIXME messages for [[phab:T318093|T318093]], being Declined


== 2016-12-19 ==
== 2022-09-29 ==
* 21:21 andrewbogott: and also python-functools32_3.2.3.2-3~bpo8+1_all.deb
* 23:54 TheresNoTime: samtar@deployment-jobrunner04:~$ sudo systemctl stop php7.2-fpm.service && sudo systemctl start php7.4-fpm.service
* 21:20 andrewbogott: upgrading to python-jsonschema_2.5.1-5~bpo8+1_all.deb on deployment-eventlogging03
* 23:47 TheresNoTime: cherry pick 836953 to deployment-prep
* 20:51 andrewbogott: upgrading to python-requests_2.12.3-1_all.deb ./python-urllib3_1.19.1-1_all.deb on deployment-mediawiki04 and deployment-tin
* 23:09 TheresNoTime: [samtar@deployment-deploy03 ~]$ sudo puppet agent -tv
* 09:35 legoktm: deploying https://gerrit.wikimedia.org/r/328145
* 23:08 TheresNoTime: deployment-deploy03, `sudo systemctl stop php7.2-fpm.service`, `sudo systemctl start php7.4-fpm.service`
* 08:00 legoktm: deploying https://gerrit.wikimedia.org/r/288819 https://gerrit.wikimedia.org/r/276065 https://gerrit.wikimedia.org/r/328136
* 23:03 TheresNoTime: ran `sudo puppet agent -tv` on deployment -deploy03, -mediawiki11, -mediawiki12
* 02:25 legoktm: deploying https://gerrit.wikimedia.org/r/327692
* 13:56 James_F: Zuul: Drop PHP72 jobs everywhere, and PHP73 everywhere except old branches
* 13:41 James_F: Zuul: [mediawiki/core] Drop PHP 7.2 and PHP 7.3 testing for master and wmf for [[phab:T261872|T261872]]
* 13:34 James_F: Zuul: [mediawiki/vendor] Drop PHP72 jobs, use only PHP74 ones
* 12:30 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/836696


== 2016-12-16 ==
== 2022-09-28 ==
* 22:34 legoktm: deploying https://gerrit.wikimedia.org/r/327202
* 17:32 brennen: trying a re-publish of dev-images in case https://gitlab.wikimedia.org/repos/releng/dev-images/-/commit/eb82162b4bf443df20998a53bfb06460bfc6a365 didn't get picked up
* 14:33 hashar: Nodepool Image ci-jessie-wikimedia-1481897950 in wmflabs-eqiad is ready
* 00:30 James_F: Zuul: [mediawiki/services/function-orchestrator] Use direct coverage job here too
* 14:25 hashar: Nodepool Image ci-trusty-wikimedia-1481897961 in wmflabs-eqiad is ready
* 00:20 James_F: Zuul: [mediawiki/services/function-evaluator] Use direct coverage job, for [[phab:T302608|T302608]]
* 14:19 hashar: Refreshing Nodepool images. The snapshots were broken due to mariadb-client failing to upgrade
* 13:45 hashar: integration / contintcloud : remove security rules of labs projects that allowed gallium (phased out) [[phab:T95757|T95757]]
* 13:44 hashar: integration / contintcloud : update security rules of labs projects to allow contint2001
* 13:15 hashar: integration: update sudo policy for debian-glue to keep the env variable SHELL_ON_FAILURE (for https://gerrit.wikimedia.org/r/#/c/327720/ )
* 10:15 hashar: integration: apt-get upgrade on all permanent slaves
* 10:13 hashar: integration-slave-docker-1000  changed docker::version from no more existent '1.12.3-0~jessie' to simply 'present'. Will have to manually upgrade it from now on.  [[phab:T153419|T153419]]
* 10:04 hashar: deployment-puppetmaster02  updated puppet repo. Was stall due to a bump of the mariadb submodule


== 2016-12-15 ==
== 2022-09-27 ==
* 21:00 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/324368
* 08:11 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/835182
* 19:23 marxarelli: Manually rebasing and re-applying cherry picks for operations/puppet on integration-puppetmaster01.eqiad.wmflabs
* 16:08 hashar: deployment-phab02 : apt-get upgrade  [[phab:T147818|T147818]]
* 14:48 Amir1: ladsgroup@deployment-tin:~$ mwscript updateCollation.php --wiki=fawiki ([[phab:T139110|T139110]])
* 11:41 zeljkof: Reloading Zuul to deploy 327473


== 2016-12-14 ==
== 2022-09-26 ==
* 12:38 elukey: created deployment-copper on deployment-prep as temporary test
* 21:46 Daimona: Applying schema changes to the wikishared DB on beta for the CampaignEvents extension # [[phab:T318379|T318379]] [[phab:T318120|T318120]]
* 21:31 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (1/2) # [[phab:T318120|T318120]]
* 20:00 dduvall: regenerating 314 jobs for deployment of https://gerrit.wikimedia.org/r/835262
* 11:40 James_F: Docker: Building and publishing quibble-buster-php74-bundle
* 11:40 James_F: Docker
* 10:52 hashar: Rolling quibble/ruby jobs from php 7.4 to 7.2: `mediawiki-selenium-integration-docker` `legacy-quibble-rubyselenium-docker` # [[phab:T318525|T318525]]
* 09:35 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/834717/


== 2016-12-13 ==
== 2022-09-23 ==
* 22:52 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/327119
* 18:09 James_F: Zuul: [wikimedia-cz/web-*] Migrate tests from php73+ to php74+
* 21:15 thcipriani: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/327048
* 18:06 James_F: Zuul: [labs/tools/guc] Migrate tests from php73+ to php74+
* 09:42 hashar: Updating MediaWiki Jenkins jobs to support injecting skin dependencies [[phab:T151593|T151593]]
* 18:04 James_F: Zuul: [labs/tools/coverme] Migrate tests from php73+ to php74+
* 02:17 legoktm: deploying https://gerrit.wikimedia.org/r/326880
* 15:55 James_F: Docker: Building and publishing php74 versions of composer-security-check, mediawiki-phan, mediawiki-phan-testrun, and phpmetrics
* 02:10 legoktm: deploying https://gerrit.wikimedia.org/r/326877
* 13:26 James_F: Zuul: Run php 7.4 phan for extensions and skins


== 2016-12-09 ==
== 2022-09-22 ==
* 04:01 legoktm: deploying https://gerrit.wikimedia.org/r/326070
* 20:40 zabe: shutoff deployment-db07 # [[phab:T318126|T318126]]
* 03:45 legoktm: deploying https://gerrit.wikimedia.org/r/326069
* 20:36 zabe: take deployment-prep out of read-only # [[phab:T318126|T318126]]
* 20:32 zabe: failover deployment-prep master from deployment-db07 to deployment-db09 # [[phab:T318126|T318126]]
* 20:25 zabe: set deployment-prep as read-only # [[phab:T318126|T318126]]
* 16:26 dancy: Upgrading scap to latest code revision in beta cluster
* 10:38 zabe: deployment-db10: start replication # [[phab:T318126|T318126]]


== 2016-12-08 ==
== 2022-09-21 ==
* 23:35 legoktm: deploying https://gerrit.wikimedia.org/r/326048 https://gerrit.wikimedia.org/r/326050
* 23:34 zabe: shutoff deployment-db08 # [[phab:T318126|T318126]]
* 22:32 legoktm: deploying https://gerrit.wikimedia.org/r/325930
* 23:00 jeena: restarting zuul to try and fix CI issues
* 21:14 legoktm: deploying https://gerrit.wikimedia.org/r/326032
* 20:46 zabe: clone deployment-db10 from deployment-db08 # [[phab:T318126|T318126]]
* 21:08 legoktm: deploying https://gerrit.wikimedia.org/r/326020
* 18:49 TheresNoTime: cherry-picked [[gerrit:833839]] to deployment-puppetmaster04, testing [[phab:T317417|T317417]]
* 20:27 legoktm: deploying https://gerrit.wikimedia.org/r/325974
* 18:19 zabe: install mariadb 10.6 via role::mariadb::beta on deployment-db10 # [[phab:T318126|T318126]]
* 20:19 legoktm: deploying https://gerrit.wikimedia.org/r/326016
* 17:55 zabe: create volume db10 and attach to deployment-db10 # [[phab:T318126|T318126]]
* 20:11 legoktm: deploying https://gerrit.wikimedia.org/r/326015
* 17:54 zabe: create deployment-db10 as g3.cores8.ram16.disk20 # [[phab:T318126|T318126]]
* 19:51 legoktm: deploying https://gerrit.wikimedia.org/r/326009
* 14:21 zabe: deployment-db09: restart mariadb # [[phab:T318126|T318126]]
* 19:44 legoktm: deploying https://gerrit.wikimedia.org/r/325912 https://gerrit.wikimedia.org/r/326006
* 13:55 TheresNoTime: modified deployment-prep "prometheus" security group - port 80, [[phab:T315699|T315699]]
* 15:33 hashar: Image ci-jessie-wikimedia-1481210905 in wmflabs-eqiad is ready : Notice: /Stage[main]/Main/Package[netcat-openbsd]/ensure: ensure changed 'purged' to 'present'
* 13:18 James_F: Jenkins: Dropped 16 more old node jobs left on the server.
* 15:28 hashar: Updating Nodepool Jessie image to ship `netcat`  [[phab:T151469|T151469]] [[phab:T152684|T152684]]
* 13:11 James_F: Jenkins: Dropped four old node10 jobs left on the server (oojs-core-node10-browser-docker, ooui-special-node10-plus-php80-composer-docker, wikipeg-special-node10-plus-php72-composer-docker, wikipeg-special-node10-plus-php80-composer-docker)
* 10:31 hashar: Image ci-trusty-wikimedia-1481192772 in wmflabs-eqiad is ready
* 13:05 James_F: Jenkins: Dropped scap-pipeline-stretch and trigger-scap-pipeline-stretch following {{Gerrit|26c74a1}}
* 10:21 hashar: Refreshing Nodepool base image for Trusty. Was blocked on a mariadb upgrade, should also acquire network faster [[phab:T113342|T113342]]
* 12:36 hashar: Reloaded Zuul for Remove Stretch from mediawiki/tools/scap - https://gerrit.wikimedia.org/r/833705
* 09:45 legoktm: deploying https://gerrit.wikimedia.org/r/325903
* 09:46 andrewbogott: removed some stray whitespace in /var/lib/git/operations/puppet that was preventing rebase on deployment-puppetmaster04.deployment-prep.eqiad.wmflabs
* 08:48 hashar: Image ci-jessie-wikimedia-1481186016 in wmflabs-eqiad is ready  [[phab:T113342|T113342]]
* 05:31 legoktm: legoktm@integration-saltmaster:~$ sudo salt '*jessie*' cmd.run 'puppet agent -tv'
* 05:26 legoktm: cherry-picked https://gerrit.wikimedia.org/r/#/c/325877/ onto integration-puppetmaster01
* 03:26 legoktm: deploying https://gerrit.wikimedia.org/r/325873


== 2016-12-07 ==
== 2022-09-20 ==
* 15:04 hashar: Image ci-trusty-wikimedia-1481122712 in wmflabs-eqiad is ready  [[phab:T117418|T117418]]
* 22:00 zabe: deployment-db09: start replication # [[phab:T318126|T318126]]
* 02:29 matt_flaschen: foreachwikiindblist FlowFixInconsistentBoards complete
* 20:06 zabe: deployment-db09: import dump into mariadb # [[phab:T318126|T318126]]
* 02:27 matt_flaschen: Started (foreachwikiindblist flow.dblist extensions/Flow/maintenance/FlowFixInconsistentBoards.php) 2>&1 {{!}} tee FlowFixInconsistentBoards_2016-12-06.txt on deployment-tin
* 20:04 zabe: rsynced dump from deployment-db08 to deployment-db09 # [[phab:T318126|T318126]]
* 08:08 hashar: Upgrading CI and releases Jenkins plugins notably to update the git client [[phab:T315897|T315897]]
* 02:06 zabe: created backup of all databases on deployment-db08 # [[phab:T318126|T318126]]


== 2016-12-06 ==
== 2022-09-19 ==
* 21:20 hashar: Image ci-jessie-wikimedia-1481058839 in wmflabs-eqiad is ready [[phab:T113342|T113342]]
* 23:58 zabe: install mariadb 10.6 via role::mariadb::beta on deployment-db09 # [[phab:T318126|T318126]]
* 21:13 hashar: Refresh Nodepool Jessie snapshot which boot 3 times faster. Will help get nodes available faster [[phab:T113342|T113342]]
* 23:57 zabe: create volume db09 and attach to deployment-db09 # [[phab:T318126|T318126]]
* 16:33 hashar: Nodepool imported a new Jessie image 'jessie-[[phab:T113342|T113342]]' with some network configuration hotfix. Will use for debugging. [[phab:T113342|T113342]]
* 23:57 zabe: create deployment-db09 as g3.cores8.ram16.disk20 # [[phab:T318126|T318126]]
* 09:08 Reedy: running foreachwiki update.php on beta
* 20:24 dduvall: Updating docker-pkg files on contint primary for https://gerrit.wikimedia.org/r/833066
* 16:54 James_F: Zuul: [operations/mediawiki-config] Switch to PHP 7.4 jobs
* 16:24 James_F: Zuul: [mediawiki/core] Add php80 and php81 to `check php` command
* 15:36 James_F: Zuul: [mediawiki/core] run phan on PHP 7.4 for [[phab:T316518|T316518]]
* 13:50 James_F: Zuul: [mediawiki/core] Add a non-vendor php81 job for main branch for [[phab:T316078|T316078]]
* 12:06 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (2/2) # [[phab:T316128|T316128]]
* 11:57 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (1/2) # [[phab:T316128|T316128]]


== 2016-12-05 ==
== 2022-09-16 ==
* 20:43 hashar: Image ci-jessie-wikimedia-1480969940 in wmflabs-eqiad is ready (include trendingedits::packages  which explicitly define the installation of librdkafka-dev' )
* 15:47 dancy: Upgrading scap to latest code revision in beta cluster
* 09:52 elukey: add https://gerrit.wikimedia.org/r/#/c/324642/ to the deployment-prep's puppet master to test nutcracker
* 09:39 hashar: beta-update-databases-eqiad fails due to CONTENT_MODEL_FLOW_BOARD not registered on the wiki. [[phab:T152379|T152379]]
* 08:44 hashar: Image ci-jessie-wikimedia-1480926961 in wmflabs-eqiad is ready  [[phab:T113342|T113342]]
* 08:35 hashar: Pushing new Jessie image to Nodepool that is supposedly boot 3x times faster [[phab:T113342|T113342]]


== 2016-12-04 ==
== 2022-09-15 ==
* 15:25 Krenair: Found a git-sync-upstream cron on deployment-mx for some reason... commented for now, but wtf was this doing on a MX server?
* 19:56 thcipriani: Updating development images on contint primary
* 17:24 dduvall: Updating docker-pkg files on contint primary for https://gerrit.wikimedia.org/r/832374
* 11:03 TheresNoTime: soft reboot deployment-parsoid12, unresponsive


== 2016-12-03 ==
== 2022-09-13 ==
* 23:07 legoktm: deploying https://gerrit.wikimedia.org/r/325132
* 22:14 zabe: delete deployment-urldownloader02
* 10:48 legoktm: deploying https://gerrit.wikimedia.org/r/325093 and https://gerrit.wikimedia.org/r/325094


== 2016-12-02 ==
== 2022-09-12 ==
* 14:40 hashar: added Tobias Gritschacher to Gerrit "integration" group so he can +2 patches on integration/* repositories \O/
* 22:15 brennen: Updating development images on contint primary for https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/19
* 20:38 RhinosF1: (for lack of a better place) added Cyberpower678 to acl*userdisable. has enough clue, fairly active and trusted.
* 08:14 James_F: Zuul: [mediawiki/extensions/Realnames] Enable quibble composer jobs


== 2016-12-01 ==
== 2022-09-09 ==
* 18:20 elukey: removing https://gerrit.wikimedia.org/r/#/c/305536 from the puppet master via rebase -i (no-op for beta)
* 17:46 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (2/2) # [[phab:T311126|T311126]]
* 18:11 elukey: adding https://gerrit.wikimedia.org/r/#/c/305536/3 to the puppet master
* 17:25 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (1/2) # [[phab:T311126|T311126]]
* 14:16 hashar: Image ci-jessie-wikimedia-1480601060 in wmflabs-eqiad is ready  {{!}} [[phab:T152096|T152096]]
* 17:08 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (2/2) # [[phab:T316409|T316409]]
* 16:36 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (1/2) # [[phab:T316409|T316409]]
* 10:49 hashar: devtools: fixed fqdn of instances puppetmaster-1001 and gerrit-prod-1001 by manually editing `/etc/hosts` # [[phab:T317404|T317404]]


== 2016-11-30 ==
== 2022-09-08 ==
* 17:22 gehel: restart of logstash on deployment-logstash2 - upgrade to Java 8 - [[phab:T151325|T151325]]
* 20:50 dduvall: running `./fab deploy_docker` to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/830919
* 17:11 gehel: rolling restart of deployment-elastic0* - upgrade to Java 8 - [[phab:T151325|T151325]]
* 15:47 dancy: Upgrading scap to latest code revision in beta cluster
* 11:22 hashar: Gerrit hide mediawiki/extensions/JsonData/JsonSchema Empty since 2013
* 11:20 hashar: Gerrit made mediawiki/extensions/GuidedTour/guiders read-only (per README.md, no more used)
* 11:18 hashar: Gerrit  mediawiki/extensions/CentralNotice/BannerProxy.git  Empty since 2014


== 2016-11-29 ==
== 2022-09-07 ==
* 15:23 hashar: Image ci-jessie-wikimedia-1480432368 in wmflabs-eqiad is ready
* 15:05 TheresNoTime: making hack changes to beta to test [[phab:T317195|T317195]] resolution
* 14:30 hashar: Image ci-trusty-wikimedia-1480429423 in wmflabs-eqiad is ready  [[phab:T151879|T151879]]
* 14:24 hashar: Refreshing Nodepool Trusty snapshot to get php5-xsl installed [[phab:T151879|T151879]]


== 2016-11-28 ==
== 2022-09-06 ==
* 09:48 hashar: Image ci-trusty-wikimedia-1480326016 in wmflabs-eqiad is ready
* 15:42 bd808: Promoted user 'StrikerBot' to admin on gitlab.wikimedia.org so that Striker can use the account to attach Developer accounts to gitlab via API.
* 09:39 hashar: Regenerated Nodepool image for Trusty. It no more includes apache::mod::php5 which broke the build and is not needed on Trusty ( https://gerrit.wikimedia.org/r/323803  )
* 02:05 James_F: Running REL1_39 branch commands for [[phab:T313920|T313920]]
* 09:15 elukey: cherry-pick of https://gerrit.wikimedia.org/r/#/c/323517 to deployment-puppetmaster02 to test
* 00:20 Krinkle: Prune various old mediawiki/core wmf branches for Gerrit usability, ref [[phab:T303828|T303828]]


== 2016-11-26 ==
== 2022-09-02 ==
* 16:15 Reedy: killed /srv/jenkins-workspace/workspace/mediawiki-core-*/src and /srv/jenkins-workspace/workspace/mwext-*/src from integration slaves to get rid of borked MW dirs
* 15:59 zabe: added vwalters as member of the deployment-prep project [[phab:T316943|T316943]]
* 15:51 Reedy: deleted /srv/jenkins-workspace/workspace/mediawiki-core-code-coverage/src on integration-slave-trusty-1006 to force a reclone
* 13:40 Krinkle: " ENOENT: no such file or directory, lstat " failing quibble jobs on integration-agent-docker-1024
* 14:14 Reedy: moved old /srv/mediawiki-staging/php-master to /tmp/php-master, recloned MW Core, copied in LocalSettings, skins, vendor and extensions. [[phab:T151676|T151676]]. scap sync-dir running
* 13:05 Reedy: marked deployment-tin as offline due to [[phab:T151670|T151670]]


== 2016-11-24 ==
== 2022-09-01 ==
* 20:49 hashar: make contint1001 Jenkins slave to only builds jobs with a label matching the node  https://integration.wikimedia.org/ci/computer/contint1001/configure [[phab:T86659|T86659]]
* 09:36 zabe: shutoff deployment-urldownloader02
* 15:46 elukey: removing https://gerrit.wikimedia.org/r/#/c/322268/ from the list of cherry picks on puppet master since it is not the right way to go
* 07:43 hashar: Updating Jenkins jobs for Quibble 1.4.5 > 1.4.6  + php 7.4 update {{!}} [[phab:T305525|T305525]] {{!}} [[phab:T314586|T314586]] {{!}} [[phab:T316601|T316601]] {{!}} https://gerrit.wikimedia.org/r/c/integration/config/+/828611
* 08:58 elukey: rebased puppet operations git repo on  deployment-puppetmaster to refresh https://gerrit.wikimedia.org/r/#/c/322268/


== 2016-11-23 ==
== 2022-08-31 ==
* 15:04 Krenair: fixed puppet on deployment-cache-text04 by manually enabling experimental apt repo, see [[phab:T150660|T150660]]
* 23:41 zabe: deleted shutoff deployment-restbase03
* 10:57 hashar: Terminating deployment-apertium01 again [[phab:T147210|T147210]]
* 16:39 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension # [[phab:T308738|T308738]]
* 16:37 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (2/2) # [[phab:T312870|T312870]]
* 16:21 Daimona: Applying schema change to the wikishared DB on beta for the CampaignEvents extension (1/2) # [[phab:T312870|T312870]]
* 16:18 hashar: Tag Quibble 1.4.6 @ {{Gerrit|8828487d0}} # [[phab:T305525|T305525]] [[phab:T314586|T314586]]
* 15:27 James_F: Docker: Building and publishing quibble-fresnel image based on php74 for [[phab:T316525|T316525]]


== 2016-11-22 ==
== 2022-08-30 ==
* 19:31 hashar: beta: rebased puppet master
* 16:00 James_F: Zuul: [labs/tools/heritage] Switch postmerge job to tox-py37-coverage-publish for [[phab:T316627|T316627]]
* 19:30 hashar: beta: dropping cherry pick for the PDF render by mobrovac ( https://gerrit.wikimedia.org/r/#/c/305256/ ). Got merged
* 09:32 hashar: doc: on doc1002: `sudo -u doc-uploader rm -fR /srv/doc/mw-tools-scap/` That got moved to `/srv/doc` and a redirect has been set. # [[phab:T315541|T315541]]
* {{SAL entry|1=08:29 hashar: Deleting shut off instances: integration-puppetmaster , deployment-puppetmaster , deployment-pdf02 , deployment-conftool  - T150339}}


== 2016-11-21 ==
== 2022-08-29 ==
* {{SAL entry|1=12:46 hashar: beta: Cherry picked puppet fix for udp2log https://gerrit.wikimedia.org/r/#/c/322639/ T151169}}
* 21:25 inflatador: ES6->7 upgrade in beta-cluster [[phab:T315604|T315604]]
* 13:39 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/826985
* 12:34 James_F: Zuul: [mediawiki/core] Don't run phan on PHP 7.4, it doesn't pass; for [[phab:T316518|T316518]]
* 12:08 James_F: Zuul: Enable PHP74 jobs on gate-and-submit-wmf pipeline [Re-re-try] for [[phab:T293924|T293924]]
* 11:28 James_F: Zuul: [mediawiki/core] Make PHP 8.1 voting on REL1_38 and REL1_39 for [[phab:T316080|T316080]]
* 09:00 hashar: Updated Jenkins job mediawiki-quibble-composer-mysql-php80-docker to capture core dumps using https://gerrit.wikimedia.org/r/c/integration/config/+/496392  # [[phab:T315167|T315167]]
* 00:28 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/827015


== 2016-11-19 ==
== 2022-08-25 ==
* {{SAL entry|1=00:10 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/322370}}
* 15:53 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/826593
* 07:37 James_F: Zuul: [mediawiki/extensions/UnlinkedWikibase] Add quibble for [[phab:T316183|T316183]]


== 2016-11-18 ==
== 2022-08-23 ==
* {{SAL entry|1=15:42 elukey: cherry picked https://gerrit.wikimedia.org/r/#/c/322268 on puppet master}}
* 17:40 hashar: Stopping Gerrit
* 11:54 hashar: Manually applied a `docker-pkg` fix on contint2001 to prevent it from downloading unrelated images [[phab:T310458|T310458]]


== 2016-11-17 ==
== 2022-08-22 ==
* {{SAL entry|1=22:07 mutante: re-enabled puppet on contint1001 after live Apache fix}}
* 07:17 taavi: trying to disconnect jenkins from gearman and then re-connect to see if it helps with [[phab:T315818|T315818]]
* {{SAL entry|1=11:34 hasharLunch: Deleted instance deployment-apertium01 . Was Trusty and lacked packages, replaced by a Jessie one ages ago. T147210}}
* 07:12 taavi: restart zuul-merger on contint2001 [[phab:T315818|T315818]]


== 2016-11-16 ==
== 2022-08-21 ==
* {{SAL entry|1=20:53 elukey: restored apache2 config on deployment-mediawiki06}}
* 13:07 Reedy: looks live various CI jobs (coverage etc) have been stuck for about 8.5 hours
* {{SAL entry|1=20:28 elukey: temporary increasing verbosity of mod_rewrite on deployment-mediawiki06 as test}}
* 13:00 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/824862
* {{SAL entry|1=20:02 Krenair: mysql master back up, root identity is now unix socket based rather than password}}
* {{SAL entry|1=19:57 Krenair: taking mysql master down to fix perms}}
* {{SAL entry|1=13:02 hashar: Restarted HHVM on deployment-mediawiki05 was not honoring requests T150849}}
* {{SAL entry|1=12:24 hashar: beta: created dewiktionary table on the Database slave. Restarted replication with START SLAVE;    T150834  T150764}}
* {{SAL entry|1=10:39 hashar: Removing revert b47ce21cec3a4340dd37c773210a514350f10297 from deployment-tin and reenabling jenkins job.  https://gerrit.wikimedia.org/r/321857 will get it fixed}}
* {{SAL entry|1=10:26 hashar: Reverting mediawiki/core b47ce21cec3a4340dd37c773210a514350f10297 on beta cluster T150833}}
* {{SAL entry|1=09:51 hashar: marking deployment-tin offline so I can live hack mediawiki code / scap for T150833 and T15034}}
* {{SAL entry|1=09:12 hashar: deployment-mediawiki04 stopping hhvm}}
* {{SAL entry|1=09:12 hashar: deployment-mediawiki04 stopping hhv}}
* {{SAL entry|1=08:59 hashar: beta database update broken with: MediaWiki 1.29.0-alpha Updater\n\nYour composer.lock file is up to date with current dependencies!}}
* {{SAL entry|1=07:52 Krenair: the new mysql root password for -db04 is at /tmp/newmysqlpass as well as in a new file in the puppetmaster's labs/private.git}}
* {{SAL entry|1=06:34 twentyafterfour: restarting hhvm on deployment-mediawiki04}}
* {{SAL entry|1=06:33 Amir1: ladsgroup@deployment-mediawiki05:~$ sudo service hhvm restart}}
* {{SAL entry|1=06:30 mutante: restarting hhvm on deployment-mediawiki06}}


== 2016-11-15 ==
== 2022-08-19 ==
* {{SAL entry|1=16:03 hasharAway: adding thcipriani to the labs "git" project maintained by paladox}}
* 23:04 TheresNoTime: resized deployment-mwlog01's /srv volume, restarted
* 22:57 TheresNoTime: shutting down deployment-mwlog01 for [[phab:T315707|T315707]]
* 15:50 James_F: Docker: Build stalled out for 30 minutes; terminated and re-started.
* 15:15 dancy: Upgrading scap to latest code revision in beta cluster
* 15:11 James_F: Docker: Building and publishing images with PHP 8.0.22 for [[phab:T315167|T315167]]


== 2016-11-14 ==
== 2022-08-18 ==
* {{SAL entry|1=08:16 Amir1: cherry-picking 321096/3 in beta puppetmaster}}
* 17:01 hashar: Restarted zuul-merger on contint1001 # [[phab:T315586|T315586]]
* 16:42 hashar: Reloaded Zuul for {{Gerrit|Ie83b19699a8526bf67f5610a0aa89dcedc0e3979}}
* 13:14 awight: [beta] Deploying new kartotherian version


== 2016-11-12 ==
== 2022-08-17 ==
* 14:02 Amir1: cherry-picked gerrit change 321096/2 in puppetmaster
* 14:18 zabe: fix merge conflicts in deployment-prep private repo # [[phab:T315394|T315394]]
* 10:27 hashar: Built image docker-registry.discovery.wmnet/releng/commit-message-validator:1.0.0  # [[phab:T315159|T315159]]


== 2016-11-11 ==
== 2022-08-16 ==
* 23:48 bd808: Updated _template/logstash on deployment-logstash2 to include change from https://gerrit.wikimedia.org/r/#/c/320441/
* 20:51 RhinosF1: beta: is down see wikitech-l and https://phabricator.wikimedia.org/T315350
* 23:44 bd808: Cherry-picked https://gerrit.wikimedia.org/r/#/c/320441/ for testing on deployment-logstash2
* 20:30 hashar: Repooled integration-agent-docker-1028 , it was mysteriously unreachable [[phab:T315372|T315372]]
* 21:27 hashar: deployment-tin  deleted /var/lock/scap . Was left over after beta-scap-eqiad job got abruptly aborted
* 19:18 Krinkle: mediawiki/extensions/EventLogging$ git remote-wildcard-br-d 'wmf/1.35*' 'wmf/1.36*'  'wmf/1.37*' 'wmf/1.38*'
* 19:17 Krinkle: mediawiki/extensions/Scribunto$ git remote-wildcard-br-d 'wmf/1.35*' # ref [[phab:T303828|T303828]]
* 19:16 TheresNoTime: manually running `/usr/local/bin/wmf-beta-update-databases.py` on `deployment-deploy03`
* 17:16 TheresNoTime: soft-rebooting deployment-mediawiki12


== 2016-11-10 ==
== 2022-08-12 ==
* 09:33 hashar: Image ci-jessie-wikimedia-1478770026 in wmflabs-eqiad is ready
* 17:47 dancy: Restarting zuul
* 09:26 hashar: Regenerate Nodepool base image for Jessie and refreshing snapshot image
* 17:42 dancy: Restarting Jenkins in an attempt to get CI jobs running again
* 00:54 ori: On deployment-cache-<nowiki>{</nowiki>text,upload<nowiki>}</nowiki>06, ran: touch /srv/trafficserver/tls/etc/ssl_multicert.config && systemctl reload trafficserver-tls.service . Certificate was close to expiry


== 2016-11-09 ==
== 2022-08-11 ==
* 20:27 Krenair: removed default SSH access from production host 208.80.154.135, the old gallium IP
* 21:11 mutante: restarted phd service on phab2001
* 16:34 Reedy: deployment-tin no longer offline, jenkins running jobs now
* 19:12 brennen: Updating development images on contint primary for https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/16
* 16:11 Reedy: marking deployment-tin.eqiad  as offline to test -labs -> beta config rename
* 12:26 jnuche: Reenabled CI beta sync jobs after cluster incident
* 11:48 jnuche: Temporarily disabled CI beta sync jobs until issue in cluster is resolved
* 10:25 zabe: take deployment-prep out of read-only mode


== 2016-11-08 ==
== 2022-08-10 ==
* 10:23 hashar: refreshing all jenkins jobs to clear out potential live hack I made but can't remember on which jobs I did
* 11:36 jnuche: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/822052


== 2016-11-07 ==
== 2022-08-09 ==
* 14:01 gilles: Pointing deployment-imagescaler01.eqiad.wmflabs' puppet to puppetmaster.thumbor.eqiad.wmflabs
* 22:11 James_F: Docker: Building and publishing quibble-buster-php74-coverage for PHP7.4+ coverage
* 21:56 James_F: Two failures in devimage build: releng/eventlogging and releng/buster-swift53 – nothing new from me, looks like they've been broken for a bit?
* 21:17 James_F: Updating development images on contint primary for https://gitlab.wikimedia.org/repos/releng/dev-images/-/merge_requests/17
* 21:07 James_F: Zuul: Enable PHP74 jobs on gate-and-submit-wmf pipeline [Re-try] for [[phab:T293924|T293924]]
* 19:42 James_F: Docker: Re-build and publish quibble-buster-php74 based on Wikimedia PHP not sury-php for [[phab:T293851|T293851]]


== 2016-11-04 ==
== 2022-08-08 ==
* 13:20 hashar: gerrit: created mediawiki/extensions/PageViewInfo.git  and renamed user group extension-WikimediaPageViewInfo to extension-PageViewInfo T148775
* 15:56 taavi: gerrit: used `ssh gerrit.wikimedia.org -p 29418 gerrit close-connection` to disconnect four of sgimeno's stuck sessions
* 12:57 hashar: Image ci-jessie-wikimedia-1478263647 in wmflabs-eqiad is ready (bring in java for maven projects)
* 14:43 James_F: jforrester@doc1002:~$ sudo -u doc-uploader rm -rf /srv/doc/wikibase-vuejs-components/ for [[phab:T309872|T309872]]
* 12:49 dcausse: deployment-prep reloading nginx on deployment-elastic0[5-7] to fix ssl cert issue
* 13:23 James_F: Zuul: [mediawiki/libs/metrics-platform] Run Java jobs on maven file paths for [[phab:T314630|T314630]]
* 09:28 hashar: Delete integration-slave-jessie-1003 , only have a few jobs running on permanent Jessie slaves - T148183
* 10:28 jnuche: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/821166
* 09:26 hashar: Delete zuul-dev-jessie.integration.eqiad.wmflabs  was for testing Zuul on Jessie and it works just fine on contint1001 :] T148183
* 09:25 hashar: Delete integration-slave-trusty-1012 one less permanent slave since some load has been moved to Nodepool  T148183
* 09:24 hashar: Delete integration-slave-trusty-1016 not pooled in Jenkins anymore T148183


== 2016-11-03 ==
== 2022-08-05 ==
* 15:05 Amir1: deploy 0caa589 in ores to deployment-sca03
* 16:02 James_F: Docker: Building and publishing composer-security-check:1.1.1 for [[phab:T296967|T296967]]
* 14:52 Amir1: deploying ores 0caa589 in deployment-sca03
* 15:40 James_F: Zuul: [mediawiki/services/function-*] Switch coverage to node16
* 11:32 hashar: deployment-apertium01 manually cleared puppet.conf
* 15:33 James_F: Zuul: [mediawiki/libs/metrics-platform] Add experimental regular java jobs for [[phab:T314630|T314630]]
* 11:29 hashar: deployment-apertium01 fails puppet du to wrong certificate bah
* 14:48 James_F: Zuul: Add WelpThatWorked to allow list
* 07:22 Krenair: fiddled with jenkins jobs in mediawiki-core-doxygen-publish to try to get stuff moving in the postmerge queue again
* 14:48 James_F: Zuul: [mediawiki/extensions/MenuEditor] BlueSpiceDiscovery dependency is a skin
* 05:04 Krenair: beginning to move the rest of beta to the new puppetmaster
* 01:53 mutante: followed instructions at https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Gearman_deadlock
* 01:53 mutante: disabling and re-enabling gearman, zuul is not working and could be gearman deadlock


== 2016-11-02 ==
== 2022-08-04 ==
* 22:06 hashar: hello stashbot
* 15:21 dancy: Deleting beta-mediawiki-config-update-eqiad job
* 18:51 Krenair: armed keyholder on -tin and -mira
* 15:16 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/820405
* 18:50 Krenair: started mysql on -db boxes to bring beta back online
* 10:01 TheresNoTime: clearing out stuck beta deployment jobs [[phab:T314378|T314378]] [[phab:T72597|T72597]]
* 10:54 hashar: Image ci-jessie-wikimedia-1478083637 in wmflabs-eqiad is ready
* 10:47 hashar: Force refresh Nodepool snapshot for Jessie  so it get doxygen included T119140


== 2016-11-01 ==
== 2022-08-03 ==
* 22:22 Krenair: started mysql on -db03 to hopefully pull us out of read-only mode
* 21:05 James_F: Zuul: Doing a graceful restart to see if this clears the fork-bombed CI jobs.
* 22:21 Krenair: started mysql on -db04
* 20:13 taavi: reloading zuul for https://gerrit.wikimedia.org/r/820212
* 22:19 Krenair: stopped and started udp2log-mw on -fluorine02
* 17:44 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/820171
* 22:10 hashar: Armed keyholder on deployment-tin . Instance had 20 minutes uptime and apparently keyholder does not self arm
* 14:57 brennen: gitlab: flipping admin bit for bd808 for API testing purposes
* 22:00 Krenair: started moving nodes back to the new puppetmaster
* 14:11 James_F: Zuul: [wikimedia/vuejs-components] Mark as archived for [[phab:T309872|T309872]]
* 02:55 Krenair: Managed to mess up the deployment-puppetmaster02 cert, had to move those nodes back
* 12:00 James_F: Ran `zuul-test-repo design/codex postmerge` on contint2001 to finally run coverage for Codex
* 11:58 James_F: Zuul: Run publish jobs on branches called 'main' too


== 2016-10-31 ==
== 2022-08-02 ==
* 20:57 Krenair: moving some nodes to deployment-puppetmaster02
* 19:26 James_F: Zuul: [design/codex] Switch coverage job back to -direct
* 16:57 bd808: Added Niharika29 as project member
* 15:23 dancy: Deleted beta-build-scap-deb and beta-publish-deb Jenkins jobs. (https://gerrit.wikimedia.org/r/c/integration/config/+/819028)
* 15:22 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/819028
* 07:55 TheresNoTime: cleared stuck beta deployment jobs [[phab:T72597|T72597]]


== 2016-10-27 ==
== 2022-08-01 ==
* 20:51 hashar: reboot integration-puppetmaster01
* 23:16 James_F: Zuul: [design/codex] Switch to node16
* 18:50 bd808: stashbot has replaced qa-morebots in this channel as the sole bot handling !log messages
* 23:16 James_F: 16:15:59 <+wikibugs> (Merged) jenkins-bot: Zuul: [design/codex] Switch to node16 [integration/config] - https://gerrit.wikimedia.org/r/819185 (owner: Jforrester)
* 18:46 bd808: Testing dual page wiki logging by stashbot. (check #3)
* 22:53 TheresNoTime: remove stuck beta deployment jobs
* 18:36 bd808: !log deployment-prep Testing dual page wiki logging by stashbot. (second attempt)
* 22:51 dduvall: re-armed keyholder on deploy-1004.devtools following reboot
* 18:14 bd808: !log deployment-prep Testing dual page wiki logging by stashbot.
* 22:50 James_F: Zuul: Don't use browser-direct-coverage where browser-coverage will do
* 10:30 hashar: integration: on Trusty slaves, remove jenkins-deploy from KVM which is only needed for Android testing for T149294: salt -v '*slave-trusty*' cmd.run 'deluser jenkins-deploy kvm'
* 22:49 dduvall: modified `deployment_hosts` puppet config for devtools project to allow deployments from `deploy-1004`
* 10:29 hashar: integration: on Trusty slaves, remove jenkins-deploy from KVM which is only needed for Android testing:  salt -v '*slave-trusty*' cmd.run 'groupdeluser jenkins-deploy kvm'
* 22:24 dduvall: armed keyholder with phabricator key on deploy-1004.devtools
* 10:25 hashar: integration: purge Android packages from Trusty slaves for T149294 : salt -v '*slave-trusty*' cmd.run 'apt-get --yes remove --purge gcc-multilib lib32z1 lib32stdc++6 qemu'
* 22:11 dduvall: setting puppetmaster to project standalone for deploy-1004.devtools
* 21:01 James_F: Zuul: [mediawiki/extensions/Phonos] Add comment about deployment timing for [[phab:T314306|T314306]]
* 21:00 James_F: Zuul: [mediawiki/extensions/BlueSpiceCustomMenu] Add MenuEditor dependency
* 15:53 taavi: reloading zuul for https://gerrit.wikimedia.org/r/819097
* 09:14 TheresNoTime: clearing stuck beta CI jobs


== 2016-10-25 ==
== 2022-07-29 ==
* 19:21 hasharAway: Python PyPi mirror has some issue. Impacts all CI jobs relying on tox  https://status.python.org/
* 22:16 James_F: Zuul: Configure CI for the forthcoming REL1_39 branches for [[phab:T313919|T313919]]
* 10:39 elukey: cherry picked https://gerrit.wikimedia.org/r/#/c/314519/ and https://gerrit.wikimedia.org/r/#/c/306943/ to deployment-puppetmaster
* 18:00 brennen: using standalone puppetmaster in devtools to test phabricator scap3 changes


== 2016-10-24 ==
== 2022-07-28 ==
* 16:19 andrewbogott: upgrading deployment-puppetmaster to puppet 3.8.5 packages
* 17:54 brennen: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/818189/
* 09:14 hashar: rebasing integration puppet master


== 2016-10-21 ==
== 2022-07-27 ==
* 09:42 gehel: decommission of deployment-elastic08 - T147777
* 13:55 James_F: Zuul: [mediawiki/core] Add a non-vendor php80 job for main branch [[phab:T300463|T300463]]
* 13:08 James_F: Zuul: [mediawiki/core] Make php80 voting on REL1_38 for [[phab:T274965|T274965]]
* 13:04 James_F: Zuul: Add php81 experimental job everywhere we have php80
* 12:39 James_F: Zuul: [mediawiki/extensions/WikibaseLexeme] Add WikibaseLexemeCirrusSearch dep
* 03:48 Krinkle: Click "Disable publishing" for a dozen repos created recently, including OAuthRateLimiter, ref [[phab:T143162|T143162]], [[phab:T193565|T193565]]


== 2016-10-20 ==
== 2022-07-25 ==
* 23:37 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/317083
* 22:16 dduvall: re-enabled puppet on untrusted runners following testing of https://gerrit.wikimedia.org/r/c/operations/puppet/+/815769
* 20:53 legoktm: deploying https://gerrit.wikimedia.org/r/317022
* 21:25 dduvall: disabling puppet on untrusted gitlab-runners to test deployment of https://gerrit.wikimedia.org/r/c/operations/puppet/+/815769


== 2016-10-14 ==
== 2022-07-23 ==
* 21:13 matt_flaschen: Ran START SLAVE to restart replication after columns created directly on replica were deleted.
* 17:43 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/816251
* 20:53 bd808: Dropped lu_local_id, lu_global_id from replica db which were added improperly
* 20:37 matt_flaschen: Applied CentralAuth's patch-lu_local_id.sql migration for T148111, to sql --write
* 20:09 bd808: Applied CentralAuth's patch-lu_local_id.sql migration for T148111
* 11:30 dcausse: deployment-prep running sudo update-ca-certificates --fresh on deployment-ton to fix curl error code 60 in cirrus maint script (T145609)


== 2016-10-13 ==
== 2022-07-21 ==
* 21:21 hashar: Deleted CI slaves integration-slave-jessie-1004 integration-slave-jessie-1005 integration-slave-trusty-1013 integration-slave-trusty-1014 integration-slave-trusty-1017 integration-slave-trusty-1018
* 21:55 dancy: Upgrading scap to 4.11.2-1+0~20220720160115.349~1.gbpd4a6cb in beta cluster
* 20:12 hashar: Switching composer-hhvm / composer-php55 to Nodepool  https://gerrit.wikimedia.org/r/#/c/306727/  T143938
* 16:23 gilles: Resetting to 61a9cd1f47c5aec8ded92f2486ce43309b9e3e03 on deployment-puppetmaster
* 16:06 godog: add settings to duplicate traffic to thumbor in beta and restart swift-proxy
* 16:03 gilles: Cherry-picking https://gerrit.wikimedia.org/r/#/c/315648/ on deployment-puppetmaster
* 15:35 gilles: Resetting to 61a9cd1f47c5aec8ded92f2486ce43309b9e3e03 on deployment-puppetmaster
* 14:38 gilles: Cherry-picking https://gerrit.wikimedia.org/r/#/c/315234/5 on deployment-puppetmaster
* 14:34 gilles: Resetting to 61a9cd1f47c5aec8ded92f2486ce43309b9e3e03 on deployment-puppetmaster
* 14:32 gilles: Cherry-picking https://gerrit.wikimedia.org/r/#/c/315234/4 on deployment-puppetmaster
* 14:32 gilles: Resetting to 61a9cd1f47c5aec8ded92f2486ce43309b9e3e03 on deployment-puppetmaster
* 14:27 gilles: Cherry-picking https://gerrit.wikimedia.org/r/#/c/315234/ on deployment-puppetmaster
* 14:22 gilles: Resetting to 61a9cd1f47c5aec8ded92f2486ce43309b9e3e03 on deployment-puppetmaster
* 13:42 gilles: Cherry picking https://gerrit.wikimedia.org/r/#/c/315248/ on deployment-puppetmaster


== 2016-10-12 ==
== 2022-07-20 ==
* 13:37 elukey: upgraded memcached on deployment-memc04 to 1.4.28-1.1+wmf1 as part of a perf experiment (T129963) - rollback: wipe https://wikitech.wikimedia.org/wiki/Hiera:Deployment-prep/host/deployment-memc04, apt-get remove memcached on deployment-memc04, puppet run
* 15:43 dancy: Upgrading scap to 4.11.1-1+0~20220720154238.348~1.gbp94de82 in beta cluster
* 13:19 James_F: Zuul: [mediawiki/extensions/VueTest] Add extension-codehealth pipeline


== 2016-10-11 ==
== 2022-07-19 ==
* 21:35 hasharAway: Force pushed Zuul patchqueue  5628f95...fc6a118 HEAD -> patch-queue/debian/precise-wikimedia
* 17:40 dancy: Upgrading scap to 4.11.0-1+0~20220719173732.346~1.gbpe07bc9 in beta cluster
* 14:37 hashar: Mysql was down on Precise slaves. Apparently rebooted 17 days ago and I guess mysql does not spawn on boot. Restarted mysql on all Precise via: salt -v '*slave-precise*' cmd.run 'start mysql'
* 17:00 urbanecm: deployment-prep: urbanecm@deployment-mwmaint02:~$ mwscript extensions/GrowthExperiments/maintenance/migrateWikitextMentorList.php --wiki=arwiki # [[phab:T310905|T310905]]
* 09:35 godog: reboot deployment-imagescaler01 to enable memory cgroup
* 08:29 hashar: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/#/c/313387/ Filter out refs/meta/config from all pipelines  T52389


== 2016-10-10 ==
== 2022-07-18 ==
* 15:45 dcausse: deployment-prep deployment-elastic0[5-8]: reduce the number of replicas to 1 max for all indices
* 19:43 dancy: Upgrading scap to 4.10.0-1+0~20220718175214.344~1.gbpe518a1 in beta cluster
* 13:40 Lucas_WMDE: lucaswerkmeister-wmde@deployment-deploy03:~$ sql wikishared --write < /srv/mediawiki-staging/php-master/extensions/CampaignEvents/db_patches/mysql/tables-generated.sql # [[phab:T311752|T311752]]
* 10:40 hashar: Refreshing Jenkins jobs for https://gerrit.wikimedia.org/r/814745
* 09:58 hashar: Refreshing Jenkins jobs for https://gerrit.wikimedia.org/r/c/integration/config/+/814730 jjb: update php jobs to have php-pcov included
* 09:46 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/814728


== 2016-10-07 ==
== 2022-07-17 ==
* 20:10 hashar: Created repository.integration.eqiad.wmflabs to play/Test Sonatype Nexus
* 13:00 taavi: reloading zuul for https://gerrit.wikimedia.org/r/814356
* 20:10 hashar: rebooting integration-puppetmaster01
* 07:55 hashar: Upgrading Nodepool image for Jessie


== 2016-10-06 ==
== 2022-07-16 ==
* 14:45 hashar: deployment-mira disarmed/rearmed keyholder in an attempt to clear a Shinken alarm
* 00:10 mutante: doc1002 - sudo systemctl start rsync-doc-doc2001.codfw.wmnet - Icinga alerted after an 'rsync warning: some files vanished before they could be transferred (code 24)' - but all is ok on next attempt
* 12:16 hashar: Jenkins slave deployment-tin.eqiad , removing label "deployment-tin.eqiad"  it has "BetaClusterBastion" and all jobs are bound to it already


== 2016-10-05 ==
== 2022-07-15 ==
* 19:33 andrewbogott: removing mediawiki::conftool from deployment-mediawiki04, deployment-mediawiki06, deployment-mediawiki05
* 15:59 hashar: Built pcov php docker images [[phab:T280170|T280170]]
* 15:46 hashar: contint2001: `docker-system-prune-dangling.service`  it failed overnight cause Docker was not running. That should clear Icinga state # [[phab:T313119|T313119]]
* 14:05 James_F: Zuul: [mediawiki/tools/wikilambda-cli] Switch to node16 jobs
* 13:05 James_F: Docker: Building node16 images for CI for [[phab:T313075|T313075]], this time actually.
* 12:30 hashar: Starting docker on contint2001.wikimedia.org # [[phab:T313119|T313119]]
* 12:20 hashar: rebuilding `php??` images for pcov https://gerrit.wikimedia.org/r/c/integration/config/+/694621 # [[phab:T280170|T280170]]
* 10:55 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/813967
* 10:49 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/813932


== 2016-10-04 ==
== 2022-07-14 ==
* 19:43 andrewbogott: removed contint::slave_scripts and associated files from deployment-sca01 and  deployment-sca02
* 18:50 James_F: Docker: Building node16 images for CI for [[phab:T313075|T313075]]
* 16:22 bd808: Restarted puppetmaster process on deployment-puppetmaster
* 14:52 James_F: Zuul: [mediawiki/skins/BlueSpiceSkin] Archive for [[phab:T203215|T203215]]
* 16:20 bd808: deployment-puppetmaster: removing cherry-pick of https://gerrit.wikimedia.org/r/#/c/305256/; conflicts with upstream changes
* 14:48 James_F: Zuul: [mediawiki/extensions/BlueSpiceExtensions] Archive
* 15:01 godog: shutdown deployment-poolcounter02, replaced by deployment-poolcounter04 - T123734
* 14:42 James_F: Zuul: [mediawiki/extensions/BlueSpiceBookshelfUI] Archive for [[phab:T268085|T268085]]
* 09:03 hashar: Regenerating configuration of all Jenkins job due to https://gerrit.wikimedia.org/r/#/c/313306/
* 14:38 James_F: Zuul: [mediawiki/tools/wikilambda-cli] Install node14 CI
* 01:14 twentyafterfour: New scap command line autocompletions are now installed on deployment-tin and deployment-mira refs T142880


== 2016-10-03 ==
== 2022-07-13 ==
* 22:40 thcipriani: manual rebase on deployment-puppetmaster:/var/lib/git/operations/puppet
* 23:23 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/813720
* 22:05 thcipriani: reapplied beta::deployaccess to mediawiki servers
* 20:31 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/813707
* 21:42 cscott: updated OCG to version 0bf27e3452dfdc770317f15793e93e6e89c7865a
* 21:36 cscott: starting OCG deploy
* 13:43 hashar: Added integration-slave-trusty-1014  back in the pool
* 13:41 hashar: Tip of the day: to reboot an instance and bypass molly-guard: /sbin/reboot
* 13:39 hashar: integration-slave-trusty-1014  upgrading packages, clean up and rebooting it
* 13:37 hashar: marked integration-slave-trusty-1014 offline. Cant run job / get stuck somehow
* 10:21 godog: add role::prometheus::node_exporter to classes in hiera:deployment-prep T144502


== 2016-10-01 ==
== 2022-07-12 ==
* 09:41 hashar: beta: shutdown deployment-db1 and deployment-db2 . Databases have been migrated to other hosts T138778
* 17:29 Amir1: dropping tl_namespace and tl_title from templatelinks in fawiki ([[phab:T312865|T312865]])


== 2016-09-29 ==
== 2022-07-11 ==
* 15:43 hashar: logstash-beta: refreshed the field list via https://logstash-beta.wmflabs.org/app/kibana#/settings/indices/logstash
* 22:55 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/812934
* 13:52 hashar: Restarted jobrunner / jobchron on deployment-jobrunner02 . Were no more logging to /var/log/mediawiki/ somehow
* 19:46 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/812467
* 13:51 hashar: Restarted udp2log on deployment-fluorine02
* 10:50 legoktm: deploying https://gerrit.wikimedia.org/r/313384
* 10:37 hashar: Jenkins upgrade AnsiColor plugin from 0.3.1 to 0.4.2
* 10:28 hashar: Upgrading Jenkins plugins with zeljkof :]
* 08:59 hashar: Hopefully going to get beta fixed via mw/core revert patch https://gerrit.wikimedia.org/r/313373


== 2016-09-28 ==
== 2022-07-10 ==
* 23:56 MaxSem: Deleted varnish cache files on deployment-cache-upload04 to free up space, disk full
* 00:07 Krinkle: krinkle@mediawiki12$ sudo enable-puppet
* 21:48 hasharAway: deployment-tin:  service nscd  restart
* 21:43 hasharAway: beta cluster update database is broken :/  Filled T146947 about it
* 21:25 hasharAway: deployment-tin: sudo -H -u www-data php5 /srv/mediawiki-staging/multiversion/MWScript.php update.php --wiki=commonswiki --quick
* 21:18 hasharAway: https://integration.wikimedia.org/ci/view/Beta/job/beta-update-databases-eqiad/ is broken for unkwnon reason :(
* 20:48 hasharAway: Deleted deployment-tin02 via Horizon. Replaced by deployment-tin
* 20:19 hasharAway: restarted keyholder on deployment-tin
* 20:11 hasharAway: Switch Jenkins slave deployment-mira.eqiad to deployment-tin.eqiad
* 20:09 hasharAway: deployment-tin: keyholder arm
* 20:08 hasharAway: deployment-tin for instance in `grep deployment /etc/dsh/group/mediawiki-installation`; do ssh-keyscan `dig +short $instance` >> /etc/ssh/ssh_known_hosts; done;
* 19:49 hasharAway: Dropping deployment-tin02 , replacing it with deployment-tin which has been rebuild to Jessie T144006
* 12:44 hashar: Cant finish up the switch to deployment-tin,  puppet still does not pass due to weird clone issues ...
* 11:48 hashar: Deleting deployment-tin Trusty instance and recreate one with same hostname as Jessie; Meant to replace deployment-tin02  T144006
* 10:44 hashar: CI updating all mwext-Wikibase* jenkins jobs for https://gerrit.wikimedia.org/r/#/c/313056/  T142158
* 10:43 hashar: Updating slave scripts for "Disable garbage collection for mw-phpunit.sh"  https://gerrit.wikimedia.org/r/313051  T142158
* 08:31 hashar: Reloading Zuul to deploy dc2ada37


== 2016-09-27 ==
== 2022-07-09 ==
* 20:11 hashar: Reloading Zuul to deploy 3c3289aa1a  for T143938 and T146783
* 20:39 ori: ori@deployment-mediawiki12:~$ sudo apt install php-tideways-xhprof-dbgsym
* 16:29 anomie: Cherry-picked https://gerrit.wikimedia.org/r/#/c/313035/ on deployment-puppetmaster
* 17:25 ori: Cherry-picked {{Gerrit|Ief73cc553}} (varnish: use libvmod-querysort on Beta Cluster) on deployment-prep Puppetmaster. Can be reverted if there are any issues.
* 06:16 Krinkle: krinkle@mediawiki12$ sudo disable-puppet
* 06:08 ori: ori@deployment-mediawiki12: userdel systemd-coredump, followed by apt install systemd-coredump
* 05:50 Krinkle: krinkle@deployment-mediawiki-12$ sudo apt-get install systemd-coredump  # ref [[phab:T312689|T312689]]


== 2016-09-26 ==
== 2022-07-07 ==
* 23:58 bd808: Started udp2log-mw on deployment-fluorine02 for T146723
* 22:42 TheresNoTime: clear stuck beta deployment jobs (again), [[phab:T72597|T72597]]
* 11:35 hashar: deployment-salt02 : autoremoving a bunch of java related packages
* 21:10 TheresNoTime: clear stuck beta deployment jobs, [[phab:T72597|T72597]]
* 11:31 hashar: rebooting deployment-salt02  has a kernel soft lock while hitting the disk
* 16:47 urbanecm: deployment-prep: wikiadmin@172.16.3.206(enwiki)> delete from growthexperiments_mentor_mentee where gemm_mentor_id=93651; # testing a specific workflow in Special:MentorDashboard
* 11:24 hashar: beta: mass upgrading all debian packages on all instances
* 12:22 hashar: integration: rebooting `integration-agent-docker-1039` [[phab:T312534|T312534]]
* 10:32 hashar: beta: on deployment-pdf01 rm -fR /home/cscott/tmp/npm*
* 10:29 hashar:  deployment-pdf01 apt-get upgrade / cleaning files left over etc
* 10:28 hashar: beta: on deployment-pdf01 rm -fR /home/cscott/.npm/ T145343


== 2016-09-24 ==
== 2022-07-05 ==
* 20:08 hashar: deployment-tin is shutdown. Replaced by Jessie deployment-tin02
* 14:17 dwalden: restarted mathoid service on deployment-docker-mathoid01
* 20:02 hashar: deployment-mira: ssh-keyscan deployment-tin02.deployment-prep.eqiad.wmflabs >> /etc/ssh/ssh_known_hosts
* 11:39 hashar: Reloaded Zuul for `skip selenium for Wikibase repo/rest-api` https://gerrit.wikimedia.org/r/c/integration/config/+/811258
* 20:00 hashar: beta: dropping deployment-tin (ubuntu) replaced by deployment-tin02 (jessie). Primary is still deployment-mira (https://gerrit.wikimedia.org/r/#/c/312654/ T144578 )
* 08:49 hauskatze: Diffusion rORES repository. Changed URI settings: enabled SSH push for mirroring; disabled HTTP {{!}} [[phab:T311390|T311390]]


== 2016-09-23 ==
== 2022-06-30 ==
* 20:21 hashar: integration:  salt -v '*trusty*' cmd.run 'service mysql start'
* 22:02 TheresNoTime: unstuck beta-mediawiki-config-update-eqiad jobs, will comment at [[phab:T72597|T72597]]
* 20:00 hashar: rebooting all CI permanent slaves.  Making sure nothing is left on /mnt (which is no more mounted)
* 21:05 TheresNoTime: cancelled beta-code-update-eqiad#398138 to make way for pending beta-scap-sync-world#57641, queued another beta-code-update-eqiad
* 19:53 hashar: added a 30 minutes build timeout to https://integration.wikimedia.org/ci/job/phabricator-jessie-diffs/
* 16:47 taavi: reloading zuul to deploy https://gerrit.wikimedia.org/r/810053
* 15:02 hashar: rebooting integration-slave-jessie-1001
* 14:04 hashar: remove the /mnt based tmpfs for T146381 /  https://gerrit.wikimedia.org/r/#/c/312518/ via: salt -v '*' cmd.run 'umount /mnt/home/jenkins-deploy/tmpfs'
* 13:41 hashar: Switching tmpfs from /mnt to /srv https://gerrit.wikimedia.org/r/#/c/312330/  and running fab deploy_slave_scripts


== 2016-09-22 ==
== 2022-06-29 ==
* 19:29 hasharAway: switching Jenkins slaves workspace from /mnt/jenkins-workspace to /srv/jenkins-workspace  (actually the same dir/inode on the filesystem)
* 14:48 ori: Clearing data from incomplete migration on Wikifunctionswiki via sql.php
* 01:52 legoktm: deploying https://gerrit.wikimedia.org/r/312158
* 13:39 TheresNoTime: clearing stuck beta deployment jobs, watching to ensure they catch up :')


== 2016-09-21 ==
== 2022-06-28 ==
* 18:22 yuvipanda: shutting down integration-puppetmaster
* 14:45 TheresNoTime: clear stuck beta deployment jobs, now running & will keep an eye
* 17:26 yuvipanda: cherry-pick https://gerrit.wikimedia.org/r/#/c/312044/ on deployment-puppetmaser
* 13:39 hashar: gerrit: added `Cindy-the-browser-test-bot` to the `Service Users` group https://gerrit.wikimedia.org/r/admin/groups/d39fe9cefd40ca1a07e372c0d7bd7e72ce2e4a2f,members {{!}} [[phab:T311370|T311370]]
* 16:41 hashar: deployment-tin02 initiale provisioning is complete. Gotta add it as a deployment server via a puppet.git patch
* 09:37 hashar: phabricator: changed username of rORES Phab>Gerrit replication from `phab` to `phabricator` # [[phab:T311390|T311390]]
* 16:01 hashar: deployment-tin02 applied puppet classes beta::autoupdater, beta::deployaccess, role::deployment::server, role::labs::lvm::srv
* 15:32 hashar: spawned deployment-tin02
* 14:55 hashar: removed the CI puppet class from deployment-sca01 and deployment-sca02 .  Stopped services using /srv  , unmounted /srv, removed it from /etc/fstab
* 14:27 hashar: deployment-sca01 and deployment-sca02 are now broken.  The CI puppet class mount /srv which ends up being only 500 MBytes
* 14:08 hashar: deployment-mira adding puppet class beta::autoupdater
* 14:06 hashar: Enabling Jenkins slave deployment-mira
* 14:05 hashar: deployment-mira seems ready for action and is the primary deployment server.  Enabling jenkins to it
* 11:25 hashar: removing Jenkins slave deployment-tin , deployment-mira is the new deployment master  T144578
* 10:58 hashar: Changing Jenkins slaves home dir for deployment-sca01 and deployment-sca02  from /mnt/home/jenkins-deploy to /srv/jenkins/home/jenkins-deploy
* 10:57 hashar: Changing Jenkins slaves home dir for deployment-tin and deployment-mira from /mnt/home/jenkins-deploy to /srv/jenkins/home/jenkins-deploy
* 10:10 hashar: deployment-mira removing "role::labs::lvm::srv"  duplicate with role::ci::slave::labs::common
* 10:07 hashar: Making deployment-mira a Jenkins slave by applying puppet class role::ci::slave::labs::common  T144578
* 10:05 hashar: Arming keyholder on deployment-mira
* 09:43 hashar: beta: switching master deployment server from deployment-tin to deployment-mira
* 09:34 hashar: From [[Hiera:deployment-prep]] remove bit already in puppet:  "scap::deployment_server": deployment-tin.deployment-prep.eqiad.wmflabs
* 08:55 moritzm: remove mira from deployment-prep (replaced by deployment-mira)
* 08:37 hashar: beta: manually rebased puppetmaster
* 08:11 elukey: terminated jobrunner01 and removed from deployment-prep's sacp dsh list
* 07:19 legoktm: deploying https://gerrit.wikimedia.org/r/311927


== 2016-09-20 ==
== 2022-06-27 ==
* 21:49 hashar: Deleting deployment-mira02 /srv was too small. Replaced by deployment-mira
* 21:19 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/809022
* 20:54 hashar: from deployment-tin for T144578, accept ssh host key of deployment-mira :  sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mira.deployment-prep.eqiad.wmflabs
* 19:28 Reedy: Reloading Zuul to deploy https://phabricator.wikimedia.org/T308406
* 20:47 hashar: Creating deployment-mira instance with flavor c8.m8.s60 (8 cpu, 8G RAM and 60G disk) T144578
* 19:00 thcipriani: cherry-picked https://gerrit.wikimedia.org/r/#/c/311760/ to deployment-puppetmaster to fix failing beta-scap-eqiad job, had to manually start rsync, puppet failed to start
* 18:38 hashar: on tin: `sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mira02.deployment-prep.eqiad.wmflabs`  - T144006
* 18:33 hashar: on deployment-mira02  ran `sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mediawiki04.deployment-prep.eqiad.wmflabs` per T144006
* 18:01 marxarelli: deployed mediawiki-config changes on beta cluster. back in read/write mode using new database instances
* 17:37 marxarelli: deployment-db04 restored from backup and replication started
* 16:54 marxarelli: upgraded package and data to mariadb 10 on deployment-db03
* 16:31 marxarelli: cherry picking operations/puppet patches (T138778) to deployment-puppetmaster
* 16:30 moritzm: rebooting deployment-mira02
* 16:23 marxarelli: applied innodb transaction logs to deployment-db1 backup and successfully restored on deployment-db03
* 15:47 marxarelli: completed innobackupex on deployment-db1. copying backup to deployment-db03 for restoration
* 14:54 hashar: beta: cherry picking fix up for the jobrunner logging https://gerrit.wikimedia.org/r/#/c/311702/ and  https://gerrit.wikimedia.org/r/311719 T146040
* 14:44 marxarelli: entering read-only mode on beta cluster
* 14:27 elukey: stopped puppet, jobrunner and jobchron on deployment-jobrunner01
* 14:20 marxarelli: disabling beta cluster jenkins jobs in preparation for data migration (T138778)
* 13:07 godog: add deployment-prometheus01 instance T53497
* 11:20 elukey: applied beta::deployaccess, role::labs::lvm::srv, role::mediawiki::jobrunner to jobrunner02
* 10:45 elukey: created deployment-jobrunner02 in deployment-prep


== 2016-09-19 ==
== 2022-06-24 ==
* 22:01 legoktm: shutdown integration-puppetmaster
* 20:52 taavi: added `denisse` as a member
* 21:29 yuvipanda: regenerated client certs only on integration-puppetmaster01, seems ok now
* 20:46 yuvipanda: re-enable puppet everywhere
* 20:43 yuvipanda: enable puppet and run on integration-slave-trusty-1003.eqiad.wmflabs
* 20:41 yuvipanda: accidentally deleted /var/lib/puppet/ssl on integration-puppetmaster01 as well, causing it to lose keys. Reprovision by pointing to labs puppetmaster
* 20:34 yuvipanda: rm -rf /var/lib/puppet/ssl on all integration nodes
* 20:34 yuvipanda: copied /etc/puppet/puppet.conf from integration-trusty-slave-1001 to all integration
* 20:25 yuvipanda: delete /etc/puppet/puppet.conf.d/10-self.conf and /var/lib/puppet/ssl on integration-slave-trusty-1001
* 20:20 yuvipanda: re-enabled puppet on integration-slave-trusty-1001
* 20:08 yuvipanda: reset puppetmaster of integration-puppetmaster01 to be labs puppetmaster
* 20:03 yuvipanda: disable puppet across integration project, moving puppetmasters
* 19:49 legoktm: creating T144951 enabled role::puppetmaster::standalone role on integration-puppetmaster01
* 19:33 legoktm: creating T144951 integration-puppetmaster01 instance using m1.small and debian jessie
* 15:11 hashar: beta: updating jobrunner service 0dc341f..a0e8216


== 2016-09-17 ==
== 2022-06-23 ==
* 07:11 legoktm: deploying https://gerrit.wikimedia.org/r/311024
* 15:59 taavi: reload zuul for https://gerrit.wikimedia.org/r/808021


== 2016-09-16 ==
== 2022-06-22 ==
* 21:03 hashar: deployment-tin  did a git gc on /srv/deployment/ores  That freed up disk space and cleared an alarm on co master mira02
* 17:36 taavi: gerrit: add tfellows to the extension-OpenBadges group per request in [[phab:T308278|T308278]]
* 21:00 hashar: deleted deployment-parsoid05
* 17:35 taavi: gerrit: create group extension-JsonData with robla in it, make it an owner of mediawiki/extensions/JsonData per request in [[phab:T303147|T303147]]
* 20:52 hashar: fixed puppet on deployment-parsoid05 . Temporary instance will delete it later to clear out shinken.wmflabs.org
* 16:19 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/807586
* 20:27 hashar: beta: force running puppet in batches of 4 instances:  salt --batch 4 -v 'deployment-*' cmd.run 'puppet agent -tv'
* 09:35 hashar: Switched `gitlab-prod-1001.devtools.eqiad1.wikimedia.cloud` instance to use the project Puppet master `puppetmaster-1001.devtools.eqiad1.wikimedia.cloud`
* 20:13 hashar: beta: restarted puppetmaster
* 09:08 hashar: contint1001 , contint2002: deleting `.git/logs` from all zuul-merger repositories. We do not need the reflog `sudo -u zuul find /srv/zuul/git -type d -name .git -print -execdir rm -fR .git/logs \;` # [[phab:T307620|T307620]]
* 20:07 hashar: beta: salt -v '*' cmd.run 'rm -fR /var/lib/puppet/client/ssl/'
* 09:00 hashar: contint1001 , contint2002: setting `core.logallrefupdates=false` on all Zuul merger git repositories: `sudo -u zuul find /srv/zuul/git -type d -name .git -print -execdir git config core.logallrefupdates false \;` # [[phab:T307620|T307620]]
* 20:07 hashar: beta: stopping puppetmaster,  rm -f /var/lib/puppet/server/ssl/ca/signed/*
* 07:46 hashar: Building operations-puppet docker image for https://gerrit.wikimedia.org/r/c/integration/config/+/807180
* 19:53 hashar: beta created instance "deployment-parsoid05" Should be deleted later, that is merely to purge the hostname from Shinken ( http://shinken.wmflabs.org/host/deployment-parsoid05 )
* 11:42 hashar: beta: apt-get upgrade on deployment-jobrunner01
* 11:36 hashar: apt-get upgrade on deployment-tin , bring in a new hhvm version and others


== 2016-09-15 ==
== 2022-06-21 ==
* 22:29 legoktm: sudo salt '*precise*' cmd.run 'service mysql start', all mysql's are down
* 22:01 brennen: gitlab-runners: re-registering all shared runners
* 16:45 godog: install xenial kernel on deployment-zotero01 and reboot T145793
* 17:55 dancy: Upgrading scap to 4.9.4-1+0~20220621174226.320~1.gbp56e4d4 in beta cluster
* 16:18 hashar: prometheus enabled on all beta cluster instance.  Does not support Precise hence puppet will fail on the last two Precise instances deployment-db1 and deployment-db2  until they are migrated to Jessie  T138778
* 15:53 godog: add role::prometheus::node_exporter to classes in hiera:deployment-prep T144502
* 15:10 hashar: beta: Applying puppet class role::prometheus::node_exporter to mira02 just like mira.  That is for godog
* 15:08 hashar: T144006 Disabled Jenkins job  beta-scap-eqiad.  On mira02  rm -fR /srv/*  .  Applying puppet for role::labs::lvm::srv
* 15:05 hashar: T144006  Applying class role::labs::lvm::srv to mira02  (it is out of disk space :D )
* 14:45 hashar: T144006 sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@mira02.deployment-prep.eqiad.wmflabs
* 14:44 hashar: T144006 sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mediawiki05.deployment-prep.eqiad.wmflabs
* 12:33 elukey: added base::firewall, beta::deployaccess, mediawiki::conftool, role::mediawiki::appserver to mediawiki05
* 12:20 elukey: terminate mediawiki02 to create mediawiki05
* 10:48 hashar: beta: cherry picking moritzm patch https://gerrit.wikimedia.org/r/#/c/310793/ "Also handle systemd in keyholder script" T144578
* 09:33 hashar: T144006 sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mediawiki06.deployment-prep.eqiad.wmflabs
* 09:10 elukey: executed git pull and then git rebase -i on deployment puppet master
* 08:52 elukey: terminated mediawiki03 and created mediawiki06
* 08:45 elukey: removed mediawiki03 from puppet with https://gerrit.wikimedia.org/r/#/c/310749/
* 02:36 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/310701


== 2016-09-14 ==
== 2022-06-20 ==
* 21:37 hashar: integration: setting "ulimit -c 2097152" on all slaves due to Zend PHP segfaulting  T142158
* 16:30 urbanecm: add sgimeno as a project member (Growth engineer with need for access)
* 14:31 hashar: Added otto to integration labs project
* 15:50 ori: On deployment-cache-<nowiki>{</nowiki>text,upload<nowiki>}</nowiki>06, ran: touch /srv/trafficserver/tls/etc/ssl_multicert.config && systemctl reload trafficserver-tls.service ([[phab:T310957|T310957]])
* 13:28 gehel: upgrading deployment-logstash2 to elasticsearch 2.3.5 - T145404
* 14:07 ori: restarted acme-chief on deployment-acme-chief03
* 09:27 hashar: Deleting deployment-mediawiki01 , replaced by deployment-mediawiki04  T144006
* 07:19 legoktm: sudo salt '*trusty*' cmd.run 'service mysql start', it was down on all trusty salves
* 07:17 legoktm: mysql just died on a bunch of slaves (trusty-1013, 1012, 1001)


== 2016-09-13 ==
== 2022-06-17 ==
* 17:02 marxarelli: re-enabling beta cluster jenkins jobs following maintenance window
* 17:15 ori: provisioned deployment-cache-text07 in deployment-prep to test query normalization via VCL
* 16:59 marxarelli: aborting beta cluster db migration due to time constraints and ops outage. will reschedule
* 01:08 TimStarling: on deployment-docker-cpjobqueue01 and deployment-docker-changeprop01 I redeployed the changeprop configuration, reverting the PHP 7.4 hack
* 15:34 marxarelli: disabled beta jenkins builds while in maintenance mode
* 15:18 marxarelli: starting 2-hour read-only maintenance window for beta cluster migration
* 10:06 hashar: beta: manually updated  jobrunner install on deployment-jobrunner01 and deployment-tmh01 then reloaded the services with:  service jobchron reload
* 10:02 hashar: Trebuchet is broken for /srv/deployment/jobrunner/jobrunner  cant reach the deploy minions somehow.  Did the update manually
* 10:00 hashar: Upgrading beta cluster jobrunner to catch up with upstream b952a7c..0dc341f  merely picking up a trivial log change ( https://gerrit.wikimedia.org/r/#/c/297935/ )
* 09:40 hashar: Unpooled deployment-mediawiki01 from scap and varnish. Shutting down instance.  T144006
* 09:02 hashar: on deployment-tin, accepted mediawiki04 host key for jenkins-deploy user : sudo -u jenkins-deploy -H SSH_AUTH_SOCK=/run/keyholder/proxy.sock ssh mwdeploy@deployment-mediawiki04.deployment-prep.eqiad.wmflabs  T144006
* 08:26 hashar: mwdeploy@deployment-mediawiki04  manually accepted ssh host key of deployment-tin  T144006
* 08:17 hashar: beta: manually accepted ssh host key for deployment-mediawiki04 as user mwdeploy on deployment-tin and mira T144006
* 07:46 gehel: upgrading elasticsearch to 2.3.5 on deployment-elastic0? - T145404


== 2016-09-12 ==
== 2022-06-16 ==
* 14:41 elukey: applied base::firewall, beta::deployaccess, mediawiki::conftool, role::mediawiki::appserver to deployment-mediawiki04.deployment-prep.eqiad.wmflabs (Debian jessie instance) - T144006
* 12:24 hashar: gitlab: runner-1030: `docker volume prune -f`
* 12:50 gehel: rolling back upgrading elasticsearch to 2.4.0 on deployment-elastic05 - T145058
* 12:24 hashar: gitlab: runner-1026: `docker volume prune -f`
* 12:03 gehel: upgrading elasticsearch to 2.4.0 on deployment-elastic0? - T145058
* 10:02 elukey: ran `scap install-world --batch` to allow scap/puppet to work on ml-cache100[2,3]
* 12:01 hashar: Gerrit: made analytics-wmde group to be owned by themselves
* 11:57 hashar: Gerrit: added ldap/wmde as an included group of the 'wikidata' group. Asked by and demoed to addshore


== 2016-09-11 ==
== 2022-06-15 ==
* 18:45 legoktm: deploying https://gerrit.wikimedia.org/r/309829
* 22:39 brennen: phabricator: tagged release/2022-06-15/1 ([[phab:T310742|T310742]])
* 16:31 hashar: integration-agent-docker-1035: docker image prune
* 15:26 dancy: Upgrading scap to 4.9.4-1+0~20220615151557.315~1.gbped3b8d in beta cluster


== 2016-09-09 ==
== 2022-06-14 ==
* 20:53 thcipriani: testing scap 3.2.5-1 on beta cluster
* 21:30 TheresNoTime: clear out stuck `beta-scap-sync-world` jobs (repeatedly per each queued `beta-mediawiki-config-update-eqiad` job), queued jobs now running. monitored for until each job had run successfully. jobs up to date
* 11:08 hashar: Added git tag for latest versions of mediawiki/selenium and mediawiki/ruby/api
* 17:18 brennen: starting 1.39.0-wmf.16 ([[phab:T308069|T308069]]) transcript in deploy1002:~brennen/1.39.0-wmf.16.log
* 09:30 legoktm: Image ci-jessie-wikimedia-1473412532 in wmflabs-eqiad is ready
* 13:35 TheresNoTime: clear stuck `beta-scap-sync-world` job, other queued jobs now running. Cancel running `beta-update-databases-eqiad` job, will ensure it runs on the next timer
* 08:53 legoktm: added phpflavor-php70 label to integration-slave-jessie-100[1-5]
* 00:42 TimStarling: on deployment-deploy03 removed helm2, as was done in production
* 08:49 legoktm: deploying https://gerrit.wikimedia.org/r/309048


== 2016-09-08 ==
== 2022-06-13 ==
* 21:33 hashar: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/309413  " Inject PHP_BIN=php5 for php53 jobs"
* 22:04 TheresNoTime: cleared out stalled Jenkins beta jobs on `deployment-deploy03`, manually started `beta-code-update-eqiad` job & watched to completion. all caught up
* 20:00 hashar: nova delete ci-jessie-wikimedia-369422  (was stuck in deleting state)
* 04:33 hashar: Restarting Docker on contint1001.wikimedia.org , apparently can't build images anymore
* 19:49 hashar: Nodepool, deleting instances that Nodepool lost track of (from nodepool alien-list)
* 19:47 hashar: nodepool cant delete: ci-jessie-wikimedia-369422 [ delete | 2.24  hours . Stuck in task_state=deleting  :(
* 19:46 hashar: Nodepool looping over some tasks since 17:45  ( https://grafana.wikimedia.org/dashboard/db/nodepool?panelId=21&fullscreen  )
* 19:26 legoktm: repooled integration-slave-jessie-1005 now that php7 testing is done
* 19:19 hashar: integration: salt -v '*' cmd.run 'cd /srv/deployment/integration/slave-scripts; git pull' | https://gerrit.wikimedia.org/r/308931
* 19:12 hashar: integration:  salt -v '*' cmd.run 'cd /srv/deployment/integration/slave-scripts; git pull'  | https://gerrit.wikimedia.org/r/309272
* 17:08 legoktm: deleted integration-jessie-lego-test01
* 16:50 legoktm: deleted integration-aptly01
* 10:03 hashar: Delete Jenkins job https://integration.wikimedia.org/ci/job/mwext-VisualEditor-sync-gerrit/ that has been left behind. It is no more needed. T51846 T86659
* 10:02 hashar: Delete mwext-VisualEditor-sync-gerrit job, already got removed by ostriches in 139d17c8f1c4bcf2bb761e13a6501e4d85684066 . The issue in Gerrit (T51846) has been fixed. Poke T86659 , one less job on slaves.


== 2016-09-07 ==
== 2022-06-12 ==
* 20:44 matt_flaschen: Re-enabled beta-code-update-eqiad .
* 21:13 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/804777
* 20:35 hashar: Updated security group for deployment-prep labs project.  Allow ssh port 22 from contint1001.wikimedia.org (matching rules for gallium). T137323
* 20:30 hashar: Updated security group for contintcloud and integration labs project.  Allow ssh port 22 from contint1001.wikimedia.org (matching rules for gallium). T137323
* 20:14 matt_flaschen: Temporarily disabled https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/ to test live revert of aa0f6ea
* 16:09 hashar: Nodepool back in action. Had to manually delete some instances in labs
* 15:58 hashar: Restarting Nodepool . Lost state when labnet got moved T144945
* 13:13 hashar: Image ci-jessie-wikimedia-1473253681 in wmflabs-eqiad is ready  , has php7 packages. T144872
* 11:53 hashar: Force refreshing Nodepool jessie snapshot to get PHP7 included T144872
* 11:03 hashar: integration: cherry pick https://gerrit.wikimedia.org/r/#/c/308955/ "contint: prefer our bin/php alternative"  T144872
* 10:55 hashar: integration: dropped PHP7 cherry pick from puppet master. https://gerrit.wikimedia.org/r/#/c/308918/ has been merged.  Pushing it to the fleet of permanent Jessie slaves. T144872
* 10:37 hashar: beta: cleaning up salt-keys on deployment-salt02 . Bunch of instances got deleted
* 09:41 hashar: Moving rake jobs back to Nodepool ( T143938 ) with https://gerrit.wikimedia.org/r/#/c/306723/ and https://gerrit.wikimedia.org/r/#/c/306724/
* 05:57 legoktm: deploying https://gerrit.wikimedia.org/r/308932 https://gerrit.wikimedia.org/r/299697
* 05:26 legoktm: cherry-picked https://gerrit.wikimedia.org/r/#/c/308918/ onto integration-puppetmaster with a hack that has it only apply to integration-slave-jessie-1005
* 04:59 legoktm: added Krenair to integration project to help debug puppet stuff
* 04:35 legoktm: depooled integration-slave-jessie-1005 in jenkins so I can test puppet stuff on it


== 2016-09-06 ==
== 2022-06-10 ==
* 13:58 hashar: Qunit jobs should be all fine again now.  T144802
* 15:20 James_F: Zuul: [mediawiki/extensions/SearchVue] Add initial CI jobs for [[phab:T309932|T309932]]
* 13:46 hashar: nodepool.SnapshotImageUpdater: Image ci-jessie-wikimedia-1473169259 in wmflabs-eqiad is ready  T144802
* 08:28 hashar: Reloaded Zuul to remove mediawiki/services/parsoid from CI dependencies # https://gerrit.wikimedia.org/r/c/integration/config/+/803990
* 13:20 hashar: Rebuilding Nodepool Jessie image to hopefully include libapache-mod-php5 and restore qunit jobs behavior  T144802
* 04:27 TimStarling: on deployment-deploy03 running scap sync-world -v with PHP 7.4 for [[phab:T295578|T295578]]
* 10:37 hashar: gerrit: mark apps/android/commons hidden since it is now community maintained on GitHub. Will avoid confusion. T127678
* 04:03 TimStarling: on deployment-deploy03 running scap sync-world -v with PHP 7.2 for [[phab:T295578|T295578]] sanity check
* 09:11 hashar: nodepool.SnapshotImageUpdater: Image ci-trusty-wikimedia-1473152801 in wmflabs-eqiad is ready
* 09:06 hashar: nodepool.SnapshotImageUpdater: Image ci-jessie-wikimedia-1473152393 in wmflabs-eqiad is ready
* 09:00 hashar: Trying to refresh Nodepool Jessie image .  Image properties have been dropped, should fix it


== 2016-09-05 ==
== 2022-06-09 ==
* 14:08 hashar: Refreshing Nodepool base images for Trusty and Jessie. Managed to build new ones after T143769
* 22:49 dancy: Upgrading scap to 4.9.1-1+0~20220609211227.304~1.gbpe48c42 in beta cluster
* 16:39 brennen: gitlab shared runners: re-registering to apply image allowlist configuration


== 2016-09-02 ==
== 2022-06-08 ==
* 20:36 legoktm: deploying https://gerrit.wikimedia.org/r/308227
* 17:14 hashar: Reloaded Zuul for {{Gerrit|I39342265033e82ae13998f53defe6612dc6819b4}}
* 15:17 hashar: Bringing tox jobs to Nodepool with https://gerrit.wikimedia.org/r/#/c/306725/
* 15:57 dancy: Set `profile::mediawiki::php::restarts::ensure: present` in deployment-prep hiera config for [[phab:T237033|T237033]]
* 09:28 hashar: Reloaded Zuul for "Add doc publish for Translate" https://gerrit.wikimedia.org/r/792134


== 2016-09-01 ==
== 2022-06-06 ==
* 19:00 urandom: T130861: Restarting Cassandra on deployment-restbase0[1-2]
* 14:37 James_F: Zuul: [mediawiki/extensions/ImageSuggestions] Mark as in production for [[phab:T302711|T302711]]
* 18:58 urandom: T130861: De-cherry-picking https://gerrit.wikimedia.org/r/#/c/282466/
* 18:34 urandom: T130861: Restarting Cassandra on deployment-restbase0[1-2]
* 18:32 urandom: T130861: Cherry picking https://gerrit.wikimedia.org/r/#/c/282466/ to deployment-puppetmaster
* 16:38 legoktm: deploying https://gerrit.wikimedia.org/r/307794
* 12:22 hashar: migrating deployment-tin keyholder to use base::service_unit for moritm https://gerrit.wikimedia.org/r/#/c/307510/ + reboot + keyholder arm
* 03:09 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/307909


== 2016-08-31 ==
== 2022-06-02 ==
* 23:40 bd808: forced puppet run on deployment-salt02. Had not run automatically for 8 hours
* 15:33 dancy: Upgrading scap to 4.8.1-1+0~20220602153109.295~1.gbp318d9c in beta cluster
* 23:36 bd808: Deleted /data/scratch on integration-slave-trusty-1016 to fix puppet
* 11:26 hashar: Restarting Jenkins on contint2001
* 23:32 bd808: Deleted /data/scratch on integration-slave-trusty-1013 to fix puppet
* 11:19 hashar: Restarting Jenkins on releases1002
* 23:22 bd808: Deleted /data/scratch on integration-slave-trusty-1012 to fix puppet
* 23:19 bd808: Deleted /data/scratch on integration-slave-trusty-1011 to fix puppet
* 23:15 bd808: Deleted /data/scratch on integration-slave-precise-1012 to fix puppet
* 23:11 bd808: Deleted /data on integration-slave-precise-1011 to fix puppet
* 23:08 bd808: Deleted /data on integration-slave-jessie-1001 to fix puppet
* 23:04 bd808: Deleted empty /data, /data/project, and /data/scratch on integration-puppetmaster to fix puppet
* 22:59 bd808: Deleted empty /data, /data/project, and /data/scratch on integration-publisher to fix puppet
* 01:44 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/307670


== 2016-08-30 ==
== 2022-05-31 ==
* 23:31 yuvipanda: cherry-picking https://gerrit.wikimedia.org/r/#/c/307656/ fixed puppet on the elasticsearch machines!
* 21:16 dancy: Upgrading scap to 4.8.0-1+0~20220531211114.292~1.gbp8dbbcf in beta cluster
* 22:29 yuvipanda: in lieu of blood sacrifice, restart puppetmaster on deployment-pupetmaster
* 17:40 dancy: Upgrading scap to 4.8.0-1+0~20220531173912.291~1.gbp21a7ef in beta cluster
* 21:44 yuvipanda: use clush to fix puppet.conf of all clients, realize also accidentally set a client's puppet.conf for the server, recover server's old conf file from a cat in shell history, restore, breathe sigh of relief
* 17:33 dancy: Reverted to scap 4.8.0-1+0~20220524160924.288~1.gbp794a08 in beta cluster
* 21:37 yuvipanda: sudo takes like 15s each time, is there no god?
* 17:07 dancy: Upgrading scap to 4.8.0-1+0~20220531170512.289~1.gbp143729 in beta cluster
* 21:36 yuvipanda: managed to get vim into a state where I can not quit it, probably recording a macro. I hate computers
* 21:16 yuvipanda: deployment-pdf01 fixed manually
* 21:15 yuvipanda: deployment-pdf02 has proper ssl certs mysteriously without me doing anything
* 21:06 yuvipanda: moved deployment-db[12], deployment-stream to not use role::puppet::self, attempting to semi-automate rest
* 20:52 yuvipanda: cherry-picked appropriate patch on deployment-puppetmaster for T120159, did https://wikitech.wikimedia.org/w/index.php?title=Hiera:Deployment-prep/host/deployment-puppetmaster&oldid=818847 to make sure the puppetmaster allows connections from elsewhere
* 19:48 legoktm: deploying https://gerrit.wikimedia.org/r/306710
* 19:13 bd808: Fixed puppet runs on deployment-sca0[12] with cherry-pick of https://gerrit.wikimedia.org/r/#/c/307561
* 18:57 bd808: Duplicate declaration: File[/srv/deployment] is already declared in file /etc/puppet/modules/contint/manifests/deployment_dir.pp:14; cannot redeclare at /etc/puppet/modules/service/manifests/deploy/common.pp:12 on node deployment-sca01.deployment-prep.eqiad.wmflabs
* 18:40 bd808: Puppet busted on deployment-aqs01 -- Could not find data item analytics_hadoop_hosts in any Hiera data file and no default supplied at /etc/puppet/manifests/role/aqs.pp:46
* 12:59 hashar: beta: revert master branch to origin.  Ran scap and enabled again beta-code-update-eqiad job.
* 12:55 hashar: Running scap on beta cluster via https://integration.wikimedia.org/ci/view/Beta/job/beta-scap-eqiad/117786/console  T143889
* 12:53 hashar: Cherry picking https://gerrit.wikimedia.org/r/#/c/307501/ on beta cluster for T143889
* 12:51 hashar: disabling https://integration.wikimedia.org/ci/view/Beta/job/beta-code-update-eqiad/  to cherry pick a revert patch


== 2016-08-29 ==
== 2022-05-30 ==
* 07:56 hashar: hard rebooting integration-slave-trusty-1012 via horizon and restarting puppet manually
* 11:47 jelto: apply gitlab-settings to gitlab1004 - [[phab:T307142|T307142]]
* 07:50 hashar: integration-slave-trusty-1013  puppet.conf  certname was set to 'undef' breaking puppet
* 11:46 jelto: apply gitlab-settings to gitlab1003 - [[phab:T307142|T307142]]


== 2016-08-27 ==
== 2022-05-28 ==
* 20:51 hashar: integration: tweak sudo policy for jenkins-deploy running cowbuilder: env_keep+=DEB_BUILD_OPTIONS
* 19:09 TheresNoTime: deployment-deploy04 live, not referenced by anything [[phab:T309437|T309437]]
* 20:24 hashar: Manually installing jenkins-debian-glue 0.17.0 on integration-slave-jessie-1004 and integration-slave-jessie-1005 ( T142891 ) .  That is to support PBUILDER_USENETWORK T141114
* 20:05 hashar: Jenkins added global env variable BUILD_TIMEOUT set to 30  for T144094


== 2016-08-26 ==
== 2022-05-27 ==
* 22:29 legoktm: deploying https://gerrit.wikimedia.org/r/307025
* 22:55 zabe: zabe@deployment-mwmaint02:~$ mwscript extensions/WikiLambda/maintenance/updateTypedLists.php --wiki=wikifunctionswiki --db # started ~20 min ago
* 08:15 Amir1: restart uwsgi-ores and celery-ores-worker in deployment-sca03 (T143567)
* 22:49 TheresNoTime: manually running database update script: samtar@deployment-deploy03:~$ /usr/local/bin/wmf-beta-update-databases.py
* 08:11 hashar: beta-scap-eqiad job is back in operation.  Was blocked on logstash not being reachable. T143982
* 22:09 TheresNoTime: samtar@deployment-deploy03:~$ sudo keyholder arm
* 08:10 hashar: deployment-logstash2 is back after a hard reboot. T143982
* 21:44 TheresNoTime: hard rebooted deployment-deploy03 as soft reboot unresponsive
* 08:07 hashar: rebooting deployment-logstash02 via Horizon. Kernel hang apparently T143982
* 21:44 bd808: `sudo wmcs-openstack role add --user zabe --project deployment-prep projectadmin` ([[phab:T309419|T309419]])
* 08:00 hashar: beta-scap-eqiad failing investigating
* 21:10 zabe: zabe@deployment-deploy03:~$ sudo keyholder arm
* 07:54 Amir1: cherry-picked 306839/1 into deployment-puppetmaster
* 20:53 bd808: `sudo wmcs-openstack role add --user samtar --project deployment-prep projectadmin` ([[phab:T309415|T309415]])
* 00:28 twentyafterfour: restarted puppetmaster service on deployment-puppetmaster
* 20:49 dancy: Initiated hard reboot of deployment-deploy03.deployment-prep


== 2016-08-25 ==
== 2022-05-26 ==
* 23:15 Amir1: cherry-picked 306839/1 into puppetmaster
* 18:33 dancy: Updated Jenkins beta-* job configs
* 20:10 hashar: Delete  integration-slave-trusty-1023 with label AndroidEmulator.  The Android job has been migrated to a new Jessie based instance via  T138506
* 16:51 TheresNoTime: manually triggered beta-update-databases-eqiad post-merge of {{Gerrit|2c7b5825}}
* 19:05 hashar: hard rebooting integration-raita via Horizon
* 16:51 brennen: puppetmaster-1001.devtools: resetting ops/puppet checkout to production branch
* 16:04 hashar: fixing puppet.conf on integration-slave-trusty-1013  it mysteriously considered itself as the puppetmaster
* 16:02 hashar: integration restarted puppetmaster service
* 08:28 hashar: beta update database fixed
* 08:28 hashar: beta cluster update database failed due to: "Your composer.lock file is up to date with current dependencies!"  Probably a race condition with ongoing scap.


== 2016-08-24 ==
== 2022-05-25 ==
* 15:14 halfak: deploying ores d00171
* 18:38 TheresNoTime: (@ ~18:20UTC) samtar@deployment-mwmaint02:~$ mwscript resetUserEmail.php --wiki=wikidatawiki Mahir256 [snip] [[phab:T309230{{!}}T309230]]
* 09:50 hashar: deployment-redis02 fixed AOF file /srv/redis/deployment-redis02-6379.aof and restarted the redis instance should fix T143655  and might help T142600
* 15:46 dancy: Restarted apache2 on gerrit1001
* 09:43 hashar: T143655 stopping redis 6379 on deployment-redis02 : initctl stop redis-instance-tcp_6379
* 09:38 hashar: deployment-redis02 initctl stop redis-instance-tcp_6379 && initctl start redis-instance-tcp_6379 | That did not fix it magically though  T143655


== 2016-08-23 ==
== 2022-05-24 ==
* 18:21 legoktm: deploying https://gerrit.wikimedia.org/r/306257
* 15:15 dancy: Upgrading scap to 4.7.1-1+0~20220524151055.286~1.gbpe809e8 in beta cluster
* 16:38 bd808: Fixed ops/puppet sync by removing stale cherry-pick of https://gerrit.wikimedia.org/r/#/c/305996/
* 13:35 James_F: Zuul: [mediawiki/tools/code-utils] Add composer test CI for [[phab:T309099|T309099]]
* 08:22 hashar: running puppet on integration-slave-trusty-1014
* 11:36 TheresNoTime: cleared stuck beta deployment jobs per https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Hung_beta_code/db_update
* 08:18 hashar: reboot integration-slave-trusty-1014
* 08:16 hashar: disabled/enabled Jenkins Gearman client to remove deadlock with Throttle plugin


== 2016-08-22 ==
== 2022-05-23 ==
* 23:40 legoktm: updating slave_scripts on all slaves
* 19:21 inflatador: Deleted deployment-elastic0[5-7] in favor of newer bullseye hosts [[phab:T299797|T299797]]
* 18:37 dancy: Reverted to scap 4.7.1-1+0~20220505181519.270~1.gbpeb47ae in beta cluster
* 18:35 dancy: Upgrading beta cluster scap to 4.7.1-1+0~20220523183110.280~1.gbpaa0826
* 14:49 James_F: Zuul: Enforce Postgres and SQLite support via in-mediawiki-tarball
* 08:37 elukey: move kafka jumbo in deployment-prep to fixed uid/gid - [[phab:T296982|T296982]]
* 08:29 elukey: move kafka main in deployment-prep to fixed uid/gid - [[phab:T296982|T296982]]
* 08:06 elukey: move kafka logging in deployment-prep to fixed uid/gid - [[phab:T296982|T296982]]


== 2016-08-18 ==
== 2022-05-22 ==
* 22:03 bd808: deployment-fluorine02: Hack 'datasets:x:10003:997::/home/datasets:/bin/bash' into /etc/passwd for T117028
* 18:39 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/795818/
* 20:30 MaxSem: Restarted hhvm on appservers for wikidiff2 upgrades
* 19:03 MaxSem: Upgrading hhvm-wikidiff2 in beta cluster
* 16:53 legoktm: deploying https://gerrit.wikimedia.org/r/#/c/305532/


== 2016-08-17 ==
== 2022-05-21 ==
* 22:28 legoktm: deploying https://gerrit.wikimedia.org/r/305408
* 23:05 legoktm: deployed https://gerrit.wikimedia.org/r/c/integration/config/+/794756/
* 21:33 cscott: updated OCG to version e3e0fd015ad8fdbf9da1838c830fe4b075c59a29
* 14:11 hashar: Icinga reports `Gerrit Health Check SSL Expiry` errors filed as [[phab:T308908|T308908]]
* 21:28 bd808: restarted salt-minion on deployment-pdf02
* 21:26 bd808: restarted salt-minion on deployment-pdf01
* 21:15 cscott: starting OCG deploy to beta
* 14:10 gehel: upgrading elasticsearch to 2.3.4 on deployment-logstash2.deployment-prep.eqiad.wmflabs
* 13:28 gehel: upgrading elasticsearch to 2.3.4 on deployment-elastic*.deployment-prep + JVM upgrade


== 2016-08-16 ==
== 2022-05-20 ==
* 23:10 thcipriani: max_servers at 6, seeing 6 allocated instances, still seeing 403 already used 10 of 10 instances :((
* 16:21 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/793809
* 22:37 thcipriani: restarting nodepool, bumping max_servers to match up with what openstack seems willing to allocate (6)
* 09:06 Amir1: removing ores-related-cherry-picked commits from deployment-puppetmaster


== 2016-08-15 ==
== 2022-05-19 ==
* 21:30 thcipriani: update scap on beta to 3.2.3-1 bugfix release
* 19:34 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/793527
* 02:30 bd808: Forced a zuul restart -- https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Restart
* 14:31 hashar: Reloaded zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/793458 {{!}} Don't re-trigger the test pipeline on patches with C+2 already
* 02:23 bd808: Lots and lots of "AttributeError: 'NoneType' object has no attribute 'name'" errors in /var/log/zuul/zuul.log
* 02:21 bd808: nodepool delete 301068
* 02:20 bd808: nodepool delete 301291
* 02:20 bd808: nodepool delete 301282
* 02:19 bd808: nodepool delete 301144
* 02:11 bd808: nodepool delete 299641
* 02:11 bd808: nodepool delete 278848
* 02:08 bd808: Aug 15 02:07:48 labnodepool1001 nodepoold[24796]: Forbidden: Quota exceeded for instances: Requested 1, but already used 10 of 10 instances (HTTP 403)


== 2016-08-13 ==
== 2022-05-18 ==
* 23:16 Amir1: cherry-picking 304678/1 into the puppetmaster
* 19:31 hashar: Reloaded Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/793028
* 00:08 legoktm: deploying https://gerrit.wikimedia.org/r/304588
* 18:45 brennen: gitlab: created placeholder /repos/mediawiki group for squatting purposes
* 00:06 legoktm: deploying https://gerrit.wikimedia.org/r/304068
* 08:29 hashar: Updating SSH Build agent from 1.31.5 to 1.32.0 on CI Jenkins to prevent an issue when uploading `remoting.jar`  # [[phab:T307339|T307339]]#7937268
* 07:32 hashar: Deleting Jenkins agent configuration for `integration-castor03` # [[phab:T252071|T252071]]


== 2016-08-12 ==
== 2022-05-17 ==
* 23:57 legoktm: p
* 23:26 James_F: Zuul: [mediawiki/extensions/Phonos] Install basic quibble CI for [[phab:T308558|T308558]]
* 23:57 legoktm: deploying https://gerrit.wikimedia.org/r/304587, no-o
* 18:19 Amir1: deploying 2ef24f2 to ores-beta in sca03


== 2016-08-10 ==
== 2022-05-16 ==
* 23:56 legoktm: deploying https://gerrit.wikimedia.org/r/304149
* 19:31 inflatador: bking@deployment-elastic07 halted deployment-elastic07 in beta ES cluster; will decom on Friday [[phab:T299797|T299797]]
* 23:47 thcipriani: stopping nodepool to clean up
* 19:02 inflatador: bking@deployment-elastic06 halted deployment-elastic06 in beta ES cluster; will decom on Friday [[phab:T299797|T299797]]
* 23:41 legoktm: deploying https://gerrit.wikimedia.org/r/304131
* 08:33 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/791809
* 21:59 thcipriani: restarted nodepool, no trusty instances were being used by jobs
* 01:58 legoktm: deploying https://gerrit.wikimedia.org/r/303218


== 2016-08-09 ==
== 2022-05-14 ==
* 23:21 Amir1: ladsgroup@deployment-sca03:~$ sudo service celery-ores-worker restart
* 23:19 James_F: Zuul: Add Dreamy_Jazz to CI allow list
* 15:24 thcipriani: due to https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Jenkins_execution_lock
* 23:17 James_F: Zuul: [mediawiki/extensions/LocalisationUpdate] Move out of production section
* 15:20 thcipriani: beta site updates stuck for 15 hours :(
* 20:25 urbanecm: add TheresNoTime (samtar) as a project member per request
* 02:17 legoktm: deploying https://gerrit.wikimedia.org/r/303741
* 02:16 legoktm: manually updated slave-scripts on all slaves via `fab deploy_slave_scripts`
* 00:56 legoktm: deploying https://gerrit.wikimedia.org/r/303726


== 2016-08-08 ==
== 2022-05-13 ==
* 23:33 Tim: deleted instance deployment-depurate01
* 22:59 James_F: Zuul: [mediawiki/extensions/SocialProfile] Add WikiEditor as a CI dependency
* 16:19 bd808: Manually cleaned up root@logstash02 cronjobs related to logstash03
* 22:52 James_F: Zuul: Add Tranve to CI allow list
* 14:39 Amir1: deploying d00159c for ores in sca03
* 22:01 hashar: reloaded zuul for https://gerrit.wikimedia.org/r/791688
* 10:14 Amir1: deploying 616707c into sca03 (for ores)
* 18:58 inflatador: bking@deployment-elastic05 halted deployment-elastic05 in beta ES cluster; will decom in 1 wk [[phab:T299797|T299797]]
* 17:18 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/791644/
* 13:16 taavi: added user Zoranzoki21 to extension-HidePrefix gerrit group [[phab:T305317|T305317]]


== 2016-08-07 ==
== 2022-05-12 ==
* 12:01 hashar: Nodepool: can't spawn instances due to: Forbidden: Quota exceeded for instances: Requested 1, but already used 10 of 10 instances (HTTP 403)
* 22:09 inflatador: bking@deployment-elastic05 banned deployment-elastic05 from beta ES cluster in preparation for decom [[phab:T299797|T299797]]
* 12:01 hashar: nodepool: deleted servers stuck in "used" states for roughly 4 hours (using: nodepool list , then nodepool delete <id>)
* 19:53 hashar: gerrit: triggering full replication to gerrit2001 to test [[phab:T307137|T307137]]
* 11:54 hashar: Nodepool: can't spawn instances due to: Forbidden: Quota exceeded for instances: Requested 1, but already used 10 of 10 instances (HTTP 403)
* 16:00 hashar: contint2001 and contint1001 now automatically run `docker system prune --force` every day  and `docker system prune --force` on Sunday {{!}} https://gerrit.wikimedia.org/r/c/operations/puppet/+/773784/
* 11:54 hashar: nodepool: deleted servers stuck in "used" states for roughly 4 hours (using: nodepool list , then nodepool delete <id>)
* 15:05 brennen: gitlab-prod-1001.devtools: soft reboot
* 00:46 brennen: gitlab: disabling container registries on all existing projects ([[phab:T307537|T307537]])


== 2016-08-06 ==
== 2022-05-11 ==
* 12:31 Amir1: restarting uwsgi-ores and celery-ores-worker in deployment-sca03
* 23:20 brennen: gitlab-prod-1001.devtools: container registry currently enabled
* 12:28 Amir1: cherry-picked 303356/1 into the puppetmaster
* 18:58 brennen: gitlab-prod-1001.devtools: setting to use devtools standalone puppetmaster
* 12:00 Amir1: restarting uwsgi-ores and celery-ores-worker in deployment-sca03


== 2016-08-05 ==
== 2022-05-10 ==
* 17:54 bd808: Cherry-picked https://gerrit.wikimedia.org/r/#/c/299825/3 for testing
* 12:06 hashar: Updating Quibble jobs to image 1.4.5 with Memcached enabled {{!}} https://gerrit.wikimedia.org/r/c/integration/config/+/790641 {{!}} [[phab:T300340|T300340]]
* 17:50 bd808: Removed stale cherry-picks for https://gerrit.wikimedia.org/r/#/c/302303/ and https://gerrit.wikimedia.org/r/#/c/300458/ that were blocking git rebase
* 10:55 hashar: Updating `wmf-quibble-*` jobs to Quibble 1.4.5 # https://gerrit.wikimedia.org/r/c/integration/config/+/790638/
* 00:41 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/303113
* 08:36 hashar: Updating wikibase-client-docker and wikibase-repo-docker to Quibble 1.4.5 + supervisord https://gerrit.wikimedia.org/r/c/integration/config/+/790621
* 00:31 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/300068
* 08:30 hashar: Updating MediaWiki coverage jobs to Quibble image 1.4.5 + supervisord https://gerrit.wikimedia.org/r/c/integration/config/+/790381
* 08:24 hashar: Updating codehealth jobs to Quibble 1.4.5 + supervisord https://gerrit.wikimedia.org/r/c/integration/config/+/790380/
* 08:23 hashar: Updating MediaWiki Phan jobs to Quibble 1.4.5 https://gerrit.wikimedia.org/r/c/integration/config/+/790377


== 2016-08-04 ==
== 2022-05-09 ==
* 20:07 marxarelli: Running jenkins-jobs update config/ 'selenium-*' to deploy https://gerrit.wikimedia.org/r/#/c/302775/
* 21:43 James_F: Beta Cluster: Shutting down old deployment-restbase03 instance for [[phab:T295375|T295375]]
* 17:03 legoktm: jstart -N qamorebots /usr/lib/adminbot/adminlogbot.py --config ./confs/qa-logbot.py
* 20:33 hashar: Manually cancelling deadlock build jobs for beta https://integration.wikimedia.org/ci/view/Beta/ # [[phab:T307963|T307963]]


== 2016-08-01 ==
== 2022-05-08 ==
* 20:28 thcipriani: restarting deployment-ms-be01, not responding to ssh, mw-fe01 requests timing out
* 12:33 urbanecm: deployment-prep: urbanecm@deployment-mwmaint02:~$ foreachwikiindblist growthexperiments extensions/GrowthExperiments/maintenance/migrateMenteeOverviewFiltersToPresets.php --update # [[phab:T304057|T304057]]
* 08:28 Amir1: deploying fedd675 to ores in sca03


== 2016-07-29 ==
== 2022-05-06 ==
* 23:27 bd808: Rebooting deployment-logstash2; Console showed hung task timeouts (P3606)
* 12:55 hashar: Migrated Castor service from integration-castor03 to integration-castor05 # [[phab:T252071|T252071]]
* 15:55 hasharAway: pooled Jenkins slave integration-slave-jessie-1003 [10.68.21.145]
* 14:02 hashar: deployment-prep / beta : added addshore to the project
* 13:24 hashar: created integration-slave-jessie-1003 m1.medium to help processing debian-glue jobs
* 13:01 hashar: Upgrading Zuul on jessie slaves using https://people.wikimedia.org/~hashar/debs/zuul_2.1.0-391-gbc58ea3-jessie/zuul_2.1.0-391-gbc58ea3-wmf2jessie1_amd64.deb
* 12:53 hashar: Upgrading Zuul on precise slaves using https://people.wikimedia.org/~hashar/debs/zuul_2.1.0-391-gbc58ea3/zuul_2.1.0-391-gbc58ea3-wmf2precise1_amd64.deb
* 09:38 hashar: Upgrading Zuul to get rid of a forced sleep(300) whenever a patch is merged T93812. zuul_2.1.0-391-gbc58ea3-wmf2precise1


== 2016-07-28 ==
== 2022-05-05 ==
* 21:46 hashar_: xintegration: change sudo policy for jenkins-deploy to help on T141538 : env_keep+=WORKSPACE
* 22:57 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/789723
* 12:18 hashar: installed 2.1.0-391-gbc58ea3-wmf1jessie1 on zuul-dev-jessie.integration.eqiad.wmflabs T140894
* 22:31 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/789721
* 12:18 hashar: installed 2.1.0-391-gbc58ea3-wmf1jessie1 on zuul-dev-jessie.integration.eqiad.wmflabs
* 22:28 dduvall: created 2 new jobs to deploy https://gerrit.wikimedia.org/r/789720
* 09:46 hashar: Nodepool: Image ci-trusty-wikimedia-1469698821 in wmflabs-eqiad is ready
* 22:24 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/789718
* 09:35 hashar: Regenerated Nodepool image for TrustyThe snapshot failed while upgrading grub-pc for some reason. Noticed with thcipriani yesterday
* 22:21 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/789717
* 22:15 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/789714
* 22:13 dduvall: created 2 new jobs to deploy https://gerrit.wikimedia.org/r/789713
* 22:09 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/789711
* 22:07 dduvall: created 2 new jobs to deploy https://gerrit.wikimedia.org/r/789710
* 21:57 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789707/1
* 21:51 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789706
* 21:48 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789704
* 21:44 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789703
* 21:38 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/789698
* 21:35 dduvall: created 4 jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789697
* 21:26 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789694
* 21:22 dduvall: creating 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789693
* 18:27 dduvall: reenabled puppet on integration-agent-docker-1023.integration.eqiad1.wikimedia.cloud
* 18:25 dancy: Update to scap 4.7.1-1+0~20220505181519.270~1.gbpeb47ae in beta cluster
* 18:16 dduvall: disabled puppet on integration-agent-docker-1023.integration.eqiad1.wikimedia.cloud for deployment of https://gerrit.wikimedia.org/r/c/operations/puppet/+/768774
* 16:29 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789650
* 16:26 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789649
* 14:25 hashar: Created integration-castor05
* 12:28 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/789179 and https://gerrit.wikimedia.org/r/789232
* 07:45 hashar: deployment-prep: removed a few queued Jenkins builds from https://integration.wikimedia.org/ci/view/Beta/


== 2016-07-27 ==
== 2022-05-04 ==
* 16:13 hashar: salt -v '*slave-trusty*' cmd.run 'service mysql start'    ( was missing on integration-slave-trusty-1011.integration.eqiad.wmflabs )
* 21:29 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789285
* 14:03 hashar: upgraded zuul on gallium via dpkg -i /root/zuul_2.1.0-391-gbc58ea3-wmf1precise1_amd64.deb    (revert is zuul_2.1.0-151-g30a433b-wmf4precise1_amd64.deb )
* 21:16 dduvall: created 1 new job to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789284
* 12:43 hashar: restarted Jenkins for some trivial plugins updates
* 21:07 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789278
* 12:35 hashar: hard rebooting integration-slave-trusty-1011 from Horizon. ssh lost, no log in Horizon.
* 21:00 dduvall: created 2 jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789277
* 09:46 hashar: manually triggered debian-glue on all operations/debs repo that had no jenkins-bot vote. Via zuul enqueue on gallium and list fetched from "gerrit query --current-patch-set 'is:open NOT label:verified=2,jenkins-bot project:^operations/debs/.*'|egrep '(ref|project):'"
* 20:48 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/789274
* 06:21 Tim: created instance deployment-depurate01 for testing of role::html5depurate
* 20:44 dduvall: creating 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789273
* 20:31 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789265
* 20:25 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789264
* 20:22 urbanecm: urbanecm@deployment-mwmaint02:~$ mwscript extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=commonswiki --logwiki=metawiki "There'sNoTime" "TheresNoTime" # [[phab:T307590|T307590]]
* 20:14 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789259/1
* 20:11 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789258
* 18:54 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789245
* 18:47 dduvall: creating 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789244
* 18:31 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789238
* 18:24 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789237
* 17:51 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789225
* 17:22 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789218
* 17:12 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789217
* 16:11 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789204
* 16:01 dduvall: created 2 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789203
* 16:01 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789195
* 15:42 dduvall: created 2 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/789194
* 13:44 James_F: Zuul: [mediawiki/services/function-evaluator] Use bespoke pipeline jobs only [[phab:T307507|T307507]]


== 2016-07-26 ==
== 2022-05-03 ==
* 20:13 hashar: Zuul deployed https://gerrit.wikimedia.org/r/301093 which adds 'debian-glue' job on all of operations/debs/ repos
* 23:35 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/788871
* 18:10 ostriches: zuul: reloading to pick up config change
* 23:23 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/788868
* 12:49 godog: cherry-pick https://gerrit.wikimedia.org/r/#/c/300827/ on deployment-puppetmaster
* 22:03 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/788806
* 11:59 legoktm: also pulled in I73f01f87b06b995bdd855628006225879a17fee5
* 22:01 dduvall: created 4 new jobs to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/788806
* 11:59 legoktm: deploying https://gerrit.wikimedia.org/r/301109
* 21:40 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/788798
* 11:37 hashar: rebased integration puppetmaster git repo
* 21:27 dduvall: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/788799
* 11:31 hashar: enable puppet agent on integration-puppetmaster . Had it disabled while hacking on https://gerrit.wikimedia.org/r/#/c/300830/
* 21:25 dduvall: created trigger-pipelinelib-pipeline-test and pipelinelib-pipeline-test jobs for https://gerrit.wikimedia.org/r/c/integration/config/+/788799
* 08:42 hashar: T141269 On integration-slave-trusty-1018 , deleting workspace that has a corrupt git: rm -fR /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm*
* 11:50 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/788682
* 01:08 Amir1: deployed ores a291da1 in sca03, ores-beta.wmflabs.org works as expected


== 2016-07-25 ==
== 2022-05-02 ==
* 22:45 legoktm: restarting zuul due to depends-on lockup
* 15:09 dancy: Updating beta cluster scap to 4.7.1-1+0~20220502085300.264~1.gbp367de7?
* 14:24 godog: bounce puppetmaster on deployment-puppetmaster
* 10:06 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/786934 # [[phab:T301766|T301766]]
* 13:17 godog: cherry-pick https://gerrit.wikimedia.org/r/#/c/300827/ on deployment-puppetmaster


== 2016-07-23 ==
== 2022-04-29 ==
* 20:06 bd808: Cleanup jobrunner01 logs via -- sudo logrotate --force /etc/logrotate.d/mediawiki_jobrunner
* 21:49 brennen: created https://gitlab.wikimedia.org/toolforge-repos and https://gitlab.wikimedia.org/cloudvps-repos for cloud tenants ([[phab:T305301|T305301]])
* 20:03 bd808: Deleted jobqueues in redis with no matching wikis: ptwikibooks, labswiki
* 18:37 James_F: Zuul: Add SimilarEditors dependency on QuickSurveys extension for [[phab:T297687|T297687]]
* 19:20 bd808: jobrunner01 spamming /var/log/mediawiki with attempts to process jobs for wiki=labswiki


== 2016-07-22 ==
== 2022-04-28 ==
* 20:26 hashar: T141114 upgraded jenkins-debian-glue from v0.13.0 to v0.17.0  on integration-slave-jessie-1001 and integration-slave-jessie-1002
* 20:31 James_F: Zuul: Add PHP81 as voting for libraries, PHP extensions etc. for [[phab:T293509|T293509]]
* 19:07 thcipriani: beta-cluster has successfully used a canary for mediawiki deployments
* 18:57 brennen: finished editing mediawiki-new-errors
* 16:53 thcipriani: bumping scap to v.3.2.1 on deployment-tin to test canary deploys, again
* 18:50 brennen: adding some filters to mediawiki-new-errors, including one based on https://wikitech.wikimedia.org/wiki/Performance/Runbook/Kibana_monitoring#Filtering_by_query_string
* 16:46 thcipriani: rolling back scap version to v.3.2.0
* 09:03 hashar: Gerrit upgraded to 3.4.4  at roughly 8:00 UTC
* 16:38 thcipriani: bumping scap to v.3.2.1 on deployment-tin to test canary deploys
* 13:02 hashar: zuul rebased patch queue on tip of upstream branch and force pushed branch. c3d2810...4ddad4e HEAD -> patch-queue/debian/precise-wikimedia (forced update)
* 10:32 hashar: Jenkins restarted and it pooled both integration-slave-jessie-1002  and  integration-slave-trusty-1018
* 10:23 hashar: Jenkins has some random deadlock. Will probably reboot it
* 10:17 hashar: Jenkins can't ssh / add slaves integration-slave-jessie-1002 or  integration-slave-trusty-1018 . Apparently due to some Jenkins deadlock in the ssh slave plugin :-/   Lame way to solve it: restart Jenkins
* 10:10 hashar: rebooting integration-slave-jessie-1002 and integration-slave-trusty-1018 . Hang somehow
* 10:06 hashar: T141083 salt -v '*slave-trusty*' cmd.run 'service mysql start'
* 09:55 hashar: integration-slave-trusty-1001 service mysql start


== 2016-07-21 ==
== 2022-04-27 ==
* 16:11 hashar: Updated our JJB fork cherry picking f74501e781f by madhuvishy.  Was made to support the maven release plugin. Branch bump is 10f2bcd..6fcaf39
* 19:06 hashar: Updating operations/software/gerrit branches and tags from upstream # [[phab:T292759|T292759]]
* 16:04 hashar: integration/zuul.git .Updated upstream branch:bc58ea34125f11eb353abc3e5b96ac1efad06141  finally caught up with upstream \O/
* 15:20 hashar: Updating non-quibble jobs to composer 2.3.3 {{!}} [[phab:T303867|T303867]] {{!}} https://gerrit.wikimedia.org/r/c/integration/config/+/777029
* 15:13 hashar: integration/zuul.git .Updated upstream branch:  06770a85fcff810fc3e1673120710100fc7b0601:upstream
* 14:03 hashar: integration/zuul.git bumping upstream branch:  git push d34e0b4:upstream
* 03:18 greg-g: had to do https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Hung_beta_code.2Fdb_update twice, seems to be back
* 00:13 bd808: Cherry-picked https://gerrit.wikimedia.org/r/#/c/299825/ to deployment-puppetmaster so wdqs nginx log parsing can be tested


== 2016-07-20 ==
== 2022-04-26 ==
* 13:55 hashar: beta: switching job beta-scap-eqiad to use 'scap sync' per https://gerrit.wikimedia.org/r/#/c/287951/  (poke thcipriani )
* 15:40 brennen: train 1.39.0-wmf.9 ([[phab:T305215|T305215]]): no current blockers - expect to start train ops after the toolhub deployment window wraps, so some time after 17:00 UTC; taking a pre-train stroll-around-the-block break before that.
* 12:47 hashar: integration: enabled unattended upgrade on all instances by adding contint::packages::apt to https://wikitech.wikimedia.org/wiki/Hiera:Integration
* 13:46 James_F: Deleting deployment-mx02.deployment-prep.eqiad1.wikimedia.cloud for [[phab:T306068|T306068]]
* 10:28 hashar: beta dropped salt-key on deployment-salt02 for the three instances: deployment-upload.deployment-prep.eqiad.wmflabs , deployment-logstash3.deployment-prep.eqiad.wmflabs and deployment-ores-web.deployment-prep.eqiad.wmflabs
* 13:38 James_F: Zuul: [mediawiki/extensions/SimilarEditors] Install basic prod CI for [[phab:T306897|T306897]]
* 10:26 hashar: beta: rebased puppetmaster git repo. "Parsoid: Move to service::node"  has weird conflict https://gerrit.wikimedia.org/r/#/c/298436/
* 12:33 hashar: Manually pruned dangling docker images on contint1001 and contint2001
* 10:15 hashar: beta: removing puppet cherry pick of https://gerrit.wikimedia.org/r/#/c/258979/ "mediawiki: add conftool-specifc credentials and scripts"  abandonned/superseeded and caused a conflict
* 08:30 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/780824
* 08:17 hashar: deployment-fluorine : deleting a puppet lock file /var/lib/puppet/state/agent_catalog_run.lock  (created at 2016-07-18 19:58:46 UTC)
* 08:09 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/785204
* 01:53 legoktm: deploying https://gerrit.wikimedia.org/r/299930


== 2016-07-18 ==
== 2022-04-25 ==
* 20:56 thcipriani: Deleted deployment-fluorine:/srv/mw-log/archive/*-201605* freed 30 GB
* 17:29 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/779450
* 15:00 hashar: Upgraded Zuul on the Precise slaves to zuul_2.1.0-151-g30a433b-wmf4precise1
* 15:31 James_F: Zuul: [mediawiki/extensions/RegularTooltips] Add basic quibble CI
* 12:10 hashar: (restarted qa-morebots)
* 12:10 hashar: Enabling puppet again on integration-slave-precise-1002 , removing Zuul-server config and adding the slave back in Jenkins pool


== 2016-07-16 ==
== 2022-04-20 ==
* 23:19 paladox: testing morebots
* 16:25 zabe: root@deployment-cache-upload06:~# touch /srv/trafficserver/tls/etc/ssl_multicert.config && systemctl reload trafficserver-tls.service


== 2016-07-15 ==
== 2022-04-18 ==
* 08:34 hashar: Unpooling integration-slave-precise-1002  will use it as a zuul-server test instance temporarily
* 19:27 brennen: gitlab runners: deleting a number of stale runners with no contacts in > 2 months which are most likely no longer extant
* 16:49 brennen: phabricator: created phame blog https://phabricator.wikimedia.org/phame/blog/view/22/ for [[phab:T306329|T306329]]
* 16:48 brennen: phabricator: adding self to acl*blog-admins
* 15:33 James_F: Shutting off deployment-wdqs01 from the Beta Cluster project per [[phab:T306054|T306054]]; it's apparently unused, so this shouldn't break anything.


== 2016-07-14 ==
== 2022-04-14 ==
* 18:54 ebernhardson: deployment-prep manually edited elasticsearch.yml on deployment-elastic05 and restarted to get it listening on eth0. Still looking into why puppet wrote out wrong config file
* 22:30 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/779969
* 09:05 Amir1: rebooting deployment-ores-redis
* 16:09 brennen: removed or renamed 4 filters from mediawiki-new-errors per check-new-error-tasks/check.sh
* 08:29 Amir1: deploying 0e9555f to ores-beta (sca03)


== 2016-07-13 ==
== 2022-04-12 ==
* 16:05 urandom: Installing Cassandra 2.2.6-wmf1 on deployment-restbase0[1-2].deployment-prep.eqiad.wmflabs : T126629
* 21:49 brennen: Updating dev-images docker-pkg files on primary contint for elastic 7.10.2
* 13:58 hashar: T137525 reverted Zuul back to zuul_2.1.0-95-g66c8e52-wmf1precise1_amd64.deb  . It could not connect to Gerrit reliably
* 21:46 brennen: Updating dev-images docker-pkg files on primary contint for elastic 6.8.23
* 13:46 hashar: T137525 Stopped zuul that ran in a terminal (with -d). Started it with the init script.
* 21:37 brennen: Updating dev-images docker-pkg files on primary contint for apache & elasticsearch changes ([[phab:T304290|T304290]], [[phab:T305143|T305143]])
* 11:37 hashar: apt-get upgrade on deployment-mediawiki02
* 16:05 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/779500
* 08:33 hashar: removing deployment-parsoid05 from the Jenkins slaves T140218
* 15:55 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/779498 https://gerrit.wikimedia.org/r/779141


== 2016-07-12 ==
== 2022-04-08 ==
* 20:29 hashar: integration: force running unattended upgrade on all instances:  salt --batch 4 -v '*' cmd.run 'unattended-upgrade'  . That upgrades diamond and hhvm among others.  imagemagick-common has a prompt though
* 11:08 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/778287
* 20:22 hashar: CI force running puppet on all instances:  salt --batch 5 -v '*' puppet.run
* 20:04 hashar: Maybe fix unattended upgrade on the CI slaves via https://gerrit.wikimedia.org/r/298568
* 16:43 Amir1: deploying f472f65 to ores-beta
* 10:11 hashar: Github created repos operations-debs-contenttranslation-apertium-mk-en and operations-docker-images-toollabs-images        for Gerrit replication


== 2016-07-11 ==
== 2022-04-07 ==
* 14:24 hashar: Removing ZeroMQ config from the Jenkins jobs. It is now enabled globally. T139923
* 06:07 urbanecm: deployment-prep: foreachwiki extensions/GrowthExperiments/maintenance/T304461.php --delete # [[phab:T304461|T304461]], output is at P24204
* 10:16 hashar: T136188: on Trusty slaves, upgrading Chromium from v49 to v51:  salt -v '*slave-trusty-*' cmd.run 'apt-get -y install chromium-browser chromium-chromedriver chromium-codecs-ffmpeg-extra'
* 05:54 urbanecm: deployment-prep: mwscript extensions/GrowthExperiments/maintenance/T304461.php --wiki=<nowiki>{</nowiki>enwiki,cswiki<nowiki>}</nowiki> --delete # [[phab:T304461|T304461]]
* 10:13 hashar: T136188: salt -v '*slave-trusty*' cmd.run 'rm /etc/apt/preferences.d/chromium-*'
* 10:09 hashar: Unpinning Chromium v49 from the Trusty slaves and upgrading to v51 for T136188
* 09:34 zeljkof: Enabled ZMQ Event Publisher on all Jobs in Jenkins


== 2016-07-09 ==
== 2022-04-06 ==
* 18:57 legoktm: deploying https://gerrit.wikimedia.org/r/297731 and https://gerrit.wikimedia.org/r/298142
* 20:03 thcipriani: rebooting phabricator
* 14:07 bd808: Testing logstash change https://gerrit.wikimedia.org/r/#/c/298115/ via cherry-pick
* 11:44 James_F: Zuul: [mediawiki/extensions/WikiEditor] Add BetaFeatures to phan deps for [[phab:T304596|T304596]]


== 2016-07-08 ==
== 2022-04-04 ==
* 16:08 hashar: scandium: git -C /srv/ssd/zuul/git/mediawiki/services/graphoid remote set-head origin --auto
* 22:43 James_F: dockerfiles: [composer-scratch] Upgrade composer to 2.3.3 and cascade for [[phab:T294260|T294260]]
* 16:06 hashar: scandium: git -C /srv/ssd/zuul/git/mediawiki/services/graphoid init &&  git -C /srv/ssd/zuul/git/mediawiki/services/graphoid remote add origin ssh://jenkins-bot@ytterbium.wikimedia.org:29418/mediawiki/services/graphoid
* 18:49 hashar: Reloading Zuul to revert https://gerrit.wikimedia.org/r/776179
* 14:59 hashar: nodepool: rebuild Trusty image from scratch Image ci-trusty-wikimedia-1467989709 in wmflabs-eqiad is ready
* 18:23 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/776179
* 12:35 hashar: beta: find /data/project/upload7/*/*/thumb -type f -atime +30 -delete
* 17:50 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/775796
* 10:31 hashar: beta: mass delete http://commons.wikimedia.beta.wmflabs.org/wiki/Category:GWToolset_Batch_Upload files T64835
* 12:12 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/776723
* 10:26 hashar: beta: mass delete http://commons.wikimedia.beta.wmflabs.org/wiki/Category:GWToolset_Batch_Upload files
* 10:28 James_F: Zuul: [mediawiki/extensions/WikiLambda] Publish PHP and JS documentation
* 08:54 jnuche: redeploying Zuul


== 2016-07-07 ==
== 2022-04-02 ==
* 21:41 MaxSem: Chowned php-master/vendor back to jenkins-deploy
* 12:00 zabe: apply https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/773903 on deployment-prep centralauth databases
* 13:10 hashar: deleting integration-slave-trusty-1024 and integration-slave-trusty-1025  to free up some RAM. We have enough permanent Trusty slaves. T139535
* 02:43 MaxSem: started redis-server on deployment-stream
* 01:14 bd808: Restarted logstash on deployment-logstash2
* 01:13 MaxSem: Leaving my hacks for the night to collect data, if needed revert with cd /srv/mediawiki-staging/php-master/vendor && sudo git reset --hard HEAD && sudo chown -hR jenkins-deploy:wikidev .
* 00:50 bd808: Rebooting deployment-logstash3.eqiad.wmflabs; console full of hung process messages from kernel
* 00:27 MaxSem: Initialized ORES on all wikis where it's enabled, was causing job failures
* 00:13 MaxSem: Debugging a fatal in betalabs, might cause syncs to fail


== 2016-07-06 ==
== 2022-03-31 ==
* 20:30 hashar: beta: restarted mysql on both db1 and db2 so it takes in account the --syslog setting  T119370
* 20:58 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/775957
* 20:08 hashar: beta:  on db1 and db2  move the MariaDB 'syslog' setting under [mysqld_safe] section. Cherry picked https://gerrit.wikimedia.org/r/#/c/296713/3 and reloaded mysql on both instances. T119370
* 14:54 hashar: Image ci-jessie-wikimedia-1467816381 in wmflabs-eqiad is ready  T133779
* 14:47 hashar_: attempting to refresh ci-jessie-wikimedia image to get librdkafka-dev included for T133779


== 2016-07-05 ==
== 2022-03-29 ==
* 21:54 hasharAway: CI has drained the gate-and-submit queue
* 14:20 James_F: Zuul: [mediawiki/extensions/IPInfo] Add EventLogging phan dependency for [[phab:T304948|T304948]]
* 21:37 hasharAway: Nodepool: nodepool delete  a few instances that would never spawn / have been stuck for ~ 40 minutes
* 12:32 hashar: integration-agent-docker-1039: clearing leftover pipelinelib builds: `sudo rm -fR /srv/jenkins/workspace/workspace/*`  [[phab:T304932|T304932]] [[phab:T302477|T302477]]
* 05:35 hashar: Relocate castor directory on integration-castor03 from `/srv/jenkins-workspace/caches` to `/srv/castor` https://gerrit.wikimedia.org/r/c/operations/puppet/+/774771


== 2016-07-04 ==
== 2022-03-28 ==
* 18:58 hashar: Upgrading arcanist on permanent CI slaves since xhpast was broken T137770 
* 16:55 hashar: integration: created instance integration-castor04 with flavor `g3.cores8.ram32.disk20` (twice more ram than integration-castor03) # [[phab:T252071|T252071]]
* 12:50 yuvipanda: migrating deployment-tin to labvirt1011
* 16:49 hashar: integration: created 320G volume https://horizon.wikimedia.org/project/volumes/3f90c3f2-158d-4e45-a919-0f048f47c3b6/ . Intended to migrate integration-castor03 [[phab:T252071|T252071]]
* 10:34 hashar: contint2001 and contint1001: pruning obsolete branches from the zuul-merger: `sudo -H -u zuul find /srv/zuul/git -type d -name .git -print -execdir git -c url."https://gerrit.wikimedia.org/r/".insteadOf="ssh://jenkins-bot@gerrit.wikimedia.org:29418/" remote prune origin \;` [[phab:T220606|T220606]]
* 10:25 hashar: Changed `Trainsperiment Survey Questions` surveys permissions to be open outside of WMF and limited to 1 answer (forcing signin) https://docs.google.com/forms/u/0/d/e/1FAIpQLSd0Nc2jGkAGW-5rTiKN2EHWzfw2HeHm13N-ZCw1xUdE3z6woQ/formrestricted
* 10:18 hashar: contint2001 and contint1001: pruning all git reflog entries from the zuul-merger: `sudo -u zuul find /srv/zuul/git -name .git -type d -execdir git reflog expire --expire=all --all`.  They are useless and no more generated since https://gerrit.wikimedia.org/r/c/operations/puppet/+/757943
* 09:53 hashar: Tag Quibble 1.4.5 @ {{Gerrit|abe16d574}} {{!}} [[phab:T291549|T291549]]


== 2016-07-03 ==
== 2022-03-27 ==
* 13:10 paladox: phabricator Update phab-01 and phab-05 (phab-02) and phab-03 to fix a security bug in phabricator (Did the update last night but forgot to log it)
* 13:23 James_F: Zuul: [releng/phatality] Make the node14 CI job voting [[phab:T304736|T304736]]
* 12:04 jzerebecki: reloading zuul for 7e6a2e2..13ea50f


== 2016-07-02 ==
== 2022-03-26 ==
* 13:38 jzerebecki: reloading zuul for 15127b2..7e6a2e2
* 02:37 Reedy: beta-update-databases-eqiad is back to @hourly


== 2016-06-30 ==
== 2022-03-25 ==
* 10:31 hashar: Deleting integration-slave-trusty-1015 . Can not bring up mysql T138074  and the ssh slave connection would not hold anyway. Must be broken somehow
* 23:51 Reedy: temporarily turning off period building of beta-update-databases-eqiad until it's run to completion
* 10:04 hashar: Attempting to refresh Nodepool image for Jessie ( ci-jessie-wikimedia ). Been stall for 284 hours (12 days)
* 23:21 Reedy: running /usr/local/bin/wmf-beta-update-databases.py manually
* 09:36 hashar: Trusty is missing the package arcanist ... :(
* 20:22 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/773866
* 09:35 hashar: Attempting to refresh Nodepool image for Trusty ( ci-trusty-wikimedia ). Been stall for 283 hours (12 days)
* 20:02 brennen: mediawiki-new-errors: ran check-new-error-tasks/check.sh and cleared "resolved" filters
* 09:43 hashar: Building Quibble Docker images to rename quibble-with-apache to quibble-with-supervisord


== 2016-06-28 ==
== 2022-03-24 ==
* 21:33 halfak: deploying ores beec291
* 20:00 hashar: reloading Zuul for {{Gerrit|Id844e1723a38eed627af03397cf0ad90c7b09a32}} # [[phab:T299320|T299320]]
* 21:15 halfak: deploying ores 6979a98
* 20:00 James_F: Clearing integration-castor03:/srv/jenkins-workspace/caches/castor-mw-ext-and-skins/master/mwgate-node14-docker/_cacache/content-v2/sha512/22/ for [[phab:T304652|T304652]]
* 15:00 James_F: Zuul: [design/codex] Publish code coverage reports for [[phab:T303899|T303899]]
* 09:37 Lucas_WMDE: killed a beta-scap-sync-world job manually, let’s see if that helps getting beta updates unstuck


== 2016-06-27 ==
== 2022-03-23 ==
* 22:32 eberhardson: deployment-prep deployed gerrit.wikimedia.org/r/296279 to puppetmaster to test kibana4 role
* 17:35 brennen: restarting phabricator for [[phab:T304540|T304540]], brief downtime expected
* 19:41 bd808: Rebooting deployment-logstash3.eqiad.wmflabs via wikitech. Console log full of blocked kworker messages, ssh non-responsive, and blocking logstash records being recorded.
* 14:56 dancy: Updating scap to 4.5.0-1+0~20220321191814.216~1.gbp24bc64 in beta cluster
* 18:20 thcipriani: deployment-puppetmaster.deployment-prep:/var/lib/git/labs/private modules/secret/secrets/keyholder keys conflicts resolved
* 18:09 bd808: Git repo at deployment-puppetmaster.deployment-prep:/var/lib/git/labs/private is behind upstream due to multiple modules/secret/secrets/keyholder local files that would be overwritten by upstream changes.


== 2016-06-24 ==
== 2022-03-22 ==
* 15:04 hashar: switch apps-android-wikimedia-* jobs to Jessie T138506
* 14:44 hashar: gerrit: `./deploy_artifacts.py --version=3.3.10 gerrit.war` [[phab:T304226|T304226]]
* 14:07 James_F: Killed https://integration.wikimedia.org/ci/job/pywikibot-core-tox-nose-jessie/556/console (stuck for 90 minutes)
* 13:50 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/771945
* 09:54 hashar: T138506 Adding a JDK installation "Debian - OpenJdk 8" in Jenkins global configuration with JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64


== 2016-06-23 ==
== 2022-03-21 ==
* 13:58 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/295691
* 08:35 hashar: The castor cache for mediawiki/core wmf/1.39-wmf.1 is actually empty!
* 12:13 hashar: Deleting integration-saltmaster and recreating it with Jessie T136410
* 08:32 hashar: Nuking npm castor cache /srv/jenkins-workspace/caches/castor-mw-ext-and-skins/master/wmf-quibble-selenium-php72-docker/npm/ # [[phab:T300203|T300203]]
* 10:14 hashar: T137807 Upgrading Jenkins TAP Plugin
* 08:55 hashar: integration: rebased puppet master by dropping a conflicting/obsolete patch
* 08:28 hashar: fixing puppet cert on deployment-cache-text04


== 2016-06-17 ==
== 2022-03-18 ==
* 10:35 jzerebecki: offlined integration-slave-trusty-1015 T138074
* 14:18 elukey: restart testing of kafka logging TLS certificates (may affect logstash in beta, ping me in case it is a problem)
* 10:06 hashar: Refreshed Nodepool Trusty image
* 13:22 hashar: Rolling back Quibble jobs from 1.4.4 [[phab:T304147|T304147]]
* 10:02 hashar: Refreshed Nodepool Jessie image
* 07:41 elukey: experimenting with PKI and kafka logging on deployment-prep, logstash dashboard/traffic may be down (please ping me in case it is a problem)


== 2016-06-14 ==
== 2022-03-17 ==
* 14:22 hashar: T136971 on tin MediaWiki 1.28.0-wmf.6, from 1.28.0-wmf.6, successfully checked out. Applying security patches
* 19:11 hashar: Building Docker images for Quibble 1.4.4
* 11:21 hashar: T137797 Created Gerrit repository operations/debs/geckodriver  to package https://github.com/mozilla/geckodriver
* 19:06 hashar: Tag Quibble 1.4.4 @ {{Gerrit|56b2c9ba52c}} # [[phab:T300340|T300340]]
* 16:25 hashar: Switching Quibble jobs to use memcached rather than APCu {{!}} https://gerrit.wikimedia.org/r/c/integration/config/+/770468 {{!}} [[phab:T300340|T300340]]
* 14:11 hashar: Update all jobs to support `CASTOR_HOST` env variable {{!}} https://gerrit.wikimedia.org/r/770921 {{!}} [[phab:T216244|T216244]] {{!}} [[phab:T252071|T252071]]
* 14:07 hashar: Building Docker image to support `CASTOR_HOST` {{!}} https://gerrit.wikimedia.org/r/770921 {{!}} [[phab:T216244|T216244]]


== 2016-06-13 ==
== 2022-03-16 ==
* 21:11 hashar: https://integration.wikimedia.org/ci/computer/integration-slave-trusty-1015/ put offline. Jenkins cant ssh / pool it for some reason
* 22:00 James_F: Docker: Publishing sonar-scanner:4.6.0.2311-3 for [[phab:T303958|T303958]]
* 20:07 hashar: beta: update.php / database update finally pass!
* 20:13 James_F: Zuul: [mediawiki/services/function-evaluator and …/function-orchestrator] Switch to npm coverage job for [[phab:T302607|T302607]] and [[phab:T302608|T302608]]
* 19:55 hashar: T137615 deployment-db2, **eswiki** > CREATE INDEX echo_notification_event ON echo_notification (notification_event);
* 19:48 zabe: apply https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/769424/ on deployment-prep
* 19:22 hashar: T137615 deployment-db2, enwiki > CREATE INDEX echo_notification_event ON echo_notification (notification_event);
* 19:43 taavi: apply https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/771347/ on deployment-prep
* 10:37 hashar: Restarted puppetmaster on integration-puppetmaster (memory leak / can not fork: no memory)
* 10:35 hashar: T137561  salt -v '*trusty*' cmd.run "cd /root/ && dpkg -i firefox_46.0.1+build1-0ubuntu0.14.04.3_amd64.deb"
* 10:23 hashar: Hard reboot integration-slave-trusty-1015
* 08:30 hashar: Beta: `mwscript extensions/Echo/maintenance/removeInvalidTargetPage.php --wiki=enwiki` for T137615


== 2016-06-10 ==
== 2022-03-15 ==
* 15:49 jzerebecki: reloading zuul for 8c048fb..272d1ec
* 18:26 brennen: gitlab: removed most existing /people groups
* 15:29 jzerebecki: T137561 integration-puppetmaster:/var/lib/git/operations/puppet# git reset --hard 1e1ff12b13b73b5c5e2015a72f51561f10b305d0
* 18:10 brennen: gitlab: finished migrating access for all existing people groups to direct project membership ([[phab:T274461|T274461]], [[phab:T300935|T300935]])
* 15:19 jzerebecki: T137561 integration-saltmaster:~# salt -v '*trusty*' cmd.run "cd /root/ && dpkg -i firefox_46.0.1+build1-0ubuntu0.14.04.3_amd64.deb"
* 16:49 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/770963
* 15:18 jzerebecki: T137561 integration-saltmaster:~# salt -v '*trusty*' cmd.run "cd /root/ && wget 'https://ubuntu.wikimedia.org/ubuntu/pool/main/f/firefox/firefox_46.0.1%2bbuild1-0ubuntu0.14.04.3_amd64.deb'"
* 14:30 hashar: CI Jenkins: globally defined CASTOR_HOST=integration-castor03.integration.eqiad.wmflabs via https://integration.wikimedia.org/ci/configure # [[phab:T216244|T216244]]
* 15:15 jzerebecki: T137561 integration-puppetmaster:/var/lib/git/operations/puppet# git fetch https://gerrit.wikimedia.org/r/operations/puppet refs/changes/39/293739/1 && git cherry-pick FETCH_HEAD
* 14:17 hashar: Apply label `castor` to node https://integration.wikimedia.org/ci/computer/integration-castor03/ # [[phab:T216244|T216244]]
* 01:37 James_F: Zuul: Switch services/function* publish job from node12 to node14
* 01:14 James_F: Zuul: [wikidata/query-builder] Switch branchdeploy from node12 to node14
* 00:08 James_F: Zuul: [wikipeg] Switch from node12 to node14 special job


== 2016-06-09 ==
== 2022-03-14 ==
* 18:49 hashar: restarting nutcracker on deployment-mediawiki02
* 23:57 James_F: Zuul: [ooui] Switch from node12 to node14
* 16:53 hashar: rebuild Nodepool trusty image ci-trusty-wikimedia-1465490962
* 23:46 James_F: Docker: Publishing node14-test-browser-php80-composer:0.1.0
* 16:37 hashar: Manually deleting old zuul references on scandium.eqiad.wmnet . Running in a screen
* 23:27 James_F: Zuul: Drop legacy node12 templates except the one for Services
* 16:32 hashar: rebuild Nodepool jessie image ci-jessie-wikimedia-1465489579
* 23:10 James_F: Zuul: [oojs/router] Drop custom job and just use the generic node14 one
* 16:03 hashar: Restarting Nodepool
* 23:08 James_F: Zuul: [oojs/core] Switch from node12 to node14 jobs
* 22:46 James_F: Zuul: [unicodejs] Switch from node12 to node14
* 22:25 James_F: Zuul: [VisualEditor/VisualEditor] Switch from node12 to node14
* 19:51 James_F: Zuul: Migrate almost all libraries and tools from node12 to node14 for [[phab:T267890|T267890]]
* 15:36 James_F: Zuul: Switch extension-javascript-documentation from node12 to node14 for [[phab:T267890|T267890]]
* 15:21 James_F: Zuul: Switch all mwgate jobs from node12 to node14 for [[phab:T267890|T267890]]
* 09:52 hashar: Building Quibble Docker images for https://gerrit.wikimedia.org/r/757867 {{!}} [[phab:T300340|T300340]]
* 08:54 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/770079


== 2016-06-08 ==
== 2022-03-11 ==
* 02:56 legoktm: / on gallium is read-only
* 04:02 zabe: zabe@deployment-mwmaint02:~$ mwscript extensions/CentralAuth/maintenance/populateGlobalEditCount.php --wiki=metawiki
* 02:47 legoktm: disabling/enabling gearman in jenkins because everything is stuck


== 2016-06-07 ==
== 2022-03-10 ==
* 19:28 hashar: Nodepool has troubles spawning instances probably due to on going (?) labs maintenance
* 20:45 zabe: apply https://gerrit.wikimedia.org/r/c/mediawiki/extensions/CentralAuth/+/769416 on deployment-prep centralauth databases
* 14:56 hashar: Restarting Jenkins to upgrade Rebuilder plugin with https://github.com/jenkinsci/rebuild-plugin/pull/34  (sort out parameters not being reinjected)
* 20:25 James_F: Zuul: [mediawiki/extensions/VueTest] Add basic quibble CI
* 09:02 hashar: Upgrading Jenkins IRC plugin 2.25..2.27 and instant messaging plugin 1.34..1.35  . The former should fix a deadlock on shutdowning Jenkins | T96183
* 20:03 Krinkle: Updating docker-pkg files on contint primary for  https://gerrit.wikimedia.org/r/768843
* 15:12 hashar: updating Quibble jenkins jobs
* 14:26 James_F: Docker: Publishing new versions of quibble-buster and cascade adding unzip for [[phab:T250496|T250496]] / [[phab:T303417|T303417]].
* 11:43 Amir1: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/769668
* 09:59 dwalden: restarted apache on deployment-mediawiki11 # [[phab:T302699|T302699]]


== 2016-06-06 ==
== 2022-03-09 ==
* 19:26 hasharAway: Regenerating Nodepool snapshots for Trusty and Jessie
* 17:08 hashar: Updating Gerrit Comment.soy to get rid of a literal `null` string being inserted in notification emails {{!}} https://gerrit.wikimedia.org/r/c/operations/puppet/+/768005 {{!}} https://phabricator.wikimedia.org/T288312
* 13:04 hashar: Migrated all qunit jobs to Nodepool T136301 has the related Gerrit changes
* 10:05 hashar: migrating mediawiki-core-qunit job to Nodepool instances https://gerrit.wikimedia.org/r/#/c/291322/ T136301


== 2016-06-04 ==
== 2022-03-08 ==
* 00:09 Krinkle: krinkle@integration-slave-trusty-1017:~$ sudo rm -rf /mnt/jenkins-workspace/workspace/mediawiki-extensions-hhvm/src/extensions/Babel (T86730)
* 20:31 brennen: requiring 2fa for all users under /repos


== 2016-06-03 ==
== 2022-03-07 ==
* 19:18 hashar: Image ci-jessie-wikimedia-1464981111 in wmflabs-eqiad is ready  Zend 5.x for qunit | T136301
* 10:53 zabe: restarted apache on deployment-mediawiki11 # [[phab:T302699|T302699]]
* 15:17 hashar: refreshed Nodepool Trusty image due to some imagemagick upgrade issue.  Image ci-trusty-wikimedia-1464966671 in wmflabs-eqiad is ready
* 10:40 hashar: scandium (zuul merger):  rm -fR /srv/ssd/zuul/git/mediawiki/extensions/Collection  T136930


== 2016-06-02 ==
== 2022-03-04 ==
* 12:10 hashar: Upgraded Zuul upstream code being 66c8e52..30a433b package is 2.1.0-151-g30a433b-wmf1precise1
* 20:29 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/768146
* 19:13 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/768068


== 2016-06-01 ==
== 2022-03-03 ==
* 17:49 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/292186
* 19:13 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/767864
* 16:45 tgr: enabling AuthManager on beta cluster
* 15:37 James_F: Docker: Publishing sury-php images based on bullseye not stretch and cascade for [[phab:T278203|T278203]]
* 15:20 legoktm: deploying https://gerrit.wikimedia.org/r/292153
* 14:43 hashar: Reloading Zuul for {{Gerrit|Iae45cae8ec209a3e795fe4fd7dd92290565277db}}
* 14:44 twentyafterfour: jenkins restart completed
* 12:47 hashar: Upgrading Quibble on CI Jenkins jobs from 1.3.0 to 1.4.3 https://gerrit.wikimedia.org/r/c/integration/config/+/767749/
* 14:36 twentyafterfour: restarting jenkins to install "single use slave" plugin (jenkins will restart when all builds are finished)
* 10:30 hashar: Building Docker images for Quibble 1.4.3
* 13:49 hashar: Beta : clearing temporary files under /data/project/upload7  (mainly wikimedia/commons/temp )
* 10:22 hashar: Tagged Quibble 1.4.3 @ {{Gerrit|cf5cd1a0a07}}
* 10:29 hashar: Upgraded Linux kernel on deployment-salt02  T136411
* 09:24 hashar: Building Docker images for Quibble 1.4.2
* 10:14 hashar: beta: salt-key -d deployment-salt.deployment-prep.eqiad.wmflabs  T136411
* 09:20 hashar: Tag Quibble 1.4.2 @ {{Gerrit|63d2855a1e}} # [[phab:T302226|T302226]] [[phab:T302707|T302707]]
* 09:16 hashar: Enabling puppet again on Trusty slaves. Chromium is now properly pinned to version 49 ( https://gerrit.wikimedia.org/r/#/c/291116/3 | T136188 )
* 08:55 hashar: integration slaves : salt -v '*' pkg.upgrade


== 2016-05-31 ==
== 2022-03-02 ==
* 20:24 bd808: Reloading zuul to pick up I58f878f3fd19dfa21a46a52464575cb06aacbb22
* 19:53 James_F: Zuul: Configure CI for the forthcoming REL1_38 branches for [[phab:T302908|T302908]]
* 15:56 dancy: Updating scap to 4.4.1-1+0~20220302155149.192~1.gbpe351d6 in beta
* 15:27 Krinkle: Reloading Zuul to deploy  https://gerrit.wikimedia.org/r/767493
* 15:04 taavi: resolve merge conflicts on deployment-puppetmaster04


== 2016-05-30 ==
== 2022-02-28 ==
* 18:39 hashar: Upgraded our Jenkins Job Builder fork to 1.5.0 + a couple of cherry picks: cd63874...10f2bcd
* 19:29 brennen: removing mutante (dzahn) as application-level gitlab admin; adding as owner of /repos for the time being to facilitate some migrations
* 12:53 hashar: Upgrading Zuul 1cc37f7..66c8e52 T128569
* 19:22 dancy: Update scap to 4.4.0-1+0~20220228192031.189~1.gbp0a8436 in beta
* 08:04 ori: zuul is back up but jobs which were enqueued are gone
* 19:17 brennen: adding mutante (dzahn) as application-level gitlab admin
* 07:50 ori: restarting jenkins on gallium, too
* 07:49 ori: restarted zuul-merger service on gallium
* 07:44 ori: Disconnecting and then reconnecting Gearman from Jenkins did not appear to do anything; going to depool / repool nodes.
* 07:42 ori: Temporarily disconnecting Gearman from Jenkins, per <https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Known_issues>


== 2016-05-28 ==
== 2022-02-26 ==
* 04:43 ori: depooling integration-slave-trusty-1015 to profile phpunit runs
* 20:05 zabe: apply [[phab:T302658|T302658]] on deployment-prep centralauth databases
* 13:24 zabe: apply [[phab:T302660|T302660]] on deployment-prep centralauth databases
* 13:19 zabe: apply [[phab:T302659|T302659]] on deployment-prep centralauth databases


== 2016-05-27 ==
== 2022-02-24 ==
* 19:29 hasharAway: Refreshed Nodepool images
* 16:02 dancy: Updating beta cluster scap to 4.4.0-1+0~20220224155429.187~1.gbp66c5c2
* 18:13 thcipriani: restarting zuul for deadlock
* 13:44 hashar: integration/config now fully enforces shellcheck https://gerrit.wikimedia.org/r/756088
* 18:00 thcipriani: Reloading Zuul to deploy I0c3aeacf92d430ad1272f5f00e7fb7182b8a05bf
* 13:13 hashar: Built image docker-registry.discovery.wmnet/releng/castor:0.2.5
* 02:55 bd808: Deleted deployment-fluorine:/srv/mw-log/archive/*-20160[34]* logs; freed 26G
* 13:10 hashar: Updating castor-save-workspace-cache job https://gerrit.wikimedia.org/r/764817
* 11:54 hashar: Built image docker-registry.discovery.wmnet/releng/shellcheck:0.1.1
* 11:41 hashar: Built image docker-registry.discovery.wmnet/releng/sonar-scanner:4.6.0.2311-2
* 11:04 hashar: Built image docker-registry.discovery.wmnet/releng/operations-puppet:0.8.6
* 08:58 hashar: Built image docker-registry.discovery.wmnet/releng/mediawiki-phan-testrun:0.2.1


== 2016-05-26 ==
== 2022-02-23 ==
* 22:23 hashar: salt -v '*trusty*' cmd.run 'puppet agent --disable "Chromium needs to be v49. See T136188"'
* 23:21 dancy: Update beta cluster scap to 4.3.1-1+0~20220223231645.183~1.gbp8ddb60
* 21:47 hashar: integration-slave-trusty-1015 still on Chromium 50 .. T136188
* 20:10 dancy: Updating scap in beta
* 21:42 hashar: downgrading chromium-browser on integration-slave-1015  T136188
* 19:23 hashar: Built docker-registry.discovery.wmnet/releng/logstash-filter-verifier:0.0.3
* 09:24 jzerebecki: reloading zuul for d38ad0a..6798539
* 12:41 hashar: Depooling integration-agent-puppet-docker-1002 , pooling integration-agent-puppet-docker-1003 # [[phab:T252071|T252071]]
* 07:48 gehel: deployment-prep upgrading elasticsearch to 2.3.3 and restarting (T133124)
* 10:21 hashar: Created Bullseye instance integration-agent-puppet-docker-1003 https://horizon.wikimedia.org/project/instances/96cf9ddc-daa3-4c9f-8c21-cdd58e95973e/  # [[phab:T252071|T252071]]
* 07:36 dcausse: deployment-prep elastic: updating cirrussearch warmers (T133124)
* 08:37 hashar: Removing Stretch based integration-agent-qemu-1001 # [[phab:T284774|T284774]]
* 07:31 gehel: deployment-prep deploying new elasticsearch plugins (T133124)


== 2016-05-25 ==
== 2022-02-22 ==
* 22:38 Amir1: running puppet agent manually on sca01
* 16:41 zabe: zabe@deployment-mwmaint02:~$ foreachwiki migrateUserGroup.php oversight suppress # [[phab:T112147|T112147]]
* 16:26 hashar: 2016-05-25 16:24:35,491 INFO nodepool.image.build.wmflabs-eqiad.ci-trusty-wikimedia: Notice: /Stage[main]/Main/Package[ruby-jsduck]/ensure: ensure changed 'purged' to 'present'  T109005
* 13:28 urbanecm: deployment-prep: Create database for incubatorwiki ([[phab:T210492|T210492]])
* 15:07 hashar: g++ added to Jessie and Trusty Nodepool instances | T119143
* 14:12 hashar: Regenerating Nodepool snapshot to include g++ which is required by some NodeJS native modules T119143
* 10:58 hashar: Updating Nodepool ci-jessie-wikimedia snapshot image to get netpbm package installed into it. T126992  https://gerrit.wikimedia.org/r/290651
* 09:30 hashar: Clearing git-sync-upstream script on integration-slave-trusty1013 and integration-slave-trusty-1017. That is only supposed to be on the puppetmaster
* 09:15 hashar: Fixed resolv.conf on integration-slave-trusty-1013 and force running puppet to catch up with change since May 16 19:52
* 09:11 hashar: restarting puppetmaster on integration-puppetmaster  ( memory leak / can not fork)


== 2016-05-24 ==
== 2022-02-21 ==
* 07:03 mobrovac: rebooting deployment-tin, can't log in
* 14:58 hashar: Reverting Quibble jobs from 1.4.0 to 1.3.0 # [[phab:T302226|T302226]]
* 07:31 hashar: Switching Quibble jobs from Quibble 1.3.0 to 1.4.0 # [[phab:T300340|T300340]] [[phab:T291549|T291549]] [[phab:T225730|T225730]]
* 07:27 hashar: Refreshing all Jenkins jobs


== 2016-05-23 ==
== 2022-02-20 ==
* 19:35 hashar: killed all mysqld process on Trusty CI slaves
* 10:32 qchris: Manually triggering replication run of Gerrit's analytics/datahub to populate newly created analytics-datahub GitHub repo
* 15:49 thcipriani: beta code update not running, disconnect-reconnect dance resulted in: [05/23/16 15:48:39] [SSH] Authentication failed.
* 14:32 jzerebecki: offlined integration-slave-trusty-1004 because it can't connect to mysql T135997
* 13:32 hashar: Upgrading Jenkins git plugins and restarting Jenkins
* 11:01 hashar: Upgrading hhvm on Trusty slaves. Bring him hhvm compiled against libicu52 instead of libicu48
* 09:12 _joe_: deployment-prep: all hhvm hosts in beta upgraded to run on the newer libicu; now running updateCollation.php (T86096)
* 09:11 hashar: Image ci-jessie-wikimedia-1463994307 in wmflabs-eqiad is ready
* 09:01 hashar: Image ci-trusty-wikimedia-1463993508 in wmflabs-eqiad is ready
* 08:56 _joe_: deployment-prep: starting upgrade of HHVM to a version linked to libicu52, T86096
* 08:54 hashar: Regenerating Nodepool image manually. Broke over the week-end due to a hhvm/libicu transition.  Should get pip 8.1.x now


== 2016-05-20 ==
== 2022-02-19 ==
* 20:30 bd808: Killing https://integration.wikimedia.org/ci/job/mediawiki-extensions-qunit/43608/ which has been running for 5 hours
* 12:19 taavi: restart trafficserver-tls on deployment-cache-text06
* 02:15 James_F: Zuul: [design/codex] Publish the Netlify preview on every patch for [[phab:T293705|T293705]]
* 00:35 James_F: Manually re-triggered a build of the docs of Codex (via `zuul-test-repo design/codex postmerge`) now that we actually set the environment vars for [[phab:T293705|T293705]]


== 2016-05-19 ==
== 2022-02-18 ==
* 16:47 thcipriani: deployment-tin jenkins worker seems to be back online after [https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Hung_beta_code.2Fdb_update some prodding]
* 22:54 James_F: Zuul: [branchdeploy-codex-node14-npm-docker] Create as experimental for [[phab:T293705|T293705]]
* 16:41 thcipriani: beta-code-update eqiad hung for past few hours
* 22:14 James_F: Jenkins: Defined BRANCHDEPLOY_AUTH_TOKEN_codex and BRANCHDEPLOY_SITE_ID_codex secrets for [[phab:T293705|T293705]]
* 15:16 hashar: Restarted zuul-merger daemons on both gallium and scandium : file descriptors leaked
* 13:44 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/763724 [[phab:T301453|T301453]]
* 11:59 hashar: CI: salt -v '*' cmd.run 'pip install --upgrade pip==8.1.2'
* 09:21 hashar: Reloading Zuul for {{Gerrit|I1494abb5e9e28da951ffb72154a074a16a0f8381}}
* 11:54 hashar: Upgrading pip on CI slaves from 7.0.1 to 8.1.2  https://gerrit.wikimedia.org/r/#/c/289639/
* 10:15 hashar: puppet broken on deployment-tin :    ?[1;31mError: Could not retrieve catalog from remote server: Error 400 on SERVER: Invalid parameter trusted_group on node deployment-tin.deployment-prep.eqiad.wmflabs?[0m


== 2016-05-18 ==
== 2022-02-17 ==
* 13:16 Amir1: deploying a05e830 to ores nodes (sca01 and ores-web)
* 21:48 brennen: added Dzahn (mutante) to acl*repository-admins on phabricator
* 12:46 urandom: (re)cherry-picking c/284078 to deployment-prep
* 15:58 zabe: root@deployment-cache-upload06:~# touch /srv/trafficserver/tls/etc/ssl_multicert.config && systemctl reload trafficserver-tls.service # [[phab:T301995|T301995]]
* 11:36 hashar: Restarted qa-morebots
* 13:35 hashar: Reloading Zuul for https://gerrit.wikimedia.org/r/c/integration/config/+/763207
* 11:36 hashar: Marked mediawiki/core/vendor repository has hidden in Gerrit. It got moved to mediawiki/vendor including the whole history Settings page: https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/core/vendor
* 13:20 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/763458
* 11:12 hashar: Bringing deployment-deploy03 back
* 11:07 hashar: Disabled deployment-deploy03 Jenkins agent in order to revert some mediawiki/core patch and test the outcome


== 2016-05-13 ==
== 2022-02-16 ==
* 14:39 thcipriani: remove shadow l10nupdate user from deployment-tin and mira in beta
* 18:20 hashar: Tag Quibble 1.4.1 @ {{Gerrit|d4bd2801de}} # [[phab:T300301|T300301]]
* 10:20 hashar: Put integration-slave-trusty-1004 offline. Ssh/passwd is borked  T135217
* 16:42 dancy: Updating to scap 4.3.1-1+0~20220216163646.173~1.gbp823710?in beta
* 09:59 hashar: Deleting non nodepool mediawiki PHPUnit jobs for T135001 (mediawiki-phpunit-hhvm mediawiki-phpunit-parsertests-hhvm mediawiki-phpunit-parsertests-php55 mediawiki-phpunit-php55)
* 12:55 jelto: apply gitlab-settings to gitlab-prod-1001.devtools.eqiad1.wikimedia.cloud
* 04:06 thcipriani|afk: changed ownership of mwdeploy public keys post shadow mwdeploy user removal is important
* 10:09 hashar: Reloading Zuul for {{Gerrit|I997fee0f160ca3049b8085879831bfe175096ced}}
* 03:47 thcipriani|afk: ldap failure has created a shadow mwdeploy user on beta, deleted using vipw
* 09:59 hashar: Reloading Zuul for {{Gerrit|I2ffa016563ad37f1e7c13dcce81deb8ab411c9e2}}


== 2016-05-12 ==
== 2022-02-15 ==
* 22:53 bd808: Started dead mysql on integration-slave-precise-1011
* 21:12 dancy: rebooting deployment-mediawiki12.deployment-prep.eqiad1.wikimedia.cloud to try to revive beta wikis
* 20:59 dancy: Killed runaway puppet agent on deployment-mediawiki11.deployment-prep.eqiad1.wikimedia.cloud
* 16:24 hashar: Restarting CI Jenkins for plugins updates
* 16:21 hashar: Upgrading Jenkins plugins on releases Jenkins
* 16:06 hashar: Rollback fresh-test Jenkins job to the version intended to run on integration-agent-qemu-1001
* 15:26 hashar: Reloading Zuul for {{Gerrit|If80b4b4cfa5c1a869ceb220f5b11c272b384a721}}


== 2016-05-11 ==
== 2022-02-14 ==
* 21:05 hashar: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/288128  #T134946
* 16:28 dancy: Updating scap in beta cluster to 4.3.1-1+0~20220211225318.167~1.gbp315b2c
* 20:26 hashar: rebooting integration-slave-trusty-1016  is back up
* 16:16 Amir1: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/762471
* 20:15 hashar: rebooting integration-slave-trusty-1016  unreachable somehow
* 15:41 hashar: Messing up with fresh-test Jenkns job to polish up Qemu / qcow2 integration
* 16:43 hashar: Reduced number of executors on Trusty instances from 3 to 2. Memory get exhausted causing the tmpfs to drop files and thus MW jobs to fail randomly.
* 14:26 jnuche: Jenkins upgrade complete [[phab:T301361|T301361]]
* 13:33 hashar: Added contint::packages::php to Nodepool images T119139
* 13:54 jnuche: Jenkins contint instances are going to be restarted soon
* 12:59 hashar: Dropping texlive and its dependencies from gallium.
* 12:52 hashar: deleted integration-dev
* 12:51 hashar: creating  integration-dev instance to hopefully have Shinken clean itself
* 11:42 hashar: rebooting deployment-aqs01 via wikitech  T134981
* 10:46 hashar: beta/ci puppetmaster : deleting old tags in /var/lib/git/operations/puppet  and repacking the repos
* 08:49 hashar: Deleting instances deployment-memc02 and deployment-memc03 (Precise instances, migrated to Jessie)  #T134974
* 08:43 hashar: Beta: switching memcached to new Jessie servers by cherry picking https://gerrit.wikimedia.org/r/#/c/288156/ and running puppet on mw app servers  #T134974
* 08:20 hashar: Creating deployment-memc04 and deployment-memc05 to switch beta cluster memcached to Jessie.  m1.medium with security policy "cache" T13497
* 01:44 matt_flaschen: Created Flow-specific External Store tables (blobs_flow1) on all wiki databases on Beta Cluster: T128417


== 2016-05-10 ==
== 2022-02-12 ==
* 19:17 hashar: beta / CI  purging old Linux kernels:  salt -v '*' cmd.run 'dpkg -l|grep ^rc|awk "{ print \$2 }"|grep linux-image|xargs dpkg --purge'
* 18:22 urbanecm: deployment-prep: reboot deployment-eventgate-3 ([[phab:T289029|T289029]])
* 17:34 cscott: updated OCG to version b0c57a1c6890e9fa1f2c3743fc14cb6a7f244fc3
* 16:44 bd808: Cleaned up 8.5G of pbuilder tmp output on integration-slave-jessie-1001 with `sudo find /mnt/pbuilder/build -maxdepth 1 -type d -mtime +1 -exec rm -r {} \+`
* 16:35 bd808: https://integration.wikimedia.org/ci/job/debian-glue failure on integration-slave-jessie-1001 due to /mnt being 100$ full
* 14:20 hashar: deployment-puppetmaster mass cleaned packages/service/users etc  T134881
* 13:54 moritzm: restarted zuul-merger on scandium for openssl update
* 13:52 moritzm: restarting zuul on gallium for openssl update
* 13:51 moritzm: restarted apache and zuul-merger on gallium for openssl update
* 13:48 hashar: deployment-puppetmaster : dropping role::ci::jenkins_access role::ci::slave::labs and role::ci::slave::labs::common  T134881
* 13:46 hashar: Deleting Jenkins slave deployment-puppetmaster T134881
* 13:45 hashar: Change https://integration.wikimedia.org/ci/job/beta-build-deb/ job to use label selector "DebianGlue && DebianJessie" instead of "BetaDebianRepo"  T134881
* 13:33 hashar: Migrating all debian glue jobs to Jessie permanent slaves T95545
* 13:30 hashar: Adding  integration-slave-jessie-1002 in Jenkins.  it is all puppet compliant
* 12:59 thcipriani|afk: triggering puppet run on scap targets in beta for https://gerrit.wikimedia.org/r/#/c/287918/ cherry pick
* 09:07 hashar: fixed puppet.conf on deployment-cache-text04


== 2016-05-09 ==
== 2022-02-10 ==
* 20:58 hashar: Unbroke puppet on integration-raita.integration.eqiad.wmflabs . Puppet was blocked because role::ci::raita was no more. Fixed by rebasing https://gerrit.wikimedia.org/r/#/c/208024 T115330 
* 17:29 jeena: reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/761602
* 20:13 hashar: beta: salt -v '*' cmd.run 'dpkg --purge libganglia1 ganglia-monitor; rm -fR /etc/ganglia'  # T134808
* 20:06 hashar: CI, removing ganglia configuration entirely via:  salt -v '*' cmd.run 'rm -fRv /etc/ganglia'  # T134808
* 20:04 hashar: CI, removing ganglia configuration entirely via:  salt -v '*' cmd.run 'dpkg --purge ganglia-monitor'  # T134808
* 16:32 jzerebecki: reloading zuul for 3e2ab56..d663fd0
* 15:39 andrewbogott: migrating deployment-flourine to labvirt1009
* 15:39 hashar: Adding label contintLabsSlave  to integration-slave-jessie1001 and  integration-slave-jessie1002
* 15:26 hashar: Creating integration-slave-jessie-1001 T95545


== 2016-05-06 ==
== 2022-02-09 ==
* 19:45 urandom: Restart cassandra-metrics-collector on deployment-restbase0[1-2]
* 15:22 taavi: deleted shutoff deployment-mx02
* 19:41 urandom: Rebasing 02ae1757 on deployment-puppetmaster : T126629


== 2016-05-05 ==
== 2022-02-08 ==
* 22:09 MaxSem: Promoted Yurik and Jgirault to sysops on beta enwiki. Through shell because logging in is broken for me.
* 17:34 taavi: remove scap from deployment-kafka-main/jumbo
* 16:23 taavi: hard reboot misbehaving deployment-echostore01
* 13:39 taavi: delete /srv/mediawiki-staging.save on deployment-deploy03


== 2016-05-04 ==
== 2022-02-07 ==
* 21:28 cscott: deployed puppet FQDN domain patch for OCG: https://gerrit.wikimedia.org/r/286068 and restarted ocg on deployment-pdf0[12]
* 20:55 taavi: added Zabe as member of the deployment-prep project [[phab:T301179|T301179]]
* 15:03 hashar: beta-scap: deployment-tin.deployment-prep.eqiad.wmflabs Name or service not known
* 18:19 Reedy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/760550
* 15:03 hashar: beta-scap: deployment-tin.deployment-prep.eqiad.wmflabs
* 12:24 hashar: deleting Jenkins job mediawiki-core-phpcs  , replaced by Nodepool version mediawiki-core-phpcs-trusty  T133976
* 12:11 hashar: beta: restarted nginx on varnish caches ( systemctl restart nginx.service ) since they were not listening on port 443 #T134362
* 11:07 hashar: restarted CI puppetmaster  (out of memory leak)
* 10:57 hashar: CI: mass upgrading deb packages
* 10:53 hashar: beta: clearing out leftover apt conf that points to unreachable web proxy : salt -v '*' cmd.run "find /etc/apt -name '*-proxy' -delete"
* 10:48 hashar: Manually fixing nginx upgrade on deployment-cache-text04 and deployment-cache-upload04  see T134362 for details
* 09:27 hashar: deployment-cache-text04 systemctl stop varnish-frontend.service  . To clear out all the stuck CLOSE_WAIT connections  T134346
* 08:33 hashar: fixed puppet on deployment-cache-text04 (race condition generating puppet.conf )


== 2016-05-03 ==
== 2022-02-04 ==
* 23:21 bd808: Changed "Maximum Number of Retries" for ssh agent launch in jenkins for deployment-tin from "0" to "10"
* 00:21 Krinkle: Updating docker-pkg files on contint primary for https://gerrit.wikimedia.org/r/759622
* 23:01 twentyafterfour: rebooting deployment-tin
* 23:00 bd808: Jenkins agent on deployment-tin not spawning; investigating
* 20:02 hashar: Restarting Jenkins
* 16:49 hashar: Notice: /Stage[main]/Contint::Packages::Python/Package[pypy]/ensure: ensure changed 'purged' to 'present'  | T134235
* 16:46 hashar: Refreshing Nodepool Jessie image to have it include pypy | T134235  poke @jayvdb
* 14:49 mobrovac: deployment-tin rebooting it
* 14:25 hashar: beta  salt -v '*' pkg.upgrade
* 14:19 hashar: beta: added unattended upgrade to Hiera::deployment-prep
* 13:30 hashar: Restarted nslcd on deployment-tin ,  pam was refusing authentication for some reason
* 13:29 hashar: beta: got rid of a leftover Wikidata/Wikibase patch that broke scap  salt -v 'deployment-tin*' cmd.run 'sudo -u jenkins-deploy git -C /srv/mediawiki-staging/php-master/extensions/Wikidata/ checkout -- extensions/Wikibase/lib/maintenance/populateSitesTable.php'
* 13:23 hashar: deployment-tin force upgraded HHVM from 3.6 to 3.12
* 09:42 hashar: adding puppet class contint::slave_scripts to deployment-sca01 and deployment-sca02 . Ships multigit.sh  T134239
* 09:31 hashar: Deleting CI slave deployment-cxserver03 , added deployment-sca01 and deployment-sca02 in Jenkins. T134239
* 09:28 hashar: deployment-sca01 removing puppet lock /var/lib/puppet/state/agent_catalog_run.lock  and running puppet again
* 09:26 hashar: Applying puppet class role::ci::slave::labs::common  on deployment-sca01 and deployment-sca02 (cxserver and parsoid being migrated T134239 )
* 03:33 kart_: Deleted deployment-cxserver03, replaced by deployment-sca0x


== 2016-05-02 ==
== 2022-02-03 ==
* 21:27 cscott: updated OCG to version b775e612520f9cd4acaea42226bcf34df07439f7
* 18:41 taavi: deployment-prep: route /w/api.php to deployment-mediawiki11, trying to reduce load on a single server
* 21:26 hashar: Nodepool is acting just fine: Demand from gearman: ci-trusty-wikimedia: 457  | <AllocationRequest for 455.0 of ci-trusty-wikimedia>
* 14:53 hashar: Building Docker images for Quibble 1.4.(prepared by kostajh)
* 21:25 hashar: restarted qa-morebots "2016-05-02 21:22:23,599 ERROR: Died in main event loop"
* 13:51 kostajh: Tag Quibble 1.4.0 @ {{Gerrit|4231bc2832395d94e29a332fe8d863301a0cd441}} # [[phab:T300340|T300340]] [[phab:T291549|T291549]] [[phab:T225730|T225730]]
* 21:23 hashar: gallium: enqueued 488 jobs directly in Gearman. That is to test https://gerrit.wikimedia.org/r/#/c/286462/ ( mediawiki/extensions to hhvm/zend5.5 on Nodepool). Progress /home/hashar/gerrit-286462.log
* 20:14 hashar: MediaWiki phpunit jobs to run on Nodepool instances \O/
* 16:41 urandom: Forcing puppet run and restarting Cassandra on deployment-restbase0[1-2] : T126629
* 16:40 urandom: Cherry-picking https://gerrit.wikimedia.org/r/operations/puppet refs/changes/78/284078/12 to deployment-puppetmaster : T126629
* 16:24 urandom: Restarat Cassandra on deployment-restbase0[1-2] : T126629
* 16:21 urandom: forcing puppet run on deployment-restbase0[1-2] : T126629
* 16:21 urandom: cherry-picking latest refs/changes/78/284078/11 onto deployment-puppetmaster : T126629
* 09:44 hashar: On zuul-merger instances (gallium / scandium), cleared out pywikibot/core working copy ( rm -fR /srv/ssd/zuul/git/pywikibot/core/ ) T134062


== 2016-04-30 ==
== 2022-02-02 ==
* 18:31 Amir1: deploying d4f63a3 from github.com/wiki-ai/ores-wikimedia-config into targets in beta cluster via scap3
* 16:50 dancy: Upgrading scap to 4.2.2-1+0~20220202164708.157~1.gbp376a16 in beta.
* 16:12 dancy: Upgrading scap to 4.2.2-1+0~20220201161808.156~1.gbp1c1c64 in beta


== 2016-04-29 ==
== 2022-02-01 ==
* 16:37 jzerebecki: restarting zuul for 4e9d180..ebb191f
* 17:27 addshore: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/734654
* 15:45 hashar: integration: deleting integration-trusty-1026 and cache-rsync . Maybe that will clear them up from Shinken
* 00:34 tgr: deployment-pre un-cherry-picked gerrit 758584 from beta puppetmaster, patch is now merged [[phab:T300591|T300591]]
* 15:14 hashar: integration: created 'cache-rsync' and 'integration-trusty-1026' , attempting to have Shinken to deprovision them
* 00:12 tgr: deployment-prep cherry-picked gerrit 758584 to beta puppetmaster [[phab:T300591|T300591]]


== 2016-04-28 ==
== 2022-01-31 ==
* 22:03 urandom: deployment-restbase01 upgrade to 2.2.6 complete : T126629
* 19:01 James_F: Re-configured Jenkins job mediawiki-i18n-check-docker to {{Gerrit|9e3ea96c548d7a84be763d38c2d118bc861cf189}} for [[phab:T222216|T222216]]
* 21:56 urandom: Stopping Cassandra on deployment-restbase01, upgrading package to 2.2.6, and forcing puppet run : T126629
* 10:49 hashar: Added integration-agent-qemu-1003 with label `Qemu` # [[phab:T284774|T284774]]
* 21:55 urandom: Snapshotting Cassandra tables on deployment-restbase01 (name = 1461880519833) : T126629
* 21:55 urandom: Snapshotting Cassandra tables on deployment-restbase01 : T126629
* 21:52 urandom: Forcing puppet run on deployment-restbase02 : T126629
* 21:51 urandom: Cherry picking operations/puppet refs/changes/78/284078/10 to puppmaster : T126629
* 20:46 urandom: Starting Cassandra on deployment-restbase02 (now v2.2.6) : T126629
* 20:41 urandom: Re-enable puppet and force run on deployment-restbase02 : T126629
* 20:38 urandom: Halting Cassandra on deployment-restbase02, masking systemd unit, and upgrading package(s) to 2.2.6 : T126629
* 20:37 urandom: Snapshotting Cassandra tables on deployment-restbase02 (snapshot name = 1461875833996) : T126629
* 20:37 urandom: Snapshotting Cassandra tables on deployment-restbase02 : T126629
* 20:33 urandom: Cassandra on deployment-restbase01.deployment-prep started : T126629
* 20:25 urandom: Restarting Cassandra on deployment-restbase01.deployment-prep : T126629
* 20:14 urandom: Re-enable puppet on deployment-restbase01.deployment-prep, and force a run : T126629
* 20:12 urandom: cherry-picking https://gerrit.wikimedia.org/r/#/c/284078/ to deployment-puppetmaster : T126629
* 20:06 urandom: Disabling puppet on deployment-restbase0[1-2].deployment-prep : T126629
* 14:43 hashar: Rebuild Nodepool Jessie image. Comes with hhvm
* 12:52 hashar: Puppet is happy on deployment-changeprop
* 12:47 hashar: apt-get upgrade deployment-changeprop  (outdated exim package)
* 12:42 hashar: Rebuild Nodepool Trusty instance to include the PHP wrapper script T126211


== 2016-04-27 ==
== 2022-01-28 ==
* 23:57 thcipriani: nodepool instances running again after an openstack rabbitmq restart by andrewbogott
* 21:45 taavi: running recountCategories.php on all beta wikis per [[phab:T299823|T299823]]#7652496
* 22:51 duploktm: also ran openstack server delete ci-jessie-wikimedia-85342
* 14:27 hashar: taking heapdump of CI Jenkins `sudo -u jenkins /usr/lib/jvm/java-11-openjdk-amd64/bin/jmap -dump:live,format=b,file=/var/lib/jenkins/202201281527.hprof xxxx`
* 22:42 legoktm: nodepool delete 85342
* 22:41 matt_flaschen: Deployed https://gerrit.wikimedia.org/r/#/c/285765/ to enable External Store everywhere on Beta Cluster
* 22:38 legoktm: stop/started nodepool
* 22:36 thcipriani: I don't have permission to restart nodepool
* 22:35 thcipriani: restarting nodepool
* 22:18 matt_flaschen: Deployed https://gerrit.wikimedia.org/r/#/c/282440/ to switch Beta Cluster to use External Store for new testwiki writes
* 21:00 hashar: thcipriani downgraded git plugins successfully (we wanted to rule out their upgrade  for some weird issue)
* 20:13 cscott: updated OCG to version e39e06570083877d5498da577758cf8d162c1af4
* 14:10 hashar: restarting Jenkins
* 14:09 hashar: Jenkins upgrading credential plugin 1.24 > 1.27 And Credentials binding plugin 1.6 > 1.7
* 14:07 hashar: Jenkins upgrading git plugin 2.4.1 > 2.4.4
* 14:01 hashar: Jenkins upgrading git client plugin 1.19.1. > 1.19.6
* 13:13 jzerebecki: reloading zuul for 81a1f1a..0993349
* 11:43 hashar: fixed puppet on deployment-cache-text04  T132689
* 10:38 hashar: Rebuild Image ci-trusty-wikimedia-1461753210 in wmflabs-eqiad is ready
* 09:43 hashar: tmh01.deployment-prep.eqiad.wmflabs denies mwdeploy user breaking https://integration.wikimedia.org/ci/job/beta-scap-eqiad/


== 2016-04-26 ==
== 2022-01-27 ==
* 20:45 hashar: Regenerating Nodepool Jessie snapshot to include composer and HHVM | T128092
* 20:26 hashar: Successfully published image docker-registry.discovery.wmnet/releng/logstash-filter-verifier:0.0.2  # [[phab:T299431|T299431]]
* 20:23 jzerebecki: reloading zuul for eb480d8..81a1f1a
* 19:34 Amir1: Reloading Zuul to deploy 757464
* 19:25 jzerebecki: reload zuul for 4675213..eb480d8
* 16:00 hashar: Pooling back agents 1035 1036 1037 1038 , they could not connect due to ssh host mismatch since yesterday they all got attached to instance 1033 and accepted that host key # [[phab:T300214|T300214]]
* 19:25 jzerebecki: 4675213..eb480d8
* 09:16 hashar: integration: cumin --force 'name:docker' 'apt install rsync'  # [[phab:T300236|T300236]]
* 14:18 hashar: Applied security patches to 1.27.0-wmf.22 | T131556
* 09:05 hashar: integration: cumin --force 'name:docker' 'apt install rsync'  # [[phab:T300214|T300214]]
* 12:39 hashar: starting cut of 1.27.0-wmf.22 branch ( poke ostriches )
* 00:24 thcipriani: restarting jenkins
* 10:29 hashar: restored integration/phpunit on CI slaves due to https://integration.wikimedia.org/ci/job/operations-mw-config-phpunit/ failling
* 09:11 hashar: CI is back up!
* 08:20 hashar: shutoff instance castor, does not seem to be able to start again :| T133652
* 08:12 hashar: hard rebooting castor instance | T133652
* 08:10 hashar: soft rebooting castor instance | T133652
* 08:06 hashar: CI jobs deadlocked due to castor being unavailable | https://phabricator.wikimedia.org/T133652
* 00:46 thcipriani: temporary keyholder fix in place in beta
* 00:18 thcipriani: beta-scap-eqiad failure due to bad keyholder-auth.d fingerprints


== 2016-04-25 ==
== 2022-01-26 ==
* 20:58 cscott: updated OCG to version 58a720508deb368abfb7652e6a8c7225f95402d2
* 20:29 hashar: Completed migration of integration-agent-docker-XXXX instances from Stretch to Bullseye - [[phab:T252071|T252071]]
* 19:46 hashar: Nodepool now has a couple trusty instances intended to experiment with Zend 5.5 / HHVM migration . https://phabricator.wikimedia.org/T133203#2236625
* 19:55 hashar: deleting integration-agent-docker-1014 which only has the `codehealth` label. A short live experiment no more used since October 2nd 2019 - https://gerrit.wikimedia.org/r/c/integration/config/+/540362 - [[phab:T234259|T234259]]
* 13:34 hashar: Nodepool is attempting to create a Trusty snapshot with name ci-trusty-wikimedia-1461591203 | T133203
* 18:56 hashar: integration: pooled in Jenkins a few more Bullseye docker agents for [[phab:T252071|T252071]]
* 13:15 hashar: openstack image create --file /home/hashar/image-trusty-20160425T124552Z.qcow2 ci-trusty-wikimedia --disk-format qcow2 --property show=true  # T133203
* 18:17 hashar: integration: pooled in Jenkins a few Bullseye docker agent for [[phab:T252071|T252071]]
* 10:38 hashar: Refreshing Nodepool Jessie snapshot based on new image
* 16:45 hashar: integration: creating  integration-agent-docker-1023  based on buster with new flavor `g3.cores8.ram24.disk20.ephemeral60.4xiops` # [[phab:T290783|T290783]]
* 10:35 hashar: Refreshed Nodepool Jessie image ( image-jessie-20160425T100035Z )
* 09:24 hashar: beta / scap failure filled as T133521
* 09:20 hashar: Keyholder / mwdeploy ssh keys have been messed up on beta cluster somehow :-(
* 08:47 hashar: mwdeploy@deployment-tin has lost ssh host keys file :(


== 2016-04-24 ==
== 2022-01-25 ==
* 17:14 jzerebecki: reloading e06f1fe..672fc84
* 20:17 James_F: Zuul: [mediawiki/extensions/CentralAuth] Drop UserMerge dependency
* 16:39 James_F: Zuul: Mark Math extension as now tarballed in parameter_functions for [[phab:T232948|T232948]]
* 15:57 James_F: Zuul: [mediawiki/extensions/Math] Add Math to the main gate for [[phab:T232948|T232948]]
* 13:44 hashar: Jenkins CI: added Logger https://integration.wikimedia.org/ci/log/ProcessTree%20-%20T299995/ to watch `hudson.util.ProcessTree` for [[phab:T299995|T299995]]
* 10:02 hashar: integration: removing usage of `role::ci::slave::labs::docker::docker_lvm_volume` in Horizon following https://gerrit.wikimedia.org/r/c/operations/puppet/+/755948  . Docker role instances now always have a 24G partition for Docker
* 09:59 hashar: integration-agent-qemu-1001: resized /srv to 100% disk free: `lvextend -r -l +100%FREE /dev/mapper/vd-second--local--disk` # [[phab:T299996|T299996]]
* 09:59 hashar: integration-agent-qemu-1001: resizing /dev/mapper/vd-second--local--disk (/srv) to 20G : `resize2fs -p /dev/mapper/vd-second--local--disk 20G` # [[phab:T299996|T299996]]
* 09:51 hashar: integration-agent-qemu-1001: resizing /dev/mapper/vd-second--local--disk (/srv) to 20G : `resize2fs -p /dev/mapper/vd-second--local--disk 20G`
* 09:51 hashar: integration-agent-qemu-1003: nuked /dev/vd/second-local-disk and /srv to make room for a docker logical volume. That has fixed puppet  [[phab:T299996|T299996]]
* 09:22 Reedy: unblocked beta again
* 07:32 Krinkle: integration-castor03:/srv/jenkins-workspace/caches$ sudo rm -rf castor-mw-ext-and-skins/


== 2016-04-22 ==
== 2022-01-24 ==
* 18:13 legoktm: deploying https://gerrit.wikimedia.org/r/284841
* 21:44 Reedy: unstick beta ci jobs
* 08:13 legoktm: deploying https://gerrit.wikimedia.org/r/284860
* 21:19 jeena: reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/756523
* 20:36 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/756139
* 17:28 hashar: Nuke castor caches on integration-castor03 : sudo rm -fR /srv/jenkins-workspace/caches/castor-mw-ext-and-skins/master/<nowiki>{</nowiki>quibble-vendor-mysql-php72-selenium-docker,wmf-quibble-selenium-php72-docker<nowiki>}</nowiki>  # [[phab:T299933|T299933]]
* 17:28 hashar: Nuke castor caches on integration-castor03 : sudo rm -fR /srv/jenkins-workspace/caches/castor-mw-ext-and-skins/master/<nowiki>{</nowiki>quibble-vendor-mysql-php72-selenium-docker,wmf-quibble-selenium-php72-docker<nowiki>}</nowiki>


== 2016-04-21 ==
== 2022-01-22 ==
* 19:07 thcipriani: scap version testing should be done, puppet should no longer be disabled on hosts
* 13:40 taavi: apply [[phab:T299827|T299827]] on deployment-prep centralauth database
* 18:02 thcipriani: disabling puppet on scap targets to test scap_3.1.0-1+0~20160421173204.70~1.gbp6706e0_all.deb
* 11:44 taavi: restart varnish-frontend.service on deployment-cache-upload06 to clear puppet agent failure alerts


== 2016-04-20 ==
== 2022-01-21 ==
* 22:28 thcipriani: rolling back scap version in beta, legit failure :(
* 18:12 taavi: resolved merge conflicts on deployment-puppetmaster04
* 21:52 thcipriani: testing new scap version in beta on deployment-tin
* 15:50 hashar: integration-puppetmaster-02: deleted 2021 snapshot tags in puppet repo and ran `git gc --prune=now`
* 17:54 thcipriani: Reloading Zuul to deploy [[gerrit:284494]]
* 13:58 hashar: Stopping HHVM on CI slaves by cherry picking a couple puppet patches | T126594
* 13:33 hashar: salt -v '*trusty*' cmd.run 'rm /usr/lib/x86_64-linux-gnu/hhvm/extensions/current'  # Cleanup on CI slaves for T126658
* 13:27 hashar: Restarted integration puppet master service (out of memory / mem leak)


== 2016-04-17 ==
== 2022-01-20 ==
* 01:01 legoktm: deploying https://gerrit.wikimedia.org/r/283837
* 20:24 James_F: Zuul: [Kartographer] Add parsoid as dependency for CI jobs
* 20:22 James_F: Zuul: [DiscussionTools] Add Gadgets as dependency for Phan jobs
* 20:04 dancy: Jenkins beta jobs are back online, using scap prep auto now.
* 19:19 dancy: Pausing beta Jenkins jobs to make a copy of /srv/mediawiki-staging in preparation for testing
* 19:10 dancy: Unpacking scap (4.1.1-1+0~20220120175448.144~1.gbp517f9d) over (4.1.1-1+0~20220113154148.133~1.gbp6e3a17) on deploy03
* 18:07 hashar: Updating Quibble jobs to have MediaWiki files written on the hosts /srv partition (38G) instead of inside the container which ends in /var/lib/docker (24G) https://gerrit.wikimedia.org/r/755743  # [[phab:T292729|T292729]]
* 16:31 hashar: Rebalancing /var/lib/docker and /srv partitions on CI agents {{!}} https://gerrit.wikimedia.org/r/755713
* 12:12 hashar: contint2001 deleting all the Docker images (they will be pulled as needed)
* 12:10 hashar: contint2001 : docker container prune && docker image prune
* 12:07 hashar: contint1001 deleting all the Docker images (they will be pulled as needed)
* 12:04 hashar: contint1001 `docker image prune`
* 11:51 hashar: Cleaning very old Docker images on contint1001.wikimedia.Org


== 2016-04-16 ==
== 2022-01-19 ==
* 14:21 Krenair: restarted qa-morebots per request
* 18:20 hashar: Adding  https://integration.wikimedia.org/ci/computer/contint1001/ back to the pool again
* 14:18 Krenair: <jzerebecki> !log reloading zuul for 3f64dbd..c6411a1
* 17:31 hashar: Adding  https://integration.wikimedia.org/ci/computer/contint1001/ back to the pool after the machine got powercycled # [[phab:T299542|T299542]]
* 10:38 Reedy: kill some stuck jobs [[phab:T299485|T299485]]


== 2016-04-13 ==
== 2022-01-18 ==
* 01:48 legoktm: deploying https://gerrit.wikimedia.org/r/282952
* 19:56 hashar: building Docker images for https://gerrit.wikimedia.org/r/754951
* 18:01 taavi: added ryankemper as a member of the deployment-prep project
* 15:00 hashar: Updating Jenkins jobs for Quibble 1.3.0  with proper PHP version in the images # [[phab:T299389|T299389]]
* 11:39 hashar: Rolling back Quibble 1.3.0 jobs due to php configuration files with at least releng/quibble-buster73:1.3.0  # [[phab:T299389|T299389]]
* 08:07 hashar: Updating Jenkins jobs for Quibble to pass `--parallel-npm-install` https://gerrit.wikimedia.org/r/c/integration/config/+/754569
* 08:02 hashar: Updating Jenkins jobs for Quibble 1.3.0


== 2016-04-12 ==
== 2022-01-17 ==
* 19:47 bd808: Cleaned up large hhbc cache file on deployment-medaiwiki03 via `sudo service hhvm stop; sudo rm /var/cache/hhvm/fcgi.hhbc.sq3; sudo service hhvm start`
* 16:28 hashar: Building Quibble 1.3.0 Docker images
* 19:47 bd808: Cleaned up large hhbc cache file on deployment-medaiwiki02 via `sudo service hhvm stop; sudo rm /var/cache/hhvm/fcgi.hhbc.sq3; sudo service hhvm start`
* 16:16 hashar: Tagged Quibble 1.3.0 @ {{Gerrit|2b2c7f9a45}} # [[phab:T297480|T297480]] [[phab:T226869|T226869]] [[phab:T294931|T294931]]
* 19:46 bd808: Cleaned up large hhbc cache file on deployment-medaiwiki01 via `sudo service hhvm stop; sudo rm /var/cache/hhvm/fcgi.hhbc.sq3; sudo service hhvm start`
* 08:32 hashar: Refreshing all Jenkins jobs with jjb to take in account recent changes related to the Jinja2 docker macro
* 19:10 Amir1: manually rebooted deployment-ores-web
* 19:08 Amir1: manually cherry-picked 282992/2 into to puppetmaster
* 17:05 Amir1: ran puppet agen in sca01 manually in /srv directory
* 11:34 hashar: Jenkins upgrading "Script Security Plugin" from 1.17 to 1.18.1 https://wiki.jenkins-ci.org/display/SECURITY/Jenkins+Security+Advisory+2016-04-11


== 2016-04-11 ==
== 2022-01-14 ==
* 21:23 csteipp: deployed and reverted oath
* 15:56 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/c/integration/config/+/753981
* 20:30 thcipriani: relaunched slave-agent on integration-slave-trusty-1025, back online
* 14:59 hashar: Starting VM integration-agent-docker-1022 which was in shutdown state since December and is Bullseye based # [[phab:T290783|T290783]]
* 20:19 thcipriani: integration-slave-trusty-1025 horizon console filled with INFO: task jbd2/vda1-8:170 blocked for more than 120 seconds. rebooting
* 13:49 hashar: Restarting all CI Docker agents via Horizon to apply new flavor settings [[phab:T265615|T265615]] [[phab:T299211|T299211]]
* 20:13 thcipriani: killing stuck jobs, marking integration-slave-trusty-1025 as offline temporarily
* 01:47 dancy: revert to scap 4.1.1-1+0~20220113154148.133~1.gbp6e3a17 in beta
* 14:42 thcipriani: deployment-mediawiki01 disk full :(


== 2016-04-08 ==
== 2022-01-13 ==
* 22:46 matt_flaschen: Created blobs1 table for all wiki DBs on Beta Cluster
* 18:02 dancy: Updating scap to 4.1.1-1+0~20220113154506.135~1.gbp523480 on all beta hosts
* 14:34 hashar: Image ci-jessie-wikimedia-1460125717 in wmflabs-eqiad is ready  adds package 'unzip' | T132144
* 17:54 dancy: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/753792
* 12:49 hashar: Image ci-jessie-wikimedia-1460119481 in wmflabs-eqiad is ready , adds package 'zip' | T132144
* 16:27 dancy: testing scap prep auto on deployment-deploy03
* 09:30 hashar: Removed label hasAndroidSdk from gallium . That prevent that slave from sometime running the job apps-android-commons-build 
* 15:52 dancy: Update scap to 4.1.1-1+0~20220113154506.135~1.gbp523480 on deployment-deploy03
* 08:42 hashar: Rebased puppet master and fixed conflict with https://gerrit.wikimedia.org/r/#/c/249490/
* 11:27 hashar: Updating Jenkins job to normalize usage of `docker run --workdir` https://gerrit.wikimedia.org/r/c/integration/config/+/753457
* 10:52 hashar: Restarting Jenkins CI for plugins update
* 10:42 hashar: Applied Jenkins built-in node migration to CI Jenkins (`master` > `built-in` renaming) # [[phab:T298691|T298691]]
* 10:14 taavi: cancelled stuck deployment-prep jobs on jenkins


== 2016-04-07 ==
== 2022-01-12 ==
* 20:16 hashar: deployment-mediawiki02.deployment-prep.eqiad.wmflabs , cleared up random left over stuff / big logs etc
* 18:58 hashar: Applied plugins update to https://releases-jenkins.wikimedia.org/
* 20:08 hashar: deployment-mediawiki02.deployment-prep.eqiad.wmflabs / is full


== 2016-04-05 ==
== 2022-01-11 ==
* 23:56 marxarelli: Removed cherry-pick and rebased /var/lib/git/operations/puppet on integration-puppetmaster after merge of https://gerrit.wikimedia.org/r/#/c/281706/
* 09:18 hashar: Updating all Jenkins jobs following recent "noop" refactorings
* 21:58 marxarelli: Restarting puppetmaster on integration-puppetmaster
* 21:53 marxarelli: Cherry picked https://gerrit.wikimedia.org/r/#/c/281706/ on integration-puppetmaster and applying on integration-slave-trusty-1014
* 10:32 hashar: gallium removing texlive
* 10:29 hashar: gallium removing libav / ffmpeg. No more needed since jobs are no more running on that server


== 2016-04-04 ==
== 2022-01-10 ==
* 17:30 greg-g: Phabricator going down in about 10 minutes to hopefully address the overheating issue: T131742
* 17:13 dancy: Update beta scap to 4.1.0-1+0~20220107203309.130~1.gbpcd0ace
* 10:06 hashar: integration: salt -v '*-slave*' cmd.run 'rm /usr/local/bin/grunt; rm -fR /usr/local/lib/node_modules/grunt-cli'  | T124474
* 14:01 James_F: Zuul: Add gate-and-submit-l10n to Isa for [[phab:T222291|T222291]]
* 10:04 hashar: integration: salt -v '*-slave*' cmd.run 'npm -g uninstall  grunt-cli' | T124474
* 03:15 greg-g: Phabricator is down


== 2016-04-03 ==
== 2022-01-05 ==
* 07:02 legoktm: deploying https://gerrit.wikimedia.org/r/281079
* 19:15 taavi: run `sudo chown -R jenkins-deploy:wikidev public/dists/bullseye-deployment-prep/` on deployment-deploy03
* 03:16 Amir1: manually rebooted deployment-ores-web and deployment-sca01
* 17:31 hashar: Deploying Zuul change https://gerrit.wikimedia.org/r/c/integration/config/+/751697  to get rid of the wmf-quibble-apache jobs # [[phab:T285649|T285649]]
* 10:48 hashar: CI: switching MediaWiki selenium from php built-in server to Apache # https://gerrit.wikimedia.org/r/751697
* 09:24 hashar: Updating Quibble jobs to use latest image (provides `quibble-with-apache` entrypoint) https://gerrit.wikimedia.org/r/c/integration/config/+/751685/


== 2016-04-02 ==
== 2022-01-04 ==
* 22:58 Amir1: added local hack to pupetmaster to make scap3 provider more verbose
* 12:49 hashar: Reloading Zuul for "api-testing: rename jobs to shorter forms"  https://gerrit.wikimedia.org/r/751422
* 19:46 hashar: Upgrading Jenkins Gearman plugin to v2.0 , bring in diff registration for faster updates of Gearman server
* 09:48 hashar: Builder Quibble Docker images with Apache included https://gerrit.wikimedia.org/r/c/integration/config/+/748104
* 14:39 Amir1: manually added 281170/5 to beta puppetmaster
* 09:47 hashar: Reloading Zuul for "Add CentralAuth to phan dependency list for GrowthExperiments" https://gerrit.wikimedia.org/r/751383
* 14:22 Amir1: manually added 281161/1 to beta puppetmaster
* 11:31 Reedy: deleted archived logs older than 30 days from deployment-fluorine


== 2016-04-01 ==
== 2022-01-03 ==
* 22:16 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/281046
* 14:37 hashar: Upgraded Java 11 on contint2001 && contint1001Restarted CI Jenkins.
* 21:13 hashar: Image ci-jessie-wikimedia-1459544873 in wmflabs-eqiad is ready
* 14:35 hashar: Upgraded Java 11 on releases1002 && releases2002
* 20:57 hashar: Refreshing Nodepool snapshot to hopefully get npm 2.x installed T124474
* 20:37 hashar: Added Luke081515 as a member of deployment-prep (beta cluster) labs project
* 20:31 hashar: Dropping grunt-cli from the permanent slavesPeople can have it installed by listing it in their package.json devDependencies https://gerrit.wikimedia.org/r/#/c/280974/
* 14:06 hashar: integration: removed sudo policy permitting sudo as any member of the project for any member of the project, which included jenkins-deploy user
* 14:05 hashar: integration: removed sudo policy permitting sudo as root for any member of the project, which included jenkins-deploy user
* 11:23 bd808: Freed 4.5G on deployment-fluorine:/srv/mw-log by deleting wfDebug.log
* 04:00 Amir1: manually rebooted deployment-sca01
* 00:16 csteipp: created oathauth_users table on centralauth db in beta


== 2016-03-31 ==
* 21:19 legoktm: deploying https://gerrit.wikimedia.org/r/280756
* 13:52 hashar: rebasing integration puppetmaster (it had some merge commit )
* 01:40 Krinkle: Purge npm cache in integration-slave-trusty-1015:/mnt/home/jenkins-deploy/.npm was corrupted around March 23 19:00 for unknown reasons (T130895)


== 2016-03-30 ==
{{SAL-archives/Release Engineering}}
* 19:32 twentyafterfour: deleted some nutcracker and hhvm log files on deployment-mediawiki01 to free space
* 15:37 hashar: Gerrit has trouble sending emails T131189
* 13:48 Reedy: deployment-prep Make that deployment-tmh01
* 13:48 Reedy: deployment-prep upgrade hhvm on deployment-mediawiki01 and reboot
* 13:35 Reedy: deployment-prep upgrade hhvm on deployment-mediawiki03 and reboot
* 12:16 gehel: deployment-prep restarting varnish on deployment-cache-text04
* 11:04 Amir1: cherry-picked 280413/1 in beta puppetmaster, manually running puppet agent in deployment-ores-web
* 10:22 Amir1: cherry-picking 280403 to beta puppetmaster and manually running puppet agent in deployment-ores-web


== 2016-03-29 ==
* 23:22 marxarelli: running jenkins-jobs update config/ 'mwext-donationinterfacecore125-testextension-zend53' to deploy https://gerrit.wikimedia.org/r/#/c/280261/
* 19:52 Amir1: manually updated puppetmaster, deleted SSL cert key in deployment-ores-web in VM, running puppet agent manually
* 02:20 jzerebecki: reloading zuul fo 46923c8..c0937ee
== 2016-03-26 ==
* 22:38 jzerebecki: reloading zuul for 2d7e050..46923c8
== 2016-03-25 ==
* 23:55 marxarelli: deleting instances integration-slave-trusty-1002 and integration-slave-trusty-1005
* 23:54 marxarelli: deleting jenkins nodes integration-slave-trusty-1002 and integration-slave-trusty-1005
* 23:41 marxarelli: completed rolling manual deploy of https://gerrit.wikimedia.org/r/#/c/279640/ to trusty slaves
* 23:27 marxarelli: starting rolling offline/remount/online of trusty slaves to increase tmpfs size
* 23:22 marxarelli: pooled new trusty slaves integration-slave-trusty-1024 and integration-slave-trusty-1025
* 23:13 jzerebecki: reloading zuul fro 0aec21d..2d7e050
* 22:14 marxarelli: creating new jenkins node for integration-slave-trusty-1024
* 22:11 marxarelli: rebooting integration-slave-trusty-{1024,1025} before pooling as replacements for trusty-1002 and trusty-1005
* 21:06 marxarelli: repooling integration-slave-trusty-{1005,1002} to help with load while replacement instances are provisioning
* 16:59 marxarelli: depooling integration-slave-trusty-1002 until DNS resolution can be resolved. still investigating disk space issue
== 2016-03-24 ==
* 16:39 thcipriani: restarted rsync service on deployment-tin
* 13:45 thcipriani|afk: rearmed keyholder on deployment-tin
* 04:41 Krinkle: beta-update-databases-eqiad and beta-scap-eqiad stuck for over 8 hours (IRC notifier plugin deadlock)
* 03:28 Krinkle: beta-mediawiki-config-update-eqiadqueued has been stuck for over 5 hours.
== 2016-03-23 ==
* 23:00 Krinkle: rm-rf integration-slave-trusty-1013:/mnt/home/jenkins-deploy/tmpfs/jenkins-2/karma-54925082/ (bad permissions, caused Karma issues)
* 19:02 legoktm: restarted zuul
== 2016-03-22 ==
* 17:40 legoktm: deploying https://gerrit.wikimedia.org/r/278926
== 2016-03-21 ==
* 21:55 hashar: zuul: almost all MediaWiki extensions migrated to run the npm job on Nodepool (with Node.js 4.3)  T119143 . All tested. Will monitor the build results that ran overnight tomorrow
* 20:28 hashar: Mass running npm-node-4.3 jobs against MediaWiki extensions to make sure they all pass ( https://gerrit.wikimedia.org/r/#/c/278004/  |  T119143 )
* 17:40 elukey: executed git rebase --interactive on deployment-puppetmaster.deployment-prep.eqiad.wmflabs to remove https://gerrit.wikimedia.org/r/#/c/278713/
* 15:46 elukey: hacked manually the cdh puppet submodule on deployment-puppetmaster.deployment-prep.eqiad.wmflabs - please let me know if interfere with anybody's tests
* 14:24 elukey: executed git submodule update --init on deployment-puppetmaster.deployment-prep.eqiad.wmflabs
* 11:25 elukey: beta: cherry picked https://gerrit.wikimedia.org/r/#/c/278713/ to test an updated to the cdh module (analytics)
* 11:13 hashar: beta: rebased puppet master which had a conflict on https://gerrit.wikimedia.org/r/#/c/274711/  which got merged meanwhile (saves Elukey )
* 11:02 hashar: beta: added Elukey (wikimedia ops) to the project as member and admin
== 2016-03-19 ==
* 13:04 hashar: Jenkins: added ldap-labs-codfw.wikimedia.org as a fallback LDAP server  T130446
== 2016-03-18 ==
* 17:16 jzerebecki: reloading zuul for e33494f..89a9659
== 2016-03-17 ==
* 21:10 thcipriani: updating scap on deployment-tin to test D133
* 18:31 cscott: updated OCG to version c1a8232594fe846bd2374efd8f7c20d7e97ac449
* 09:34 hashar: deployment-jobrunner01 deleted /var/log/apache/*.gz  T130179
* 09:04 hashar: Upgrading hhvm and related extensions on jobrunner01  T130179
== 2016-03-16 ==
* 14:28 hashar: Updated jobs having the package manager cache system (castor) via https://gerrit.wikimedia.org/r/#/c/277774/
== 2016-03-15 ==
* 15:17 jzerebecki: added wikidata.beta.wmflabs.org in https://wikitech.wikimedia.org/wiki/Special:NovaAddress to deployment-cache-text04.deployment-prep.eqiad.wmflabs
* 14:19 hashar: Image ci-jessie-wikimedia-1458051246 in wmflabs-eqiad is ready  T124447
* 14:14 hashar: Refreshing Nodepool snapshot images so it get a fresh copy of slave-scripts  T124447
* 14:08 hashar: Deploying slave script change https://gerrit.wikimedia.org/r/#/c/277508/ "npm-install-dev.py: Use config.dev.yaml instead of config.yaml" for T124447
== 2016-03-14 ==
* 22:18 greg-g: new jobs weren't processing in Zuul, lego fixed it and blamed Reedy
* 20:13 hashar: Updating Jenkins jobs mwext-Wikibase-* so they no more rely on --with-phpunit ( ping @hoo https://gerrit.wikimedia.org/r/#/c/277330/ )
* 17:03 Krinkle: Doing full Zuul restart due to deadlock (T128569)
* 10:18 moritzm: re-enabled systemd unit for logstash on deployment-logstash2
== 2016-03-11 ==
* 22:42 legoktm: deploying https://gerrit.wikimedia.org/r/276901
* 19:41 legoktm: legoktm@integration-slave-trusty-1001:/mnt/jenkins-workspace/workspace$ sudo rm -rf mwext-Echo-testextension-* # because it was broken
== 2016-03-10 ==
* 20:22 hashar: Nodepool Image ci-jessie-wikimedia-1457641052 in wmflabs-eqiad is ready
* 20:19 hashar: Refreshing Nodepool to include the 'varnish' package T128188 
* 20:05 hashar: apt-get upgrade integration-slave-jessie1001  (bring in ffmpeg update and nodejs among other things)
* 12:22 hashar: Nodeppol Image ci-jessie-wikimedia-1457612269 in wmflabs-eqiad is ready
* 12:18 hashar: Nodepool: rebuilding image to get mathoid/graphoid packages included (hopefully) T119693 T128280
== 2016-03-09 ==
* 17:56 bd808: Cleaned up git clone state in deployment-tin.deployment-prep:/srv/mediawiki-staging/php-master and queued beta-code-update-eqiad to try again (T129371)
* 17:48 bd808: Git clone at deployment-tin.deployment-prep:/srv/mediawiki-staging/php-master in completely horrible state. Investigating
* 17:22 bd808: Fixed https://integration.wikimedia.org/ci/job/beta-mediawiki-config-update-eqiad/4452/
* 17:19 bd808: Manually cleaning up broken rebase in deployment-tin.deployment-prep:/srv/mediawiki-staging
* 16:27 bd808: Removed cherry-pick of https://gerrit.wikimedia.org/r/#/c/274696 ; manually cleaned up systemd unit and restarted logstash on deployment-logstash2
* 14:59 hashar: Image ci-jessie-wikimedia-1457535250 in wmflabs-eqiad is ready T129345
* 14:57 hashar: Rebuilding snapshot image to get Xvfb enabled at boot time T129345
* 13:04 moritzm: cherrypicked patch to deployment-prep which provides a systemd unit for logstash
* 10:52 hashar: Image ci-jessie-wikimedia-1457520493 in wmflabs-eqiad is ready
* 10:29 hashar: Nodepool: created new image and refreshing snapshot in attempt to get Xvfb running T129320 T128090
== 2016-03-08 ==
* 23:42 legoktm: running CentralAuth's checkLocalUser.php --verbose=1 --delete=1 on deployment-tin for T115198
* 21:33 hashar: Nodepool  Image ci-jessie-wikimedia-1457472606 in wmflabs-eqiad is ready
* 19:23 hashar: Zuul inject DISPLAY https://gerrit.wikimedia.org/r/#/c/273269/
* 16:03 hashar: Image ci-jessie-wikimedia-1457452766 is ready T128090
* 15:59 hashar: Nodepool: refreshing snapshot image to ship browsers+Xvfb for T128090
* 14:27 hashar: Mass refreshed CI slave-scripts 1d2c60d..e27c292
* 13:38 hashar: Rebased integration puppet master. Dropped a make-wmf-branch patch and the one for raita role
* 11:26 hashar: Nodepool: created new snapshot to set puppet $::labsproject : ci-jessie-wikimedia-1457436175 hoping to fix hiera lookup T129092
* 02:51 ori: deployment-prep Updating HHVM on deployment-mediawiki01
* 02:27 ori: deployment-prep Updating HHVM on deployment-mediawiki02
* 01:50 Krinkle: integration-saltmater: salt -v '*slave-trusty*' cmd.run 'rm -rf /mnt/jenkins-workspace/workspace/mwext-testextension-hhvm/src/skins/BlueSky' (T117710)
* 01:50 Krinkle: integration-saltmater: salt -v '*slave-trusty*' cmd.run 'rm -rf /mnt/jenkins-workspace/workspace/mwext-testextension-hhvm-composer/src/skins/BlueSky'
== 2016-03-07 ==
* 21:03 hashar: Nodepool upgraded to 0.1.1-wmf.4 , it no more waits 1 minute before deleted a used node | T118573
* 20:05 hashar: Upgrading Nodepool from 0.1.1-wmf3 to 0.1.1-wmf.4 with andrewbogott | T118573
== 2016-03-06 ==
* 10:20 legoktm: deploying https://gerrit.wikimedia.org/r/274911
== 2016-03-04 ==
* 19:31 hashar: Nodepool Image ci-jessie-wikimedia-1457119603 in wmflabs-eqiad is ready - T128846
* 13:29 hashar: Nodepool Image ci-jessie-wikimedia-1457097785 in wmflabs-eqiad is ready
* 08:42 hashar: CI deleting integration-slave-precise-1001 (2 executors). It is not in labs DNS which causes bunch of issues, no need for the capacity anymore. T128802
* 02:49 Krinkle: Reloading Zuul to deploy https://gerrit.wikimedia.org/r/274889
* 00:11 Krinkle: salt -v --show-timeout '*slave*' cmd.run "bash -c 'cd /srv/deployment/integration/slave-scripts; git pull'"
== 2016-03-03 ==
* 23:37 legoktm: salt -v --show-timeout '*slave*' cmd.run "bash -c 'cd /srv/deployment/integration/slave-scripts; git pull'"
* 22:34 legoktm: mysql not running on integration-slave-precise-1002, manually starting (T109704)
* 22:30 legoktm: mysql not running on integration-slave-precise-1011, manually starting (T109704)
* 22:19 legoktm: mysql not running on integration-slave-precise-1012, manually starting (T109704)
* 22:07 legoktm: deploying https://gerrit.wikimedia.org/r/274821
* 21:58 Krinkle: Reloading Zuul to deploy (EventLogging and AdminLinks)  https://gerrit.wikimedia.org/r/274821  /
* 18:49 thcipriani: killing deployment-bastion since it is no longer used
* 14:23 hashar: https://integration.wikimedia.org/ci/computer/integration-slave-trusty-1011/ is out of disk space
== 2016-03-02 ==
* 16:22 jzerebecki: reloading zuul for 9398fa1..943f17b
* 10:38 hashar: Zuul should no more be caught in death loop due to Depends-On on an  event-schemas change. Hole filled with https://gerrit.wikimedia.org/r/#/c/274356/ T128569
* 08:53 hashar: gerrit set-account Jsahleen --inactive    T108854
* 01:19 thcipriani: force restarting zuul because the queue is very stuck https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Restart
* 01:13 thcipriani: following steps for gearman deadlock: https://www.mediawiki.org/wiki/Continuous_integration/Zuul#Known_issues
== 2016-03-01 ==
* 23:10 Krinkle: Updated Jenkins configuration to also support php5 and hhvm for Console Sections detection of "PHPUnit"
* 17:05 hashar: gerrit: set accounts inactive for Eloquence and Mgrover. Former employees of wmf and mail bounceback
* 16:41 hashar: Restarted Jenkins
* 16:32 hashar: Bunch of Jenkins job got stall because I have killed threads in Jenkins to unblock  integration-slave-trusty-1003 :-(
* 12:14 hashar:  integration-slave-trusty-1003 is back online
* 12:13 hashar: Might have killed the proper Jenkins thread to unlock integration-slave-trusty-1003
* 12:03 hashar: Jenkins can not pool back integration-slave-trusty-1003  Jenkins master has a bunch of blocking threads pilling up with hudson.plugins.sshslaves.SSHLauncher.afterDisconnect() locked somehow
* 11:41 hashar: Rebooting integration-slave-trusty-1003 (does not reply to salt / ssh)
* 10:34 hashar: Image ci-jessie-wikimedia-1456827861 in wmflabs-eqiad is ready
* 10:24 hashar: Refreshing Nodepool snapshot instances
* 10:22 hashar: Refreshing Nodepool base image to speed instances boot time (dropping open-iscsi package https://gerrit.wikimedia.org/r/#/c/273973/ )
== 2016-02-29 ==
* 16:23 hashar: salt -v '*slave*' cmd.run 'rm -fR /mnt/jenkins-workspace/workspace/mwext*jslint' T127362
* 16:17 hashar: Deleting all mwext-.*-jslint jobs from Jenkins. Paladox has migrated all of them to jshint/jsonlint generic jobs T127362
* 16:16 hashar: Deleting all mwext-.*-jslint jobs from Jenkins. Paladox has migrated all of them to jshint/jsonlint generic jobs
* 09:46 hashar: Jenkins installing Yaml Axis Plugin 0.2.0
== 2016-02-28 ==
* 01:30 Krinkle: Rebooting integration-slave-precise-1012 – Might help T109704 (MySQL not running)
== 2016-02-26 ==
* 15:14 jzerebecki: salt -v --show-timeout '*slave*' cmd.run "bash -c 'cd /srv/deployment/integration/slave-scripts; git pull'" T128191
* 15:14 jzerebecki: salt -v --show-timeout '*slave*' cmd.run "bash -c 'cd /srv/deployment/integration/slave-scripts; git pull'"
* 14:44 hashar: (since it started, dont be that scared!)
* 14:44 hashar: Nodepool has triggered 40 000 instances
* 11:53 hashar: Restarted memcached on deployment-memc02  T128177
* 11:53 hashar: memcached process on deployment-memc02 seems to have a nice leak of socket usages (from lost) and plainly refuse connections (bunch of CLOSE_WAIT)  T128177
* 11:53 hashar: memcached process on deployment-memc02 seems to have a nice leak of socket usages (from lost) and plainly refuse connections (bunch of CLOSE_WAIT)
* 11:40 hashar: deployment-memc04 find /etc/apt -name '*proxy' -delete  (prevented apt-get update)
* 11:26 hashar: beta: salt -v '*' cmd.run 'apt-get -y install ruby-msgpack'  . I am tired of seeing puppet debug messages: "Debug: Failed to load library 'msgpack' for feature 'msgpack'"
* 11:24 hashar: puppet keep restarting nutcracker apparently T128177
* 11:20 hashar: Memcached error for key "enwiki:flow_workflow%3Av2%3Apk:63dc3cf6a7184c32477496d63c173f9c:4.8" on server "127.0.0.1:11212": SERVER HAS FAILED AND IS DISABLED UNTIL TIMED RETRY
== 2016-02-25 ==
* 22:38 hashar: beta:  maybe deployment-jobunner01 is processing jobs a bit faster now.  Seems like hhvm went wild
* 22:23 hashar: beta: jobrunner01  had apache/hhvm killed somehow .... Blame me
* 21:56 hashar: beta: stopped jobchron / jobrunner on deployment-jobrunner01  and restarting them by running puppet
* 21:49 hashar: beta did a git-deploy of jobrunner/jobrunner hoping to fix puppet run on deployment-jobrunner01 and apparently it did! T126846
* 11:21 hashar: deleting workspace /mnt/jenkins-workspace/workspace/browsertests-Wikidata-WikidataTests-linux-firefox-sauce on slave-trusty-1015
* 10:08 hashar: Jenkins upgraded T128006
* 01:44 legoktm: deploying https://gerrit.wikimedia.org/r/273170
* 01:39 legoktm: deploying https://gerrit.wikimedia.org/r/272955 (undeployed) and https://gerrit.wikimedia.org/r/273136
* 01:37 legoktm: deploying https://gerrit.wikimedia.org/r/273136
* 00:31 thcipriani: running puppet on beta to update scap to latest packaged version: sudo salt -b '10%' -G 'deployment_target:scap/scap' cmd.run 'puppet agent -t'
* 00:20 thcipriani: deployment-tin not accepting jobs for some time, ran through https://www.mediawiki.org/wiki/Continuous_integration/Jenkins#Hung_beta_code.2Fdb_update, is back now
== 2016-02-24 ==
* 19:55 legoktm: legoktm@deployment-tin:~$ mwscript extensions/ORES/maintenance/PopulateDatabase.php --wiki=enwiki
* 18:30 bd808: "configuration file '/etc/nutcracker/nutcracker.yml' syntax is invalid"
* 18:27 bd808: nutcracker dead on mediawiki01; investigating
* 17:20 hashar: Deleted Nodepool instances so new ones get to use the new snapshot ci-jessie-wikimedia-1456333979
* 17:12 hashar: Refreshing nodepool snapshot. Been stall since Feb 15th T127755
* 17:01 bd808: https://wmflabs.org/sal/releng missing SAL data since 2016-02-20T20:19 due to bot crash; needs to be backfilled from wikitech data (T127981)
* 16:43 hashar: sal on elastic search is stall https://phabricator.wikimedia.org/T127981
* 15:07 hasharAW: beta app servers have lost access to memcached due to bad nutcracker conf | T127966
* 14:41 hashar: beta: we have a lost a memcached server 11:51am UTC
== 2016-02-23 ==
* 22:45 thcipriani: deployment-puppetmaster is in a weird rebase state
* 22:25 legoktm: running sync-common manually on deployment-mediawiki02
* 09:59 hashar: Deleted a bunch of mwext-.*-jslint jobs that are no more in used (migrated to either 'npm' or  'jshint' / 'jsonlint' )
== 2016-02-22 ==
* 22:06 bd808: Restarted puppetmaster service on deployment-puppetmaster to "fix" error "invalid byte sequence in US-ASCII"
* 17:46 jzerebecki: ssh integration-slave-trusty-1017.eqiad.wmflabs 'sudo -u jenkins-deploy rm -rf /mnt/jenkins-workspace/workspace/mwext-testextension-hhvm/src/.git/config.lock
* 16:47 gehel: deployment-prep upgrading deployment-logstash2 to elasticsearch 1.7.5
* 10:26 gehel: deployment-prep upgrading elastic-search to 1.7.5 on deployment-elastic0[5-8]
== 2016-02-20 ==
* 20:19 Krinkle: beta-code-update-eqiad job repeatedly stuck at "IRC notifier plugin"
* 19:29 Krinkle: beta-code-update-eqiad broken because deployment-tin:/srv/mediawiki-staging/php-master/extensions/MobileFrontend/includes/MobileFrontend.hooks.php was modified on the server without commit
* 19:22 Krinkle: Various beta-mediawiki-config-update-eqiad jobs have been stuck 'queued' for > 24 hours
== 2016-02-19 ==
* 12:09 hashar: killed https://integration.wikimedia.org/ci/job/beta-code-update-eqiad/  been running for 13 hours. Blocked because slave went offline due to labs reboots yesterday
* 10:15 hashar: Creating a bunch of repository in GitHub to fix Gerrit replication errors
== 2016-02-18 ==
* 19:20 legoktm: deploying https://gerrit.wikimedia.org/r/271583 and https://gerrit.wikimedia.org/r/271581, both no-ops
* 18:14 legoktm: deploying https://gerrit.wikimedia.org/r/271012
* 17:36 legoktm: deploying https://gerrit.wikimedia.org/r/271555
* 16:01 hashar: deleting instance  integration-slave-precise-1003  think we have enough precise slaves
* 10:44 hashar: Nodepool: JenkinsException: Could not parse JSON info for server[https://integration.wikimedia.org/ci/]
== 2016-02-17 ==
* 07:36 legoktm: deploying https://gerrit.wikimedia.org/r/271201
* 01:01 yuvipanda: attempting to turn off NFS on 52 instances on deployment-prep project
== 2016-02-16 ==
* 23:22 yuvipanda: new instances on deployment-prep no longer get NFS because of https://wikitech.wikimedia.org/w/index.php?title=Hiera%3ADeployment-prep&type=revision&diff=311783&oldid=311781
* 23:18 hashar: jenkins@gallium find /var/lib/jenkins/config-history/nodes -maxdepth 1 -type d -name 'ci-jessie*' -exec rm -vfR {} \;
* 23:17 hashar: Jenkins accepting slave creations again. Root cause is /var/lib/jenkins/config-history/nodes/ has reached the 32k inode limit.
* 23:14 hashar: Jenkins: Could not create rootDir /var/lib/jenkins/config-history/nodes/ci-jessie-wikimedia-34969/2016-02-16_22-40-23
* 23:02 hashar: Nodepool can not authenticate with Jenkins anymore. Thus it can not add slaves it spawned.
* 22:56 hashar: contint: Nodepool instances pool exhausted
* 21:14 andrewbogott: deployment-logstash2 migration finished
* 20:49 jzerebecki: reloading zuul for 3bf7584..67fec7b
* 19:58 andrewbogott: migrating deployment-logstash2 to labvirt1010
* 19:00 hashar: tin: checking out mw 1.27.0-wmf.14
* 15:23 hashar: integration-make-wmfbranch : /mnt/make-wmf-branch  mount now has gid=wikidev and group setuid (i.e. mode 2775)
* 15:20 hashar: integration-make-wmfbranch : change tmpfs to /mnt/make-wmf-branch  (from /var/make-wmf-branch )
* 11:30 jzerebecki: T117710 integration-saltmaster:~# salt -v '*slave-trusty*' cmd.run 'rm -rf /mnt/jenkins-workspace/workspace/mwext-testextension-hhvm-composer/src/skins/BlueSky'
* 09:52 hashar: will cut the wmf branches this afternoon starting around 14:00 CET
== 2016-02-15 ==
* 16:28 jzerebecki: reloading zuul for 2d16ad3..3bb0afa
* 16:10 hashar: Image ci-jessie-wikimedia-1455552377 in wmflabs-eqiad is ready
* 15:25 jzerebecki: reloading zuul for e174335..2d16ad3
* 15:23 hashar: Image ci-jessie-wikimedia-1455549539 in wmflabs-eqiad is ready
* 15:19 hashar: Regenerating Nodepool snapshot. Slave scripts have 0 bytes...
* 15:04 hashar: Slave scripts added to Nodepool instances! Image ci-jessie-wikimedia-1455548346 in wmflabs-eqiad is ready
* 11:05 hashar: Image ci-jessie-wikimedia-1455534001 in wmflabs-eqiad is ready
* 07:52 legoktm: deploying https://gerrit.wikimedia.org/r/270686
* 06:52 legoktm: legoktm@gallium:/srv/org/wikimedia/doc$ sudo -u jenkins-slave rm -rf EventLogging/ GuidedTour/ MultimediaViewer/ TemplateData/
* 06:22 legoktm: deploying https://gerrit.wikimedia.org/r/270677
* 06:12 legoktm: deploying https://gerrit.wikimedia.org/r/270675
* 06:02 legoktm: deploying https://gerrit.wikimedia.org/r/270674
* 05:56 legoktm: deploying https://gerrit.wikimedia.org/r/270673
* 05:32 legoktm: deploying https://gerrit.wikimedia.org/r/270670
* 04:05 legoktm: deploying https://gerrit.wikimedia.org/r/270667
* 03:26 legoktm: deploying https://gerrit.wikimedia.org/r/270665
* 02:56 legoktm: deploying https://gerrit.wikimedia.org/r/270657
== 2016-02-14 ==
* 23:54 legoktm: deploying https://gerrit.wikimedia.org/r/270656
* 23:25 legoktm: deploying https://gerrit.wikimedia.org/r/270654
* 23:13 legoktm: also deploying https://gerrit.wikimedia.org/r/#/c/265098/
* 23:11 legoktm: deploying https://gerrit.wikimedia.org/r/270651
* 05:18 bd808: tools.stashbot Testing after restart (T126419)
== 2016-02-13 ==
* 06:42 bd808: restarted nutcracker on deployment-mediawiki01
* 06:32 bd808: jobrunner on deployment-jobrunner01 enabled after reverting changes from T87928 that caused T126830
* 05:51 bd808: disabled jobrunner process on jobrunner01; queue full of jobs broken by T126830
* 05:31 bd808: trebuchet clone of /srv/jobrunner/jobrunner broken on jobrunner01; failing puppet runs
* 05:25 bd808: jobrunner process on deployment-jobrunner01 badly broken; investigating
* 05:20 bd808: Ran https://phabricator.wikimedia.org/P2273 on deployment-jobrunner01.deployment-prep.eqiad.wmflabs; freed ~500M; disk utilization still at 94%
== 2016-02-12 ==
* 23:54 hashar: beta cluster broken since 20:30 UTC  https://logstash-beta.wmflabs.org/#/dashboard/elasticsearch/fatalmonitor  havent looked
* 17:36 hashar: salt -v '*slave-trusty*' cmd.run 'apt-get -y install texlive-generic-extra'    # T126422
* 17:32 hashar: adding texlive-generic-extra on CI slaves by cherry picking https://gerrit.wikimedia.org/r/#/c/270322/ - T126422
* 17:19 hashar: get rid of integration-dev   it is broken somehow
* 17:10 hashar: Nodepool back at spawning instances.  contintcloud has been migrated in wmflabs
* 16:51 thcipriani: running  sudo salt '*' -b '10%' deploy.fixurl to fix deployment-prep trebuchet urls
* 16:31 hashar: bd808 added support for saltbot to update tasks automagically!!!! T108720
* 03:10 yurik: attempted to sync graphoid from gerrit 270166 from deployment-tin, but it wouldn't sync.  Tried to git pull sca02, submodules wouldn't pull
== 2016-02-11 ==
* 22:53 thcipriani: shutting down deployment-bastion
* 21:28 hashar: pooling back slaves 1001 to 1006
* 21:18 hashar: re enabling hhvm service on slaves ( https://phabricator.wikimedia.org/T126594 ) Some symlink is missing and only provided by the upstart script grrrrrrr https://phabricator.wikimedia.org/T126658
* 20:52 legoktm: deploying https://gerrit.wikimedia.org/r/270098
* 20:35 hashar: depooling the six recent slaves: /usr/lib/x86_64-linux-gnu/hhvm/extensions/current/luasandbox.so cannot open shared object file
* 20:29 hashar: pooling integration-slave-trusty-1004 integration-slave-trusty-1005 integration-slave-trusty-1006
* 20:14 hashar: pooling integration-slave-trusty-1001 integration-slave-trusty-1002 integration-slave-trusty-1003
* 19:35 marxarelli: modifying deployment server node in jenkins to point to deployment-tin
* 19:27 thcipriani: running sudo salt -b '10%' '*' cmd.run 'puppet agent -t' from deployment-salt
* 19:27 twentyafterfour: Keeping notes on the ticket: https://phabricator.wikimedia.org/T126537
* 19:24 thcipriani: moving deployment-bastion to deployment-tin
* 17:59 hashar: recreated instances with proper names:  integration-slave-trusty-{1001-1006}
* 17:52 hashar: Created integration-slave-trusty-{1019-1026} as m1.large  (note 1023 is an exception it is for Android).  Applied role::ci::slave , lets wait for puppet to finish
* 17:42 Krinkle: Currently testing https://gerrit.wikimedia.org/r/#/c/268802/ in Beta Labs
* 17:27 hashar: Depooling all the ci.medium slaves and deleting them.
* 17:27 hashar: I tried.  The ci.medium instances are too small and MediaWiki tests really need 1.5GBytes of memory :-(
* 16:00 hashar: rebuilding integration-dev https://phabricator.wikimedia.org/T126613
* 15:27 Krinkle: Deploy Zuul config change https://gerrit.wikimedia.org/r/269976
* 11:46 hashar: salt -v '*' cmd.run '/etc/init.d/apache2 restart'  might help for Wikidata browser tests failling
* 11:32 hashar: disabling hhvm service on CI slaves ( https://phabricator.wikimedia.org/T126594 , cherry picked both patches )
* 10:50 hashar: reenabled puppet on CI. All transitioned to a 128MB tmpfs (was 512MB)
* 10:16 hashar: pooling back integration-slave-trusty-1009 and integration-slave-trusty-1010  (tmpfs shrunken)
* 10:06 hashar: disabling puppet on all CI slaves. Trying to lower tmpfs 512MB to 128MB  ( https://gerrit.wikimedia.org/r/#/c/269880/ )
* 02:45 legoktm: deploying https://gerrit.wikimedia.org/r/269853 https://gerrit.wikimedia.org/r/269893
== 2016-02-10 ==
* 23:54 hashar_: depooling Trusty slaves that only have 2GB of ram that is not enough.  https://phabricator.wikimedia.org/T126545
* 22:55 hashar_: gallium: find /var/lib/jenkins/config-history/config -type f -wholename '*/2015*' -delete  (  https://phabricator.wikimedia.org/T126552 )
* 22:34 Krinkle: Zuul is back up and procesing Gerrit events, but jobs are still queued indefinitely. Jenkins is not accepting new jobs
* 22:31 Krinkle: Full restart of Zuul. Seems Gearman/Zuul got stuck. All executors were idling. No new Gerrit events processed either.
* 21:22 legoktm: cherry-picking https://gerrit.wikimedia.org/r/#/c/269370/ on integration-puppetmaster again
* 21:17 hashar: CI dust have settled.  Krinkle and I have pooled a lot more Trusty slaves to accommodate for the overload caused by switching to php55 (jobs run on Trusty)
* 21:08 hashar: pooling trusty slaves 1009, 1010, 1021, 1022  with 2 executors  (they are ci.medium)
* 20:38 hashar: cancelling mediawiki-core-jsduck-publish  and mediawiki-core-doxygen-publish jobs manually.  They will catch up on next merge
* 20:34 Krinkle: Pooled integration-slave-trusty-1019 (new)
* 20:28 Krinkle: Pooled integration-slave-trusty-1020 (new)
* 20:24 Krinkle: created integration-slave-trusty-1019 and integration-slave-trusty-1020 (ci1.medium)
* 20:18 hashar: created integration-slave-trusty-1009 and 1010 (trusty ci.medium)
* 20:06 hashar: creating integration-slave-trusty-1021 and integration-slave-trusty-1022 (ci.medium)