You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
News/Toolforge Stretch deprecation: Difference between revisions
imported>Majavah No edit summary |
imported>Komla Sapaty (→Move a continuous job: added a complete a example of starting a continuous job on a buster grid) |
||
Line 64: | Line 64: | ||
$ become YOUR_TOOL | $ become YOUR_TOOL | ||
:# Start your job on the Buster job grid | :# Start your job on the Buster job grid(note: this is a specific example for a php job that checks for a quota). | ||
$ jstart -release buster . | $ jstart -release buster -mem 350m php check_my_quota.php | ||
</syntaxhighlight> | </syntaxhighlight> | ||
The exact commands needed to start each continuous job vary greatly from tool to tool. This would be a great time to make a page of reference material for yourself and other maintainers here on Wikitech in the <kbd>{{Querylink|Special:Prefixindex|qs=namespace=116|Tool}}</kbd> namespace and using the [[:Template:Tool|Tool template]] if you haven't already. | The exact commands needed to start each continuous job vary greatly from tool to tool. This would be a great time to make a page of reference material for yourself and other maintainers here on Wikitech in the <kbd>{{Querylink|Special:Prefixindex|qs=namespace=116|Tool}}</kbd> namespace and using the [[:Template:Tool|Tool template]] if you haven't already. | ||
Line 190: | Line 190: | ||
== Why are we doing this? == | == Why are we doing this? == | ||
This is an implementation of our Operating System Upgrade Policy. | |||
In a nutshell, we use Debian and deprecate versions three years after release and remove them completely from our infrastructure by four years after their release. | |||
See [[Operating_system_upgrade_policy|Operating System Upgrade Policy]] for more information. | |||
== See also == | == See also == |
Revision as of 23:30, 22 January 2022
This page is currently a draft. More information and discussion about changes to this draft on the talk page. |
This page details information about deprecating and removing hosts running Debian Stretch (9.x) as an operating system from the Toolforge infrastructure. The login bastions and Grid execution hosts are still running Stretch and must be replaced with new instances.
What is changing?
- New job grid running Son of Grid Engine on Debian Buster instances
- New bastion hosts running Debian Buster with connectivity to the new job grid.
- New versions of PHP, Python2, Python3, and other language runtimes
- New versions of various support libraries
We are introducing a configuration option to let you select which operating system (Debian Stretch or Debian Buster) you want for your grid-based tool. That way, you can try out the migration from Stretch to Buster at your convenience.
Timeline
- 2021-xx-xx: Availability of Debian Buster grid announced to community
- 2021-xx-xx: Reminders via email to tool maintainers for tools still running on Stretch
- Week of 2021-xx-xx:
- Daily reminders via email to tool maintainers for tools still running on Stretch
- Switch login.toolforge.org to point to Buster bastion
- 2021-xx-xx: Shutdown Stretch grid
What should I do?
SSH to the bastions
During the compatibility period, there are 2 sets of bastions available:
login.toolforge.org
: points to the old Debian Stretch bastiondev.toolforge.org
: points to the old Debian Stretch development bastionlogin-buster.toolforge.org
: points to the new Debian Buster bastiondev-buster.toolforge.org
: points to the new Debian Buster development bastion
When the time arrives, the old Stretch bastion will stop working, and both login.toolforge.org
and dev.toolforge.org
will point to Buster bastions.
Move a grid engine webservice
We strongly encourage you to migrate web services to Kubernetes instead of using the grid.
If you have strong reasons to keep using the grid for webservices, then try the --release {buster|stretch}
parameter:
:# Connect to the Buster bastion
$ ssh <your-shell-name>@login-buster.toolforge.org
:# Become your tool account
$ become YOUR_TOOL
:# Start the webservice as a Kubernetes container rather than a grid job
:# <type> is one of: php7.2, php5.6, python, python2, nodejs, golang, jdk8, ruby2, tcl
$ webservice --backend=kubernetes --release buster <type> start
:# -- OR --
:# Start the webservice as a Buster grid job
:# <type> is one of: lighttpd, uwsgi-python, tomcat, generic, lighttpd-plain, nodejs, uwsgi-plain
$ webservice --backend=gridengine --release buster <type> start
See Help:Toolforge/Web#Backends for more information on migrating from grid engine to Kubernetes.
![]() | Python2 and Python3 webservices will need to rebuild their virtualenv environments on the new target runtime (Buster grid or Kubernetes). |
![]() | NodeJS webservices will need to rebuild their $HOME/www/js/node_modules on the new target runtime (Buster grid or Kubernetes). |
Move a continuous job
:# Connect to the Stretch bastion
$ ssh <your-shell-name>@login-buster.toolforge.org
:# Become your tool account
$ become YOUR_TOOL
:# Start your job on the Buster job grid(note: this is a specific example for a php job that checks for a quota).
$ jstart -release buster -mem 350m php check_my_quota.php
The exact commands needed to start each continuous job vary greatly from tool to tool. This would be a great time to make a page of reference material for yourself and other maintainers here on Wikitech in the Tool namespace and using the Tool template if you haven't already.
Move a cron job
TODO. The grid server is the same, adjust the -release
jsub argument
If your workload permits, please avoid scheduling cronjobs from midnight to 3am so you're not competing with other cronjobs for system resources. That time window is currently very crowded.
What are the primary changes with moving to Buster?
Language runtime and library versions
The vast majority of the language runtimes and libraries installed on the grid nodes are upgraded in BUSTER.
Runtime | Stretch Version | Buster Version |
---|---|---|
Python3 | 3.5.3 | 3.7.3 |
PHP | 7.2 | 7.3 |
Python2 | 2.7.13 | 2.7.16 |
NodeJS | 8.11.1 | 10.24.0 |
Perl | 5.24 | 5.28 |
Java | 11.0.6 | 11.0.9 |
Ruby | 2.3.3 | 2.5.5 |
Mono | 5.12.0 | 5.18.0 |
TCL | 8.6.0 | 8.6.9 |
R | 3.3.3 | 3.5.2 |
Solutions to common problems
Having trouble with the new grid? If the answer to your problem isn't here, ask for help in #wikimedia-cloud connect or file a task in Phabricator.
Rebuild virtualenv for python users
Since the python executables and libraries are updated in Debian Buster, local virtualenvs will need to be deleted and re-created on the new bastion for anything that runs from those virtualenvs to work. Several errors are likely to be caused by old virtualenvs with one obvious one being an unexpected ImportError
.
Using a requirements file may make this simpler in many cases, if your project doesn't already use one. You can create one in your local directory by running pip freeze > requirements.txt
in your tool folder with your virtualenv activated. Then later on, you can simply use pip install -r requirements.txt
to install the new environment after you deleted the old virtualenv and created a new one. For more information on this option, see pip's documentation on requirements files.
Example 1: Upgrading a Stretch grid engine based tool to the Buster grid
Follow these steps if you manually submit jobs using jsub, or if you submit jobs using a crontab.
$ ssh <your-shell-name>@login-buster.toolforge.org
$ become YOUR_TOOL
$ rm -rf venv # This will destroy the virtualenv and all libraries, so make sure you know what you will need to install later!
$ virtualenv venv
$ source venv/bin/activate
$ pip install --upgrade pip # upgrade pip itself to avoid problems with older versions
$ pip install ... # Here you'd use the requirements file syntax if you have one, or you'd manually install each needed library.
Example 2: Upgrading a uWSGI webservice into a Kubernetes container
If you are currently running your uWSGI webservice under the Grid Engine backend (i.e., webservice uwsgi-python command
), and you want to upgrade to a uWSGI webservice running under Kubernetes (i.e., webservice --backend=kubernetes python command)
, you should rebuild your virtualenv as follows:
$ ssh <your-shell-name>@login-buster.toolforge.org
$ become YOUR-TOOL
$ webservice --backend=kubernetes python stop
$ webservice --backend=kubernetes python shell # do not skip this step – setting up the venv directly from the bastion may result in serious performance issues, compare T214086
$ rm -rf www/python/venv/ # this will destroy the virtualenv and all libraries, so make sure you know what you will need to install later!
$ python3 -m venv www/python/venv/
$ source www/python/venv/bin/activate
$ pip install --upgrade pip # upgrade pip itself to avoid problems with older versions
$ pip install -r www/python/src/requirements.txt # assuming your tool has a requirements.txt file
$ webservice --backend=kubernetes python start
Example 3: Upgrading a Kubernetes uWSGI webservice
If you are already using the Kubernetes backend, there is nothing you need to do -- the container will use the same image as before.
Delete a tool
Some tools were experiments that are done, others were made obsolete by other tools, some are just things that the original maintainer is tired of caring for. There is no UI with a big red "Delete this tool" button, so how can you responsibly request that a tool be deleted?
You can't delete a tool account yourself, though you can delete the content of your directories and make an existing web tool inaccessible by shutting down the web service (webservice stop
). If you really want a tool account to be deleted, please follow the steps described at Toolforge (Tools to be deleted).
SSH to login-buster.toolforge.org fails with 'Permission denied (publickey)'
This is typically an issue with the newer Debian Buster provided version of sshd
on the server side refusing to authenticate an insecure or deprecated public key type. Specifically, support for DSA (ssh-dss) keys was deprecated in Openssh 7.0. If your ssh public key starts with the string "ssh-dss" you will be impacted by this. RSA keys smaller than 1024 bits are also deprecated.
First make sure that you are passing a valid key by attempting to ssh to login.toolforge.org
using the same public key and username. If this also fails, the problem is likely something other than the ssh key type. Join us in #wikimedia-cloud connect for interactive debugging help.
If you can ssh to login.toolforge.org
with no errors, your key is probably of an unsupported type. Generate a new ssh key pair and upload the public key using the form at https://toolsadmin.wikimedia.org/profile/settings/ssh-keys or Special:Preferences#mw-prefsection-openstack. We currently recommend using either ed25519 or 4096-bit RSA keys. See Production shell access#Generating your SSH key for more information.
SSH to login-buster.toolforge.org fails with 'Permission denied (publickey,hostbased)'
In case you face this problem, make sure to use the right shell name located on your User Preferences called **Instance shell account name**. It's supposed to be used in logging into the Toolforge server when need be, whether Stretch or Buster.
Monitoring tools
- Tools running jobs on Stretch grid engine in last 7 days
- This report updates once per hour and will not report jobs that have been seen running on the Buster grid in the same 7 day period.
- Report has drill down pages for each maintainer and tool. Examples: bd808's tools, sge-status tool
- Webservices that move from the Stretch grid directly to the Kubernetes cluster will not be removed from the report automatically.
Why are we doing this?
This is an implementation of our Operating System Upgrade Policy. In a nutshell, we use Debian and deprecate versions three years after release and remove them completely from our infrastructure by four years after their release. See Operating System Upgrade Policy for more information.
See also
- [Cloud-announce] Toolforge: Trusty deprecation and grid engine migration
- [Cloud-announce] Toolforge: Trusty job grid deprecation reminders start today
- [Cloud-announce] Toolforge: Bastion changes and Trusty deprecation final steps
Communication and support
Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia Movement volunteers. Please reach out with questions and join the conversation:
- Chat in real time in the IRC channel #wikimedia-cloud connect, the bridged Telegram group, or the bridged Mattermost channel
- Discuss via email after you subscribed to the cloud@ mailing list