You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Help:Toolforge/Python: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>JPxG
imported>Stang
(→‎Kubernetes python jobs: corr yaml package)
(6 intermediate revisions by 5 users not shown)
Line 11: Line 11:
== Virtual environments ==
== Virtual environments ==
If you don't want to create one tool for each Python task you need specific environment for, that's when '''[https://docs.python.org/3/tutorial/venv.html virtual environment]''' comes handy. The advantages of virtual environments don't end there. Apart from packages you can also maintain several separated ''git/svn repositories'', several ''command line profiles'', etc. You basically create this small virtual unit in your folder system, set up its environment to your needs and then reach its contents when you need it.
If you don't want to create one tool for each Python task you need specific environment for, that's when '''[https://docs.python.org/3/tutorial/venv.html virtual environment]''' comes handy. The advantages of virtual environments don't end there. Apart from packages you can also maintain several separated ''git/svn repositories'', several ''command line profiles'', etc. You basically create this small virtual unit in your folder system, set up its environment to your needs and then reach its contents when you need it.
The python3 venv ''must'' be created in the same execution environment that the venv will be used from. This means:
* if the tool uses a a kubernetes backend (recommended), the venv should be boostrapped inside a container.
* if the tool uses a grid engine backend, the venv should be bootstrapped directly on a Toolforge bastion filesystem like <code>dev.toolforge.org</code> or <code>login.toolforge.org</code>.
Read below for more concrete information.
=== For Kubernetes backend ===
This is for tools that use the Toolforge Kubernetes backend (recommended).
==== Kubernetes python webservices ====
See [[Help:Toolforge/Web/Python | Toolforge webservices with python]].
==== Kubernetes python jobs ====
Follow these instructions if your are using the [[Help:Toolforge/Jobs_framework | Toolforge Jobs framework]].
You need to bootstrap your python venv from inside a job itself (similar to what happens in kubernetes webservices).
Create a script similar to this:
{{Codesample|name=bootstrap_venv.sh|lang=bash|scheme=light|code=
#!/bin/bash
# use bash strict mode
set -euo pipefail
# create the venv
python3 -m venv pyvenv
# activate it
source pyvenv/bin/activate
# upgrade pip inside the venv and add support for the wheel package format
pip install -U pip wheel
# install some concrete packages
pip install requests
pip install pyyaml
# or, install all packages from src/requirements.txt
# pip install -r src/requirements.txt
}}
Then run it in the desired python container, selecting the python version you prefer, example:
<syntaxhighlight lang="shell-session">
tools.mytool@tools-sgebastion-11:~$ ls bootstrap_venv.sh
bootstrap_venv.sh
tools.mytool@tools-sgebastion-11:~$ chmod ug+x bootstrap_venv.sh
tools.mytool@tools-sgebastion-11:~$ toolforge-jobs run bootstrap-venv --command "cd $PWD && ./bootstrap_venv.sh" --image tf-python39 --wait
tools.mytool@tools-sgebastion-11:~$ ls pyvenv
pyvenv
</syntaxhighlight>
Now you can run your python tool using this venv, example:
<syntaxhighlight lang="shell-session">
tools.mytool@tools-sgebastion-11:~$ cat src/mytool.py
import requests
r = requests.get('https://www.wikidata.org/wiki/Q1')
print(r.status_code)
tools.mytool@tools-sgebastion-11:~$ toolforge-jobs run mytool --command "pyvenv/bin/python src/mytool.py" --image tf-python39
tools.mytool@tools-sgebastion-11:~$ cat mytool.out
200
</syntaxhighlight>
=== For Grid Engine backend ===
{{Note | Please consider migrating your tool into the Kubernetes backend.}}
This is for tools that use the Toolforge Grid Engine backend (not recommended).
==== Grid Engine python webservice ====
See [[Help:Toolforge/Web/Python | Toolforge webservices with python]].
==== Grid Engine python jobs ====
{{Note | Please consider migrating your python jobs to [[Help:Toolforge/Jobs_framework | Toolforge Jobs framework]].}}
Follow these instructions if you are using the Grid Engine backend to run your jobs.


You can ''create'' your first virtual environment using:
You can ''create'' your first virtual environment using:
Line 17: Line 100:
$ python3 -mvenv my_venv
$ python3 -mvenv my_venv
</syntaxhighlight>
</syntaxhighlight>
{{Note|content=The python3 command above ''must'' be run from the same execution environment that the venv will be used from. For Grid Engine jobs, run the command on a Toolforge bastion like login.toolforge.org. For a Kubernetes job, run the command from inside a <code>webservice --backend=kubernetes python${VERSION} shell</code> interactive shell.}}


This will install package manager, some basic tools, commands and prerequisites, everything into your new little unit. Once you created one, let's use it and play with it:
This will install package manager, some basic tools, commands and prerequisites, everything into your new little unit. Once you created one, let's use it and play with it:
Line 40: Line 120:
You can reach it again from the inside the same way any time you want. This is handy when you want to update it for example. But for scheduled tasks, you would have to create a batch file with multiple commands to reach it, use it and leave it.
You can reach it again from the inside the same way any time you want. This is handy when you want to update it for example. But for scheduled tasks, you would have to create a batch file with multiple commands to reach it, use it and leave it.


=== Use venv with scheduled tasks ===
===== Use venv with scheduled tasks =====
Since it is saved in a folder in your Toolforge space, you can always use it from the outside just like any other folder. Well, you can not alter it this way, but for scheduled tasks you usually don't need to:
Since it is saved in a folder in your Toolforge space, you can always use it from the outside just like any other folder. Well, you can not alter it this way, but for scheduled tasks you usually don't need to:


Line 47: Line 127:
</syntaxhighlight>
</syntaxhighlight>


=== Use your venv everywhere ===
===== Use your venv everywhere =====
You can also use your virtual environment everywhere by default. You can activate it using <code>.profile</code> like:
You can also use your virtual environment everywhere by default. You can activate it using <code>.profile</code> like:


Line 54: Line 134:
</syntaxhighlight>
</syntaxhighlight>


{{Further|further = help|[[Help:Toolforge/Pywikibot#virtualenv|Set up venv for Pywikibot]] and [[Help:Toolforge/Web#Using virtualenv with webservice shell|Set up venv for webservice]]}}


== See also ==
== See also ==
* [[Help:Toolforge/Web/Python]]
* [[Help:Toolforge/Web/Python]]
* [[Help:Toolforge/Pywikibot#virtualenv]] --- python virtual envs for pywikibot
* [[Help:Troubleshooting Toolforge]]


{{:Help:Cloud Services communication}}
{{:Help:Cloud Services communication}}


[[Category:Toolforge|Python]]
[[Category:Toolforge|Python]]

Revision as of 12:10, 5 June 2022

Several Python runtimes are available for use inside Toolforge:

  • python3: Python 3.5.3 (3.7.3 on kubernetes)
  • python: Python 2.7.13 (2.7.9 on kubernetes; deprecated)

For guidance on writing tools in Python, see e. g. Help:Toolforge/My first Flask OAuth tool and Help:Toolforge/My first Django OAuth tool.

Deprecating Python 2

Many legacy tools and code examples use Python 2.x as a runtime. Use of Python 3.x is encouraged for new code as Python 2.7 stopped being maintained in 2020. Toolforge will provide some amount of support for Python 2.x through 2022 because Debian will be supporting Python 2.7 in Debian 10 (buster). This support will only extend to critical security patches however.

Virtual environments

If you don't want to create one tool for each Python task you need specific environment for, that's when virtual environment comes handy. The advantages of virtual environments don't end there. Apart from packages you can also maintain several separated git/svn repositories, several command line profiles, etc. You basically create this small virtual unit in your folder system, set up its environment to your needs and then reach its contents when you need it.

The python3 venv must be created in the same execution environment that the venv will be used from. This means:

  • if the tool uses a a kubernetes backend (recommended), the venv should be boostrapped inside a container.
  • if the tool uses a grid engine backend, the venv should be bootstrapped directly on a Toolforge bastion filesystem like dev.toolforge.org or login.toolforge.org.

Read below for more concrete information.

For Kubernetes backend

This is for tools that use the Toolforge Kubernetes backend (recommended).

Kubernetes python webservices

See Toolforge webservices with python.

Kubernetes python jobs

Follow these instructions if your are using the Toolforge Jobs framework.

You need to bootstrap your python venv from inside a job itself (similar to what happens in kubernetes webservices).

Create a script similar to this:

bootstrap_venv.sh
#!/bin/bash

# use bash strict mode
set -euo pipefail

# create the venv
python3 -m venv pyvenv

# activate it
source pyvenv/bin/activate

# upgrade pip inside the venv and add support for the wheel package format
pip install -U pip wheel

# install some concrete packages
pip install requests
pip install pyyaml

# or, install all packages from src/requirements.txt
# pip install -r src/requirements.txt

Then run it in the desired python container, selecting the python version you prefer, example:

tools.mytool@tools-sgebastion-11:~$ ls bootstrap_venv.sh
bootstrap_venv.sh
tools.mytool@tools-sgebastion-11:~$ chmod ug+x bootstrap_venv.sh
tools.mytool@tools-sgebastion-11:~$ toolforge-jobs run bootstrap-venv --command "cd $PWD && ./bootstrap_venv.sh" --image tf-python39 --wait
tools.mytool@tools-sgebastion-11:~$ ls pyvenv
pyvenv

Now you can run your python tool using this venv, example:

tools.mytool@tools-sgebastion-11:~$ cat src/mytool.py
import requests
r = requests.get('https://www.wikidata.org/wiki/Q1')
print(r.status_code)
tools.mytool@tools-sgebastion-11:~$ toolforge-jobs run mytool --command "pyvenv/bin/python src/mytool.py" --image tf-python39
tools.mytool@tools-sgebastion-11:~$ cat mytool.out
200

For Grid Engine backend

This is for tools that use the Toolforge Grid Engine backend (not recommended).

Grid Engine python webservice

See Toolforge webservices with python.

Grid Engine python jobs

Follow these instructions if you are using the Grid Engine backend to run your jobs.

You can create your first virtual environment using:

$ python3 -mvenv my_venv

This will install package manager, some basic tools, commands and prerequisites, everything into your new little unit. Once you created one, let's use it and play with it:

$ source my_venv/bin/activate

(my_venv) $ pip install my_dream_package==7.0.3
...

Once you are happy with it, you can always leave using:

(my_venv) $ deactivate

This way you can create as many separated Python environments as you wish.

You can reach it again from the inside the same way any time you want. This is handy when you want to update it for example. But for scheduled tasks, you would have to create a batch file with multiple commands to reach it, use it and leave it.

Use venv with scheduled tasks

Since it is saved in a folder in your Toolforge space, you can always use it from the outside just like any other folder. Well, you can not alter it this way, but for scheduled tasks you usually don't need to:

$ jsub -N my_task -once -quiet my_venv/bin/python3 my_script
Use your venv everywhere

You can also use your virtual environment everywhere by default. You can activate it using .profile like:

$ echo "source my_venv/bin/activate" >> .profile


See also

Communication and support

Support and administration of the WMCS resources is provided by the Wikimedia Foundation Cloud Services team and Wikimedia movement volunteers. Please reach out with questions and join the conversation:

Discuss and receive general support
Receive mail announcements about critical changes
Subscribe to the cloud-announce@ mailing list (all messages are also mirrored to the cloud@ list)
Track work tasks and report bugs
Use the Phabricator workboard #Cloud-Services for bug reports and feature requests about the Cloud VPS infrastructure itself
Learn about major near-term plans
Read the News wiki page
Read news and stories about Wikimedia Cloud Services
Read the Cloud Services Blog (for the broader Wikimedia movement, see the Wikimedia Technical Blog)