You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Help:Toolforge/Pywikibot: Difference between revisions
imported>BryanDavis (→Using the shared Pywikibot files (recommended setup): Update paths which changed with the pywikibot 7.0.0 release) |
imported>Rubin |
||
Line 117: | Line 117: | ||
<syntaxhighlight lang="shell-session"> | <syntaxhighlight lang="shell-session"> | ||
$ $HOME/pwb/bin/python3 $HOME/path/to/script.py | $ $HOME/pwb/bin/python3 $HOME/path/to/script.py | ||
</syntaxhighlight> | |||
=== Using the virtual environment on Kubernetes === | |||
The way to launch and customise the virtual environment is different on [[News/Toolforge Stretch deprecation|Kubernetes]]. | |||
The virtual environment should be defined in the toolforge-job itself. Create a script similar to this: | |||
{{Codesample|name=pwb_venv.sh|lang=bash|scheme=light|code= | |||
#!/bin/bash | |||
# create the venv | |||
python3 -m venv pwbvenv | |||
# activate it | |||
source pwbvenv/bin/activate | |||
# install some packages | |||
pip3 install --upgrade pip "setuptools>=49.4.0, !=50.0.0, <50.2.0" wheel | |||
cd $HOME/pywikibot-core | |||
pip3 install -e .[mwparserfromhell,mwoauth,mysql] | |||
}} | |||
<syntaxhighlight lang="shell-session"> | |||
tools.mytool@tools-sgebastion-11:~$ chmod ug+x pwb_venv.sh | |||
tools.mytool@tools-sgebastion-11:~$ toolforge-jobs run pwb-venv --command "./pwb_venv.sh" --image tf-python39 --wait | |||
INFO: job 'pwb-venv' completed | |||
</syntaxhighlight> | </syntaxhighlight> | ||
Line 180: | Line 206: | ||
17 0 * * * jstart -l h_vmem=512M -N script_wui python3 $HOME/pywikibot-core/pwb.py script_wui.py -log | 17 0 * * * jstart -l h_vmem=512M -N script_wui python3 $HOME/pywikibot-core/pwb.py script_wui.py -log | ||
</syntaxhighlight> | </syntaxhighlight> | ||
=== Kubernetes === | |||
The system of job creation on Kubernetes is different. First, the virtual environment [[Help:Toolforge/Pywikibot#Using_the_virtual_environment_on_Kubernetes|needs to be customised]]. | |||
After that, the job could be launched: | |||
<syntaxhighlight lang="shell-session"> | |||
$ toolforge-jobs run script_name --command "$HOME/pwbvenv/bin/python3 $HOME/pywikibot-core/pwb.py script_name -start:!" --image tf-python39 | |||
</syntaxhighlight> | |||
Additional parameters for the job could be reviewed on [[Help:Toolforge/Jobs framework]] and could include, for example, additional memory allocation (<code>--mem MEM</code>), job restart after being finished (<code>--continuous</code>), etc. | |||
== Using pip == | == Using pip == |
Revision as of 12:45, 11 April 2022
Caution: This page may contain inaccuracies. It is currently being edited and redesigned for better readability. For further information, please see T134495.
![]() | This page is related to https://wikitech.wikimedia.org/wiki/Help:Toolforge/My_first_Pywikibot_tool. These pages will be combined for simpler documentation. |
The Pywikibot Framework is a collection of Python tools that automate work on MediaWiki sites. Please review mw:Manual:Pywikibot/Installation first.
The stable version of the Pywikibot 'core' branch (formerly 'rewrite') is accessible at /shared/pywikibot/stable
. If you are a developer and/or would like to use the current master branch, this is accessible at /shared/pywikibot/core
but be aware this might not be a stable release. To have control when the code is updated, you may also choose to install 'core' locally in your tool directory.
Note that the shared 'core' code consists only of the source files; each bot operator will need to create their own configuration files (such as 'user-config.py') and set up a PYTHONPATH and other environment variables. Please see Using the shared Pywikibot files for more information.
For most purposes, using the centralized 'core' files is recommended. The shared files are available at /data/project/shared/pywikibot/stable
, and steps for configuring your tool account are provided below. The configuration files themselves are stored in your tool account in the $HOME/.pywikibot
directory, or another directory, where they can be used via the -dir
option (all of this is described in more detail in the instructions).
If you are a developer and/or would like to control when the code is updated, please see Installing Pywikibot locally for instructions.
To set up your Tools account to use the shared 'core' framework:
1. Become your tool-account
maintainer@tools-login:~$ become toolname
2. In your home directory, create (or edit, if it exists already) a '.bash_profile' file:
nano .bash_profile
and include the following line:
export PYTHONPATH=/data/project/shared/pywikibot/stable:/data/project/shared/pywikibot/stable/scripts
The path should be on one line, though it may appear to be on multiple lines depending on your screen width. When you save the .bash_profile file (CTRL+X), your settings will be updated for all future shell sessions.
3. Import the path settings into your current session:
tools.tool@tools-login$ source .bash_profile
4. In your home directory, create a subdirectory named '.pywikibot' (the '.' is important!) for bot-related files:
tools.tool@tools-login$ mkdir $HOME/.pywikibot
5. Configure Pywikibot.
To create configuration files, use the following command and then follow the instructions. You may also use an existing configuration file (e.g., 'user-config.py') that works on another system by copying it into your .pywikibot directory:
tools.tool@tools-login$ python3 /data/project/shared/pywikibot/stable/pywikibot/scripts/generate_user_files.py
6. Test out your setup. In general, all jobs should be run on the grid, but it's fine to test your setup on the command line. You should see the following terminal output (or something similar):
tools.tool@tools-login$ python3 /data/project/shared/pywikibot/stable/pywikibot/scripts/version.py
Pywikibot: [https] r-pywikibot-core.git (1db1f28, g15095, 2021/05/31, 14:35:28, stable)
Release version: 6.3.0
requests version: 2.12.4
cacerts: /etc/ssl/certs/ca-certificates.crt
certificate test: ok
Python: 3.5.3 (default, Sep 27 2018, 17:25:39)
[GCC 6.3.0 20170516]
Note that you do not need to run scripts using pwb.py, but run scripts directly, e.g., python3 /data/project/shared/pywikibot/stable/pywikibot/scripts/version.py
. Setting PYTHONPATH means that you no longer need the pwb.py helper script to make, say, import pywikibot
work. Anyway the pwb.py helper script has additional advantages like ignoring typing mistakes for script names, script path redirection, dependency checks, see pwb script documentation.
If you need to use multiple user-config.py files, you can do so by adding -dir:<path where you want your user-config.py> to every python command. To use the local directory, use -dir:. (colon dot).
For more information about Pywikibot, please see the Pywikibot documentation. The pywikibot mailing list (pywikibotlists.wikimedia.org) and IRC channel (#pywikibot connect) are good places to go for additional help. Other useful information about using the centralized 'core' files is available here: User:Russell Blau/Using pywikibot on Labs
Caution: Script path for Pywikibot framework utility scripts (generate_family_file.py, generate_user_files.py, shell.py, version.py) has been changed in core (master) branch with release 7.0.0. To use them the path is
/data/project/shared/pywikibot/core/pywikibot/scripts/<script_name>
or it can be invoked by the pwb.py wrapper script. See also: https://doc.wikimedia.org/pywikibot/master/utilities/index.html
Setup pywikibot on Toolforge (locally)
Installing pywikibot local to your tool allows you to upgrade whenever it suits you, instead of always running the latest version.
Clone pywikibot git repo
Clone the 'core' git repository:
$ git clone --recursive --branch stable "https://gerrit.wikimedia.org/r/pywikibot/core" $HOME/pywikibot-core
Setup a Python virtual environment for library dependencies
When using a local pywikibot install, use a Python virtual environment (venv) to manage Python library dependencies. The Toolforge environment does provide system packages for many Python libraries, but these are installed using Debian packages which means that they are often older versions and not likely to be upgraded often.
Create a venv. You can give this venv any name you would like. We will use 'pwb' in this example.
$ python3 -m venv $HOME/pwb
Once you have created the venv, you can "activate" it to setup your shell's $PATH so that the python3
and pip3
binaries in the virtual environment are used by default.
$ source $HOME/pwb/bin/activate
(pwb) $
Now that the venv is created and active for your current shell session, we can install the pywikibot code from the git clone we made earlier into this venv. This basically installs the pywikibot core code as a symlink in the venv. This way, if you modify the directory, you don't need to install it again.
(pwb) $ pip3 install --upgrade pip "setuptools>=49.4.0, !=50.0.0, <50.2.0" wheel
...
Successfully installed pip-21.2.4 setuptools-58.1.0 wheel-0.37.0
(pwb) $ cd $HOME/pywikibot-core
(pwb) $ pip3 install -e .[mwparserfromhell,mwoauth,mysql] # adjust extra dependencies as needed for your tool
...
Finished processing dependencies for pywikibot==6.6.1
Note: the setuptools!=50.0.0
install constraint is for T261748 and the upstream issue in setuptools related to relative imports.
Using the virtual environment without activating it
To use the code from outside the virtual environment (for example to submit jobs to the grid engine), use the full paths to the python3
inside your venv directory and the full path to the script you want to run:
$ $HOME/pwb/bin/python3 $HOME/path/to/script.py
Using the virtual environment on Kubernetes
The way to launch and customise the virtual environment is different on Kubernetes.
The virtual environment should be defined in the toolforge-job itself. Create a script similar to this:
#!/bin/bash
# create the venv
python3 -m venv pwbvenv
# activate it
source pwbvenv/bin/activate
# install some packages
pip3 install --upgrade pip "setuptools>=49.4.0, !=50.0.0, <50.2.0" wheel
cd $HOME/pywikibot-core
pip3 install -e .[mwparserfromhell,mwoauth,mysql]
tools.mytool@tools-sgebastion-11:~$ chmod ug+x pwb_venv.sh
tools.mytool@tools-sgebastion-11:~$ toolforge-jobs run pwb-venv --command "./pwb_venv.sh" --image tf-python39 --wait
INFO: job 'pwb-venv' completed
Setup job submission
After installing, you can run your bot directly via a shell command, though this is highly discouraged. You should use the grid to run jobs instead.
In order to setup the submission of the jobs you want to execute and use the grid engine you should first read Help:Toolforge/Grid.
To run a bot using the grid, you might want to be in the pywikibot directory (this is not needed) - which means you have to write a small wrapper script. The following example script (versiontest.sh) is used to run version.py:
$ cat versiontest.sh
#!/bin/bash
cd /data/project/shared/pywikibot/stable
python3 version.py
To submit a job, set the permissions for the script and then use the 'jsub' command to send the job to the grid:
$ chmod 0755 versiontest.sh
$ jsub versiontest.sh
Job output will be written to output and error files in your home directory called YOURJOBNAME.out and YOURJOBNAME.err, respectively (versiontest.out and versiontest.err in this example):
$ cat ~/versiontest.out
pywikibot [https] r/pywikibot/compat (r10211, 8fe6bdc, 2013/08/18, 14:00:57, ok)
Python 2.7.3 (default, Aug 1 2012, 05:14:39)
[GCC 4.6.3]
config-settings:
use_api = True
use_api_login = True
unicode test: ok
Example
An infinitely running job such as an irc-bot can be started like this:
$ jsub -once -continuous -l h_vmem=256M -N script_wui python3 $HOME/pywikibot-core/pwb.py script_wui.py -log
or shorter
$ jstart -l h_vmem=256M -N script_wui python3 $HOME/pywikibot-core/pwb.py script_wui.py -log
If you experience problems with your jobs, like e.g.
Fatal Python error: Couldn't create autoTLSkey mapping
you can try increasing the memory value:
$ jstart -l h_vmem=512M -N script_wui python3 $HOME/pywikibot-core/pwb.py script_wui.py -log
Now in order to create a crontab follow scheduling jobs at regular intervals with cron and setup for crontab file like:
$ crontab -e
and enter
PATH=/usr/local/bin:/usr/bin:/bin
# Run script_wui.py at 00:17 UTC each day
17 0 * * * jstart -l h_vmem=512M -N script_wui python3 $HOME/pywikibot-core/pwb.py script_wui.py -log
Kubernetes
The system of job creation on Kubernetes is different. First, the virtual environment needs to be customised.
After that, the job could be launched:
$ toolforge-jobs run script_name --command "$HOME/pwbvenv/bin/python3 $HOME/pywikibot-core/pwb.py script_name -start:!" --image tf-python39
Additional parameters for the job could be reviewed on Help:Toolforge/Jobs framework and could include, for example, additional memory allocation (--mem MEM
), job restart after being finished (--continuous
), etc.
Using pip
The pip package manager is not installed for global use on the Toolforge servers, but it can be used through the use of virtual environments. The first step is to create a virtual environment, and get the latest version of pip
installed in it:
$ python3 -m venv venv
$ source venv/bin/activate
$ pip3 install --upgrade pip
Installing specific packages from pip3
is as simple as loading the environment and then running the pip3 install
command, for example:
$ source venv/bin/activate
$ pip3 install PACKAGENAME
Lastly, running a pywikibot script that depends on a pip
package will also require loading the environment first, for instance:
$ source venv/bin/activate
$ python3 foo/bar/pwb.py SCRIPTNAME -page:"SOMEPAGE"
The venv does not get automatically activated in Grid job submissions. Two common workarounds are having wrapping shell scripts that activates the venv, or use absolute paths to the binaries within:
$ jstart -N jobname venv/bin/python3 foo/bar/pwb.py SCRIPTNAME -page:"SOMEPAGE"