You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Help talk:Toolforge/Pywikibot

From Wikitech-static
Jump to navigation Jump to search

Shared PWB with venv

Pinging the last three people to touch the relevant part of this page.

Is there any particular reason you have to use a git clone of PWB inside a venv? Couldn't you equally well do:

$ python3 -m venv $HOME/pwb
$ source $HOME/pwb/bin/activate
(pwb) $ (pwb) $ cd $HOME/pywikibot-core
(pwb) $ pip3 install -e .[mwparserfromhell,mwoauth,mysql]
Finished processing dependencies for pywikibot==6.6.1

The reason I ask is I've inherited a tool that's 1) big and complex with lots of interconnected sub-tools, 2) very old, reliant on Python 2 (and thus the old PWB), and based on relevant practices ca. 2015. I'm trying to tease them apart, bring them up to Python 3 and using the shared core-stable PWB, modernise them somewhat, and take advantage of new stuff in third-party modules (e.g. prettier graphs using latest matplotlib). What I'd like to do is set up venvs for each sub-tool (so one sub-tool's matplotlib 3.x doesn't break another sub-tool's dep on matplotlib 2.x), but still use the shared PWB (which I have no need to control at the level of my own git clone, and so would rather not have to).

Similarly, shouldn't this page encourage use of the shared PWB instead of a private git clone? As it's currently written it seems to put the exceptions in primary position, instead of explaining the best practice and shunting advanced / exceptional uses to a subpage. Or I'm just hopelessly confused. Probably the latter. :) --Xover (talk) 15:18, 24 October 2021 (UTC)

Shared pywikibot file usage instructions need to be updated for v7.0.0 changes (was: does not work)

The "Using the shared Pywikibot files (recommended setup)" does not work. I followed the steps carefully, and failed. Then I installed it locally on toolforge, and it worked. —usernamekiran (talk) 17:58, 4 March 2022 (UTC)

@Usernamekiran, do you have any specific errors to share? Anecdotally I can say that I used the shared setup instructions in officewikibot just a couple of weeks ago and was able to successfully configure the framework and run scripts on the grid engine via both manual submission and cron. -- BryanDavis (talk) 00:44, 5 March 2022 (UTC)
@BryanDavis: After following the steps carefully/correctly, when I ran the command python3 /data/project/shared/pywikibot/stable/generate_user_files.py I got "file pywikibot not found" or something meaning the same thing. I apologise, I can't recall exactly. —usernamekiran (talk) 15:59, 5 March 2022 (UTC)
@BryanDavis: Hi. I tried it again. After following the previous commands properly, when I entered python3 /data/project/shared/pywikibot/stable/generate_user_files.py it gives out the error python3: can't open file '/data/project/shared/pywikibot/stable/generate_user_files.py': [Errno 2] No such file or directory I think, in the last attempt, I had created the user files manually (in .pywikibot directory), I will try that again and see if that works. I will update here what happens. —usernamekiran (talk) 16:43, 7 March 2022 (UTC)
@Usernamekiran you are correct that /data/project/shared/pywikibot/stable/generate_user_files.py is not available on disk today. I double checked the $HOME/.bash_history on the officewikibot tool I mentioned and found that the file was in that location 4 weeks ago. The new location is /data/project/shared/pywikibot/stable/pywikibot/scripts/generate_user_files.py. This move appears to be part of the changes in pywikibot 7.0.0 which was released on 2022-02-26. I will try to update the page to reflect these changes, but everyone's help is welcome in updating the documentation. -- BryanDavis (talk) 17:06, 7 March 2022 (UTC)
Special:Diff/1954677 updates the paths for generate_user_files.py and version.py in our local docs. -- BryanDavis (talk) 17:16, 7 March 2022 (UTC)
@BryanDavis: Sorry, I did not see you reply earlier. I manually added user-config, user-password (with 600 permission), and user-fixes.py manually. All I could do was to successfully run /data/project/shared/pywikibot/core/pywikibot/scripts/version.py but I could not do anything else. While I had installed the pywikibot on toolforge locally, I could use everything (I have removed it now, for the second time). —usernamekiran (talk) 17:25, 7 March 2022 (UTC)
pinging Klein Muçi who have been helpful in the issue, and might want to stay updated. —usernamekiran (talk) 17:42, 7 March 2022 (UTC)
Bryan, Klein Does this also mean that the contents of .bash_profile should be updated? Currently its export PYTHONPATH=/data/project/shared/pywikibot/stable:/data/project/shared/pywikibot/stable/scripts —usernamekiran (talk) 18:12, 7 March 2022 (UTC)
I'm in no way near being an expert in this matter but I doubt it will have to be changed. I tried to run my existing scripts directly and through jsub, without doing any change, and they run normally. Strangely... - Klein Muçi (talk) 19:15, 7 March 2022 (UTC)
@Klein When running directly, how, and at which command prompt do you run the scripts? I am not sure what am I doing wrong. —usernamekiran (talk) 20:02, 7 March 2022 (UTC)
I believe those are still the correct paths to add to PYTHONPATH. The release notes only mention the generate_family_file.py, generate_user_files.py, shell.py and version.py scripts being moved to pywikibot/scripts. -- BryanDavis (talk) 21:17, 7 March 2022 (UTC)

multiple user-fixes.py

@Bryan Hi. I hope you are doing well. I have been meaning to thank you, and Klein Muçi, and update you with my progress discussed in the section above, but I was always getting caught with something or other. I will do it shortly though. What I wanted to ask is, currently my user-fixes.py contains around 200 entries for find and replace task. There is nothing except find and replace though. There are around 400 more words that need to be added to source code (currently pending approval/vetting of mrwiki community). What is the maximum limit of entries for the task optimally without becoming burden on server? I think this list will keep on increasing with time, but at a slower pace after this. Currently, the cronjob is set to run once every 24 hours (4PM UTC)

Two solutions come to my mind. First: making a few different user-fixes files, if that's possible. Second: I can create different "fixes" within the same file, limiting the entries in each fix to 100. Then I can set cronjob to run each fix at a different time. eg fix1 at 1 PM, fix2 at 2 PM, and so on. Kindly let me know what you think. Thanks a lot in advance, —usernamekiran (talk) 17:35, 5 April 2022 (UTC)

Your tool is limited to a maximum amount of CPU and RAM that it can use at any given time to protect the Toolforge servers. Keep an eye on how your single job is doing, and as long as it is not being stopped for exceeding its resource quota or taking longer to process things than you can allow things should be fine. You can start working on splitting things up into smaller pieces if you start to hit limits. -- BryanDavis (talk) 17:47, 5 April 2022 (UTC)
Thanks Bryan. Also, I recently got an email with subject "[REMINDER] - Your KIRANBOT4 Project Is Still Running On Strech Grid Engine". My command prompt is tools.kiranbot4@tools-sgebastion-08:~$. In the email, there are a few links, but I couldn't find any page page with clear steps for migrating to Buster grid, or Kubernetes. There are some steps given at en:User:Novem Linguae/Essays/Toolforge bot tutorial, but originally they are about setting up from scratch, not about migrating. Also, my tool does not have webservice, only pywikibot. How should I migrate? Also, what do you recommend? Buster grid, or Kubernetes? I apologise for bothering you so much. —usernamekiran (talk) 08:18, 6 April 2022 (UTC)
@Usernamekiran, see News/Toolforge Stretch deprecation#Move_a_cron_job for specific instructions on moving a job from the legacy Stretch grid to the new Buster grid. Today I would recommend the Buster grid for most pywikibot users. There has been discussion of making it easier to run pywikibot jobs on the Kubernetes cluster, but today it is an under documented and under supported use case. -- BryanDavis (talk) 16:45, 6 April 2022 (UTC)
Hello Bryan. I think I did it, I am not sure how much successful I was. Is there any way to check if I migrated successfully? —usernamekiran (talk) 14:33, 8 April 2022 (UTC)
@Usernamekiran, the report at https://grid-deprecation.toolforge.org/u/usernamekiran will eventually be empty if you succeeded in migrating off of stretch or disabling all jobs for tools you maintain. That report is updated once per hour. A more manual check can be done by looking at the qstat -xml output for each tool. The hostname component of the <queue_name> that the job is running under tells you if the queue is running on stretch (contains -09nn where nn is any two digits) or buster (contains -10- ). At the time I am writing this, your KiranBOT job is running on continuous@tools-sgeexec-0920.tools.eqiad.wmflabs which indicates it is still on the stretch grid. -- BryanDavis (talk) 16:01, 8 April 2022 (UTC)
@Bryan I stopped all the jobs. —usernamekiran (talk) 17:55, 8 April 2022 (UTC)
@Bryan Hi. Currently, I have only two active jobs, and these jobs are not visible anymore on https://grid-deprecation.toolforge.org/t/kiranbot4 as they have been run using buster grid. The remaining two jobs will be excluded from the list after a week I think. Thanks again for all your help. Just one more doubt: like we can see the jobs running on old grid, is there a way to see jobs running on buster grid? maybe a command from my end to see which grid I am using? Thanks a lot again. —usernamekiran (talk) 17:36, 9 April 2022 (UTC)

A more manual check can be done by looking at the qstat -xml output for each tool. The hostname component of the <queue_name> that the job is running under tells you if the queue is running on stretch (contains -09nn where nn is any two digits) or buster (contains -10- ).

-- BryanDavis (talk) 22:36, 10 April 2022 (UTC)