You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Wikilabels: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Nintendofan885
(Use Wikimedia.cloud)
imported>Elukey
Line 2: Line 2:
== Technical details ==
== Technical details ==
* There are several instances:
* There are several instances:
** wikilabels-02.eqiad1.wikimedia.cloud The main node and uses Postgresql (db1004.eqiad.wmnet) to work. It's accessible from labels.wmflabs.org
** wikilabels-03.wikilabels.eqiad1.wikimedia.cloud: The main node and uses Postgresql (db1004.eqiad.wmnet) to work. It's accessible from labels.wmflabs.org
** wikilabels-staging-01.eqiad1.wikimedia.cloud: The staging node, uses similar setup and accessible from labels-staging.wmflabs.org
** wikilabels-staging-01.eqiad1.wikimedia.cloud: The staging node, uses similar setup and accessible from labels-staging.wmflabs.org
** wikilabels-experiment.eqiad1.wikimedia.cloud: The do tests and funny stuff. Accessible from labels-experiment.wmflabs.org
** wikilabels-experiment.eqiad1.wikimedia.cloud: The do tests and funny stuff. Accessible from labels-experiment.wmflabs.org
Line 11: Line 11:
** The deploy repo: http://github.com/wiki-ai/wikilabels-wmflabs-deploy/
** The deploy repo: http://github.com/wiki-ai/wikilabels-wmflabs-deploy/
** Puppet: https://github.com/wikimedia/puppet/tree/production/modules/wikilabels/manifests
** Puppet: https://github.com/wikimedia/puppet/tree/production/modules/wikilabels/manifests
== Initialize a VM ==
From your local laptop/workstation, checkout the deploy repository and make sure that you can ssh to the target cloud VPS instance. Then create a Python venv and install fabric3. This will allow you to do the following:<syntaxhighlight lang="bash">
fab initialize_server:hosts="wikilabels-03.wikilabels.eqiad1.wikimedia.cloud"
</syntaxhighlight>You also need to place OAuth keys in a specific file (a random key is good):<syntaxhighlight lang="bash">
elukey@wikilabels-03:~$ cat /srv/wikilabels/config/config/99-oauth.yaml
# These creditials are intended to be used when testing the local, development
# version of Wiki Labels.  Do not use these credentials in a production
# environment.  They will redirect users to localhost:8080 expecting to find
# Wiki Labels there.
oauth:
  key: xxx
  secret: xxxx
</syntaxhighlight>You'll also need to create a file named 98-database.yaml with the following content:<syntaxhighlight lang="bash">
# These credentials are intended to be used on labels.wmflabs.org.  They are
# sensitive and should never be commited to a public repository.
database:
  user: u_wikilabels
  dbname: u_wikilabels
  password: REDACTED
</syntaxhighlight>


== Deployment guide ==
== Deployment guide ==
* After things getting merged in the main repo. You need to update the deploy repo.
 
<pre>
After things getting merged in the main repo. You need to update the deploy repo.<pre>
cd wikilabels-wmflabs-deploy/
cd wikilabels-wmflabs-deploy/
git pull
git pull
Line 36: Line 59:
</pre>
</pre>
And log it!
And log it!
=== A new labeling campaign ===
 
== A new labeling campaign ==
You need to first introduce a new campaign:
You need to first introduce a new campaign:
<pre>
<pre>
$ ssh wikilabels-02.eqiad1.wikimedia.cloud
$ ssh wikilabels-03.eqiad1.wikimedia.cloud
ladsgroup@wikilabels-02$ cd /srv/wikilabels/config
ladsgroup@wikilabels-02$ cd /srv/wikilabels/config
ladsgroup@wikilabels-02:/srv/wikilabels/config$ sudo -u www-data /srv/wikilabels/venv/bin/wikilabels new_campaign wikidatawiki "Edit quality (5k, 2018)" damaging_and_goodfaith DiffToPrevious 1 50
ladsgroup@wikilabels-02:/srv/wikilabels/config$ sudo -u www-data /srv/wikilabels/venv/bin/wikilabels new_campaign wikidatawiki "Edit quality (5k, 2018)" damaging_and_goodfaith DiffToPrevious 1 50
Line 46: Line 70:
Note the id (38 in this case). And now you need to load the data into the campaign. Download the file in the home directory:
Note the id (38 in this case). And now you need to load the data into the campaign. Download the file in the home directory:
<pre>
<pre>
ladsgroup@wikilabels-02:/srv/wikilabels/config$ less ~/wikidatawiki.autolabeled_revisions.125k_2018.review.json | sudo -u www-data ../venv/bin/wikilabels task_inserts 38
ladsgroup@wikilabels-03:/srv/wikilabels/config$ less ~/wikidatawiki.autolabeled_revisions.125k_2018.review.json | sudo -u www-data ../venv/bin/wikilabels task_inserts 38
</pre>
</pre>



Revision as of 10:23, 14 September 2022

Wikilabels is one of stand-alone services that is being used gather data from users to build AI models for ORES and it's being maintained by Wikimedia Scoring Platform team. It's currently hosted on Nova_Resource:wikilabels (Cloud VPS)

Technical details

  • There are several instances:
    • wikilabels-03.wikilabels.eqiad1.wikimedia.cloud: The main node and uses Postgresql (db1004.eqiad.wmnet) to work. It's accessible from labels.wmflabs.org
    • wikilabels-staging-01.eqiad1.wikimedia.cloud: The staging node, uses similar setup and accessible from labels-staging.wmflabs.org
    • wikilabels-experiment.eqiad1.wikimedia.cloud: The do tests and funny stuff. Accessible from labels-experiment.wmflabs.org
    • wikilabels-backups.eqiad1.wikimedia.cloud: The nodes that keeps daily database backups of the main node. Accessible from wikilabels-dumps.wmflabs.org

Initialize a VM

From your local laptop/workstation, checkout the deploy repository and make sure that you can ssh to the target cloud VPS instance. Then create a Python venv and install fabric3. This will allow you to do the following:

fab initialize_server:hosts="wikilabels-03.wikilabels.eqiad1.wikimedia.cloud"

You also need to place OAuth keys in a specific file (a random key is good):

elukey@wikilabels-03:~$ cat /srv/wikilabels/config/config/99-oauth.yaml
# These creditials are intended to be used when testing the local, development
# version of Wiki Labels.  Do not use these credentials in a production
# environment.  They will redirect users to localhost:8080 expecting to find
# Wiki Labels there.
oauth:
  key: xxx
  secret: xxxx

You'll also need to create a file named 98-database.yaml with the following content:

# These credentials are intended to be used on labels.wmflabs.org.  They are
# sensitive and should never be commited to a public repository.
database:
  user: u_wikilabels
  dbname: u_wikilabels
  password: REDACTED

Deployment guide

After things getting merged in the main repo. You need to update the deploy repo.

cd wikilabels-wmflabs-deploy/
git pull
cd submodules/wikilabels
git pull
cd ../..
git add wikilabels
git commit

Then write something like "Bumping wikilabels to HEAD"

git push
fab stage

Now it's in the staging node. log it (using !log wikilabels in #wikimedia-cloud channel in IRC) Test it and if it works fine move to prod

git checkout deploy
git rebase origin/master
git push -f origin deploy
fab deploy

And log it!

A new labeling campaign

You need to first introduce a new campaign:

$ ssh wikilabels-03.eqiad1.wikimedia.cloud
ladsgroup@wikilabels-02$ cd /srv/wikilabels/config
ladsgroup@wikilabels-02:/srv/wikilabels/config$ sudo -u www-data /srv/wikilabels/venv/bin/wikilabels new_campaign wikidatawiki "Edit quality (5k, 2018)" damaging_and_goodfaith DiffToPrevious 1 50
{'form': 'damaging_and_goodfaith', 'id': 38, 'view': 'DiffToPrevious', 'active': True, 'name': 'Edit quality (5k, 2018)', 'tasks_per_assignment': 50, 'labels_per_task': 1, 'wiki': 'wikidatawiki', 'info_url': None, 'created': datetime.datetime(2018, 7, 11, 13, 39, 54, 282569)}

Note the id (38 in this case). And now you need to load the data into the campaign. Download the file in the home directory:

ladsgroup@wikilabels-03:/srv/wikilabels/config$ less ~/wikidatawiki.autolabeled_revisions.125k_2018.review.json | sudo -u www-data ../venv/bin/wikilabels task_inserts 38

Restarting the service

Any time the connection PostgreSQL is broken, we need to restart the wikilabels service:

service uwsgi-wikilabels-web restart

Incidents

See also