You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

SecurePoll

From Wikitech-static
Revision as of 05:10, 9 August 2021 by imported>Tim Starling (→‎Voter eligibility: example script invocations)
Jump to navigation Jump to search

SecurePoll is a voting system with per-election configuration stored in the database. We use it for Board elections.

About XML files

Prior to 2014, there was no UI for creating or editing election configuration. However, there were CLI scripts to import and export the configuration as XML. Elections were created by manually writing XML files and then importing them. That's why cli/dump.php and cli/import.php are relatively complex, and why various old tools produce or act on XML.

WMF-specific CLI scripts

A lot of scripts were written under extreme time pressure to do a very specific job per the requirements of a given election. Some were archived under in a year-specific directory under cli/wm-scripts, and then copied forward for each new election with minor modifications. Some scripts were apparently lost.

dumpMetaTranslations.php
SecurePoll has its own translation UI. In the 2013 Board vote this was not used, instead messages were edited as pages on meta. This script generated compiled the translations on meta and generated XML for use with import.php --update-msgs.
doSpam.php
Collects voter names and email addresses, and outputs them as a tab-delimited text file. Used in 2013 and 2015.
buildSpamTranslations.php
Extracts bulk email translations from meta and dumps them as a series of text files.
sendEmails.php
Actually sends emails using files generated by doSpam.php and buildSpamTranslations.php.
dumpGlobalVoterList.php
Similar to doSpam.php, probably used in 2009 and 2011.
populateEditCount.php
This script, copied forward with minor variations from 2013 to 2021, counts local edits selected according to various criteria and stores them in a table on the local wiki: bv2013_edits, bv2015_edits, etc.
voterList.php
The 2013 version was run on each wiki to generate a local voter list. The 2015-2017 version is run once on a random wiki, and populates centralauth.securepoll_lists with a global voter list by adding together edit counts across all wikis.
makeGlobalVoterList.php
A generic version of voterList.php introduced in 2021.
makeArbcomList.php
Supposedly a script of this name was used for the 2009 ArbCom election.

How to run a board election

Voter eligibility

It is necessary to run scripts to pre-compute eligibility lists. The scripts take about 2 days to run.

The 2021 Board election voter eligibility rule for editors is:

You may vote from any one registered account you own on a Wikimedia wiki. You may only vote once, regardless of how many accounts you own. To qualify, this one account must:

  • blocked in no more than one project;
  • and not be a bot;
  • and have made at least 300 edits before 5 July 2021 across Wikimedia wikis;
  • and have made at least 20 edits between 5 January 2021 and 5 July 2021.

The global block count is the central-block-count property, which can be changed using the voter eligibility subpage in the UI "Must not be blocked on too many attached wikis".

The "not a bot" criterion is the not-bot property, which can be changed in the UI by checking "Must not be flagged as a bot". But note that this only filters by bot permissions on the jump wiki.

The edit count criteria don't have a UI. There is phab:T288183 to track that. For now:

  • Copy and adapt bv2021/populateEditCount.php
  • Create the bvYYYY_edits table (where YYYY is the year) on all wikis. For example:
for db in $(expanddblist all); do
    echo 'CREATE TABLE `bv2021_edits` (
  `bv_user` int(11) NOT NULL,
  `bv_long_edits` int(11) NOT NULL,
  `bv_short_edits` int(11) NOT NULL,
  PRIMARY KEY (`bv_user`)
)' | mwscript sql.php --wiki=$db
done
  • Run the new populateEditCount.php on all wikis. This takes about 2 days if you use one thread per section, i.e. foreachwikiindblist $section .../populateEditCount.php
  • Run makeGlobalVoterList.php on a random wiki, e.g.
mwscript extensions/SecurePoll/cli/wm-scripts/makeGlobalVoterList.php \
   --wiki=mediawikiwiki \
   --edit-count-table=bv2021_edits \
   --list-name=board-vote-2021 \
   --short-min-edits=20 \
   --long-min-edits=300
  • After election creation, manually insert into the database a property with name need-central-list and the value being the name of the list given to makeGlobalVoterList.php.

The 2021 rules require developers to automatically be granted access if they:

  • are Wikimedia server administrators with shell access
  • or have made at least one merged commit to any Wikimedia repos on Gerrit, between 5 January 2021 and 5 July 2021.

It is unclear how this can be implemented. The wording was introduced in June 2021 and was discussed at phab:T281977.

Exceptions

Exceptions that come from fixed lists, such as staff exceptions, should be put into the central list manually. For example, to allow Tim Starling to vote:

$ sql centralauth
mysql> INSERT INTO securepoll_lists (li_name,li_member) SELECT 'board-vote-2021', gu_id FROM globaluser WHERE gu_name='Tim Starling';

Creating the election

TO DO: presumably CreatePage can be used, but with what exact options?

Translations

A few gotchas, pitfalls and comments (from 2011-2013):

  • Don't let your translators translate the section titles! It breaks the script that generates the XML file.
  • When all the candidates have been submitted, get the translators to translate the "Candidate text" section. If the candidate names don't need to be transliterated (and can be displayed as in English), get them to remove that section entirely. If they do need to be transliterated (for languages such as Arabic), make sure that the candidates are in the right order.
  • Magic words do not (currently) work. They need removing

Do a test run

A few days before the election, you should run a test election for a few days, just to make sure that everything (particularly eligibility and the session transfer) is working properly. To make sure that there is absolutely no risk of confusion, change every message in this election to "THIS IS A TEST ELECTION, DO NOT VOTE", or similar.

Email spam

There are scripts that can handle the mass-mailing are in the SecurePoll git repository. It is probably easiest to do a copy-and-modify for the newest elections. buildSpamTranslations.php pulls translations out of the database on meta, doSpam.php generates a list of emails to send, and sendMails.php actually sends the email. Don't forget to change:

Run a trial run by manually inputting your own details into sendMails.php and see what gets spat out. I'd recommend redirecting output for all of the scripts so you have a paper trail of exactly what happened in each step.

Translations are pulled from a base page plus a language code. You might need to update buildSpamTranslations.php to strip more boilerplate from the pages. You should be prepared to deal with the following types of complaints:

  • Unflagged bots being invited to vote. Often the bot will not be flagged on all wikis that it is active — we need better handling for this, but you can find out which wiki a bot is being emailed from by grepping the output of doSpam.php.
  • Multiple accounts — usually people are understanding of the fact that the list is automatically generated, but some extra deduplication by email address would be helpful.
  • Gaffes in translation and execution (in the case of 2011, the subject line wasn't updated). The test run will fix most of these, but you should also check that the translations aren't trying to use magic words or anything fancy (unless you add support to the mailout script). Also check on the formatting of the translations themselves, translators have a habit of changing things to make them clearer and breaking the scripts.

A few lessons learned from 2011 that would be good to apply in future

  • We tended to get a lot of unflagged global bots receiving messages. Perhaps accounts bot-flagged on more than one wiki should be excluded altogether.
  • Some users with multiple accounts got multiple emails — deduplication by email address is an option worth considering.
  • We should send out the email reasonably early in the process — at most halfway through.
  • Bidirectional text tended to appear garbled in text-only emails. This was exacerbated by nonstandard formatting of the translation pages.
  • Translators need to be briefed about format — in particular, magic words don't work, and we need to try to keep boilerplate to a minimum so that we can extract the text reasonably seamlessly.
  • If we can get a machine readable list of accounts that have already voted, that might be useful for eliminating them from the mailout.

See also MetaWikipedia:Image filter referendum/Email/False positives.

Duplicate removal

SecurePoll takes the approach of trying to make ballot stuffing detectable, and allowing it to be rectified if it is detected, rather than discouraging it with error messages. This reduces the risk that a malicious user will refine their methods until they are able to evade all technical restrictions. The tradeoff is that it creates a lot of work for election administrators.

(However, allowing the same CentralAuth user to vote twice from different wikis is technical debt arising from the introduction of CentralAuth, it is not a deliberate design decision.)

Duplicate removal is a large task, and as many trusted election administrators as possible should be encouraged to help with it. SecurePoll allows people to alter their votes by voting more than once. But if a person votes more than once with different accounts, or with the same username on different wikis, then those votes are counted twice. Such votes need to be manually removed.

Election documentation should make it clear that each person is only entitled to vote once, and that having multiple qualified accounts is not a license to vote more than once. Policies should be put in place to strongly discourage deliberate ballot-stuffing.

For people who are named as election administrators on vote.wikimedia.org, SecurePoll extends the "list" page to provide a duplicate vote removal interface. As a first pass, sort it by username, and then scroll through the results looking for multiple votes with the same username, ignoring greyed-out entries. Then do the same sorting by IP address.

Votes with a "dup" marker should be investigated. The details page gives the username of the suspected sockpuppet, based on cookie evidence. Any votes deemed to be duplicates should be struck out, by clicking the "strike" link. Voters whose votes are struck should be contacted by email, if they have set an email address in their preferences.

Encryption

This section has not been updated for the new situation after 2013

Two separate keys need to be generated for each election: an encryption key and a signing key.

In the past, election committee members have been put under pressure to release running tallies, while voting is still in progress. On one occasion, such a tally was privately given to the interested party, and the results were used to influence voting in a last-ditch campaign. Having an outside party hold the encryption key exclusively until after the election is complete avoids this potential quandary.

After voting is complete, the encryption key should be given to several trusted election committee members. To prevent vote-buying, the encryption key should never be publicly released. If the encryption key is released, then a voter can use their encrypted record to prove that they voted in a particular way.

Instead, voters wishing to establish the integrity of their encrypted election records should do so by private communication with a private key holder. The election record can be decrypted and compared with who the voter claims they voted for.

The signing key allows voters to prove that they have a valid encrypted record. If a voter has a signed voting record which is not present in the database, then this is clear evidence that the server has been compromised.

To generate each key:

gpg --gen-key

Then follow the prompts. Use usernames like "Wikimedia Board Election 2011 signing key" and "Wikimedia Board Election 2011 encryption key" to make it easier to know which is which. Then export each key:

gpg --armor --export-secret-keys 'Wikimedia Board Election 2011 signing key'
gpg --armor --export 'Wikimedia Board Election 2011 encryption key'

Key generation should be done by the 3rd party keeping the decryption key.

<property name="gpg-sign-key">...exported secret key...</property>
<property name="gpg-encrypt-key">...exported public key...</property>

Tallying

Since 2021 it should be possible to do tallying from the UI. It's slow but uses the job queue.

Session transfers

SecurePoll supports having an election stubbed out on the wiki that the voters actually come from, but hosted somewhere else (votewiki in this case).

Since 2014 these stub elections are created automatically by the election creation UI when you select the "all wikis" option.

Prior to 2014, the stub elections were set up by generating an XML configuration file and importing it into all wikis.

Users do not actually need an account on the wiki where the vote is actually hosted. This was used prior to 2013 to host the vote on servers not controlled by WMF. Even though we now self-host our elections, votewiki does not have CentralAuth enabled and so users will typically not have a user account when they vote.

Session transfer works by having the jump wiki send a token to votewiki. Votewiki authenticates the token by doing an HTTP request to auth-api.php. The auth-api.php request returns user parameters which are saved by votewiki. This mechanism predates the action API and there is a task to convert this mechanism to finally use the action API which may soon be implemented.

A config hack adds "lang" and "site" parameters to the jump URL, which the remote site uses to reconstruct a URL back to the jump wiki.

Timeline

2004
The first Board election. Tim, then a volunteer, quickly wrote the BoardVote extension to run this election. It was one of the first MediaWiki extensions. Many of SecurePoll's quirks were already present, such as GPG-based encrypted vote receipts. The voting/tallying method was approval voting.
2005-2008
The BoardVote extension was edited for each election, changing table names etc. to make it continue to work for another year.
2007
The first election hosted by Software in the Public Interest (SPI). SPI would host the 2007, 2008, 2009 and 2011 elections.
2008
The tallying method was changed to the Schulze method (a Condorcet preferential method).
2009
Tim wrote the first version of SecurePoll, as a multi-election adaption of the BoardVote code. It was used for the 2009 Board election.
2013
Elections returned to being self-hosted. Encryption of voting records at rest effectively ended, with private (tallying) keys being stored on the server instead of held by an election administrator and tallied offline.
2013
The voting method was changed to a modified range vote formula: support/(support+oppose).
2014
Brad wrote an election creation UI. But the UI continues to be incomplete and was not used for voter qualification in 2015-2021.
2021
The voting method was changed to Single Transferable Vote.