You are browsing a read-only backup copy of Wikitech. The primary site can be found at

VRT System

From Wikitech-static
Revision as of 14:52, 5 November 2021 by imported>Alexandros Kosiaris (→‎Upgrading)
Jump to navigation Jump to search

Znuny, a community fork of OTRS which went closed source in version 7.x, forcing us to adopt Znuny, is installed on otrs1001.eqiad.wmnet.


  • VRT stands for Volunteer Response Team.
    • They use the platform to read and answer emails to a variety of incoming destinations, e.g. info-<language>, permissions-<language>, etc.
    • Public details are at m:Volunteer Response Team
    • Their private wiki is at vrtwiki:
  • VRTS stands for Volunteer Response Team System. We currently use software by Znuny GmbH.
    • Znuny stands for the software released by Znuny GmbH. It's a community fork of OTRS community edition which went belly up in 2021 after being closed source since version 7.x
    • OTRS was how we previously referred to the software and team. Remnants remain. (T280392 - migration task)
  • VRTS admins, volunteers who administer the queues and the agents. (~6 people)
  • VRTS agents, individuals of that team. (~400+ people)


  • URL is
  • The root user/pass is in the ops password repo
  • We use mod_perl with ModPerl::Registry, so whenever a file is changed, an apache2 reload is required.
  • The "News" messages on the main Znuny login screen can be editing by modifying /opt/otrs/Kernel/Output/HTML/Templates/Standard/

There is no need to update config files to add email addresses to the system; Inbound MX servers will automatically see that the queue exists or has disappeared. However it is possible (due to negative caching at the secondary mail exchangers) that new addresses will take up to two hours to begin working.

File:Typical Email workflow in VRT System.jpg
Typical Email workflow in VRT System


The source is being expanded in a version dependant directory under /opt, e.g. /opt/znuny-6.0.37.

A symlink is used to point to that then, e.g. /opt/otrs -> /opt/znuny-6.0.37. That symlink is updated at every upgrade.

The primary configuration file is /opt/otrs/Kernel/

Configuration can be done either via the Web interface or by changing Kernel/ and shipping the change via puppet. The file overrides whatever changes happen via the Web Interface. When you change the config, reload apache to clear the mod_perl cache.

Znuny configuration is meant to be primarily done via the Web interface. Stick to that for most use cases.

Local Customizations


The codebase has some local customizations, which are applied as Znuny packages. The packages are stored in the database in otrs.package_repository and can be reapplied after update via the web UI or the command line The packages can also be downloaded from within the web interface, when logged in as an admin user:

Admin-->Package Manager-->{packages listed under Local Repository}-->Download

via the command line, packages can be reapplied with:

sudo -u www-data /opt/otrs/bin/ Admin::Package::ReinstallAll

The packages are in gerrit in the operations/software/otrs repo. Their structure is similar to the way the VRTS source code is structured. The reason for that is because OTRS/Znuny packages can override a "proper" file. e.g.

  • Custom / Kernel / Output / HTML / Templates / Standard /

will override

  • Kernel / Output / HTML / Templates / Standard /

That makes it a bit harder then just maintaining as patches as we have to keep the entire of the files in sync with upstream and not just the differences (which are arguably very small). Maintaining consists of fetching the new files the package overrides per version and manually merging the new upstream stuff.


Installing the packages requires that they are first built. Building the packages happens on the command line using:

sudo -u otrs /opt/otrs/bin/ Dev::Package::Build --module-directory packages/WikimediaTemplates/ packages/WikimediaTemplates/WikimediaTemplates.sopm /tmp

In the command above, the module directory is just a checkout of the repo mentioned above. /tmp is the output directory and the path to the .sopm file is so the code is bundled alongside the description. The .sopm file is an XML file and the format is pretty self-explanatory.


Installing the package happens using:

sudo -u www-data /opt/otrs/bin/ Admin::Package::Install /tmp/WikimediaTemplates-1.0.18.opm

The version (1.0.18 in the example above) will need adjustment per package version and is defined in the .sopm file above.

Database backend

The primary database is on the m2 shard, database named 'otrs'.


Mail delivery

  • When user/group for the exim pipe were incorrect, logged permissions to /var/log/mail.log about errors on attempts to write in /opt/otrs/var/tmp/CacheFileStorable. We fixed this by configuring the exim pipe to use group=www-data.
  • Mail hosts need mysql access to the otrs database. If mx IP addresses change or the database is inaccessible, mail defers on whichever mx is trying to do an address lookup. When we saw this happen, exim wasn't very informative about why. When in doubt double-check mysql access from the mx's command line.
  • Spamassassin runs locally, and logs in /var/log/mail.log.

Apache permissions errors

  • as user otrs run:
/opt/otrs/bin/ --otrs-user=otrs --otrs-group=otrs --web-user=www-data --web-group=www-data /opt/otrs

SpamAssassin stops reporting Bayes results

  • This happened 2014-04-24, 2016-08-06, 2016-12-21 and we discovered it was unhappy about the bayes database.
  • /var/log/syslog was full of this:
bayes db version 0 is not able to be used, aborting! at /usr/share/perl5/Mail/SpamAssassin/BayesStore/ line 203, <GEN88>
  • We tried backup/restore the database (the verify failed), and the database shrank from ~24M to ~14M and SA stopped complaining. But SA continued to pass through mail with no Bayes results (the BAYES_XX where XX in [00,90] Header added to the message was missing)
  • So we moved aside the old database, modifed not to skip previously-seen messages, and created one-time GenericAgent [within OTRS] jobs to re-export a couple of days worth of ham/spam. Then we ran train_spamassaassin manually to train on all this data. Note otrs.TicketExport2Mbox now has --rebuild mode to support this process.
  • On the 2016-08-06 incident the log statement below was found in the logs:
  Aug  6 09:56:59.752 [1619] dbg: bayes: not available for scanning, only 126 ham(s) in bayes DB < 200

after running a:

 sudo -u debian-spamd spamassassin -D bayes < /tmp/sample_email.eml

and a

 sudo -u debian-spamd sa-learn --dump magic

confirmed it.

The fix was the re-exporting and training of spamassassin as mentioned above. Extra care should be take to make sure the spam/ham messages exported are above 200 in every case.

  • On the 2016-12-21 incident both the hams and the spams in the database were below 200. That was not logged however as the message about the hams above, leading the investigation off track for a while. Exporting quite a few messages and training spamassasin on them as above fixed the issue. The database in this case was NOT being marked as corrupted by db_verify but it was truncated in the end just for good measure manually

Mail setup

E-mail is sent and received through a special Exim instance on the hosting server. Its configuration follows the lines of the setup described in Mail, but VRT specific configuration is listed below.

Spam and Malware scanning

SpamAssassin and ClamAV are used for spam/malware scanning, in Exim ACL which is run at the DATA phase during the SMTP connection. Should SpamAssassin fail for some reason, mail is let through.

    # skip spam-check for locally-submitted messages
    accept hosts = +relay_from_hosts
        set acl_m0 = trusted relay

    # skip if message is too large (>4M)
    accept condition = ${if >{$message_size}{4M}}
        set acl_m0 = n/a
        set acl_m1 = skipped, message too large

    # skip if whitelisted in exim
    accept condition = ${if eq{$acl_m2}{skip_spamd}}
        set acl_m0 = n/a
        set acl_m1 = skipped, exim whitelist

    # add spam headers...
    warn spam = nonexistent:true
        set acl_m0 = $spam_score ($spam_bar)
        set acl_m1 = $spam_report
        set acl_m3 = $spam_score_int

    # silently drop spam at high scores (> 12)
    discard log_message = spam detected ($spam_score)
        condition = ${if >{$spam_score_int}{120}{1}{0}}

    # silently discard messages with malware attached
    discard log_message = malware detected ($malware_name)
        demime = *
        malware = *


Message tagging

We use Exim filters to tag messages with headers that VRT can match for automatic queue routing. The Exim filters are in /etc/exim4/system_filter (see the inline comments):

# Exim filter

if first_delivery then
    # Remove headers that control OTRS - we don't want these
    headers remove X-OTRS-Priority:X-OTRS-Queue:X-OTRS-Lock:X-OTRS-Ignore:X-OTRS-State
    if $acl_m0 is not "trusted relay" then
        # Remove any SpamAssassin headers and add local ones
        headers remove X-Spam-Score:X-Spam-Report:X-Spam-Checker-Version:X-Spam-Status:X-Spam-Level:X-Spam-Flag
    if $acl_m0 is not "" and $acl_m0 is not "trusted relay" then
        headers add "X-Spam-Score: $acl_m0"
        headers add "X-Spam-Report: $acl_m1"
        # Add header for OTRS filters
        if $acl_m1 is not "" and $acl_m1 begins "yes" then
            headers add "X-Spam-Flag: YES"
        # overload X-Spam-Flag since OTRS doesn't do numeric comparison
        elif $acl_m3 is not "" and $acl_m3 is above 20 then
            headers add "X-Spam-Flag: MAYBE"
            headers add "X-Spam-Flag: NO"
        # add a hook for OTRS to filter list mail
            ($message_headers contains "\nList-Id:" or
            $message_headers contains "\nList-Help:" or
            $message_headers contains "\nList-Subscribe:" or
            $message_headers contains "\nList-Unsubscribe:" or
            $message_headers contains "\nList-Post:" or
            $message_headers contains "\nList-Owner:" or
            $message_headers contains "\nList-Archive:") and
            $header_precedence: does not match "^(bulk|junk|list)"
            headers remove Precedence
            headers add "Precedence: bulk"

VRT mail routing

Mail destined for VRT is served by a simple accept router otrs, which does a MySQL database query to determine the validity of the recipient address being routed, similar to the check done earlier by the main MX hosts.

# Mail destined for OTRS

        driver = accept
        condition = ${lookup mysql{SELECT value0 FROM system_address WHERE value0='${quote_mysql:$local_part@$domain}'}{true}fail}
        transport = otrs

On success, the message is handed over to the otrs pipe transport:

# OTRS pipe transport

        driver = pipe
        command = OTRS_POSTMASTER
        current_directory = OTRS_HOME
        home_directory = OTRS_HOME
        user = OTRS_USER
        group = OTRS_GROUP
        timeout = 1m

This transport pipes the full contents of the message to the command/path specified in the macro OTRS_POSTMASTER (defined at the top of the file). A current and home directory will be set as specified, and the command will be run as the otrs user and group. If the actual execution/invocation fails for some reason, the message will be frozen on the queue with a warning message sent to root. If the command invocation succeeds, but the return code is EX_TEMPFAIL (e.g. when OTRS cannot access the database), the message is deferred/queued, and will be retried later. Any output will be logged.

Outbound mail

Any mail destined for an address that is not an OTRS address, e.g. mail submitted by OTRS itself, will be forwarded to an outbound MX.


The server runs its own ClamAV instance, using the stock clamav-daemon package. The daemon runs as user clamav which has read access to the mail queue via membership in group Debian-exim. Per the stock config, the freshclam daemon is used to update virus definitions.

Exim accesses ClamAV via unix socket at /var/run/clamav/clamd.ctl and silently drops and logs messages containing an infected attachment.


The server runs its own SpamAssassin instance. The stock spamassassin package is used, with daily updates enabled. Stock rules/scores are kept and we make a few local modifications which are listed below.

Multiple user profiles are not used, SpamAssassin reads global configuration settings and runs as user otrs. Training databases are stored in that user's homedir.


# Change to one to enable spamd

# Options
# See man spamd for possible options. The -d option is automatically added.

# SpamAssassin uses a preforking model, so be careful! You need to
# make sure --max-children is not set to anything higher than 5,
# unless you know what you're doing.

OPTIONS="--max-children 8 --nouser-config --listen-ip= -u otrs -g otrs"

# Pid file
# Where should spamd write its PID to file? If you use the -u or
# --username option above, this needs to be writable by that user.
# Otherwise, the init script will not be able to shut spamd down.

# Set nice level of spamd
NICE="--nicelevel 10"

# Cronjob
# Set to anything but 0 to enable the cron job to automatically update
# spamassassin's rules on a nightly basis


Non-stock sections are shown here:

#   Set which networks or hosts are considered 'trusted' by your mail
#   server (i.e. not spammers)
trusted_networks 2620:0:860::/46

# short-format report template, starting with Yes/No, used for OTRS filters
report _YESNO_, score=_SCORE_ | host: _HOSTNAME_ | scores: _TESTSSCORES(,)_ | autolearn=_AUTOLEARN_

#   Set file-locking method (flock is not safe over NFS, but is faster)
lock_method flock

#   Set the threshold at which a message is considered spam (default: 5.0)
required_score 3.5
score RP_MATCHES_RCVD -0.500
score RCVD_IN_RP_SAFE 2.000
score SPF_SOFTFAIL 2.000

SpamAssassin Training

There a few steps to spam training:

  1. user moves spammy messages to the Junk queue
  2. OTRS Generic Agent:"Export_Spam" runs nightly, filtering for tickets which are not in state "Closed successful" passing MessageIDs to
  3. writes the messages to /var/spool/spam/spam, and changes the ticket's state to "Closed successful"
  4. /usr/local/bin/train_spamassassin picks up /var/spool/spam/spam and feeds it to sa-learn as spam

Ham training is similar:

  1. OTRS Generic Agent:"Export_Ham" runs nightly, filtering for tickets in non-Junk queues which are in states Open, or Closed successful" and feeding those TicketID's to
  2. writes the messages to /var/spool/spam/ham
  3. /usr/local/bin/train_spamassassin picks up /var/spool/spam/ham and feeds it to sa-learn as ham

The two scripts mentioned above are custom and are installed by puppet from operations/puppet/files/otrs/* in the git repository.


This section is general guidance for a patchlevel update only. Upgrades can be complicated by database schema changes and other issues. There's really no way around reading the upgrade documentation, and testing the updates on a real system based on our existing configuration and database.

  1. fetch new otrs, install to /opt/otrs-X.Y.Z
  2. stop puppet, apache, exim, and otrs cronjobs
    • :~# puppet agent --disable
    • :~# service apache2 stop
    • :~# service exim4 stop
    • :~# service cron stop
  3. switch symlink, copy config into new code tree, fix permissions
    • :/opt# rm otrs && ln -s otrs-VERSION otrs
    • :/opt# cp otrs-PREVIOUS_VERSION/Kernel/ otrs/Kernel/
    • :/opt# ./otrs/bin/ --otrs-user=otrs --otrs-group=otrs --web-user=www-data --web-group=www-data /opt/otrs
  4. check DB schema and upgrade as necessary
    • :/opt# ./otrs/bin/
    • follow database upgrade instructions in
  5. restart apache, log in as an admin to the web interface and reinstall all addon packages
    • :/opt# service apache2 start
    • ADMIN -> System Administration -> Package Manager
    • Reinstall will be under the ACTION column for each package
  6. test functionality
  7. restart exim, manually run puppet (which installs some scripts and reenables cron jobs)
    • :~# service exim start
    • :~# service cron start
    • :~# puppetd -tv
    • send a mail to e.g. info-en and check that it shows up in OTRS


Our Znuny installation is almost fully database only. That is, all data is stored in a mysql database (m2 shard), and only configuration and code is stored locally on the hosting server. Znuny is open source and hence almost impossible to lose the code and most of the configuration is stored in puppet. There are a few configuration items that are stored locally on the server but this is temporary and those are in the end transferred to puppet. Hence we only care about database data being safe and backed up.

Database is regularly backed up once per week (on Wednesday currently). The infrastructure used is Bacula and most documentation from that page applies. The code doing the pre dump is in, and bacula just backs up the resulting file.

Restoring at a previous point in time is quite easy and all it takes is restore the dump from bacula (covered in Bacula) and applying it to the db server via the mysql command. Restoring individual items (like an article being deleted) is possible but quite complicated and difficult and requires a DBA to help isolate the specific transaction and avoid replaying it while replaying logs between the last backup and the time of the incident. It has never been done, nor required up to now.