You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Cron jobs

From Wikitech
(Redirected from Batch jobs)
Jump to navigation Jump to search
Cron redirects here. You might be looking for Help:Toolforge/Grid#Scheduling jobs at regular intervals with cron or Help:Toolforge/Kubernetes#Kubernetes cronjobs.

Note: the job queue runs continuously on many servers and is not a cron job.

manual cron jobs

QueryPage update

hume:/etc/cron.d/mw-update-special-pages: updates the special pages derived from QueryPage

PATH=/usr/local/bin:/bin:/usr/bin
00 4 */3 * * apache flock -n /var/lock/update-special-pages-small /usr/local/bin/update-special-pages-small > /home/wikipedia/logs/norotate/updateSpecialPages-small.log 2>&1
00 5 */3 * * apache flock -n /var/lock/update-special-pages /usr/local/bin/update-special-pages > /home/wikipedia/logs/norotate/updateSpecialPages.log 2>&1

update-special-pages-small

#!/bin/bash

cd /home/wikipedia/common/multiversion
for db in `</home/wikipedia/common/small.dblist`; do
	echo $db
	php MWScript.php updateSpecialPages.php $db
	echo
	echo
done

update-special-pages

#!/bin/bash

cd /home/wikipedia/common/multiversion
for db in `</home/wikipedia/common/all.dblist`; do
	echo $db
	php MWScript.php updateSpecialPages.php $db
	echo
	echo
done

Tor exit list update

hume:/etc/cron.d/mw-tor-list: Loads the tor exit list from check.torproject.org and saves it into memcached for later use by the TorBlock extension.

PATH=/usr/local/bin:/bin:/usr/bin
*/20 * * * * apache php /home/wikipedia/common/multiversion/MWScript.php extensions/TorBlock/loadExitNodes.php aawiki 2>&1

FlaggedRevs stats update

hume:/etc/cron.d/mw-flagged-revs: Updates the flaggedrevs_stats table

0 */2 * * * /home/wikipedia/common/php/extensions/FlaggedRevs/maintenance/wikimedia-periodic-update.sh 2>&1

wikimedia-periodic-update.sh

#!/bin/bash
for db in `</home/wikipedia/common/flaggedrevs.dblist`;do
	echo $db
	php -n /home/wikipedia/common/php/extensions/FlaggedRevs/maintenance/updateStats.php $db
done

Ganglia RRD commit

zwinger:/etc/cron.hourly/save-gmetad-rrds: The live RRD files for ganglia are kept in a tmpfs, for performance reasons. This script copies them back to disk in case of server restart

#!/bin/sh
/usr/local/bin/save-gmetad-rrds >> /var/log/save-gmetad-rrds.log 2>&1

save-gmetad-rrds

#!/bin/bash
service gmetad_pmtpa stop
echo "Saving RRDs..."
time rsync -a /mnt/ganglia_tmp/rrds.pmtpa/ /var/lib/ganglia/rrds.pmtpa
echo "Done"
service gmetad_pmtpa start

LDAP server backups

nfs1/2:/usr/local/sbin/opendj-backup.sh: Runs OpenDJ backups and stores them in /var/opendj/backup for pickup by amanda; cleans up backups older than three days.

0 18 * * * /usr/local/sbin/opendj-backup.sh > /dev/null 2>&1

SVN crons

?? (still runs? was formey):/usr/local/bin/svndump.php: Runs SVN dumps and stores them in /svnroot/bak for pickup by amanda; cleans up previous dump.

0 18 * * * /usr/local/bin/svndump.php > /dev/null 2>&1

?? (still runs? was formey):(mwdocs)/home/mwdocs/phase3/maintenance/mwdocgen.php: Updates the doxygen documentation for svn.

0 0 * * * (cd /home/mwdocs/phase3 && svn up && php maintenance/mwdocgen.php --all) >> /var/log/mwdocs.log 2>&1

antomony:(www-data)svn up: Updates the userinfo file

0 0 * * * (cd /var/cache/svnusers && svn up) > /dev/null 2>&1

puppetized cron jobs

Puppet configuration files can be found in the operations/puppet repo.

Apache

apaches::cron

class apaches::cron {
        cron {
                synclocalisation:
                        command =>"rsync -a --delete 10.0.5.8::common/php/cache/l10n/ /usr/local/apache/common/php/cache/l10n/",
                        user => root,
                        hour => 3,
                        minute => 0,
                        ensure => present;
                cleanupipc:
                        command => "ipcs -s | grep apache | cut -f 2 -d \\  | xargs -rn 1 ipcrm -s",
                        user => root,
                        minute => 26,
                        ensure => present;
                updategeoipdb:
                        environment => "http_proxy=http://brewster.wikimedia.org:8080",
                        command => "[ -d /usr/share/GeoIP ] && wget -qO - http://geolite.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz | gunzip > /usr/share/GeoIP/GeoIP.dat.new && mv /usr/share/GeoIP/GeoIP.dat.new /usr/share/GeoIP/GeoIP.dat",
                        user => root,
                        minute => 26,
                        ensure => absent;
                cleantmpphp:
                        command => "find /tmp -name 'php*'  -ctime +1 -exec rm -f {} \\;",
                        user => root,
                        hour => 5,
                        minute => 0,
                        ensure => present;
        }
}

Backup

backup::server

cron {
                amanda_daily:
                command =>      "/usr/sbin/amdump Wikimedia-Daily",
                require =>      Package["amanda-server"],
                user    =>      backup,
                hour    =>      2,
                minute  =>      0;

                amanda_weekly:
                command =>      "/usr/sbin/amdump Wikimedia-Weekly",
                require =>      Package["amanda-server"],
                user    =>      backup,
                hour    =>      6,
                minute  =>      0,
                weekday =>      Sunday;

                amanda_monthly:
                command =>      "/usr/sbin/amdump Wikimedia-Monthly",
                require =>      Package["amanda-server"],
                user    =>      backup,
                hour    =>      12,
                minute  =>      0,
                monthday =>     1;
        }

backup::mysql

cron {
                snaprotate:
                command =>      "/usr/local/sbin/snaprotate.pl -a swap -V tank -s data -L 20G",
                user    =>      root,
                hour    =>      1,
                minute  =>      0;
        }

Puppet

base::puppet

 # Keep puppet running
        cron {
                restartpuppet:
                        require => File[ [ "/etc/default/puppet" ] ],
                        command => "/etc/init.d/puppet restart > /dev/null",
                        user => root,
                        hour => 2,
                        minute => 37,
                        ensure => present;
                remove-old-lockfile:
                        require => Package[puppet],
                        command => "[ -f /var/lib/puppet/state/puppetdlock ] && find /var/lib/puppet/state/puppetdlock -ctime +1 -delete",
                        user => root,
                        minute => 43,
                        ensure => present;
        }

misc:puppetmaster

cron {
                updategeoipdb:
                        environment => "http_proxy=http://brewster.wikimedia.org:8080",
                        command => "wget -qO - http://geolite.maxmind.com/download/geoip/database/GeoLiteCountry/GeoIP.dat.gz | gunzip > /etc/puppet/files/misc/GeoIP.dat.new && mv /etc/puppet/files/misc/GeoIP.dat.new /etc/puppet/files/misc/GeoIP.dat; wget -qO - http://geolite.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz | gunzip > /etc/puppet/files/misc/GeoIPcity.dat.new && mv /etc/puppet/files/misc/GeoIPcity.dat.new /etc/puppet/files/misc/GeoIPcity.dat",
                        user => root,
                        hour => 3,
                        minute => 26,
                        ensure => present;
        }

DNS

dns::auth-server

 # Update ip map file

        cron { "update ip map":
                command => "rsync -qt 'rsync://countries-ns.mdc.dk/zone/zz.countries.nerd.dk.rbldnsd' /etc/powerdns/ip-map/zz.countries.nerd.dk.rbldnsd && pdns_control rediscover > /dev/null",
                user => pdns,
                hour => 4,
                minute => 7,
                ensure => present;
        }

dns::recursor

cron { pdnsstats:
                        command => "cd /var/www/pdns && /usr/local/powerdnsstats/update && /usr/local/powerdnsstats/makegraphs >/dev/null",
                        user => root,
                        minute => '*/5';
                }

image scaler

imagescaler::cron

cron { removetmpfiles:
                command => "for dir in /tmp /a/magick-tmp; do find \$dir -type f \\( -name 'gs_*' -o -name 'magick-*' \\) -cmin +60 -exec rm -f {} \\;; done",
                user => root,
                minute => '*/5',
                ensure => present
        }

LDAP

ldap::server

cron {
                "opendj-backup":
                        command =>      "/usr/local/sbin/opendj-backup.sh > /dev/null 2>&1",
                        require =>      File["/usr/local/sbin/opendj-backup.sh"],
                        user    =>      opendj,
                        hour    =>      18,
                        minute  =>      0;
        }

MediaWiki

mediawiki::maintenance

To run a MediaWiki maintenance script regularly in production, you should create a puppet file in modules/mediawiki/manifests/maintenance/ that is a subclass of mediawiki::maintenance and then include the file in modules/profile/manifests/mediawiki/maintenance.pp. See https://gerrit.wikimedia.org/r/#/c/operations/puppet/+/395694/ for example.

Misc

misc::extension-distributor

cron { extdist_updateall:
                command => "cd $extdist_working_dir/mw-snapshot; for branch in trunk branches/*; do /usr/bin/svn cleanup \$branch/extensions; /usr/bin/svn up \$branch/extensions > /dev/null; done",
                minute => 0,
                user => extdist,
                ensure => present;
        }

misc::nfs-server::home

cron { home-rsync:
                        require => File["/root/.ssh/home-rsync"],
                        command => '[ -d /home/wikipedia ] && rsync --rsh="ssh -c blowfish-cbc -i /root/.ssh/home-rsync" -azu /home/* db20@tridge.wikimedia.org:~/home/',
                        user => root,
                        hour => 2,
                        minute => 35,
                        weekday => 6,
                        ensure => present;
                }

ubuntu:mirror

# Mirror update cron entry
                cron { update-ubuntu-mirror:
                        require => [ Systemuser[mirror], File["update-ubuntu-mirror"] ],
                        command => "/usr/local/sbin/update-ubuntu-mirror > /dev/null",
                        user => mirror,
                        hour => '*/6',
                        minute => 43,
                        ensure => present;
                }

misc::kiwix-mirror

cron { kiwix-mirror-update:
                command => "rsync -vzrlptD  download.kiwix.org::download.kiwix.org/zim/0.9/ /data/kiwix/zim/0.9/ >/dev/null 2>&1",
                user => mirror,
                minute => '*/15',
                ensure => present;
        }