You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Obsolete:Media server

From Wikitech-static
Jump to navigation Jump to search

Plans for new architecture: Media server/2011 Media Storage plans

Deploying a new media server

Checklist of things that need to be done when setting up a media server (from scratch, no jumpstart):

  • Base Solaris 10 install
    fix timezone, get /tmp mounted, fix dump and swap sizes
  • Convert root filesystem to use ZFS
    fix grub menu,
  • Set up raid
  • Install pkgtool from sourceforge
    root home dir /root, get ssh keys over
    add local nonroot user with path /export/home/username and install privs
  • Install from wmf spec files: pkgtool, pca
    set proxy environ var for this, do as nonroot user
  • Check all patches, install using pca
    set proxy environ var for this, need sun account with contract
  • Install from wmf spec files: screen, netcat
    set proxy environ var for this, do as nonroot user
  • Copy media data and unpack
    aggregate nge0 and nge1
  • Install Sun Java Webserver 7
    install 1,2,3,4, put in /opt/webserver7/... with /export/upload as path
    turn off web admin server and enable regular web server by editing xml files
  • Install 64 bit dtrace plugin
    64-bit copy from ...
  • Install from wmf spec files: php, ganglia, libogg, libvorbis
    copy /etc/gmond.conf from...
    enable gmetrics in crontab by...
  • Set up cron jobs for replication and snapshots if this is a master

Transitioning from one server to another

Basic procedure for switching to another media server, presuming it has been receiving data via replication:

  • Halt uploads and delete/restores temporarily
  • Turn off replication and snapshots on live media server
  • Do replication manually to new host to copy over data since last push
  • Mount the new /export/upload filesystem everywhere with a new mount point
  • Change /home/wikipedia/bin/scap so that it uses the new mount point
  • Change the mount point everywhere in CommonSettings.php and InitialiseSettings.php
  • Change the squid settings in upload-settings.php to use new host
  • Check we can read media as things are now
  • Stop webserver on old media server, check we can read media
  • Unmount old /export/upload everywhere, check we can read media
  • Turn on uploads
  • Test uploading of images
  • Change puppet settings for ExtensionDistributor and nfs mount point checks to use the new filesystem (misc-servers.pp, nfs.pp)
  • Test ExtensionDistributor, make sure it works

Transitioning from one nfs mount to another (ie. ms7 to nas1), Oct 2012

If we go with a new mountpoint:

  • Toss this line: Alias /centralnotice/ /mnt/upload6/centralnotice/ from /etc/apache2/wmf/wikimedia.conf (and make sure centralnotice still works)
  • Mount the new /export/upload filesystem everywhere with a new mount point, using Puppet
    Make sure the netapp exports are set up right for the volume
  • Halt uploads and delete/restores temporarily
    on fenari, make a copy of /home/w/common/wmf-config/InitialiseSettings.php
    in that same file, set the values for wgEnableUploads for the wikis that have 'true' to 'false'
    sync-common-file to push that around
    sleep() (Wait 5-10 minutes for in progress uploads to finish)
  • Change the mount point everywhere in CommonSettings.php, InitialiseSettings.php (saved copy and new copy), filebackend.php, extdist/svn-invoke.conf
  • Change the settings for it in manifests/misc-servers.pp:misc::extension-distributor and push to fenari
  • Check that public and private media, ext-dist, math and timeline (did those move to swift yet?), captcha are working
  • Stop webserver on ms7, see if the above are still all good -- err, no, guess not (captcha plus some cached pages with math imgs that still get served by ms7)
  • Turn on uploads/deletes/restores
    copy back the saved copy of InitialiseSettings.php on fenari to /home/w/common/wmf-config/
    sync-common-file to push that around
  • Check that public and private media writes are working and that writes also go to the netapp
  • Unmount old /export/upload6 everywhere -- maybe not since we can't move captcha!!

(Please fill in with what's missing, etc)

Additional notes from Aaron's email:

  • I'd use the filebackend.php 'readOnly' option instead of wgEnableUploads (which is less robust). You could use both of course. 'readOnly' would need to be set for each multiwrite backend and is to be set to an explanatory English string.
  • Someone probably just forgot to add this to the list, but a final rsync (with --update and *without* --delete) is needed afterwards since the netapps are not fully up to date with ms7. This does not need to be during any read-only time. It needs --update so it won't nuke newer files changes since the rsync. It can't have --delete since the newest uploads will be in the netapps but not ms7.
  • After the rsync, I'll need to run syncFileBackend.php to pick up updates since the first netapp rsync began. I've already generated a list of position files for all wikis in my home directory with the position from 10 days ago (the 25th, for good measure). This will basically clean up after the resync above in terms of file deletions.
    • This should be fine since mark started the rsync on the 29th according to the server admin log.

Since reads basically come from swift for uploads, timelines, and math renderings, users should not notice the temporary inconstancies between the netapps and swift. Writes will transparently sync the netapps on the fly to match swift for whatever files the write operation affects or needs.

Misc

To get live access logs on the media or thumb servers you can run the command

dtrace -qs access_log.d

which relies on the dtrace nsapi plugin and the file /opt/local/share/access_log.d

These are built from mediawiki/trunk/tools/nsapi-dtrace in case you need the sources.