You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Maps: Difference between revisions
imported>Gehel |
imported>Yurik |
||
Line 34: | Line 34: | ||
== [https://github.com/kartotherian/tilerator/blob/master/README.md Tilerator] == | == [https://github.com/kartotherian/tilerator/blob/master/README.md Tilerator] == | ||
Tilerator is a backend vector tile pre-generation service that picks up jobs from a Redis job que, copying tiles from a Postgres DB, using [https://github.com/kartotherian/osm-bright.tm2source/blob/master/README.md sql queries] into vector tiles stored in Cassandra. Postgres DBs are set up on each of the maps hosts, one master and 3 slaves. | Tilerator is a backend vector tile pre-generation service that picks up jobs from a Redis job que, copying tiles from a Postgres DB, using [https://github.com/kartotherian/osm-bright.tm2source/blob/master/README.md sql queries] into vector tiles stored in Cassandra. Postgres DBs are set up on each of the maps hosts, one master and 3 slaves. Technically, Tilerator is not even a generator, but rather a "batch copying" service, which takes tiles from one configured source (e.g. a tile generator from SQL), and puts it into another source (e.g. Cassandra tile store). | ||
* [Https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/tilerator/templates/config.yaml.erb Service configuration template] (extends [[phab:diffusion/OPUP/browse/production/modules/service/templates/node/config.yaml.erb|base template]]) | * [Https://phabricator.wikimedia.org/diffusion/OPUP/browse/production/modules/tilerator/templates/config.yaml.erb Service configuration template] (extends [[phab:diffusion/OPUP/browse/production/modules/service/templates/node/config.yaml.erb|base template]]) | ||
* [http://git.wikimedia.org/blob/maps%2Ftilerator.git/HEAD/sources.prod.yaml Sources configuration] | * [http://git.wikimedia.org/blob/maps%2Ftilerator.git/HEAD/sources.prod.yaml Sources configuration] | ||
Line 95: | Line 95: | ||
=== Bulk Copying === | === Bulk Copying === | ||
The fastest way to copy a large number of tiles from one source to another is to use a large number of parts and specify <code>saveSolid=true</code> (skips solid tile detection). E.g. to copy all z16 tiles from v2 to v3, use<blockquote>'''<code>src=v2 dst=v3 zoom=16 parts=60 saveSolid=true</code>''' </blockquote> | The fastest way to copy a large number of tiles from one source to another is to use a large number of parts and specify <code>saveSolid=true</code> (skips solid tile detection). E.g. to copy all z16 tiles from v2 to v3, use<blockquote>'''<code>src=v2 dst=v3 zoom=16 parts=60 saveSolid=true</code>''' </blockquote> | ||
== Postgres == | |||
* Clear the Postgres data directory and init the database from backup (replace <code>maps2001.codfw.wmnet</code> by the postgres master): | |||
<code>rm -rf /srv/postgresql/9.4/main/* && sudo -u postgres pg_basebackup -X stream -D /srv/postgresql/9.4/main/ -h maps2001.codfw.wmnet -U replication -W</code> |
Revision as of 20:57, 19 May 2016
This page describes the technical aspects of deploying Maps service on Wikimedia Foundation infrastructure.
Intro
The maps service consists of Kartotherian - a nodejs service to serve map tiles, Tilerator - a non-public service to prepare vector tiles (data blobs) from OSM database into Cassandra storage, and TileratorUI - an interface to manage Tilerator jobs. There are four servers in the maps
group: maps-test200{1,2,3,4}.codfw.wmnet
that run Kartotherian (port 6533, NCPU instances), Tilerator (port 6534, half of NCPU instance), TileratorUI (port 6535, 1 instance). Also, there are two Varnish servers in the cache_maps
group: cp104{3,4}.eqiad.wmnet
.
Monitoring
- Usage dashboard
- Usage - varnish
- Kartotherian - Grafana
- Kartotherian - Logstash
- Maps Cassandra - Logstash
- Tilerator - Grafana
- Tilerator - Logstash
- Ganglia - Maps Varnish cluster (eqiad)
- Ganglia - Maps cluster (codfw)
- Free disk space
Importing database
maps2001 is actually not the best server for this - we should switch it around with maps2002, as it has 12 cores and 96GB RAM.
- From http://planet.osm.org/pbf/ - find the file with the latest available date, but do NOT use "latest", as that might change at any moment.
- curl -x webproxy.eqiad.wmnet:8080 -O http://planet.osm.org/pbf/planet-151214.osm.pbf.md5
- curl -x webproxy.eqiad.wmnet:8080 -O http://planet.osm.org/pbf/planet-151214.osm.pbf
- md5sum -c planet-151214.osm.pbf.md5
- PGPASSWORD="$(< ~/osmimporter_pass)" osm2pgsql --create --slim --flat-nodes nodes.bin -C 40000 --number-processes 8 --hstore planet-151214.osm.pbf -H maps-test2001 -U osmimporter -d gis
Notes
- Tables are created by osm2pgsql, no need for an initial DDL script.
Kartotherian
Kartotherian servers map tiles by getting vector data from Cassandra, applying the style to it, and returning raster images. It is also capable of serving a "static image" - a map with a given width/height/scaling/zoom, and can server vector tiles directly for on-the-client rendering (WebGL maps).
To see the tiles without Varnish cache, connect to Kartotherian using an ssh tunnel, e.g. ssh -L 6533:localhost:6533 maps-test2001.codfw.wmnet
and browse to http://localhost:6533
Tilerator
Tilerator is a backend vector tile pre-generation service that picks up jobs from a Redis job que, copying tiles from a Postgres DB, using sql queries into vector tiles stored in Cassandra. Postgres DBs are set up on each of the maps hosts, one master and 3 slaves. Technically, Tilerator is not even a generator, but rather a "batch copying" service, which takes tiles from one configured source (e.g. a tile generator from SQL), and puts it into another source (e.g. Cassandra tile store).
TileratorUI
TileratorUI is used to add jobs to the Tilerator job queue. Actually, TileratorUI is the same code as Tilerator, but started with a different configuration. Connect to TileratorUI using an ssh tunnel, e.g. ssh -L 6535:localhost:6535 maps-test2001.codfw.wmnet
and navigating to http://localhost:6535. There, you can view any style (use set style to change it), or to schedule a job by setting all relevant fields and Control+Clicking the tile you want to schedule.
See full Tilerator documentation for all commands & parameters.
Dynamic Tile Sources
Cassandra
To create a new Cassandra data source, post something like this to the /sources as a text body. Default table name is tiles
. If table or keyspace is not there, you have to use createIfMissing
parameter.
v2a:
uri: cassandra://
params:
keyspace: v2
table: tiles2a
cp: [maps-test2001.codfw.wmnet, maps-test2002.codfw.wmnet, maps-test2003.codfw.wmnet, maps-test2004.codfw.wmnet]
username: {var: cassandra-user}
password: {var: cassandra-pswd}
# repfactor: 4
# durablewrite: 0
# createIfMissing: true
Dynamic Layer Generator
To generate just a few layers from database, create a layer filter and a layer mixer:
gentmp:
uri: bridge://
xml:
npm: ["osm-bright-source", "data.xml"]
xmlSetDataSource:
if:
dbname: gis
host: ""
type: postgis
set:
host: localhost
user: {var: osmdb-user}
password: {var: osmdb-pswd}
xmlLayers: [admin, road]
mixtmp:
uri: layermixer://
params:
sources: [{ref: v2}, {ref: gentmp}]
Once set, POST a job to copy mixtmp
into the storage v2
e.g.
src=mixtmp dst=v2 baseZoom=0 fromZoom=5 beforeZoom=6 parts=10
Generating Tiles
Generate all tiles for zooms 0..7
, using generator gen
, saving into v3
everything including the solid tiles, up to 4 jobs per zoom.
src=gen dst=v3 parts=4 baseZoom=0 fromZoom=0 beforeZoom=8 saveSolid=1
Generated tiles only if they already exist in v2
source, and save them into v3
, on zooms 8..15
, 60 jobs per zoom.
src=gen dst=v3 parts=60 baseZoom=0 fromZoom=8 beforeZoom=16 sourceId=v2
Bulk Copying
The fastest way to copy a large number of tiles from one source to another is to use a large number of parts and specify saveSolid=true
(skips solid tile detection). E.g. to copy all z16 tiles from v2 to v3, use
src=v2 dst=v3 zoom=16 parts=60 saveSolid=true
Postgres
- Clear the Postgres data directory and init the database from backup (replace
maps2001.codfw.wmnet
by the postgres master):
rm -rf /srv/postgresql/9.4/main/* && sudo -u postgres pg_basebackup -X stream -D /srv/postgresql/9.4/main/ -h maps2001.codfw.wmnet -U replication -W