You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Maps: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Mathew.onipe
imported>MSantos
(Move documentation about OSM and PostgreSQL to a new page with more detailed info)
Line 5: Line 5:
[[File:Maps-deployment.png|thumb|Maps service deployment diagram]]
[[File:Maps-deployment.png|thumb|Maps service deployment diagram]]
The maps service consists of [https://github.com/kartotherian/kartotherian/blob/master/README.md Kartotherian] - a nodejs service to serve map tiles, [https://github.com/kartotherian/tilerator/blob/master/README.md Tilerator] - a non-public service to prepare vector tiles (data blobs) from OSM database into Cassandra storage, and TileratorUI - an interface to manage Tilerator jobs. There are four servers in the <code>maps</code> group: <code>maps-test200{1,2,3,4}.codfw.wmnet</code> that run Kartotherian (port 6533, NCPU instances), Tilerator (port 6534, half of NCPU instance), TileratorUI (port 6535, 1 instance). Also, there are four Varnish servers per datacenter in the <code>cache_maps</code> group.
The maps service consists of [https://github.com/kartotherian/kartotherian/blob/master/README.md Kartotherian] - a nodejs service to serve map tiles, [https://github.com/kartotherian/tilerator/blob/master/README.md Tilerator] - a non-public service to prepare vector tiles (data blobs) from OSM database into Cassandra storage, and TileratorUI - an interface to manage Tilerator jobs. There are four servers in the <code>maps</code> group: <code>maps-test200{1,2,3,4}.codfw.wmnet</code> that run Kartotherian (port 6533, NCPU instances), Tilerator (port 6534, half of NCPU instance), TileratorUI (port 6535, 1 instance). Also, there are four Varnish servers per datacenter in the <code>cache_maps</code> group.
== The infrastructure ==
* [https://wikitech.wikimedia.org/wiki/Maps/OSM_Database OSM Database (PostgreSQL)]
* Tile storage (Cassandra)
* Kartotherian
* Tilerator


== Monitoring ==
== Monitoring ==
Line 15: Line 21:
* [https://grafana.wikimedia.org/dashboard/db/service-tilerator Tilerator - Grafana]
* [https://grafana.wikimedia.org/dashboard/db/service-tilerator Tilerator - Grafana]
* [https://logstash.wikimedia.org/#/dashboard/elasticsearch/tilerator Tilerator - Logstash]
* [https://logstash.wikimedia.org/#/dashboard/elasticsearch/tilerator Tilerator - Logstash]
== Importing database ==
'''maps2001 is actually not the best server for this - we should switch it around with maps2002, as it has 12 cores and 96GB RAM.
* From https://planet.openstreetmap.org/pbf/ - find the file with the latest available date, but do NOT use "latest", as that might change at any moment.
* <tt><nowiki>curl -x webproxy.eqiad.wmnet:8080 -O https://planet.openstreetmap.org/pbf/planet-151214.osm.pbf.md5</nowiki></tt>
* <tt><nowiki>curl -x webproxy.eqiad.wmnet:8080 -O https://planet.openstreetmap.org/pbf/planet-151214.osm.pbf</nowiki></tt>
* <tt>md5sum -c planet-151214.osm.pbf.md5</tt>
* <tt>sudo -u osmupdater PGPASSWORD="$(< ~/osmimporter_pass)" osm2pgsql --create --slim --flat-nodes nodes.bin -C 40000 --number-processes 8 --hstore planet-151214.osm.pbf -H localhost -U osmupdater -d gis</tt>
* additional steps to import shapes and create some indexes / functions / ... are documented in [https://github.com/kartotherian/osm-bright.tm2source#install Kartotherian sources].
** Note: after importing the water polygons, permissions will need to be granted once again to allow users to read water_polygons: <tt><nowiki>sudo -u postgres psql -d gis -c 'GRANT SELECT ON ALL TABLES IN SCHEMA public TO kartotherian, tilerator, tileratorui;'</nowiki></tt>
=== Notes ===
* Tables are created by osm2pgsql, no need for an initial DDL script.


== [https://github.com/kartotherian/kartotherian/blob/master/README.md Kartotherian] ==
== [https://github.com/kartotherian/kartotherian/blob/master/README.md Kartotherian] ==
Line 111: Line 103:
=== Bulk Copying ===
=== Bulk Copying ===
The fastest way to copy a large number of tiles from one source to another is to use a large number of parts and specify <code>saveSolid=true</code> (skips solid tile detection). E.g. to copy all z16 tiles from v3 to v4, use '''<code>src=v3 dst=v4 zoom=16 parts=60 saveSolid=true</code>'''
The fastest way to copy a large number of tiles from one source to another is to use a large number of parts and specify <code>saveSolid=true</code> (skips solid tile detection). E.g. to copy all z16 tiles from v3 to v4, use '''<code>src=v3 dst=v4 zoom=16 parts=60 saveSolid=true</code>'''
== Postgres ==
* Clear the [[Postgres]] data directory and init the database from backup (replace <code>maps2001.codfw.wmnet</code> by the postgres master):
<code>rm -rf /srv/postgresql/9.4/main/* && sudo -u postgres pg_basebackup -X stream -D /srv/postgresql/9.4/main/ -h maps2001.codfw.wmnet -U replication -W</code>


== Puppetization and Automation ==
== Puppetization and Automation ==
Line 153: Line 141:
* Setup replication of Cassandra <code>system_auth</code> according to [[Cassandra#Replicating_system_auth|documentation]].
* Setup replication of Cassandra <code>system_auth</code> according to [[Cassandra#Replicating_system_auth|documentation]].
*'''Very important point to note is there's a catch when setting up replicator factor for system_auth keyspace on cassandra.''' We recently discovered that increasing replication factor according to docs causes outage on cassandra.  See [[phab:T214434|T214434]] and [[phab:T157354|T157354]]. Also this: [[Incident documentation/20190122-maps]].
*'''Very important point to note is there's a catch when setting up replicator factor for system_auth keyspace on cassandra.''' We recently discovered that increasing replication factor according to docs causes outage on cassandra.  See [[phab:T214434|T214434]] and [[phab:T157354|T157354]]. Also this: [[Incident documentation/20190122-maps]].
*Scap environment needs to be updated when migrating to stretch.
* Initial data load of OSM into postgresql is done by running <code>/usr/local/bin/osm-initial-import</code> on the postgresql master node. Cassandra should be shutdown during initial import to free memory for osm2pgsql.
<syntaxhighlight lang="bash">
osm-initial-import \
    -d <date_of_import> \
    -p <password_file> \
    -s  <state_file_url> \
    -x webproxy.eqiad.wmnet:8080
</syntaxhighlight>
** '''date_of_import:''' find the latest dump at https://planet.osm.org/pbf/. Example: <code>160530</code>.
** '''password_file:''' A file containing the postgresql password of the osmimporter user.
** '''state_file_url:''' The URL to the state file corresponding to the dump, find the correct one at http://planet.openstreetmap.org/replication/ (the state file must be older than the dump). Example: <code>http://planet.openstreetmap.org/replication/day/000/001/355.state.txt</code>.
* If the postgresql master already has data, the slave initialization will timeout in puppet. It then needs to be run manually:
<syntaxhighlight lang="bash">
service postgresql@9.4-main stop
rm -rf /srv/postgresql/9.4/main
sudo -u postgres /usr/bin/pg_basebackup -X stream -D /srv/postgresql/9.6/main -h <maps_master fqdn> -U replication -w
# Run puppet to make sure recovery.conf file is created
service postgresql@9.4-main start
</syntaxhighlight>


* Initial creation of cassandra keyspace: To prevent accidental modification of schema, Tilerator source configuration does not allow to create schema by default. The sources file used by tilerator / kartotherian is configured in <code>/etc/(kartotherian|tilerator|tileratorui)/config.yaml</code>, look for the <code>sources:</code> key. This is a reference to a sources file in the kartotherian / tilerator source directory. For example <code>/srv/deployment/tilerator/deploy/src/sources.prod2.yaml</code>.
* Initial creation of cassandra keyspace: To prevent accidental modification of schema, Tilerator source configuration does not allow to create schema by default. The sources file used by tilerator / kartotherian is configured in <code>/etc/(kartotherian|tilerator|tileratorui)/config.yaml</code>, look for the <code>sources:</code> key. This is a reference to a sources file in the kartotherian / tilerator source directory. For example <code>/srv/deployment/tilerator/deploy/src/sources.prod2.yaml</code>.

Revision as of 13:00, 3 April 2020

This page describes the technical aspects of deploying Maps service on Wikimedia Foundation infrastructure.

Intro

Maps service component diagram
Maps service deployment diagram

The maps service consists of Kartotherian - a nodejs service to serve map tiles, Tilerator - a non-public service to prepare vector tiles (data blobs) from OSM database into Cassandra storage, and TileratorUI - an interface to manage Tilerator jobs. There are four servers in the maps group: maps-test200{1,2,3,4}.codfw.wmnet that run Kartotherian (port 6533, NCPU instances), Tilerator (port 6534, half of NCPU instance), TileratorUI (port 6535, 1 instance). Also, there are four Varnish servers per datacenter in the cache_maps group.

The infrastructure

Monitoring

Kartotherian

Kartotherian serves map tiles by getting vector data from Cassandra, applying the style to it, and returning raster images. It is also capable of serving a "static image" - a map with a given width/height/scaling/zoom, and can server vector tiles directly for on-the-client rendering (WebGL maps).

To see the tiles without Varnish cache, connect to Kartotherian using an ssh tunnel, e.g. ssh -L 6533:localhost:6533 maps-test2001.codfw.wmnet and browse to http://localhost:6533

Tilerator

Tilerator is a backend vector tile pre-generation service that picks up jobs from a Redis job que, copying tiles from a Postgres DB, using sql queries into vector tiles stored in Cassandra. Postgres DBs are set up on each of the maps hosts, one master and 3 slaves. Technically, Tilerator is not even a generator, but rather a "batch copying" service, which takes tiles from one configured source (e.g. a tile generator from SQL), and puts it into another source (e.g. Cassandra tile store).

TileratorUI

TileratorUI is used to inspect maps, including internal data sources, and to add jobs to the Tilerator job queue. Actually, TileratorUI is the same code as Tilerator, but started with a different configuration. Connect to TileratorUI using an ssh tunnel, e.g. ssh -L 6535:localhost:6535 maps-test2001.codfw.wmnet and navigating to http://localhost:6535. There, you can view any style (use set style to change it), or to schedule a job by setting all relevant fields and Control+Clicking the tile you want to schedule.

Screenshot

Quick cheat sheet

  • Style is specified in upper left corner
    • Set it to genview to view tiles generated on the fly. Caution: that means that if you zoom out to low levels tiles can take more that 10 minutes to generate.
  • Alt+click (Option+click on Mac) on map to enqueue regeneration jobs.
    • This requires src and dst to be set. For the most basic operation, on-demand regeneration of tiles, set src to gen and dst to whatever Cassandra keyspace is used for tile storage (currently v4).
    • By default, only tile clicked upon will be regenerated.
    • Set fromZ and beforeZ to regenerate a bunch of layers under the clicked tile.
  • Click on source to view the currently active sources configuration.

See full Tilerator documentation for all commands & parameters.

Dynamic tile sources

Cassandra

To create a new Cassandra data source, post something like this to the /sources as a text body. Default table name is tiles. If table or keyspace is not there, you have to use createIfMissing parameter.

v2a:
  uri: cassandra://
  params:
    keyspace: v2
    table: tiles2a
    cp: [maps-test2001.codfw.wmnet, maps-test2002.codfw.wmnet, maps-test2003.codfw.wmnet, maps-test2004.codfw.wmnet]
    username: {var: cassandra-user}
    password: {var: cassandra-pswd}
#    repfactor: 4
#    durablewrite: 0
#    createIfMissing: true
#    copyInfoFrom: {ref: gen}

Dynamic Layer Generator

To generate just a few layers from database, create a layer filter and a layer mixer:

gentmp:
  uri: bridge://
  xml:
    npm: ["osm-bright-source", "data.xml"]
  xmlSetDataSource:
    if:
      dbname: gis
      host: ""
      type: postgis
    set:
      host: localhost
      user: {var: osmdb-user}
      password: {var: osmdb-pswd}
  xmlLayers: [admin, road]

mixtmp:
  uri: layermixer://
  params:
    sources: [{ref: v2}, {ref: gentmp}]

Once set, POST a job to copy mixtmp into the storage v2 e.g.

src=mixtmp dst=v2 baseZoom=0 fromZoom=5 beforeZoom=6 parts=10

Generating Tiles

Generate all tiles for zooms 0..7, using generator gen, saving into v4 everything including the solid tiles, up to 4 jobs per zoom.

src=gen dst=v4 parts=4 baseZoom=0 fromZoom=0 beforeZoom=8 saveSolid=1

Generated tiles only if they already exist in v2 source, and save them into v4, on zooms 8..15, 60 jobs per zoom.

src=gen dst=v4 parts=60 baseZoom=0 fromZoom=8 beforeZoom=16 sourceId=v2

Bulk Copying

The fastest way to copy a large number of tiles from one source to another is to use a large number of parts and specify saveSolid=true (skips solid tile detection). E.g. to copy all z16 tiles from v3 to v4, use src=v3 dst=v4 zoom=16 parts=60 saveSolid=true

Puppetization and Automation

Prerequisites

  • passwords and postgres replication configuration is set in Ops private repo (root@palladium:~/private/hieradata/role/(codfw|eqiad)/maps/server.yaml)
  • other configuration in puppet/hieradata/role/(codfw|common|eqiad)/maps/*.yaml
  • cassandra::rack is defined in puppet/hieradata/hosts/maps*.yaml
  • the role::maps::master / role::maps::slave roles are associated to the maps nodes (site.pp)

Manual steps

  • To initialize the first Cassandra node, we need to add the local node to the list of seeds by manually editing /etc/cassandra/cassandra.yaml and restarting cassandra:
  seed_provider:
    # Addresses of hosts that are deemed contact points.
    # Cassandra nodes use this list of hosts to find each other and learn
    # the topology of the ring.  You must change this if you are running
    # multiple nodes!
    - class_name: org.apache.cassandra.locator.SimpleSeedProvider
      parameters:
        # seeds is actually a comma-delimited list of addresses.
        # Ex: "<ip1>,<ip2>,<ip3>"
        # Omit own host name / IP in multi-node clusters (see
        # https://phabricator.wikimedia.org/T91617).
       - seeds: 10.64.16.42,10.64.32.117,10.64.48.154 # '''add local node here to initialize the first Cassandra node'''
  • change the cassandra super user password to match the one configured in private repo using cqlsh:
 cqlsh <maps1001.eqiad.wmnet> -u cassandra
 Password: cassandra
 ALTER USER cassandra WITH PASSWORD '<password>';
  • Setup of user access / rights for cassandra
 cat /usr/local/bin/maps-grants.cql | cqlsh <maps1001.eqiad.wmnet> -u cassandra
  • Setup replication of Cassandra system_auth according to documentation.
  • Very important point to note is there's a catch when setting up replicator factor for system_auth keyspace on cassandra. We recently discovered that increasing replication factor according to docs causes outage on cassandra. See T214434 and T157354. Also this: Incident documentation/20190122-maps.
  • Initial creation of cassandra keyspace: To prevent accidental modification of schema, Tilerator source configuration does not allow to create schema by default. The sources file used by tilerator / kartotherian is configured in /etc/(kartotherian|tilerator|tileratorui)/config.yaml, look for the sources: key. This is a reference to a sources file in the kartotherian / tilerator source directory. For example /srv/deployment/tilerator/deploy/src/sources.prod2.yaml.

The easiest way to create a new keyspace is to run Tilerator with a custom sources file, which instructs tilerator to create the missing keyspace. For example, create a temporary file, e.g. /home/yurik/my-source-file with the following configuration (change v4 with the keyspace declared in the sources configuration file):

v4:
  uri: cassandra://
  params:
    keyspace: v4
    cp: {var: cassandra-servers}
    username: {var: cassandra-user}
    password: {var: cassandra-pswd}
    # These parameters are only used if keyspace needs to be created:
    repfactor: 4
    durablewrite: 0
    createIfMissing: true
    copyInfoFrom: {ref: gen}

And run this bash script:

node /srv/deployment/tilerator/deploy/node_modules/tilerator/scripts/tileshell.js \
  # Use TileratorUI configuration (including password variables)
  --config /etc/tileratorui/config.yaml \
  # Use this sources file instead of the one in the config file
  --source /home/yurik/my-source-file

Tileshell will not exit, so ^C it after it reports "done".

  • On existing server, record all existing tiles as a list of tile indexes (path and generatorId need to be adapted)
node /srv/deployment/tilerator/deploy/node_modules/tilerator/scripts/tileshell.js \
  --config /etc/tileratorui/config.yaml \
  # List all tiles in the "v5" source
  -j.generatorId v5 \
  # Which zoom to enumerate
  -j.zoom 14 \
  # File to write indexes to
  --dumptiles /home/yurik/all-tiles-14.txt \
  # Instead of dumping indexes in "zoom/x/y" format, write one number indexes (0..4^zoom-1)
  --dumprawidx  \
  # If dumptiles file already exists, override it
  --dumpoverride
  • Instead of generating the entire zoom level, you may want to generate just the tiles in a list (all parameters might need to be adapted)
node /srv/deployment/tilerator/deploy/node_modules/tilerator/scripts/tileshell.js \
  --config /etc/tileratorui/config.yaml \
  # List of tile indexes, unique and sorted, one per line.
  # Indexes can be 0..4^zoom-1
  -j.filepath /home/yurik/all-tiles-14.txt \
  # All tile indexes in the file belong to zoom 14
  # Without this parameter, the file must contain zoom/x/y triplets
  -j.fileZoomOverride 14 \
  # generate zoom levels   10 <= zoom < 16
  -j.fromZoom 10 -j.beforeZoom 16 \
  # Copy tiles from "gen" source to "v4" source
  -j.generatorId gen -j.storageId v4 \
  # If tile already exists in "v4", but "gen" produces an empty tile, delete it in v4
  -j.deleteEmpty

Deploying maps services

Git repositories galore

Kartotherian is developed on GitHub, however for deployment purposes we have copies of both services' main repos in Gerrit, at maps/kartotherian and maps/tilerator. All development should happen on GitHub while Gerrit can have WMF-specific hacks (that should be avoided, however).

Keeping repositories in sync

Assuming you have a checkout from Gerrit, create a remote called github:

kartotherian$ git remote add github https://github.com/kartotherian/kartotherian.git

Then when you need to sync the repositories, pull from GitHub and push to Gerrit:

git pull github master
git push

Building for deployment

Kartotherian and Tilerator are deployed according to the standard process for deploying Wikimedia Node.js services, with the important difference that deployments are built from purpose-specific "package" repos rather than directly from service code repos. This is to facilitate the bundling of additional maps-specific dependencies.

Kartotherian: https://gerrit.wikimedia.org/r/#/admin/projects/maps/kartotherian/package
Tilerator: https://gerrit.wikimedia.org/r/#/admin/projects/maps/tilerator/package

Deploying services

Refer to Services/Deployment for general instructions.

In most cases, Kartotherian and Tilerator should be deployed together, to ensure that all dependencies (in particular, styles) are in sync.

Subpages