You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Maps/OSM Database: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Hnowlan
imported>Hnowlan
(→‎The initial-import-script: depooling, empty DB, note about import times)
Line 19: Line 19:
==== Checklist ====
==== Checklist ====
# Create a Phab task to track this work. [https://phabricator.wikimedia.org/T249086 Example].
# Create a Phab task to track this work. [https://phabricator.wikimedia.org/T249086 Example].
# Depool maps2004.  
# Depool the maps master.  
# Isolate maps2004 from the rest of the cluster. [https://gerrit.wikimedia.org/r/c/operations/puppet/+/585153 Example].
# Isolate the maps master from the rest of the cluster. [https://gerrit.wikimedia.org/r/c/operations/puppet/+/585153 Example].
# Disable Tilerator in puppet. [https://gerrit.wikimedia.org/r/c/operations/puppet/+/585141 Example].
# Disable Tilerator in puppet. [https://gerrit.wikimedia.org/r/c/operations/puppet/+/585141 Example].
# Delete postgres data, recreate the empty DB.
# Delete postgres data, recreate the empty DB.
# Run `osm-initial-import` script to reimport data from OSM, needs root access.
# Run `osm-initial-import` script to reimport data from OSM, needs root access.
# Re-enable replication
# Re-enable replication
# Repool maps2004
# Repool the maps master
# Re-init all slaves (cookbook)
# Re-init all slaves (cookbook)
# Re-enable tilerator
# Re-enable tilerator


==== Re-creating Postgres DB ====
==== Depooling the maps master ====
From the puppetmaster, depooling maps1004 in eqiad: 
 
<code>sudo confctl select "dc=eqiad,cluster=maps,service=kartotherian,name=maps1004.eqiad.wmnet" set/pooled=no</code>
 
<code>sudo confctl select "dc=eqiad,cluster=maps,service=kartotherian-ssl,name=maps1004.eqiad.wmnet" set/pooled=no</code>
 
==== Re-creating an empty Postgres DB ====
Simply remove the database and let puppet handle the rest:
 
<code>rm -rf /srv/postgresql/9.6/main/* && run-puppet-agent && service postgresql restart</code>
 
==== Re-creating Postgres DB from backup ====
* Clear the [[Postgres]] data directory and init the database from backup (replace <code>maps2001.codfw.wmnet</code> by the postgres master):
* Clear the [[Postgres]] data directory and init the database from backup (replace <code>maps2001.codfw.wmnet</code> by the postgres master):
<code>rm -rf /srv/postgresql/9.6/main/* && sudo -u postgres pg_basebackup -X stream -D /srv/postgresql/9.6/main/ -h maps2001.codfw.wmnet -U replication -W</code>
<code>rm -rf /srv/postgresql/9.6/main/* && sudo -u postgres pg_basebackup -X stream -D /srv/postgresql/9.6/main/ -h maps2001.codfw.wmnet -U replication -W</code>


==== The initial-import-script ====
==== The initial-import-script ====
Initial data load of OSM into postgresql is done by running <code>/usr/local/bin/osm-initial-import</code> on the postgresql master node. This command can take over an hour, so running in <code>screen</code> or <code>tmux</code> is advised.  
Initial data load of OSM into postgresql is done by running <code>/usr/local/bin/osm-initial-import</code> on the postgresql master node. This command can take over 12 hours or more, so running in <code>screen</code> or <code>tmux</code> is advised.  


<syntaxhighlight lang="bash">
<syntaxhighlight lang="bash">

Revision as of 11:26, 29 October 2020

OSM Database

The Maps OSM database is an important piece of the infrastructure, it's a PostgreSQL database that hosts the data source for tile generation and the Geoshapes service.

The OSM Database needs two processes that are fundamental in order to keep it available with the most updated data from OSM: initial import and replication.

Initial Importing of OSM data into PostgreSQL database

This process needs to be done in all clusters available, one at a time.

When you need to do initial import?

The following scenarios require a fresh install of the OSM database:

  • First setup of the cluster
  • Any upgrades or maintenance that require deleting the PostgreSQL data folder, for example:
    • OS migration T198622
    • Disk space in critical condition because of OSM replication
  • OSM replication is disabled for a long time range and the database is outdated
    • Depending on the server condition, is safer to re-import OSM data instead of roll replication, because it can overload Tilerator and increase significantly the CPU usage

Importing

Checklist

  1. Create a Phab task to track this work. Example.
  2. Depool the maps master.
  3. Isolate the maps master from the rest of the cluster. Example.
  4. Disable Tilerator in puppet. Example.
  5. Delete postgres data, recreate the empty DB.
  6. Run `osm-initial-import` script to reimport data from OSM, needs root access.
  7. Re-enable replication
  8. Repool the maps master
  9. Re-init all slaves (cookbook)
  10. Re-enable tilerator

Depooling the maps master

From the puppetmaster, depooling maps1004 in eqiad:

sudo confctl select "dc=eqiad,cluster=maps,service=kartotherian,name=maps1004.eqiad.wmnet" set/pooled=no

sudo confctl select "dc=eqiad,cluster=maps,service=kartotherian-ssl,name=maps1004.eqiad.wmnet" set/pooled=no

Re-creating an empty Postgres DB

Simply remove the database and let puppet handle the rest:

rm -rf /srv/postgresql/9.6/main/* && run-puppet-agent && service postgresql restart

Re-creating Postgres DB from backup

  • Clear the Postgres data directory and init the database from backup (replace maps2001.codfw.wmnet by the postgres master):

rm -rf /srv/postgresql/9.6/main/* && sudo -u postgres pg_basebackup -X stream -D /srv/postgresql/9.6/main/ -h maps2001.codfw.wmnet -U replication -W

The initial-import-script

Initial data load of OSM into postgresql is done by running /usr/local/bin/osm-initial-import on the postgresql master node. This command can take over 12 hours or more, so running in screen or tmux is advised.

osm-initial-import \
    -d <date_of_import> \
    -s  <state_file_url> \
    -x webproxy.eqiad.wmnet:8080

Troubleshouting

  • If the slave initialization timeouts in puppet, it needs to be run manually:
service postgresql@9.6-main stop

rm -rf /srv/postgresql/9.6/main

sudo -u postgres /usr/bin/pg_basebackup -X stream -D /srv/postgresql/9.6/main -h <maps_master fqdn> -U replication -w

# Run puppet to make sure recovery.conf file is created

service postgresql@9.6-main start

OSM replication

In order to keep the OSM Database updated, we use the following tools: osmosis and osm2pgsql. These tools fetch data from OSM servers and apply the requested changesets in the PostgreSQL database. When it's complete, the replication script will trigger a request to Tilerator, sending a list of tiles that needs to be regenerated.

Monitoring

You can monitor PostgreSQL activities of the Maps clusters in the following dashboards: