imported>Hnowlan |
imported>MSantos |
Line 1: |
Line 1: |
| = OSM Database =
| | #REDIRECT [[Maps/OSM Database Legacy]] |
| The Maps OSM database is an important piece of the infrastructure, it's a PostgreSQL database that hosts the data source for tile generation and the [[Geoshapes service]].
| |
| | |
| The OSM Database needs two processes that are fundamental in order to keep it available with the most updated data from OSM: initial import and replication.
| |
| | |
| == Initial Importing of OSM data into PostgreSQL database ==
| |
| This process needs to be done in all clusters available, one at a time.
| |
| | |
| === When you need to do initial import? ===
| |
| The following scenarios require a fresh install of the OSM database:
| |
| * First setup of the cluster
| |
| * Any upgrades or maintenance that require deleting the PostgreSQL data folder, for example:
| |
| ** OS migration [https://phabricator.wikimedia.org/T198622 T198622]
| |
| ** Disk space in critical condition because of OSM replication
| |
| * OSM replication is disabled for a long time range and the database is outdated
| |
| ** Depending on the server condition, is safer to re-import OSM data instead of roll replication, because it can overload Tilerator and increase significantly the CPU usage
| |
| | |
| === Importing ===
| |
| ==== Checklist ====
| |
| # Create a Phab task to track this work. [https://phabricator.wikimedia.org/T249086 Example].
| |
| # Depool the maps master. | |
| # Isolate the maps master from the rest of the cluster. [https://gerrit.wikimedia.org/r/c/operations/puppet/+/585153 Example].
| |
| # Disable Tilerator in puppet. [https://gerrit.wikimedia.org/r/c/operations/puppet/+/585141 Example].
| |
| # Delete postgres data, recreate the empty DB.
| |
| # Run `osm-initial-import` script to reimport data from OSM, needs root access.
| |
| # Re-enable replication
| |
| # Repool the maps master
| |
| # Re-init all slaves (cookbook)
| |
| # Re-enable tilerator
| |
| | |
| ==== Depooling the maps master ====
| |
| From the puppetmaster, depooling maps1004 in eqiad:
| |
| | |
| <code>sudo confctl select "dc=eqiad,cluster=maps,service=kartotherian,name=maps1004.eqiad.wmnet" set/pooled=no</code>
| |
| | |
| <code>sudo confctl select "dc=eqiad,cluster=maps,service=kartotherian-ssl,name=maps1004.eqiad.wmnet" set/pooled=no</code>
| |
| | |
| ==== Re-creating an empty Postgres DB ====
| |
| Simply remove the database and let puppet handle the rest:
| |
| | |
| <code>rm -rf /srv/postgresql/9.6/main/* && run-puppet-agent && service postgresql restart</code>
| |
| | |
| ==== Re-creating Postgres DB from backup ====
| |
| * Clear the [[Postgres]] data directory and init the database from backup (replace <code>maps2001.codfw.wmnet</code> by the postgres master):
| |
| <code>rm -rf /srv/postgresql/9.6/main/* && sudo -u postgres pg_basebackup -X stream -D /srv/postgresql/9.6/main/ -h maps2001.codfw.wmnet -U replication -W</code>
| |
| | |
| ==== The initial-import-script ====
| |
| Initial data load of OSM into postgresql is done by running <code>/usr/local/bin/osm-initial-import</code> on the postgresql master node. This command can take over 12 hours or more, so running in <code>screen</code> or <code>tmux</code> is advised.
| |
| | |
| <syntaxhighlight lang="bash">
| |
| osm-initial-import \
| |
| -d <date_of_import> \
| |
| -s <state_file_url> \
| |
| -x webproxy.eqiad.wmnet:8080
| |
| </syntaxhighlight>
| |
| | |
| * '''date_of_import:''' find the latest dump at https://planet.osm.org/pbf/. Example: <code>160530</code>. Do NOT use "latest", as that might change at any moment.
| |
| * '''state_file_url:''' The URL to the state file corresponding to the dump, find the correct one at http://planet.openstreetmap.org/replication/ (the state file must be older than the dump). Example: <code>http://planet.openstreetmap.org/replication/day/000/001/355.state.txt</code>.
| |
| | |
| ==== Troubleshouting ====
| |
| * If the slave initialization timeouts in puppet, it needs to be run manually:
| |
| | |
| <syntaxhighlight lang="bash">
| |
| service postgresql@9.6-main stop
| |
| | |
| rm -rf /srv/postgresql/9.6/main
| |
| | |
| sudo -u postgres /usr/bin/pg_basebackup -X stream -D /srv/postgresql/9.6/main -h <maps_master fqdn> -U replication -w
| |
| | |
| # Run puppet to make sure recovery.conf file is created
| |
| | |
| service postgresql@9.6-main start
| |
| </syntaxhighlight>
| |
| | |
| == OSM replication ==
| |
| In order to keep the OSM Database updated, we use the following tools: [https://wiki.openstreetmap.org/wiki/Osmosis osmosis] and [https://wiki.openstreetmap.org/wiki/Osm2pgsql osm2pgsql]. These tools fetch data from OSM servers and apply the requested changesets in the PostgreSQL database. When it's complete, the [https://gerrit.wikimedia.org/r/plugins/gitiles/operations/puppet/+/production/modules/osm/templates/replicate-osm.erb replication script] will trigger a request to Tilerator, sending a list of tiles that needs to be regenerated.
| |
| | |
| == Monitoring ==
| |
| You can monitor PostgreSQL activities of the Maps clusters in the following dashboards:
| |
| * [https://grafana.wikimedia.org/d/000000039/maps-osm-database-msantos?orgId=1&refresh=10s General PostgreSQL metrics] from [https://github.com/wrouesnel/postgres_exporter prometheus postgres exporter]
| |
| * OSM sync lag: [https://grafana.wikimedia.org/d/000000305/maps-performances?orgId=1&var-cluster=maps1&viewPanel=11 eqiad], [https://grafana.wikimedia.org/d/000000305/maps-performances?orgId=1&var-cluster=maps1&viewPanel=12 codfw]
| |