You are browsing a read-only backup copy of Wikitech. The live site can be found at


From Wikitech-static
< Analytics‎ | Systems‎ | Cluster
Revision as of 13:44, 7 April 2017 by imported>Milimetric (Milimetric moved page Analytics/Cluster/Geotagging to Analytics/Systems/Cluster/Geotagging: Reorganizing documentation)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Geotagging functions in Hadoop are provided by jars available at hdfs:///wmf/refinery/current/artifacts


refinery-core.jar exposes two functions

Function Name Data Returned
getCountryCode(String ip) country code
getGeocodedData(String IP) <map> containing geocoding information:
  • continent
  • country_code
  • country
  • subdivision
  • city
  • postal_code
  • latitude
  • longitude
  • timezone


This library provides wraper functions usable as a hive UDF

Hive UDF Wrapped Function


These functions use a regularly updated (every week) version of the MaxMind database that is downloaded on every node of the cluster in the folder /usr/share/GeoIP.