You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Analytics/Cluster/Refinery

From Wikitech-static
< Analytics‎ | Cluster
Revision as of 12:41, 10 January 2015 by imported>QChris (QChris moved page Analytics/Refinery to Analytics/Cluster/Refinery: Refinery is too tied to the cluster to live directly underneath Analytics)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Refinery is the software infrastructure that is used on the Analytics Cluster. The source code is in the analytics/source repository.

How to deploy

  1. Ssh into Tin
  2. Run:
    cd /srv/deployment/analytics/refinery
    git deploy start
    git checkout master
    git pull
    git deploy sync

    (git deploy sync will complain that only “2/3 minions completed fetch”. You can say “y”es to that)

    This part brings the refinery code from gerrit to stat1002.
  3. Ssh into stat1002
  4. Run sudo -u hdfs /srv/deployment/analytics/refinery/bin/refinery-deploy-to-hdfs --verbose --no-dry-run

    This part brings the refinery code to the HDFS (but it does not resubmit Oozie jobs).

How to deploy Oozie jobs

Please see the Deployment section in the Oozie docs.