You are browsing a read-only backup copy of Wikitech. The live site can be found at

Analytics/Systems/Cluster/Deploy a fix to incorrect camus partitionning

From Wikitech-static
< Analytics‎ | Systems‎ | Cluster
Revision as of 13:44, 7 April 2017 by imported>Milimetric (Milimetric moved page Analytics/Cluster/Deploy a fix to incorrect camus partitionning to Analytics/Systems/Cluster/Deploy a fix to incorrect camus partitionning: Reorganizing documentation)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

It's possible that you deployed and started to fetch data from a kafka topic but your camus job created incorrect partitions.

It can happen in the case you've set an incorrect property to camus.message.timestamp.format or camus.message.timestamp.field.

In such case camus may fallback to import time instead of the actual log timestamp. This will lead to incorrect partitionning like this : (in this example ts is the field that contains the log timestamp in unix seconds)

select from_unixtime(min(ts)), from_unixtime(max(ts)) from cirrussearchrequestset where year=2015 and month=11 and day=5 and hour=2 limit 1;
_c0	                _c1
2015-11-05 01:15:10	2015-11-05 02:15:10

In such scenario you will have to deploy a fix (either in file or in refinery-camus artifact). But you will also have to follow these deployement steps:

  1. Deploy the fix
  2. Comment your camus job crontab line
  3. Re-create your table or delete existing partition
  4. Archive the hdfs paths used by camus (etl.destination.path, etl.execution.base.path and etl.execution.history.path)
  5. If you use the --check options to flag your partitions with _IMPORTED:
    1. Launch a first manual run with kafka.max.pull.minutes.per.task=1
    2. The job will fail on the check phase because it is unable to handle the initial run correctly
  6. Re-enable your camus job in cron
  7. Wait for the first automatic run to finish
  8. If you used the check flag you will have to manually flag the partitions created by the first run. The number of partitions that will have to be flagged manually depends on the number of lines fetched by camus in 1 minute.
  9. Backfill with oozie