You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Category:Data pipelines

From Wikitech-static
Jump to navigation Jump to search

Documentation of data ingestion and processing pipelines.

  • Includes documentation describing how specific datasets are derived or computed, for example: MediaWiki history computation (ingestion from DB, history rebuilding, computation of metrics, extraction onto other systems, ad-hoc querying).
  • Does not include documentation for the data platform infrastructure or system components that implement a given data pipeline, for example: Airflow, Gobblin.