You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Data Platform/Systems
Jump to navigation
Jump to search
These subpages explain in technical detail the systems that process data for analytics at Wikimedia Foundation.
They include information about setup, maintenance, architecture, and more.
Search within the Data Platform/Systems docs
<inputbox> type=fulltext prefix=Data_Platform/Systems width=40 arialabel=Search the Data Platform systems docs searchbuttonlabel=Search the Data Platform systems docs </inputbox>
Child Pages of Data Platform/Systems
<splist
listStyle=bar kidsonly=yes
/>
All Subpages of Data Platform/Systems
- AQS
- AQS/OpenAPI spec style guide
- AQS/Scaling
- AQS/Scaling/2016/Hardware Refresh
- AQS/Scaling/2017/Cluster Expansion
- AQS/Scaling/2020/Cluster Expansion
- AQS/Scaling/LoadTesting
- Airflow
- Airflow/Developer guide
- Airflow/Developer guide/Normalize a DAG
- Airflow/Developer guide/Python Job Repos
- Airflow/Instances
- Airflow/Kubernetes
- Airflow/Kubernetes/Administration
- Airflow/Kubernetes/Operations
- Airflow/Kubernetes/Operations/K8s-Migration
- Airflow/Upgrading
- Analytics Meta
- Archiva
- Bigtop Packages
- Blunderbuss
- Ceph
- Ceph/Troubleshooting
- Ceph/Upgrading
- Cluster
- Cluster/Geotagging
- Cluster/Hadoop/Load
- Cluster/OpenSearch
- Cluster/Spark History
- Conda
- Coordinator
- DB Replica
- Dashiki
- Dashiki/Configuration
- DataHub
- DataHub/Administration
- DataHub/Data Catalog Documentation Guide
- DataHub/Upgrading
- Data Quality
- Data deletion and sanitization
- Dealing with data loss alarms
- Druid
- Druid/Alerts
- Druid/Load test
- Edit data loading
- Edit history administration
- Edit serving layer
- Event Data retention
- Event Data retention/AppInstallId
- Exporting from HDFS to Swift
- Geolocation
- Gobblin
- Hadoop
- Hadoop/Administration
- Hadoop/Alerts
- Hadoop/Test
- Hadoop Event Ingestion Lifecycle
- Hive
- Hive/Alerts
- Hive/Avro
- Hive/Compression
- Hive/Counting uniques
- Hive/Queries
- Hive/Queries/Wikidata
- Hive/Querying using UDFs
- Hive to Druid Ingestion Pipeline
- Iceberg
- Iceberg/Migration Dependencies
- Java
- Jupyter
- Jupyter/Administration
- Kerberos
- Kerberos/Administration
- Maintenance Schedule
- Managing systemd timers
- Manual maintenance
- Matomo
- MediaWiki replicas
- Mediawiki History Snapshot Check
- Mediawiki history reduced algorithm
- OpenSearch-on-K8s
- OpenSearch-on-K8s/Administration
- Page and user history reconstruction
- Page and user history reconstruction algorithm
- PostgreSQL
- PostgreSQL/Backup and Restore
- PostgreSQL/Clusters
- PostgreSQL/Operations
- Presto
- Presto/Administration
- Presto/Query Logger
- R
- Refine
- Refine/Deploy Refinery
- Refine/Deploy Refinery-source
- Reportupdater
- Revision augmentation and denormalization
- Siege
- Spark
- Spark/Administration
- Spark/Kubernetes
- Stat hosts
- Superset
- Superset/Administration
- Superset/Date functions
- System users
- Turnilo
- Varnishkafka
- Wikistats
- Wikistats/Deprecation of Wikistats 1
- Wikistats/Traffic
- Wikistats 2
- Wikistats 2/Map Component
- Wikistats 2/Metrics/FAQ
- Wikistats 2/Smoke Testing
- analytics.wikimedia.org
- ua-parser
- ua-parser/2019-09-18 Update