You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org
Analytics/Systems/Manual maintenance
Jump to navigation
Jump to search
Monthly
- Mediawiki history Druid data source switch
- Check for newly created wikis
- Add _REFINED flags for events that contribute to the
wmf.wikidata_item_page_link
dataset. (this is not even documented anywhere outside of email) - We run the false positive checker for webrequest loss probably once or more a month. This could be partially automated, if the script finds that all instances of loss are false positives, the job could be automatically rerun. If we do this automatically, we could update the webrequest_sequence_stats table with the results, allowing for trend tracking on top of that table. Currently if you try to analyze data loss over time you find lots of noise with high % loss due to host restarts, etc.
- We re-run sanitization. It's painful to update the command because you often have to change a property file nested in another property file nested in the command. Docs are here, and we should build a rerun command that just takes a list of schemas, since, and until parameters.