You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
(Redirected from Talk:Analytics/AQS/Wikistats 2/DataQuality/Vetting data lake metrics for project families)Jump to navigation Jump to search
- Let's Add an abstract section that explains our findings concisely
- Let's be specific with dates, what date this the vetting happen and with what snapshot was it calculated
- Let's define what we are talking about when we say project families and possibly link to announcement in wikitech
- Let's explain why "mediawiki" might not be a project family
- Let's outline the metrics we are vetting and let's link to the wikistats pages in meta, if metrics are not available let's have a section explicitly about that and possibly link to the tickets in question
- Let's have a summary section that "sums up" findings so people do not need to read the whole report to get an idea of the expected differences.
- Edits: early numbers before 2002 distort graph, let's make one graph for "before 2002" and after. Dig a little to see why differences are so big, I thought data did not go so far? Ditto for "Total article count" we need two graphs, one that explores when differences were big and a second one.
- Please label Y axis with "difference as a percentage " should be clear how is percentage calculated is wikistats1 always higher or is wikistats2 higher? Y label should be the same on all graphs. Right now labels are different per graph.
- Let's link to wikistats metric pages so it is clear what metrics are we talking about