You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Analytics/AQS/Mediarequests

From Wikitech-static
Jump to navigation Jump to search

Definition

The Mediarequest API endpoints provide data on the use of media files across all wikis. This data comes in three form factors:

  • Mediarequests per referrer: daily and monthly aggregation of hits to media files, split by the referrer Referrer means either external, internal, unknown, or any Wikimedia wiki that the resource was requested from.
  • Mediarequests per file: daily and monthly counts of mediarequests for each media file stored in the wiki servers (as long as the count is higher than one).
  • Top files by mediarequests: the most requested media files per referer and per media type.

Media type classification

File types are obtained by parsing the file extension, and then classified according to the following table:

Extensions Media type
svg, png, tiff, tiff, jpeg, jpeg, gif, xcf, webp, bmp image
mp3, ogg, oga, flac, wav, midi, midi audio
webm, ogv video
pdf, djvu, srt, txt document
(all other extensions) other

Limitations

  • The ability of splitting and filtering by referrer is limited to data from May 2019 onward. Before that, referrer is only split in internal, external, and unknown.
  • The beginning of mediarequest data is the 1st of January 2015.
  • The ability of splitting and filtering by agent type (user, spider) is limited to data from May 2019 onward.
  • About 0.7% of mediarequests are prefetches coming from Media Viewer (more details in Analytics/AQS/Media_metrics)

Quick guide to the endpoints

Definition of each type of parameter referenced in the endpoint templates below:

  • {referer}: where this request came from, one of (all-referers, internal, external, unknown, or the specific wiki where the media was loaded)
  • {media_type}: the type of media, described above with the possible options, can also be (all-media-types)
  • {file_path}: the uri-encoded path of the file (for example /wikipedia/en/0f/6f/manolo.jpg would be %2Fwikipedia%2Fen%2F0f%2F6f%2Fmanolo.jpg)
  • {agent}: the user agent, one of (all-agents, spider, user)
  • {granularity}: one of (monthly, daily)
  • {start}: start time, precise to the hour, in the format YYYYMMDDHH, eg. 2018010100
  • {end}: end time, precise to the hour, in the format YYYYMMDDHH, eg. 2019010100

Mediarequests per referrer

General endpoint template

https://wikimedia.org/api/rest_v1/metrics/mediarequests/aggregate/{referer}/{media_type}/{agent}/{granularity}/{start}/{end}

Sample return data

{
    items: [ // there are as many items as individual months/days requested
        {
            referer: en.wikipedia,
            media_type: all-media-types,
            agent: all-agents,
            granularity: daily,
            timestamp: 1970010100,
            requests: 0
        }
    ]
}

Examples

Obtaining a timeseries of all media files requested from all wikis in 2018
https://wikimedia.org/api/rest_v1/metrics/mediarequests/aggregate/all-referers/all-media-types/all-agents/monthly/2018010100/2019010100
Obtaining a timeseries of all videos requested from outside wikis in 2018
https://wikimedia.org/api/rest_v1/metrics/mediarequests/aggregate/external/all-media-types/all-agents/monthly/2018010100/2019010100
Obtaining a timeseries of all images requested from Galician Wikisource by humans in 2018
https://wikimedia.org/api/rest_v1/metrics/mediarequests/aggregate/gl.wikisource/all-media-types/user/monthly/2018010100/2019010100

Mediarequests per file

General endpoint template

Important: The file path must be URL encoded

https://wikimedia.org/api/rest_v1/metrics/mediarequests/per-file/{referer}/{agent}/{file_path}/{granularity}/{start}/{end}

Sample return data

{
    items: [ // there are as many items as individual months/days requested
        {
            referer: en.wikipedia
            file_path: '/wikipedia/en/0f/6f/manolo.jpg'
            agent: all-agents
            granularity: daily
            timestamp: 1970010100
            requests: 0
        }
    ]
}

Examples

Obtaining a timeseries of all requests to '/wikipedia/commons/1/1a/Flag_of_Argentina.svg' from any referrer in 2018
https://wikimedia.org/api/rest_v1/metrics/mediarequests/per-file/all-referers/all-agents/%2Fwikipedia%2Fcommons%2F1%2F1a%2FFlag_of_Argentina.svg/monthly/2018010100/2019010100
Obtaining a timeseries of all requests to '/wikipedia/commons/1/1a/Flag_of_Argentina.svg' by spider/bots from Armenian Wikipedia in late 2019
https://wikimedia.org/api/rest_v1/metrics/mediarequests/per-file/hy.wikipedia/spider/%2Fwikipedia%2Fcommons%2F1%2F1a%2FFlag_of_Argentina.svg/monthly/2019011000/2020010100

Top files by mediarequests

General endpoint template

https://wikimedia.org/api/rest_v1/metrics/mediarequests/top/{referer}/{media_type}/{year}/{month}/{day}

Sample return data

{
    items: [
        {
            referer: en.wikipedia
            media_type: all-media-types
            year: '1970'
            month: '01'
            day: '01'
            files: {
                file_path: '/wikipedia/en/0f/6f/manolo.jpg'
                requests: 20981
                rank: 1
            }
        }
    ]
}

Examples

Get the most requested files in all the Wikimedia movement for September 2019
https://wikimedia.org/api/rest_v1/metrics/mediarequests/top/all-referers/all-media-types/2019/09/all-days
Get the most requested documents in English Wikipedia movement for September 2019
https://wikimedia.org/api/rest_v1/metrics/mediarequests/top/en.wikipedia/document/2019/09/all-days