You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org

Analytics/Systems/Presto/Query Logger

From Wikitech-static
< Analytics‎ | Systems‎ | Presto
Revision as of 19:51, 9 November 2021 by imported>Razzi
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Currently work in progress. Will be as specified in https://phabricator.wikimedia.org/T269832.

Repository is at https://github.com/razzius/wmf-presto-query-logger

We want to get the information out of the `tasks` table:

presto:runtime> describe tasks;
         Column          |   Type    | Extra | Comment
-------------------------+-----------+-------+---------
 node_id                 | varchar   |       |
 task_id                 | varchar   |       |
 stage_execution_id      | varchar   |       |
 stage_id                | varchar   |       |
 query_id                | varchar   |       |
 state                   | varchar   |       |
 splits                  | bigint    |       |
 queued_splits           | bigint    |       |
 running_splits          | bigint    |       |
 completed_splits        | bigint    |       |
 split_scheduled_time_ms | bigint    |       |
 split_cpu_time_ms       | bigint    |       |
 split_blocked_time_ms   | bigint    |       |
 raw_input_bytes         | bigint    |       |
 raw_input_rows          | bigint    |       |
 processed_input_bytes   | bigint    |       |
 processed_input_rows    | bigint    |       |
 output_bytes            | bigint    |       |
 output_rows             | bigint    |       |
 physical_written_bytes  | bigint    |       |
 created                 | timestamp |       |
 start                   | timestamp |       |
 last_heartbeat          | timestamp |       |
 end                     | timestamp |       |
(24 rows)