Fundraising/techops/docs/analytics stack
Analytics Stack Components
| System | Description | Host:Port | Log Location |
|---|---|---|---|
| Trino | Trino is a query engine used to process analytics data in FR-Tech's datalake. the current implementation is composed of one coordinator node and 3 worker nodes, each on their own host |
Coordinator-
fransc2001:8443
Workers-
|
<host>/var/lib/trino/trino-data/var/log
Query history in trino or mariadb under
|
| Dagster | Dagster is an orchestrator used to schedule jobs that load or manipulate data |
fran2001:3000
|
syslog, tagged with 'dagster'. note that in vb the logs print to stdout, not syslog since we aren't using systemd |
| dbt | dbt (data build tool) is an open source framework for modeling data. dbt is basically an orchestrator for sql commands that ensures the commands run in a way that respects each data model's up/ downstream dependencies |
fran2001
|
syslog for the dagster materialization logs, general dbt logs are in:
/srv/dagster_data/dbt_log/dbt.log
individual run logs are in:
|
| Hive Metastore | Hive Metastore holds metadata used by Iceberg and Trino to map Trino tables to file locations in minIO |
fransc2001:9083
or the same host as the Trino Coordinator
|
syslog |
| Metabase | Metabase is a Business Intelligence tool used to visualize and analyze data. FR Analytics is in the process of migrating from Apache Superset to Metabase |
fran2001:9081
|
syslog |
| Superset | Superset is a Business Intelligence Tool. We are migrating from Superset to Metabase |
fran2001:9080
|
syslog |
| minIO | minIO is an object storage tool that holds our datalake. the physical data files are stored in minIO |
franio200[1-3]:9000
|
not currently logging |
| MariaDB | MariaDB is a database used by CiviCRM to hold all its data |
frdb2003
|
syslog and
/srv/sqldata/{hostname}-slow.log
|
| dlt | dlt (data load tool) is an open source framework for loading data to and from various APIs and databases |
fran2001
|
syslog, but not currently tagged. all dlt logs are probably tagged with
dagster
since dlt scripts run through dagster
|
syslog is in /var/log/syslog any host
Analytics How-to's
Fundraising/techops/docs/analytics stack/how to guides
Troubleshooting
Fundraising/techops/docs/analytics stack/troubleshooting