You are browsing a read-only backup copy of Wikitech. The primary site can be found at

Data Catalog Application Evaluation/Rubric/Amundsen: Difference between revisions

From Wikitech-static
Jump to navigation Jump to search
imported>Neil P. Quinn-WMF
(Neil P. Quinn-WMF moved page Data Catalog Application Evaluation/Rubric/Amundsen to 2021 data catalog selection/Rubric/Amundsen: Clarify this is not a living document and use title case)
Line 1: Line 1:
=== Core Service and Dependency Setup ===
#REDIRECT [[2021 data catalog selection/Rubric/Amundsen]]
=== Ingestion Configuration ===
=== Progress Status ===
=== Perceptions ===
=== Outcome ===
== Razzi's take on Amundsen ==
- simple architecture of 3 flask services all in python (as opposed to Datahub using java and python)
- ingestion architecture is simple: python scripts or airflow dags that make http api requests
- "social" ui features, like frequent users and owners
- loose coupling means you can use a relational database as the data store rather than neo4j (
- seems like the community is losing steam: has a flurry of events in 2019/2020 but nothing in 2021
- only supports polling for data updates, unless we also deploy atlas. Push ingest api is on their roadmap
- documentation is somewhat lacking; few ingestion examples, and broken links in docs
- some dependencies are getting out of date: elasticsearch version 6 (v7 was released 2019), nodejs version 12 (v13 was released 2019)[[File:Screenshot of Amundsen home page.png|center|thumb|700x700px|The Amundsen home page running in Docker, after loading their small sample dataset from example/scripts/]]
[[File:Amundsen README screenshot.png|Summary of Amundsen from the [ README] on Github.|thumb|841x841px]]Amundsen was created by Lyft and is now hosted by the Linux Foundation.

Latest revision as of 18:04, 4 July 2022