You are browsing a read-only backup copy of Wikitech. The live site can be found at wikitech.wikimedia.org
Data Catalog Application Evaluation Rubric/data-catalog-evaluation server notes: Difference between revisions
imported>Razzi |
imported>Razzi (→Atlas) |
||
Line 56: | Line 56: | ||
but it does not work with the proxy (configured as shown) | but it does not work with the proxy (configured as shown) | ||
[[File:Proxy settings for data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud.png|alt=Highlighted text shows atlas-demo.wmcloud.org http://172.16.3.184:21000 |center|thumb|900x900px|Proxy looks to be configured properly but is not working.]] | [[File:Proxy settings for data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud.png|alt=Highlighted text shows atlas-demo.wmcloud.org http://172.16.3.184:21000 |center|thumb|900x900px|Proxy looks to be configured properly but is not working.]]For now, since it's the only data catalog being evaluated, it's running on port 80 and can be accessed at data-catalog-evaluation.wmcloud.org. | ||
Workaround options: | |||
* Have traffic go to data-catalog-evaluation.wmdcloud.org and use nginx as a reverse proxy | |||
** can use subdomains or paths | |||
* run each application on a different virtual machine | |||
==== Loading Atlas sample data ==== | |||
TODO | |||
=== Amundsen === | === Amundsen === |
Revision as of 18:36, 20 December 2021
All files are currently in /home/razzi on data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud.
Docker setup
Install docker:
sudo apt-get install docker.io
Download docker-compose and move it to ~/.docker/cli-plugins to install it:
sudo mkdir -p /usr/libexec/docker/cli-plugins/ wget https://github.com/docker/compose/releases/download/v2.2.2/docker-compose-linux-x86_64 sudo cp docker-compose-linux-x86_64 /usr/libexec/docker/cli-plugins/docker-compose
Nginx setup
Run nginx docker container in foreground (this will be served at data-catalog-evaluation.wmcloud.org):
sudo docker run -it -p 80:80 nginx
Run nginx docker container as daemon:
sudo docker run -d -p 80:80 -v $(pwd):/usr/share/nginx/html nginx
Contents of /home/razzi/static/index.html |
---|
The following content has been placed in a collapsed box for improved usability. |
<html> <style> body { font-family: arial; } </style> Welcome to <a href="/">data-catalog-evaluation.wmcloud.org</a> <h2>Links:</h2> <ul> <li> <a href="#">Atlas (TODO)</a> </ul> <p><a href="https://wikitech.wikimedia.org/wiki/Data_Catalog_Application_Evaluation_Rubric">Read about the data catalog evaluation</a></p> <p> This file lives at data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud:/home/razzi/static/index.html and the author (me, Razzi) gives you permission to edit it, wiki-style. </p> </html> |
The above content has been placed in a collapsed box for improved usability. |
Data Catalogs
Atlas
Clone apache atlas docker image (currently at /home/razzi/apache-atlas-docker)
git clone https://github.com/sonnyhcl/apache-atlas-docker
Run atlas:
sudo docker compose up
It is accessible from localhost (curl -u admin:admin http://localhost:21000/api/atlas/admin/version works) but not via the web proxy at atlas-demo.wmfcloud.org. This works with ssh forwarding:
ssh -NL 21000:data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud:21000 data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud
but it does not work with the proxy (configured as shown)
For now, since it's the only data catalog being evaluated, it's running on port 80 and can be accessed at data-catalog-evaluation.wmcloud.org.
Workaround options:
- Have traffic go to data-catalog-evaluation.wmdcloud.org and use nginx as a reverse proxy
- can use subdomains or paths
- run each application on a different virtual machine
Loading Atlas sample data
TODO
Amundsen
TODO
DataHub
TODO
Egeria
TODO User:milimetric is working on this
Marquez
TODO
Data Sources
- Mysql TBD
- hadoop TBD https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html
- hive TBD