You are browsing a read-only backup copy of Wikitech. The primary site can be found at wikitech.wikimedia.org

Data Catalog Application Evaluation/Rubric/data-catalog-evaluation server notes

From Wikitech-static
Jump to navigation Jump to search

All files are currently in /home/razzi on data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud.

Docker setup

Install docker:

sudo apt-get install docker.io

Download docker-compose and move it to ~/.docker/cli-plugins to install it:

sudo mkdir -p /usr/libexec/docker/cli-plugins/
wget https://github.com/docker/compose/releases/download/v2.2.2/docker-compose-linux-x86_64
sudo cp docker-compose-linux-x86_64 /usr/libexec/docker/cli-plugins/docker-compose

Nginx setup

Run nginx docker container in foreground (this will be served at data-catalog-evaluation.wmcloud.org):

sudo docker run -it -p 80:80 nginx

Run nginx docker container as daemon:

sudo docker run -d -p 80:80 -v $(pwd):/usr/share/nginx/html nginx

Data Catalogs

Atlas

Clone apache atlas docker image (currently at /home/razzi/apache-atlas-docker)

git clone https://github.com/sonnyhcl/apache-atlas-docker

Run atlas:

sudo docker compose up

It is accessible from localhost (curl -u admin:admin http://localhost:21000/api/atlas/admin/version works) but not via the web proxy at atlas-demo.wmfcloud.org. This works with ssh forwarding:

ssh -NL 21000:data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud:21000 data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud

but it does not work with the proxy (configured as shown)

File:Proxy settings for data-catalog-evaluation.analytics.eqiad1.wikimedia.cloud.png
Proxy looks to be configured properly but is not working.

For now, since it's the only data catalog being evaluated, it's running on port 80 and can be accessed at data-catalog-evaluation.wmcloud.org.

Workaround options:

  • Have traffic go to data-catalog-evaluation.wmdcloud.org and use nginx as a reverse proxy
    • can use subdomains or paths
  • run each application on a different virtual machine

Loading Atlas sample data

TODO

Amundsen

TODO

DataHub

TODO

Egeria

TODO User:milimetric is working on this

Marquez

TODO

Data Sources