Merge pull request #50 from arangoml/create_test_data_on_init

Test data generated on init.
arangoml · Oct 4, 2019 · cdc4073 · cdc4073
2 parents 8cfe6c2 + cb17f6a
commit cdc4073
Show file tree

Hide file tree

Showing 9 changed files with 30 additions and 131 deletions.
diff --git a/Changelog.md b/Changelog.md
@@ -6,5 +6,6 @@
 * Added Summary View feature to the UI
 * Migrated to the latest python-arango 5.2.0
 * Minor UI tweaks and fixes
+* Test data generated on init
 
 ## 0.1 (Initial Release)
diff --git a/README.md b/README.md
@@ -68,62 +68,45 @@ To facilitate an easy start, docker containers for *torch* and *tensorflow* are
 
     ` docker run -p 6529:8529 -p 8888:8888 -p 3000:3000 -it arangopipe/ap_torch`
 
-2. We will now setup the container with some test data. To do so:
-    * Execute a `docker ps` command to get the *CONTAINER ID* of the running container. 
-    * You can then get to a shell in the container using the command: `docker exec`*CONTAINER ID*. 
-    * Once you are in the container shell, you can generate test data to try **Arangopipe** using the `test_data_generator` utility provided with **Arangopipe**. Simply follow the steps below.
-    - `cd examples/test_data_generator/`
-     - `ipython`
-    - `from generate_model_data import generate_runs`
-     - `generate_runs()`
-    - `exit`
-
-
-
-3. Running an example in the *torch* container: The _pytorch_ example is a python script. To run it, you will have to use the `docker ps` command and get to the shell in the container using the `docker exec` command. These steps are similar to what you would have done in the previous step to generate test data. Change directory to the `examples/pytorch` directory. The *torch* container provides an example of a linear regression model that uses **Arangopipe** to log experiment metadata. The experiment meta data includes information about the dataset, featureset and optimization settings used to run the *pytorch* model. Once you are in the shell of the *torch* container, run the driver program that develops the torch model and logs the experiment meta-data to *arangopipe*. To run the driver program, launch an `ipython` shell. In the shell, execute the following:
+
+
+2. Running an example in the *torch* container: The _pytorch_ example is a python script. To run it:
+    * Run the `docker ps` command to get the `CONTAINER ID` of the _pytorch_ container.
+    * Run the command ` docker exec -it [ CONTAINER ID ] /bin/bash ` where  `CONTAINER ID` is obtained from the previous step.
+    * 
+Change directory to the `examples/pytorch` directory. The *torch* container provides an example of a linear regression model that uses **Arangopipe** to log experiment metadata. The experiment meta data includes information about the dataset, featureset and optimization settings used to run the *pytorch* model. To run the example, launch an `ipython` shell. In the shell, execute the following:
     1. `from ch_torch_linear_regression_driver import run_driver`
     2. `run_driver()`
 
     The details are shown in the figure below.
 
     <img src="run_torch_driver.png" height="400">
 
-4. Execute this step after the model development step above has completed. Point your browser http:localhost:3000. Login to the Arangopipe user interface with username  root and password  `open sesame`. Select `Models` in the `Search Metadata` content pane. You should see the model you developed in the previous step. The details are shown in the figure below.
+3. Execute this step after the model development step above has completed. Point your browser http:localhost:3000. Login to the Arangopipe user interface with username  root and password  `open sesame`. Select `Models` in the `Search Metadata` content pane. You should see the model you developed in the previous step. The details are shown in the figure below.
 
 
     <img src="pytorch_model_FE.png" height="400">
 
-5. Explore Arangopipe [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/arangoml/arangopipe/0.1?filepath=arangopipe%2Farangopipe_examples_torch.ipynb). Examples that show **Arangopipe** can be used with *hyperopt*, *sklearn* and *mlfow* are provided. To get the details of where these examples are located in the container, use the binder link above. To access the notebook examples provided with the docker container, point your browser to:  `http://localhost:8888` to get to a **Jupyter** notebook. The default notebook password is _root_
+4. Explore Arangopipe [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/arangoml/arangopipe/0.1?filepath=arangopipe%2Farangopipe_examples_torch.ipynb). Examples that show **Arangopipe** can be used with *hyperopt*, *sklearn* and *mlfow* are provided. To get the details of where these examples are located in the container, use the binder link above. To access the notebook examples provided with the docker container, point your browser to:  `http://localhost:8888` to get to a **Jupyter** notebook. The default notebook password is _root_
 
-6. Point your browser to: `http://localhost:6529` to get to the **ArangoDB** web user interface. The `root` password is `open sesame`.
+5. Point your browser to: `http://localhost:6529` to get to the **ArangoDB** web user interface. The `root` password is `open sesame`.
 
 ### Tensorflow
 
 1. Start the container:
 
     ` docker run -p 6529:8529 -p 8888:8888 -p 3000:3000 -it arangopipe/ap_tensor_flow`
 
-2. We will now setup the container with some test data. To do so:
-    * Execute a `docker ps` command to get the *CONTAINER ID* of the running container. 
-    * You can then get to a shell in the container using the command: `docker exec`*CONTAINER ID*. 
-    * Once you are in the container shell, you can generate test data to try **Arangopipe** using the `test_data_generator` utility provided with **Arangopipe**. Simply follow the steps below.
-    - `cd examples/test_data_generator/`
-     - `ipython`
-    - `from generate_model_data import generate_runs`
-     - `generate_runs()`
-    - `exit`
-
-
 
-3. Running an example in the *tensorflow* container: Run the tensorflow container. Point your browser to http://localhost:8888. You will be prompted for a password. Use `root` for the password. In the file browser that is presented in the Jupyter notebook, open the `examples` directory and then open the  `TFX` directory. Open the notebook `tfx_metadata_integration.ipynb`. Read the description of the notebook. This notebook provides an example of how **Arangopipe** can be used with *tensorflow*. The utility of the multi-model feature of **ArangoDB** is leveraged in this example. [Tensorflow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started) is used to generate the summary statistics for a dataset. This *tensorflow* artifact can be stored in **Arangopipe** and reused as needed. This capability is illustrated in this notebook.
+2. Running an example in the *tensorflow* container: Run the tensorflow container. Point your browser to http://localhost:8888. You will be prompted for a password. Use `root` for the password. In the file browser that is presented in the Jupyter notebook, open the `examples` directory and then open the  `TFX` directory. Open the notebook `tfx_metadata_integration.ipynb`. Read the description of the notebook. This notebook provides an example of how **Arangopipe** can be used with *tensorflow*. The utility of the multi-model feature of **ArangoDB** is leveraged in this example. [Tensorflow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started) is used to generate the summary statistics for a dataset. This *tensorflow* artifact can be stored in **Arangopipe** and reused as needed. This capability is illustrated in this notebook.
 
-4.  Execute this step after you have executed all the cells in the notebook discussed in the previous step. Point your browser to http://localhost:3000. Login to the Arangopipe user interface with username  root and password  `open sesame`. Select `Featursets` in the `Search Metadata` content pane. You should see the featureset logged with **Arangopipe** resulting from executing the notebook discussed in the previous step.
+3.  Execute this step after you have executed all the cells in the notebook discussed in the previous step. Point your browser to http://localhost:3000. Login to the Arangopipe user interface with username  root and password  `open sesame`. Select `Featursets` in the `Search Metadata` content pane. You should see the featureset logged with **Arangopipe** resulting from executing the notebook discussed in the previous step.
 
     <img src="tensorflow_example.png" height="400">
 
-5. Explore Arangopipe [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/arangoml/arangopipe/0.1?filepath=arangopipe%2Farangopipe_examples.ipynb). Examples that show **Arangopipe** can be used with *hyperopt*, *sklearn* and *mlfow* are provided. To get the details of where these examples are located in the container, use the binder link above. To access the notebook examples provided with the docker container, point your browser to:  `http://localhost:8888` to get to a **Jupyter** notebook. The default notebook password is _root_
+4. Explore Arangopipe [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/arangoml/arangopipe/0.1?filepath=arangopipe%2Farangopipe_examples.ipynb). Examples that show **Arangopipe** can be used with *hyperopt*, *sklearn* and *mlfow* are provided. To get the details of where these examples are located in the container, use the binder link above. To access the notebook examples provided with the docker container, point your browser to:  `http://localhost:8888` to get to a **Jupyter** notebook. The default notebook password is _root_
 
-6. Point your browser to: `http://localhost:6529` to get to the **ArangoDB** web user interface. The `root` password is `open sesame`.
+5. Point your browser to: `http://localhost:6529` to get to the **ArangoDB** web user interface. The `root` password is `open sesame`.
 
 
 

diff --git a/arangopipe/Dockerfile_TF b/arangopipe/Dockerfile_TF
diff --git a/arangopipe/Dockerfile_TFFE b/arangopipe/Dockerfile_TFFE
@@ -14,15 +14,9 @@ MAINTAINER Joerg Schad <info@arangodb.com>
 ENV GIT_PYTHON_REFRESH=quiet
 RUN apt-get update
 RUN apt-get install -y python-pip
-#RUN pip install mlflow hyperopt sklearn2 jsonpickle python-arango==4.4.0
-#RUN pip install -i https://test.pypi.org/simple/ arangopipe
-RUN pip install mlflow hyperopt sklearn2 jsonpickle python-arango
-RUN pip install arangopipe
+RUN pip install mlflow hyperopt sklearn2 jsonpickle python-arango arangopipe jupyter matplotlib tensorflow-data-validation PyYAML==5.1.1
 RUN mkdir -p /workspace
-RUN pip install jupyter
-RUN pip install matplotlib
-RUN pip install tensorflow-data-validation
-RUN pip install PyYAML==5.1.1
+
 WORKDIR /
 COPY --from=0 / .
 WORKDIR /workspace/experiments

diff --git a/arangopipe/Dockerfile_Torch b/arangopipe/Dockerfile_Torch
diff --git a/arangopipe/Dockerfile_Torch_FE b/arangopipe/Dockerfile_Torch_FE
@@ -12,17 +12,11 @@ FROM continuumio/miniconda3
 MAINTAINER Joerg Schad <info@arangodb.com>
 ENV GIT_PYTHON_REFRESH=quiet
 RUN apt-get update
-RUN apt-get install -y python-pip
-#RUN pip install mlflow hyperopt sklearn2 jsonpickle python-arango==4.4.0
-#RUN pip install -i https://test.pypi.org/simple/ arangopipe
-RUN pip install mlflow hyperopt sklearn2 jsonpickle python-arango
-RUN pip install arangopipe
+RUN apt-get install -y python-pip curl
+RUN pip install mlflow hyperopt sklearn2 jsonpickle python-arango arangopipe jupyter matplotlib PyYAML==5.1.1
 RUN mkdir -p /workspace
-RUN pip install jupyter
-RUN pip install matplotlib
-RUN pip install PyYAML==5.1.1
 RUN conda install pytorch cudatoolkit=10.0 -c pytorch 
-#RUN git clone git@github.com:arangoml/arangopipe.git /workspace
+
 WORKDIR /
 COPY --from=0 / .
 WORKDIR /workspace/experiments

diff --git a/arangopipe/makefile b/arangopipe/makefile
@@ -2,16 +2,16 @@ SRC = arangopipe/*.py
 DOCKER_FILE = Dockerfile
 DOCKER_PASSWORD = <the_password>
 DOCKER_REPO = arangopipe
-DOCKER_SI_FILE = Dockerfile_Torch_FE
-DOCKER_SI_IMG_NAME = ap_torch
+DOCKER_SI_FILE = Dockerfile_TFFE
+DOCKER_SI_IMG_NAME = ap_tensor_flow
 TEST_PYPI_PASSWORD = <the_password>
 
 python_arangopipe:$(SRC)
 	python3 setup.py sdist bdist_wheel
 upload_test_pypi:
 	twine upload --repository-url https://test.pypi.org/legacy/  -u rajiv.sambasivan -p $(TEST_PYPI_PASSWORD) dist/*
 docker_APSI_build:$(DOCKER_SI_FILE)
-	docker build --no-cache -t $(DOCKER_SI_IMG_NAME) -f $(DOCKER_SI_FILE) .
+	docker build --no-cache  -t $(DOCKER_SI_IMG_NAME) -f $(DOCKER_SI_FILE) .
 docker_publish_SI_latest:
 	@echo 'starting docker SI build...'
 	docker login --username arangopipe --password $(DOCKER_PASSWORD)

diff --git a/arangopipe/startup_commands.sh b/arangopipe/startup_commands.sh
@@ -1,6 +1,11 @@
 #!/bin/bash
 arangod --database.password="open sesame"&
-
 jupyter notebook --allow-root --notebookdir=/workspace/experiments  --ip=0.0.0.0 --port=8888 --no-browser&
-
+while [[ "$(curl -sL -w "%{http_code}\\n" "http://localhost:8529" -o /dev/null)" != "200" ]]; do
+echo "Waiting for arangod"
+sleep 5
+done
+echo "arangod is up!"
+export PYTHONPATH=$PYTHONPATH:/workspace/experiments/examples/test_data_generator
+python -c "from generate_model_data import generate_runs; generate_runs()"
 npm start
diff --git a/arangopipe/tests/test_data_generator/generate_model_data.py b/arangopipe/tests/test_data_generator/generate_model_data.py
@@ -154,6 +154,4 @@ def generate_runs(clean = False):
         ap.log_serving_perf(ex_servingperf, deployment_tag, user_id)
 
     return
-
-