pyqae

python/pyspark image query analysis engine

Pyqae is a python-based tool for processing

install

The core pyqae package defines core data structures and read/write operations for image stacks

Conda

Create a new environment using the environment file in the binder folder

conda env create -f binder/environment.yml

Install the remaining packages and tools using pip in the root directory of the package

pip install .

Binder/Docker

You can use repo2docker to make a self-contained docker image directly from this repository

pip install jupyter-repo2docker

Dry-run

You can see what will be built by performing a dry run in the local directory

repo2docker --debug --no-build .

or build and run the image using

repo2docker .

other notes

It is built on numpy, scipy, scikit-learn, and scikit-image, and is compatible with Python 2.7+ and 3.4+. You can install it using:

The official procedure for installation is first running

pip install -r requirements.txt

And then running

python setup.py install

related packages

There are a number of different tools which pyqae utilizes for analysis

keras deep learning wrapper tools
elephas distribution code for ML and Keras
tensorflow core deep learning code
thunder thunder project for image and sequence analysis

You can install the ones you want with pip, for example

pip install thunder-python

using with spark

Thunder doesn't require Spark and can run locally without it, but Spark and Thunder work great together! To install and configure a Spark cluster, consult the official Spark documentation. Thunder supports Spark version 1.5+, and uses the Python API PySpark. If you have Spark installed, you can install Thunder just by calling pip install thunder-python on both the master node and all worker nodes of your cluster. Alternatively, you can clone this GitHub repository, and make sure it is on the PYTHONPATH of both the master and worker nodes.

Once you have a running cluster with a valid SparkContext — this is created automatically as the variable sc if you call the pyspark executable — you can pass it as the engine to any of Thunder's loading methods, and this will load your data in distributed 'spark' mode. In this mode, all operations will be parallelized, and chained operations will be lazily executed.

using notebooks with pyspark

PYSPARK_PYTHON=/Users/mader/anaconda/bin/python PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook --ip 0.0.0.0" /Applications/spark-2.1.1-bin-hadoop2.7/bin/pyspark --driver-memory 8g --master local[8]

using an environment

PYSPARK_PYTHON=/Users/mader/anaconda/envs/py27/bin/python PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook --ip 0.0.0.0" /Applications/spark-2.1.1-bin-hadoop2.7/bin/pyspark --driver-memory 8g --master local[8]

or the old version

PYSPARK_DRIVER_PYTHON=ipython PYSPARK_DRIVER_PYTHON_OPTS=notebook /Volumes/ExDisk/spark-2.0.0-bin-hadoop2.7/bin/pyspark

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
binder		binder
circleci		circleci
docs		docs
notebooks		notebooks
pyqae		pyqae
test		test
.coveralls.yml		.coveralls.yml
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
circle.yml		circle.yml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
run-tests.sh		run-tests.sh
setup.py		setup.py

License

4Quant/pyqae

Folders and files

Latest commit

History

Repository files navigation

pyqae

install

Conda

Binder/Docker

Dry-run

other notes

related packages

using with spark

using notebooks with pyspark

using an environment

or the old version

About

Resources

License

Stars

Watchers

Forks

Languages