Pandas Data Exploration utility is an interactive, notebook based library for quickly profiling and exploring the shape of data and the relationships between data. Using existing APIs from IpyWidget, Plot.ly, and Pandas, it creates a flexible point and click widget that allows the user to easily explore and visualize the dataset.
This is a work in progress, and I welcome any suggestions on features and/or enhancements.
pip install Pandas-Data-Exploration-Utility-Package
import pandas as pd
import pandas_exploration_util.viz.explore as pe
global_temp = pd.read_csv("./data/GlobalTemperatures.csv", parse_dates = [0], infer_datetime_format=True)
pe.generate_widget(global_temp)
see /test
for sample data and test jupyter notebook
https://github.com/yifeihuang/pandas_exploration_util/tree/master/test
Visualize the top values of any column as ranked by aggregation of any other column. Support aggregation functions include 'count', 'sum', 'mean', 'std', 'max', 'min', 'uniques'
Visualize distribution of any numerical value. Binning is automatically determined by the plot.ly histogram method.
Visualize the X-Y scatter of any column vs aggregation of any other column. Support aggregation functions include 'count', 'sum', 'mean', 'std', 'max', 'min', 'uniques'
- Setup virtualenv
- Create a virtual environment using
virtualenv /path/to/env/dir
- Activate virtual environment using
source /path/to/env/dir/bin/activate
- Clone the repo locally
- Navigate the root directory of the repo where the
setup.py
lives - Install the module in development mode using
python setup.py develop
- Run the Jupyter notebook that is in the virtual environment directory, which should have installed as the part of the dependency of the module
- Dev away
- When done uninstall the package using
python setup.py develop --uninstall
- Deactive the environment using
deactivate
https://packaging.python.org/tutorials/packaging-projects/
Assuming all relevant tools are installed and the relevant project files are properly defined
- build the distribution using
python3 setup.py sdist bdist_wheel
- upload the distribution using
twine upload dist/*{version}*