Skip to content

paulmattheww/auto-explore

Repository files navigation

Auto Explore

This library is in alpha stages. Contributions are welcome

The goal of this Python library is to create a reliable tool for performing a first-pass exploratory data analysis. The hope is that ML developers & data analysts will shorten their iteration cycle time by using this tool.

The earliest stages of a machine learning project require exploratory analysis to uncover raw features and insights that can be exploited in the modeling process. Exploratory analysis typically follows a somewhat tree-like (and at times recursive) process where task-patterns emerge across projects. By specifying certain parameters a priori about the data in question, a process that adheres to these task-patterns can be designed using open source tools to automate the majority of the "first pass" data analysis work -- freeing up time for deep-dive analyses, modeling, and deployment.

The open source projects that will be relied upon most for this project include:

While the term "automated" data analysis sounds difficult, the heavy lifting has been done by these library authors. This project will simply be extending good work that already exists, meaning I will not need to spend considerable time re-inventing the wheel on already established techniques.

True automation is still a ways out, but this library can be very helpful in exploring a new dataset.

About

Automated exploratory data analysis.

Resources

License

Stars

Watchers

Forks

Packages

No packages published