Skip to content

NewGuy012/data-analysis-workflow

Repository files navigation

data-analysis-workflow

Python Packages Used

  • Numpy
  • Scipy
  • Pandas
  • Matplotlib
  • Scikit-learn
  • Jupyter

Data Workflow

This is how I generally try to approach any data analysis task.

  1. Define
    • Set clear objectives
  2. Import
    • Support various data types
    • Explore raw data
    • Check for any inconsistency or corruption
  3. Clean
    • Preprocess
    • Filter
    • Offset
    • Exclude outliers
  4. Analyze
    • Postprocess
    • Derive custom metrics
    • Aggregate multiple values into one
    • Calculate common statistics that may shed insight
  5. Export
    • Save a clean dataset
  6. Report
    • Re-iterate process if needed
    • Generate deliverables

Data Source

Workflow inspired by "Using Python for Data Analysis" by Ian Eyre from Real Python, retrieved on May 2024, https://realpython.com/python-for-data-analysis/

Data sourced from Washington State Department of Licensing https://www.kaggle.com/datasets/sahirmaharajj/electric-vehicle-population

Common Analysis Requests

  • Descriptive analysis: describe the past
  • Diagnostic analysis: investigate the why
  • Predictive analysis: try to predict the future
  • Prescriptive analysis: plan a strategy

About

Common data analysis workflow using Python.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages