Machine Learning ETL Pipeline

A data transformation job on data sourced from a MongoDB database.
Input data was a deeply nested JSON format from a MongoDB source system.
During the transformation stage of the ETL, the data was normalised into structured relational format
for subsequent feature engineering and analysis to build a prediction model.
The result dataset had over 56,000 features (i.e. columns).

Languages and Libraries

Python
Jupyter Notebook
Pandas
Flatten_json

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
ETL.ipynb		ETL.ipynb
README.md		README.md
data_profile_report.html		data_profile_report.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ETL.ipynb

ETL.ipynb

README.md

README.md

data_profile_report.html

data_profile_report.html

Repository files navigation

Machine Learning ETL Pipeline

Languages and Libraries

About

Releases

Packages

Languages

KelvinJC/machine-learning-ETL-pipeline

Folders and files

Latest commit

History

Repository files navigation

Machine Learning ETL Pipeline

Languages and Libraries

About

Topics

Resources

Stars

Watchers

Forks

Languages