data-cleaning-pipeline

Using machine learning models to predict if patients have chronic kidney disease based on a few features. The results of the models are also interpreted to make it more understandable to health practitioners.

data-science machine-learning machine-learning-algorithms data-transformation data-visualization feature-selection dimensionality-reduction diagnostics feature-engineering health-data-analysis machine-learning-algorithm model-interpretability data-cleaning-pipeline health-data-science preventative-medicine

Updated May 24, 2024
HTML

leo-padron / Exploratory-Data-Analysis-Peru-Covid-Casualties

Star

This is replicable exploratory data analysis of Peru SINADEF database (death index) of covid-19 related cases.

eda data-analysis data-wrangling data-cleaning-pipeline

Updated Sep 6, 2021
Jupyter Notebook

liyongh1 / Multi-Class-Logistic-Regression-using-Kaggle-ML-DS-Survey-Data

Star

kaggle-dataset multi-classify-with-sklearn data-cleaning-pipeline ordinal-logit

Updated Apr 13, 2021
Jupyter Notebook

vdechen / DataAnalysis_NGO

Star

This data analysis and visualization project aimed at presenting the work of OBA-Floripa NGO to authorities and the general population. The idea is to claim the need for continued funding resources, given the positive impact of the organization's activities on public health issues.

python dashboard data-visualization data-analysis tableau-public data-cleaning-pipeline

Updated Apr 6, 2023
Jupyter Notebook

getiria-onsongo / itallic

Star

A tool that automatically detects and corrects errors in location data and imputes missing values for location-dependent data, such as region name.

conda data-cleaning-pipeline plant-breeding-data

Updated Apr 23, 2021
Jupyter Notebook

259mit / MAHA

Star

MAHA is an in-progress ETL package which uses machine learning to clean your dataset with one line command.

data-cleaning etl-pipeline data-cleaning-pipeline

Updated Oct 30, 2020
HTML

liyongh1 / Sentiment-Analysis-with-2019-Canadian-Elections-Data

Star

sentiment-analysis machine-learning-algorithms word-embeddings bag-of-words nlp-machine-learning nlp-keywords-extraction nltk-python tfidf-vectorizer data-cleaning-pipeline

Updated Apr 13, 2021
Jupyter Notebook

DeleLinus / WeRateDogs-Wrangle-Analyze-Data

Star

The dataset I wrangled (and analysed and visualized) is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people's dogs with a humorous comment about the dog.

python data-science twitter data-analytics data-analysis data-wrangling data-exploration data-analyst-nanodegree data-analysis-python weratedogs data-cleaning-pipeline data-analyst-with-python data-interpretation data-wrangling-twitter

Updated Nov 25, 2021
HTML

CeliaMuriel / inconsistent-company-names-demo

Star

Inconsistent company names demo

gcp google-cloud fuzzy-matching google-cloud-platform data-cleaning data-quality data-cleansing trifacta data-cleaning-pipeline cloud-dataprep data-cleanup data-cleaning-and-preprocessing

Updated Mar 5, 2022

125ryun / Espresso

Star

서강대학교 2023-2 '빅데이터의 이해와 교육적 활용(캡스톤디자인)' 과목 '에스프레소' 팀

big-data time-series data-analysis educational-technology big-data-analytics log-level data-analysis-python data-cleaning-pipeline user-behavioral-sequences data-cleaning-and-preprocessing log-data log-data-analysis

Updated Dec 24, 2023
Python

xyuebai / data-etl-for-ml

Star

Data ETL for machine learning with dockerizing, including data crawling, data transforming/cleaning, and saving data to s3

docker etl aws-s3 boto3 data-cleaning-pipeline

Updated Oct 19, 2022
Python

AnalystHub-Hub / IBM-Data-Science-Professional-Certificate

Star

I learnt data science through hands-on practice in the IBM Cloud using real data science tools and real-world data sets.

python data-science machine-learning ibm-watson-services machine-learning-algorithms data-visualization data-extraction data-scraping data-cleaning-pipeline ibm-cognos-analytics

Updated Oct 20, 2022
Jupyter Notebook

ved93 / ml-express

Star

A Python library for day to day data analysis and machine learning. This aims to make data building, cleaning and machine learning much much faster. A library of extension and helper modules for Python's data analysis and machine learning libraries.

visualization data-science machine-learning eda data-preprocessing feature-engineering data-preparation pandas-profiling data-summarization data-cleaning-pipeline

Updated Jan 12, 2022
Python

ManarAlharbi / DSND-Term2-Disaster_Response_Pipeline

Star

Create a machine learning pipeline, that categorizes disaster events.

python data-science machine-learning natural-language-processing sqlalchemy integrated-development-environment sqlite ide jupyter-notebook data-engineering nltk data-pipelines machine-learning-pipelines extract-transform-load gridsearchcv flask-webapp udacity-data-science-nanodegree disaster-response-pipeline categorizes-disaster-events data-cleaning-pipeline

Updated Jan 16, 2020
Jupyter Notebook

everks / dial-clean

Star

中文对话数据清洗

dialog data-cleaning-pipeline

Updated Nov 8, 2022
Python

Elysian01 / Data-Purifier

Star

A Python library for Automated Exploratory Data Analysis, Automated Data Cleaning, and Automated Data Preprocessing For Machine Learning and Natural Language Processing Applications in Python.

data-science jupyter exploratory-data-analysis python-library python-lib eda data-visualization python3 data-analysis data-preprocessing data-cleaning data-cleaning-pipeline datapurifier

Updated May 6, 2022
Jupyter Notebook

LaureBerti / Learn2Clean

Star

Learn2Clean: Optimizing the Sequence of Tasks for Data Preparation and Cleaning

reinforcement-learning data-preprocessing automated data-cleaning data-curation data-cleaning-pipeline

Updated Dec 26, 2022
Python

Improve this page

Add a description, image, and links to the data-cleaning-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-cleaning-pipeline topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-cleaning-pipeline

Here are 21 public repositories matching this topic...

RashikaKarki / Auto-Wrangler

DesiSanou / data-scraping

JamesHanZhang / table-data-format-transform-app

Shuyib / chronic-kidney-disease-kaggle

leo-padron / Exploratory-Data-Analysis-Peru-Covid-Casualties

liyongh1 / Multi-Class-Logistic-Regression-using-Kaggle-ML-DS-Survey-Data

vdechen / DataAnalysis_NGO

getiria-onsongo / itallic

259mit / MAHA

liyongh1 / Sentiment-Analysis-with-2019-Canadian-Elections-Data

DeleLinus / WeRateDogs-Wrangle-Analyze-Data

CeliaMuriel / inconsistent-company-names-demo

125ryun / Espresso

xyuebai / data-etl-for-ml

AnalystHub-Hub / IBM-Data-Science-Professional-Certificate

ved93 / ml-express

ManarAlharbi / DSND-Term2-Disaster_Response_Pipeline

everks / dial-clean

Elysian01 / Data-Purifier

LaureBerti / Learn2Clean

Improve this page

Add this topic to your repo