🔥 Master of Analytics in Action 🦅

📚 Data Science and Machine Learning are the fastest-growing fields in technology. This repository aims to develop professional and strong analytics skills for organizing, storing, and manipulating large amounts of data. 🏁

1. Technology Stack ⚙️

Python: Pandas | NumPy | Matplotlib | Seaborn
R || SAS
Data Visualization: Tableau || Power BI

2. Certifications 🎓

3. Projects 👨‍💻

Document as Code
- npx create-docusaurus@latest docs classic --typescript
- yarn add @docusaurus/theme-search-algolia tailwindcss postcss autoprefixer

Deliverables 💎

📆	Lessons / Tasks Done ⏰	Reference Links 🔗
01	🎓 AWS Certified Data Analytics - Specialty (DAS) (Collecting Streaming Data, Data Collection and Getting Data, Amazon Elastic Map Reduce (EMR), Using Redshift & Redshift Maintenance & Operations, AWS Glue, Athena, and QuickSight, ElasticSearch, AWS Security Services) ✅	A Cloud Guru - DAS & ACG Practice Exam & UDemy Practice Exam
02	🎓 AWS Certified Machine Learning - Specialty (MLS-C01) (Data Preparation, Data Analysis and Visualization, Modeling, Algorithms, Evaluation and Optimization, Implementation and Operations) ☑️	A Cloud Guru - MLS-C01 & ACG Practice Exam & UDemy Practice Exam
03	🎓 AWS Certified Database - Specialty (DBS-C01) (Relational Database Service, Amazon Aurora / DynamoDB / DocumentDB / RedShift, Migrating Data to Databases, Monitoring & Optimization) ☑️	A Cloud Guru - DBS-C01 & ACG Practice Exam & UDemy Practice Exam
04	🛠 Reproducible Local Development for Data Science and Machine Learning projects	Data Science
05	👨‍💻 Python Project : Spotify Data Analysis using Python	Project
06	📚 Statistics (Descriptive statistics - Mean, Median, Mode, Variance, & Standard deviation)	Statistics for Data Science with Python
07	Tableau Project : Sales Insights - Data Analysis using Tableau & SQL	Project
08	🚀 Project : Data Analysis using Python	Project

Project Organization

🛠 Production-grade project structure for successful data-science or machine-learning projects 🚀

├── Makefile           <- Makefile with convenience commands like `make data` or `make train`
├── README.md          🤝 Explain your project and its structure for better collaboration.
├── config/
│   └── logging.config.ini
├── data               🔍 Where all your raw and processed data files are stored.
│   ├── external       <- Data from third-party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, unprocessed, immutable data dump.
│
├── docs               📓 A default docusaurus | mkdocs project; see docusaurus.io | mkdocs.org for details
│
├── models             🧠 Store your trained and serialized models for easy access and versioning.
│
├── notebooks          💻 Jupyter notebooks for exploration and visualization.
│   ├── data_exploration.ipynb
│   ├── data_preprocessing.ipynb
│   ├── model_training.ipynb
│   └── model_evaluation.ipynb
│
├── pyproject.toml     <- Project configuration file with package metadata for analytics
│                         and configuration for tools like black
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            📊 Generated analysis (reports, charts, and plots) as HTML, PDF, LaTeX.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   🛠 The requirements file for reproducing the analysis environment, for easy environment setup.
│
├── setup.cfg          <- Configuration file for flake8
│
├── src                💾 Source code for data processing, feature engineering, and model training.
│   ├── data/
│   │   └── data_preprocessing.py
│   ├── features/
│   │   └── feature_engineering.py
│   ├── models/
│   │   └── model.py
│   └── utils/
│       └── helper_functions.py
├── tests/
│   ├── test_data_preprocessing.py
│   ├── test_feature_engineering.py
│   └── test_model.py
├── setup.py           🛠 A Python script to make the project installable.
├── Dockerfile
├── docker-compose.yml
├── .gitignore
└── analytics          🧩 Source code for use in this project.
    │
    ├── __init__.py    <- Makes analytics a Python module
    │
    ├── data           <- Scripts to download, preprocess, or generate data
    │   └── make_dataset.py
    │
    ├── features       <- Scripts to turn raw data into features for modeling
    │   └── build_features.py
    │
    ├── models         <- Scripts to train models and then use trained models to make predictions.           
    │   ├── predict_model.py
    │   └── train_model.py
    │
    └── visualization  <- Scripts to create exploratory and results-oriented visualizations
        └── visualize.py

Resources

Datasets: Amazon Datasets & Kaggle Datasets
DataHub
KDNuggets & Towards Data Science & Kaggle Winner’s Blog
Statistics: Simply Statistics
TensorFlow & Keras
Artificial Intelligence: DeepMind Blog
Top algorithms that every data scientist should have in their toolbox:
- Linear regression
- Logistic regression
- Principal component analysis (PCA)
- Decision trees
- Random forests
- CART algorithm
- Naive Bayes
- KNN
- Support vector machines (SVM)
- K-means clustering
- Neural networks

Name		Name	Last commit message	Last commit date
Latest commit History 127 Commits
.devcontainer		.devcontainer
Archive		Archive
analytics		analytics
data/external/house-prices-advanced-regression-techniques		data/external/house-prices-advanced-regression-techniques
docs		docs
models		models
notebooks		notebooks
references		references
reports		reports
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg

nnthanh101/Machine-Learning

Folders and files

Latest commit

History

Repository files navigation

🔥 Master of Analytics in Action 🦅

1. Technology Stack ⚙️

2. Certifications 🎓

3. Projects 👨‍💻

Deliverables 💎

Project Organization

Resources

About

Topics

Resources

Stars

Watchers

Forks

Languages