Skip to content

nnthanh101/Machine-Learning

Repository files navigation

🔥 Master of Analytics in Action 🦅

📚 Data Science and Machine Learning are the fastest-growing fields in technology. This repository aims to develop professional and strong analytics skills for organizing, storing, and manipulating large amounts of data. 🏁

Thanh Nguyen Thanh Nguyen


1. Technology Stack ⚙️

2. Certifications 🎓

3. Projects 👨‍💻

  • Document as Code
    • npx create-docusaurus@latest docs classic --typescript
    • yarn add @docusaurus/theme-search-algolia tailwindcss postcss autoprefixer

Deliverables 💎

📆 Lessons / Tasks Done ⏰ Reference Links 🔗
01 🎓 AWS Certified Data Analytics - Specialty (DAS) (Collecting Streaming Data, Data Collection and Getting Data, Amazon Elastic Map Reduce (EMR), Using Redshift & Redshift Maintenance & Operations, AWS Glue, Athena, and QuickSight, ElasticSearch, AWS Security Services) ✅ A Cloud Guru - DAS & ACG Practice Exam & UDemy Practice Exam
02 🎓 AWS Certified Machine Learning - Specialty (MLS-C01) (Data Preparation, Data Analysis and Visualization, Modeling, Algorithms, Evaluation and Optimization, Implementation and Operations) ☑️ A Cloud Guru - MLS-C01 & ACG Practice Exam & UDemy Practice Exam
03 🎓 AWS Certified Database - Specialty (DBS-C01) (Relational Database Service, Amazon Aurora / DynamoDB / DocumentDB / RedShift, Migrating Data to Databases, Monitoring & Optimization) ☑️ A Cloud Guru - DBS-C01 & ACG Practice Exam & UDemy Practice Exam
04 🛠 Reproducible Local Development for Data Science and Machine Learning projects Data Science
05 👨‍💻 Python Project : Spotify Data Analysis using Python Project
06 📚 Statistics (Descriptive statistics - Mean, Median, Mode, Variance, & Standard deviation) Statistics for Data Science with Python
07 Tableau Project : Sales Insights - Data Analysis using Tableau & SQL Project
08 🚀 Project : Data Analysis using Python Project

Project Organization

🛠 Production-grade project structure for successful data-science or machine-learning projects 🚀

├── Makefile           <- Makefile with convenience commands like `make data` or `make train`
├── README.md          🤝 Explain your project and its structure for better collaboration.
├── config/
│   └── logging.config.ini
├── data               🔍 Where all your raw and processed data files are stored.
│   ├── external       <- Data from third-party sources.
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, unprocessed, immutable data dump.
│
├── docs               📓 A default docusaurus | mkdocs project; see docusaurus.io | mkdocs.org for details
│
├── models             🧠 Store your trained and serialized models for easy access and versioning.
│
├── notebooks          💻 Jupyter notebooks for exploration and visualization.
│   ├── data_exploration.ipynb
│   ├── data_preprocessing.ipynb
│   ├── model_training.ipynb
│   └── model_evaluation.ipynb
│
├── pyproject.toml     <- Project configuration file with package metadata for analytics
│                         and configuration for tools like black
│
├── references         <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports            📊 Generated analysis (reports, charts, and plots) as HTML, PDF, LaTeX.
│   └── figures        <- Generated graphics and figures to be used in reporting
│
├── requirements.txt   🛠 The requirements file for reproducing the analysis environment, for easy environment setup.
│
├── setup.cfg          <- Configuration file for flake8
│
├── src                💾 Source code for data processing, feature engineering, and model training.
│   ├── data/
│   │   └── data_preprocessing.py
│   ├── features/
│   │   └── feature_engineering.py
│   ├── models/
│   │   └── model.py
│   └── utils/
│       └── helper_functions.py
├── tests/
│   ├── test_data_preprocessing.py
│   ├── test_feature_engineering.py
│   └── test_model.py
├── setup.py           🛠 A Python script to make the project installable.
├── Dockerfile
├── docker-compose.yml
├── .gitignore
└── analytics          🧩 Source code for use in this project.
    │
    ├── __init__.py    <- Makes analytics a Python module
    │
    ├── data           <- Scripts to download, preprocess, or generate data
    │   └── make_dataset.py
    │
    ├── features       <- Scripts to turn raw data into features for modeling
    │   └── build_features.py
    │
    ├── models         <- Scripts to train models and then use trained models to make predictions.           
    │   ├── predict_model.py
    │   └── train_model.py
    │
    └── visualization  <- Scripts to create exploratory and results-oriented visualizations
        └── visualize.py

Resources