Skip to content

podesport/2110446_DS_2022s2

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

68 Commits
 
 
 
 
 
 

Repository files navigation

2110446 Data Science and Data Engineering @Chula (2022/2)

Support-Ukraine

alt text

Syllabus:

Syllabus

Registration request (expired)

Course material:

Week01: Intro to Numpy, Pandas

  1. Numpy: Open In Colab

  2. Pandas: Open In Colab

  3. Pandas with Youtube stat data: Open In Colab

  4. (Advanced) Pandas with Youtube stat data: Open In Colab

Assignment (Pandas with Youtube stat data): Open In Colab

Week02: Data Preparation

  1. EDA: Open In Colab

  2. Impute Missing Value: Open In Colab

  3. Split Train/Test: Open In Colab

  4. Outliers with Log: Open In Colab

  5. Outliers with Log (Titanic DataSet): Open In Colab

Assignment: Open In Colab

Week03-04: Traditional ML

  1. Decision Trees: Open In Colab

  2. Linear Regression: Open In Colab

  3. Logistic Regression: Open In Colab

  4. Neural Network: Open In Colab

  5. K Nearest Neighbors: Open In Colab

  6. SVM: Open In Colab

  7. Save and Load Model: Open In Colab

  8. K-Means: Open In Colab

  9. Market-Basket Analysis: Open In Colab

Week05-06: Intro to Deep Learning

  1. Image classification (basic): CIFAR10 Open In Colab

  2. Image classification (advanced): Animal Open In Colab

  3. Object detection: VOCDetection Open In Colab

  4. Semantic segmentation: CamSeq2007 Open In Colab

  5. Time series Forecasting: Stock Price Open In Colab

Assignment: Open In Colab

Week08: Introduction to Data Science and Big Data Architecture

  1. Simple example Open In Colab

Assignment: Open In Colab

Week09: Data Extraction

  1. Basic Webpage Scraping Open In Colab -- Note: File 'simple_page.html' will be uploaded to Colab automatically.

  2. Wikipedia page data extraction Open In Colab

  3. REST API Data Extraction Open In Colab

  4. Twitter Data Extraction Open In Colab -- Note: Do not forget to upload "twitter.yml" to your colab machine and modify the bearer token value.

  5. Selenium -- Note: this example cannot be run on Colab.

Assignment: Open In Colab

Week10: Data Extraction

This section contains example for Kafka. To test, you can use Kafka Server using IP 35.240.149.229 port 9092 or local server.

To run local server, install kafka locally or use the following docker compose file

Basic Example:

  • Producer Open In Colab

  • Consumer Open In Colab

  • Consumer 2 Open In Colab

  • Producer 2 Open In Colab

Complex Example

  • Sensor A Producer Open In Colab

  • Sensor B Producer Open In Colab

  • File Writer Consumer Open In Colab

  • Counter Consumer Open In Colab

  • Notifier Consumer Open In Colab

AVRO Example

  • AVRO Producer Open In Colab

  • AVRO Consumer Open In Colab

  • Sample AVRO Schema sample.avsc

Consumer Group Example

  • Partition Producer Open In Colab

  • Consumer Group 1 Open In Colab

  • Consumer Group 2 Open In Colab

  • Consumer Group 3 Open In Colab

Assignment

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%