Skip to content

AmirAli5/Data-Science-Intern-at-Data-Glacier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Data Science Intern at Data Glacier

Week 1

Week 1 (30 March - 06 April)

ML

Version Control Assignment

Clone the VC repo (Link), Create a new branch, Checkout newly created branch, Run the add.py and provide my name and fav sport as input, Run the test script using command: pytest test/test.py -s, ignore warning and if there is no error then add, commit and push your changes to repo create pull request and assign to reviewer

Link: https://github.com/AmirAli5/VC

Week 2

Week 2 (06 April - 13 April)

ML

Project: G2M insight for Cab Investment firm

The Client XYZ is a private firm in US. Due to remarkable growth in the Cab Industry in last few years and multiple key players in the market, it is planning for an investment in Cab industry and as per their Go-to-Market(G2M) strategy they want to understand the market before taking final decision.

Datasets contain information on 2 cab companies. Each file (data set) provided represents different aspects of the customer profile. XYZ is interested in using your actionable insights to help them identify the right company to make their investment.

Tasks
• Identify relationships across the files
• Exploratory Data Analysis(EDA)
• Multiple hypothesis and investigate

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%202

Week 3

Week 3 (13 April - 20 April)

ML

Project: G2M insight for Cab Investment firm

The same thing that I did in week 2 but additional implement Linear Regression Model to Predict the Price of Charged.

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%203

Week 4

Week 4 (20 April - 27 April)

ML

Deployment on Flask

In this week, we deploy a machine learning model (SVM) using the Flask Framework. As a demonstration, our model help to predict the spam and ham comment of YouTube. First, we build a machine learning model for YouTube Comments Spam Detection, then create an API for the model, using Flask, the Python micro-framework for building web applications. This API allows us to utilize predictive capabilities through HTTP requests.

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%204

Week 5

Week 5 (27 April - 4 May)

ML

Cloud and API Deployment

In this week, we use the machine learning model (SVM) using the Flask Framework that we build in last week and Deploy on open source cloud using Heroku which based on API as well as web app.

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%205

Week 6

Week 6 (4 May - 11 May)

ML

File Ingestion and schema

In this week, we took large size of data and first applied different methods of reading like Dask, Modlin, ray, and Pandas to check the computational efficiency. After that, we apply basic validation on data columns and then we validate number of columns and column name of ingested file with YAML. In the end we write the file (txt) in gz format and get the summary of the file.

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%206

Week 7

Week 7 (11 May - 18 May)

ML

Project: Deliverable

In this week, Data Collection, Data intake report, Upload the Dataset, Problem Statement

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%207

Week 8

Week 8 (18 May - 25 May)

ML

Project: Deliverable

In this week, Understanding the Data, Data Preprocessing (Text Cleaning) continue..

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%208

Week 9

Week 9 (25 May - 1 June)

ML

Project: Deliverable

In this week, Data Preprocessing (Preprocessing Operations, Feature Extraction, Split the Data into train and test)

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%209

Week 10

Week 10 (1 June - 8 June)

ML

Project: Deliverable

In this week, Build the CNN with LSTM Model Model

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%2010

Week 11

Week 11 (8 June - 14 June)

ML

Project: Deliverable

In this week, Result Evaluation of Model Performance

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%2011

Week 12

Week 12 (14 June - 23 June)

ML

Project: Deliverable

In this week, Build ML Application using Flask Framework

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%2012

Week 13

Week 13 (23 June - 30 June)

ML

Project: Final Submission

Final Project Submission including Source Code, Application, Report and Presentation

Link: https://github.com/AmirAli5/Data-Science-Intern-at-Data-Glacier/tree/main/Week%2013

END

Releases

No releases published

Packages

No packages published