Skip to content

This repository consists of various projects involving scripts for building data pipelines.

Notifications You must be signed in to change notification settings

ashmitan/BuildingDataMLPipelines

Repository files navigation

Building Data ML Pipelines

This repository consists of various projects involving scripts for building data pipelines.

It consists of following assignments and tasks:

  1. Social Media Analytics Pipeline This project is about designing and developing data ingestion pipeline for collecting real-time tweets for top 5 companies and scheduling it in Apache Airflow. Raw data is inserted into AWS S3 bucket. Raw data is then cleansed, transformed, and processed data is then available for furthur analysis.

  2. License Number Plate Detection This project aims to build data ingestion , machine learning and data inference pipeline wherein it can detect text from car number plate images and verified against various tasks created in Amazon Mechanical Turk. Results are displayed in Flask application.

  3. Time Series Analysis This project aims at scraping data from FRED website using FRED API and then developing a forecasting model.

  4. E-Commerce Log Monitoring and Visualization using ELK stack This project aims to design end to end analytical pipeline, perform analytics on windows/server logs generated for an e-commerce website, stored in ElasticSearch and visualized in Kibana.