Skip to content
View mehroosali's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Richardson, Texas
Block or Report

Block or report mehroosali

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mehroosali/README.md

< Hello World, I'm Mehroos Ali />

  • I am a collaborative data engineering professional with substantial knowledge and experience in analysis, design, development, implementation, migration, convergence, management, and support of large-scale databases, data warehouses, and big data systems by creating intuitive architectures and frameworks that help organizations effectively capture, store, process, visualize and analyze huge volume of structured, semi-structured, unstructured and stream of heterogeneous data set.
  • I am currently pursuing my Masters in Computer Science at the University of Texas at Dallas specializing in Intelligent Systems.
  • I have previously interned at Amazon as Data Engineer this past summer where I gained knowledge and experience working with design and development of streaming data pipelines.
  • I have previously worked as a Data Engineer for Onward Technologies which is a global IT service provider in domains such as data analytics, data science, Artificial Intelligence (AI) and Machine Learning (ML). Before that I was working with Cognizant on their flagship Core Banking and Insurance customer - Suncorp.
  • I am interested in Big Data Engineering, Cloud Data Warehousing, Devops and Full Stack Development.
  • 📩 Feel free to reach me at mehroosali@gmail.com.

🛠 My Toolkit

java python sql hadoop spark kafka Airflow Hive Sqoop nifi docker intelij

oracle mysql aws GCP Azure maven github postman linux databricks jenkins vscode

🏆 Github Stats

Mehroos's Github Stats

🤝 Let's stay connected!

       

Pinned

  1. databricks-F1-Project databricks-F1-Project Public

    A data pipeline project build on databricks and azure to demostrate lifecycle of a cloud data project.

    Jupyter Notebook 5 3

  2. s3-redshift-batch-etl-pipeline s3-redshift-batch-etl-pipeline Public

    Built functional python ETL script with functions that initialized spark clusters using pyspark library to extract songs stored in S3 bucket. Partitioned songs data by year and artist_id and compre…

    Python 5 3

  3. bigquery-sparksql-batch-etl bigquery-sparksql-batch-etl Public

    Batch ETL pipeline project on GCP to load and transform daily flight data using Spark to update tables in BigQuery. The pipeline is automated using Airflow.

    Python 2

  4. ABCStoresPipeline ABCStoresPipeline Public

    Batch ETL data pipeline built on HDP 3.0 to process daily sales and business data to procedure power Bi reports. Automated the pipelines using Airflow.

    Scala

  5. Twitter-Sentiment-Analysis Twitter-Sentiment-Analysis Public

    personal project to pull live Twitter data using Nifi getTwitter processor and pushes to Kafka topic which is then consumed by a Spark Streaming application where basic sentiment analysis is perfor…

    Scala 1 1

  6. Realtime-Customer-Viewership-Analysis Realtime-Customer-Viewership-Analysis Public

    data pipeline using the lambda architecture is created for the unification and consolidation of real-time customer web events, weblogs, and profile data into a hive warehouse for adhoc analysis.

    Scala