Skip to content
View mishrayashi's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report mishrayashi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
mishrayashi/README.md

Hi, I'm Yashi Mishra wave

Typing animation

profile views LinkedIn Portfolio Email


About Me

I'm an AI & Data Engineer based in India, with over 4 years of experience bridging large-scale data engineering with production AI systems. I work across Python (PySpark), SQL, Databricks, and cloud-native architectures on Azure / GCP / AWS, with a focus on data quality and real-time pipelines.

I've delivered onsite with ministry stakeholders in Dubai and Abu Dhabi, translating government-level requirements into production data systems across UAE / UK / India time zones.

name:        Yashi Mishra
role:        AI & Data Engineer
location:    India  →  delivering globally (UAE • UK • IN)
focus:       Data Quality • Real-Time Pipelines • LLM & Voice AI
education:   B.Tech, Computer Science — GLA University (81.9%)
currently:   Researching data · Creating architectures · Medallion pipelines
ask-me-about: Microsoft Fabric, Databricks, PySpark, RAG, Whisper/Deepgram/ElevenLabs

What I've Shipped

  • 🏛️ Authored 300+ data-quality rules for UAE Official Development Assistance pipelines on Microsoft Fabric + Azure Databricks — audit-grade ministry reporting with Unity Catalog lineage.
  • Cut SQL latency 30%+ to under 2 seconds for live-ops dashboards by rewriting queries and stored procedures across PostgreSQL + BigQuery.
  • 🧠 Fine-tuned multilingual LLMs on the Hugging Face stack with parallel and batch training pipelines on distributed Azure + Ray infrastructure.
  • 🎙️ Prototyped a real-time Voice AI interview assistant — Whisper + Deepgram + ElevenLabs — with full production architecture and sizing presented to the client.
  • ❄️ Built a Bronze → Diamond medallion architecture on GCP for a cold-chain IoT logistics client, with N8N / Make.com alerting on reefer-temperature thresholds.
  • 💼 Designed the backend data architecture for ZipApply's candidate-recruiter matching using a life-quality scoring signal, HPCC ECL ETL pipelines, and OpenAI-powered resume tooling.

Tech Stack

AI / Voice / LLM Systems

OpenAI Hugging Face Whisper Deepgram ElevenLabs Ray RAG

Data Engineering

PySpark Databricks Microsoft Fabric BigQuery Snowflake dbt Airflow SQL

Backend & Languages

Python FastAPI PostgreSQL MySQL MongoDB React

Cloud, DevOps & Viz

Azure GCP AWS Terraform Docker Power BI Looker Tableau n8n


GitHub in Numbers

GitHub stats GitHub streak

Top languages

Trophies

Activity graph


Currently Exploring

  • 🎙️ Production architectures for low-latency Voice AI (sub-300ms end-to-end)
  • 🧪 Distributed fine-tuning patterns for multilingual LLMs on Ray + Azure
  • 🏗️ Lakehouse design with Unity Catalog and lineage-first data governance
  • 🔄 Event-driven medallion pipelines that hold up under audit

Let's Connect

LinkedIn Portfolio Gmail GitHub


"Data that ships, AI that talks, pipelines that hold under audit."

footer wave

Popular repositories Loading

  1. mishrayashi mishrayashi Public

    Config files for my GitHub profile.

  2. ETLWeather_Airflow ETLWeather_Airflow Public

    Python

  3. Introduction-to-DataBricks Introduction-to-DataBricks Public

    HTML

  4. mishrayashi.github.io mishrayashi.github.io Public

    Forked from subhayu99/subhayu99.github.io

    My Portfolio

    TypeScript

  5. mishrayashiportfolio.github.io mishrayashiportfolio.github.io Public

    HTML

  6. calltracker calltracker Public

    HTML