Skip to content
View sienlonglim's full-sized avatar
Block or Report

Block or report sienlonglim

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
sienlonglim/README.md
  • πŸ‘‹ Hi, I’m Static Badge
  • πŸ‘€ I’m interested in Data Science, Data Engineering and Data Analytics. Anything related to AI/ML!
  • 🌱 I’m currently pursuing a Masters of Science in Analytics at Georgia Tech.
  • πŸ’žοΈ I’m looking to collaborate on anything that can help me learn.
  • πŸ“« How to reach me Static Badge Static Badge Static Badge

Personal projects:

  1. Document Query Bot (RAG Framework)
    Static Badge GitHub commit activity (branch) Static Badge
  • Document splitting
  • Embeddings (OpenAI)
  • Vector database (Chroma / FAISS)
  • Semantic search
  • Retrieval chain
  1. HDB Resale Prices Predictor and Dashboard
    Static Badge GitHub commit activity (branch) Static Badge Static Badge Static Badge Static Badge
  • Large dataset involving geodata 🌏
  • Rest API calls to Data.gov.sg and OneMap API πŸ—ΊοΈ
  • Feature creation and selection (KBest, L1 Regularisation)
  • Hyperparameter tuning (Random CV)
  • Ensemble models (Gradient boosting, Random forest)
  • Web Application (Flask) with Bootstrap 5

  1. Dagster-dbt-duckdb Pipeline
    Static Badge GitHub commit activity (branch)
  • Dagster for orchestration
  • Dbt for data modeling and transformation
  • DuckDB for storage

  1. Stock portfolio analysis (K-means), forecasting (ARIMA) and stock recommendation
    Static Badge GitHub commit activity (branch)
  • Web Scrapping (BS4)
  • ETL
  • RDBMS (MySQL)
  • K means clustering
  • ARIMA

  1. Web application for SkillsFuture website attendance taking summary
    Static Badge GitHub commit activity (branch)
  • Web Application (Flask) with Bootstrap 5
  • Telegram Bot API
  • RDBMS (MariaDB)

  1. EDA of Real Anonymized Financial Dataset with SQL (Czech Republic PKDD 99' Discovery Challenge)
    Static BadgeGitHub commit activity (branch) Static Badge
  • Database design (MariaDB - CLI, visualizer)
  • MariaDB with CLI and DBVisualizer
  • SQL queries, connectors
  • Report writing

  1. Other Exploratory Data Analysis / Machine Learning projects
  • Classification of Watson Healthcare Employee Attrition (Decision Tree) πŸ‘¨β€βš•οΈ Static Badge

  • Regression fill on World Economic Data (Linear, polynomial regression) πŸ’΅ Static Badge

  • EDA of World Population Data (Interactive charts with Plotly) 🌏 Static Badge

  • EDA and recommendation on pet store sales (Fictitious dataset) 🐢 Static Badge

Pinned

  1. dbt-elt dbt-elt Public

    This project aims to build a modern data pipeline with CI/CD practices using Dagster and Dbt on top of DuckDB

    Jupyter Notebook

  2. ml_webapp ml_webapp Public

    This project utilises open data from Data.gov.sg to build several Machine Learning (ML) models that help predict HDB Resale Prices.

    Jupyter Notebook

  3. LangChain LangChain Public

    This project implements RAG using OpenAI's embedding models and LangChain's Python library

    Jupyter Notebook 9 2

  4. financial_analysis_forecasting financial_analysis_forecasting Public

    A web scrapper that retrieves your stock portfolio for cluster analysis and recommendation

    Jupyter Notebook 2

  5. jobs_retriever_automailer jobs_retriever_automailer Public

    This project automates job listing retrieval from LinkedIn via GitHub Actions

    Jupyter Notebook

  6. healthhack healthhack Public

    This project implements RAG as a conversational model (Chatbot) for the purpose of explaining medical reports to patient. Part of HealthHack Singapore competition

    Jupyter Notebook 1