Skip to content
View ruz023's full-sized avatar
Block or Report

Block or report ruz023

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

  1. Unsupervised-Fraud-Algorithm-on-the-NY-Property-Tax-Submission-Data Unsupervised-Fraud-Algorithm-on-the-NY-Property-Tax-Submission-Data Public

    Automated, unsupervised outlier identification from 1 million+ NYC properties. Two fraud models were constructed via z-score/PCA and autoencoders, respectively, and combined to identify 100+ instan…

    Jupyter Notebook

  2. Bayesian_Recommender_System Bayesian_Recommender_System Public

    Used Probabilistic Matrix Factorization (PMF) to recommend Netflix users movies and TV shows using PyMC3. Proved the superiority of the Bayesian method against baseline models.

    Jupyter Notebook 2

  3. DS-Take-Home-Challenges DS-Take-Home-Challenges Public

    Used linear and tree-based models, visualizations techniques to solve commonplace data science problems, including calculating conversion rate, analyzing A/B testing, churn/retention prediction, fr…

    Jupyter Notebook

  4. Pricing-Analytics---Hotel-Pricing-through-Casual-Inference-Analysis Pricing-Analytics---Hotel-Pricing-through-Casual-Inference-Analysis Public

    I used 28 relevant attributes to price hotel rooms using casual inference analysis between price and demand. PCA and K-Means Clustering were used to compare prices only among rooms with similar eno…

    Jupyter Notebook 1

  5. Spark-Desmontration Spark-Desmontration Public

    This is a demonstration of using Spark to explore large dataset, by using PySpark and SparkR. The files include loading data, data exploration and using clustering on words of Shakespeare's novels.

    Jupyter Notebook

  6. time-series-sales-prediction time-series-sales-prediction Public

    This is a data challenge to predict future store sales using past store sales. The data was more than 10 Gigabytes across years and retail stores across the United States. Unfortunately, due to an …

    Jupyter Notebook 1