Skip to content

YLTsai0609/DataScience101

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What is it?

Machine Learning

Learning Theorem(Optimization and Generalization)

Overfitting or Perfect Fitting? Risk Bound for Interpolated Models

over_parameterization

optimization_deep_learning_I (gradient decent - liked method)

Explaining the Success of AdaBoost and Random Forests as Interpolating Classifiers

EM_algorithm

implementation_survey

Information Theory

ShannonEntroy_to_KL_divergence

from KL_divergence_to_conditional_entropy

Supervised Learning

XGBoost theorem and code thumb up 700+

Naive Bayes and how it work when features are dependent

Unsupervised Learning

Clustering

Introduction

cs229_11_GMM

ML_13_Unsupervised_Learning_HungYi_Lee

ML_14_Unsupervised_Learning_WordEmbedding_HungYiLee

ML_15_Unsupervised_Learning_NeiborhoodEmbedding_HungYiLee

Self-Supervised Learning

Introduction

auto-encoder and it's application

Research Flow

model_size_gpu_resource_estimation

YuLong's paper note

PaperReading_Hints

Paper development

efficient_research_workflow

Deep Learning

Network Components

CNN hung yee lee

Convolution, dot, and Cross-correlation

all about the convolutions

Brief_Introduction_of_4_well-known_CNN Architecture

Affine_tranformation_bilinear_interpretation

Attention, Self-Attention

Practical nn tuning

Batch Normalization

layer, instance, group normalization

Transfer Learning / Domain Adaptation

Transfer Learning

domain_adaptation.md

Nature Language Processing (NLP)

metirials

NLP_Model_evalution_metric

KeyWord_Extrcaction_TFIDF_RAKE_Garph_TextRank

Topic_Modeling_LDA

LDA source code survey

Ngram

Word Embedding

Tokenization

algorithm in jieba

Data Diversity / Data Augmentation

Introduction to Text Augmentation

Contextulized Word Embedding & Transformer

transformer 2019 - optimize sequential data in parallel

transformer 2021

ELMO, BERT, GPT

ChatGPT

Name Entity Recognition (NER)

Introduction to NER

kashgari - production level bert model source code analysis

POS Tagging

Introduction to POS-Tagging )

Graph Convolutional Network(GCN)

Graph deep learning study matrial

Graph Neural Networks: Models and Applications AAAI 2020 tutorial

GCN QAs

Stanford_Graph_Representation_Learning

Standord_Properties_of_Network_and_Random_Graph_Models

Graph Convolution implementation using numpy and MXNet

Recommendation System (RecSys)

Introduction

recsys_overview

learning to rank

metrics

ml-application on recsys by facebook ml engineer

ANN Introduction

Recommendation against to Product

recommender from product persperctive

data property of recommendation system

Case Study

Netflix

Instgram - explore (2019)

airbnb - search ranking(2018)

YoutubeDNN for Video - 2016

Sourcing Model

factorlization machine implementation survey

tffm source code analysis

personal rank

Search

google search guideline

Search Survey

Computer Vision(CV)

Introduction

metrials

Object Detection History (SVM, HOG, SlidingWindow CNN, R-CNN, YOLO) from Kaggle Notebook

The 5 Computer Vision Techniques That Will Change How You See The World from medium post

CV_SomeKeyword

CNN Architecture AlexNet, VGG, GoogleLeNet, ResNet, NIN, DenseNet, SqueezeNet

Image Classification

Feature Extraction - Histogram of Oriented Gradient HOG

Object Detection

Data labeling

Selective Search

MTCNN

YOLO

How to design a system like yolo

YOLO source code

Object Detection Evaluation Metric

focal_loss_for_dense_object_detection

The Diversity of Images in Object Detection

image_attributes

Make_Object_Detection_Algorithm_Useful

[Topic survey]chop_object_with_different_background_to_boost_robustness

Multiple_object_tracking_sort_deep_sort_survey

Detection_Localization_Segmentation_Stanford

Model compression / Speedup algorithm / hardware

Efficient Methods and hardware for deep learning

Network Compression II by hung yi lee lee

Faces

awesome-Face_Recognition

facenet practical usage

Face3D

face_clustering

facial_landmark_detection

Image Search

KD-Tree with cnn

3D Vision

Introduction

Business Analysis

資料科學中的商業分析 1:定義問題,不做沒有用的專案

資料科學中的商業分析 2:假說思考對資料科學家來說至關重要

Infomation Quality (I)

Information Quality (II)

In-House EDA guideline

Case_Study_Questionnaire_analysis_improve_website_UIUX

yelp business model

eBay-User-Journey

Experiment Design

The type of experiments

A/B Testing

T - test, Z - test and ANOVA

Chp1 Chp2 Experiement and Testing

Chp_3_Statiscs_and_Hypothesis

Chp_4_Experiments_and_analysis

Chp_5_button_experiments

How Pinterest Supercharged its Growth Team With Experiment Idea Review

A/B Testing:「偷看結果」將成為最大的錯誤

Decision Analysis

Using Decision Tree for decision making 1, 2, 3, 4, 5

Geospatial Analysis (Location-Based Service)

Kaggle Learn All Mash Up

LineSpot - GeoHashing, KD-Tree

Statistics

matirials

Probability

P-Value

Univiate distribution

displaying pdf, cdf, sf, ppf, and isf

All_about_the_distribution

t , f , chi distribution

Bayesian Statistics

How_Bayesian_inference_works

Bayesian_Hyperparameter_tuning

Bayesian_LinearRegression

Bayesian_analysis_python

Price Dynamic in airbnb

Statistical Regression Analysis

Generalized_Additive_Models(Non Linear)

Linear regression analysis

Probabilistic_Graphical_Models

Plane Notation for Directive Graph Model

Markov Chain I : Markov_chain_MCMC

Markov Chain II : absoring Markov Chain

Probabilistic_Graphical_Models_PGM

HiddenMarkovModel_HMM

HMM_Viterbi_Algorithm

From_Markov_Chain_to_LSTM

Casual Inference

introduction

DevOps/Computer Science

Course

Missing_Semester_of_Your_CS_Education

Introduction for Computer Science cs50

Algorithm by python

Advance python

Data Structures and Algorithms: In-Depth using Python

Deployment of Machine Learning Models

Practical hints

hardware_introduction

Tech Estimation

DevOps_SomeKeyWord

How to read a big repository

garbage collection, shallow copy, deep copy

Standard Data Science project template

Kaggle_Six_steps_to_more_professional_data_science_code

Kaggle_Making_an_app_from_your_modeling_code

BCG GAMMA Data Science in Production — Advanced Python Best Practices

Serialization_Encoding_Bytes_ByteArray

Model Saving Keras, Tensorflow, Pytorch, Other

FaceNet_install

SpeedUp Strategy

TensorRT highlight

tflite and edgetpu

Manage_package_by_Conda

Python Versioning & Package Management (python + virtualenv vs pipenv vs poetry)

python command line

Compile and Install

dlib_compile

opencv_complie

Building YOLOv3

Rasperry pi 4 hints

Deployment and maintaining

API_doctring_generator

test_your_code_pytest_unit_test

packaging_your_code

Coding style

code formatter

pep 484 Type Hinting

pep 498 f_string

Math Stuff

sympy : use python to caculate Calculus

semi_positive_matrix

all_about_the_distance

variance_distance_correlation

lemma_theorem_proposition

Deep Learning Python Framework

DL software

tensorflow-note

keras-note

pytorch-note

Open dataset

wiki

Physics

Polorization by reflection 鏡片反光原理

Data Culture Building / Management

Introduction

Science vs Analytics vs Engineering vs Product

Scaling knowledge / lab

Code Review Guideline

001 - Google The Standard of Code Review

002 - Google What to Look for in a Code Review

003 - Google Navigating a CL in review

004 - Google Speed of Code Reviews

summary_deds_code_review

agile for Software development

talk from infuseAI

story point

MLOps and ML System Design

Course & papers

Machine Learning System Design Stanford CS329S

martials

Scale Up ML Lifecycle

Concept and Best Practice

mlops from gcp introduction

ml best practice from google

ml system tech deut

why your pipeline is slow?

Framework

overview

Data Career

Jobs and Companies

Others

Question Asking Guide

Paper_template

Data pipeline design principle

computation_profling

Efficient Vim + Markdown

Learning ML DL 3 stage

some_name_explaination_software_engineering

Tableau

Markdown_Ref

Latax_Ref

Solving Coding Problem Guideline

Data Science and Robots by Brandon Rohrer

About

Yu Long's note about data science

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages