Intention

This is a repository of all the things I would like to convert into projects, preferably using both R and Python. Really it's somewhere for me to store all my notes, links, blogs, tweets, hallway whispers, I would like to revisit at some point. Step 1. organize notes.

Ambition

structural time series; aka kalman filter; state space modeling
+ R package dlm and dse
Bayesian structural time series + R package BRMS R package
Frequency Pattern growth tree -- use for seeing what things happen at the same time
ARIMA modeling
exponential smoothing
GAMS
Neural Networks
Natural Language Processing
+ sentiment analysis
+ term frequency-inverse document frequency -- TF-IDF (http://www.tfidf.com/)
+ creating of word cloud in pre-defined shape and color
+ Latent Dirilecht allocation -- used to discover underlying topics (http://blog.echen.me/2011/08/22/introduction-to-latent-dirichlet-allocation/)
+ (https://algorithmia.com/tags/text-analysis)
Poisson Regression
Random Forest
Gradient boosting
stanford’s LP suite (https://stanfordnlp.github.io/CoreNLP/ ) - - tool/framework includes tokenization (text word splitting), tagging part of speech, grammar parsing (noun vs. verbs), named entity recognition, etc. Else give this python utility a try: http://www.nltk.org/ ; Similar to the stanford library - supports tokenizing, parsing, named entities identification
DeepAR forecasting algorithm at Amazon (https://docs.aws.amazon.com/sagemaker/latest/dg/deepar.html) See White paper

Fun videos to see

(https://www.youtube.com/channel/UCYO_jab_esuFRV4b17AJtAw/videos)

Piping to learn

Containers
+ Docker
+ S3 cluster
+ ECR
Amazon Athea Optimization
Amazon SageMaker
+ How to productionize ML models
+ End-toEnd ML Platform
+ Zero setup
+ Flexible model training
+ Tensor Flow
+ mxnet
+ Gluon
+ scales out with demand. Pay by demand, by second.
+ UX -- sagemaker console + Jupyter notebooks
+ Use SageMaker's hosted Notebook Instances
+ or Apache Spark through EMR and the SageMaker Spark SDK
+ or SageMaker's Console for a point and click experience
+ of your own device (EC2, laptop, etc)
+ Training/Hosting
+ Custom models via Docker/ECR
+ Low latency, high throughput, and high reliability
+ Zero downtime deployment and A/B testing
+ Trained model artifact is uploaded to S3 + Built-in algorithms
+ XGBoost, FM, Linear, and DeepAR Time-Series Forecasting for supervised learning
+ Kmeans, PCA, and Word2Vec for clustering and pre-processing
+ Image Classification
+ Native TensorFlow and MXNet support + Build your own algorithm
1. Pick your preferred framework
+ SciKit learn
+ R
+ PyTorch
+ Java
2. ... upload to Docker .. then...
+ Hyperparameter optimization (automatic model tuning)
+ Spark SDK reads from S3
+ SageMaker instances are typically smaller, not typically used for data munging. Data has been munged
+ See sagemaker Jupyter notebook repository

Regularization notes:

L1 still is a shrinkage penalty, it just shrinks coefficients exactly to zero in a finite amount of time, where L2 shrinks them asymptotically to zero.
L2 is preferred over L1 for several reasons. L2 is a shrinkage penalty, where L1 will just set values to zero. So L2 "moderates" your parameter values, while L1 just eliminates them. Also, L2 is differentiable everywhere, so it makes back

Neural Nets notes:

Paper on Dropout: http://www.cs.cmu.edu/~rsalakhu/papers/srivastava14a.pdf
As for dropout, it's kind of like what you do in random forest where you only use a subset of variables in each tree. You're suppressing the signal from some features so that other features have a chance to stand out. It also helps decorrelate variables, so they stand more on their own.
NN for NLP best practices: http://ruder.io/deep-learning-nlp-best-practices/index.html
random matrix theory to explain either saddle behaviour in neural net losses, or that most local minima are similar in depth
If NN isnt training: https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
SageMaker_workshop		SageMaker_workshop
toRead		toRead
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SageMaker_workshop

SageMaker_workshop

toRead

toRead

README.md

README.md

Repository files navigation

Intention

Ambition

Fun videos to see

Piping to learn

Regularization notes:

Neural Nets notes:

About

Releases

Packages

Languages

jchunn/Ambition

Folders and files

Latest commit

History

Repository files navigation

Intention

Ambition

Fun videos to see

Piping to learn

Regularization notes:

Neural Nets notes:

About

Resources

Stars

Watchers

Forks

Languages