Exercises :)

This repo gathers various exercises on ML problems. Some of them are still active projects, others are completed.

Collaborative filtering exercise

This comes from Computational Intelligence Lab class @ETH during 2020 spring semester.
The exercise explores basic matrix completion/factorization techniques and exposes their limitation. Also, you'll find an hand-made implementation of SGD.

PLSA exercise

This comes from Computational Intelligence Lab class @ETH during 2020 spring semester.
The exercise explores Probabilistic latent semantic analysis as presented in the paper https://arxiv.org/pdf/1301.6705.pdf. The goal of the exercise is to implement PLSA by using the EM algorithm, and to apply it to the Associated Press corpus dataset.

Sentiment analysis exercise -> SA repo

This comes from Computational Intelligence Lab class @ETH during 2020 spring semester.
This is an exploratory notebook used for the (ongoing) Kaggle Competition https://www.kaggle.com/c/cil-text-classification-2020/data.

Up until now in the notebook I have implemented:

vocabulary extraction
GloVe embedding training with Stochastic Gradient Descent
training pipeline (using the learned embeddings to train a sentiment classifier)
predictions pipeline (using the learned embeddings and the trained classifier to predict tweets sentiment)

Update 12.04.20:
The project moved to a dedicated repo. You can see all the changes there.

Note: to be able run this notebook you need to have the .zip datasets in your current working file system, or in your Google Drive folder.

Writing like Dante

This comes from Machine Perception class @ETH during 2020 spring semester.
The inspiration from this notebook was drawn from this beautiful blog post. The idea is to train an LSTM on character-level text prediction using a single .txt file. I chose a text from the Gutenberg repository, namely la Divina Commedia , by Dante Alighieri.
Maybe what is most exciting about this work is perfectly synthesized by Andrej Karpathy (the author of the blog post):

There’s something magical about Recurrent Neural Networks (RNNs). [...] Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times.

Here's a preview of the results:

E io a lui: «Dor, fosse, tra miggia,
non li oche giaco, se Cercoresto daicra
feruto alcundon par de' Fuorchi sonno,
me steso le sue cadëa in avro si tecco.

Tattir si poccia a corpo par savilsa;
e ch'io forse di quanto a terra soscruto,
mi fu maca scricciatà il focose
al figuboro e deggea feder puoso.

Per convien la bercine dol posova
de l'un cheggio, e a le piaggi traddume.

Image compression by clustering

This comes from Computational Intelligence Lab class @ETH during 2020 spring semester.
The goal of this notebook was to explore different clustering algorithms in the setting of Image Compression.
Above you can see the result obtained with K-means for different values of k (the number of clusters).

Dijkstra’s algorithm (in fieri)

Here the goal is to implement a Monte Carlo simulation that calculates the average shortest path in a graph. The shortest path algorithm will be Dijkstra’s.

Tweet Generator (in fieri)

Implementation of Generating sentences from a continuous space paper
Here's a link to the paper.

The goal

What I want to explore here is the expression of sentiment in generative models.
The dataset consists of two different samples of tweets, one with positive sentiment and one with negative sentiment.
The goal is to train two generators on the two sets separetly and analyse the qualitative differences.

Note: to be able run this notebook you need to have the .zip datasets in your current working file system, or in your Google Drive folder.

Turning my sister into an old painting (in fieri)

This notebook will explore the magic of GANs.
We are going to refer to a particular GAN architecture : the Cycle GAN. Here's a link to the paper for the more curious.

The goal? Taking a picture of my beautiful sister and turn it into a painting, to see how she would have looked like a few centuries ago.

Signal processing

This exercise draws inspiration from a pair of lectures in the Computational Intelligence Lab class @ETH during 2020 spring semester. The topic is signal processing in the realm of lossy data compression.
The notebook goes through some maths and applies it to 1D signal first, and to images in the end. In the plot below you can see the results of image compression using FFT transform (center) and DCT transform (right).

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
pics		pics
Adversarial_attacks_on_MNIST.ipynb		Adversarial_attacks_on_MNIST.ipynb
Adversarial_defense.ipynb		Adversarial_defense.ipynb
Collaborative filtering.ipynb		Collaborative filtering.ipynb
Color_compression.ipynb		Color_compression.ipynb
ComputationalGraph.jpg		ComputationalGraph.jpg
Dante.jpg		Dante.jpg
FGSM.ipynb		FGSM.ipynb
Intro_to_quantum_computing.ipynb		Intro_to_quantum_computing.ipynb
PLSA.ipynb		PLSA.ipynb
README.md		README.md
Sentiment_analysis.ipynb		Sentiment_analysis.ipynb
Signal_processing.ipynb		Signal_processing.ipynb
Turning_my_sister_into_an_old_painting.ipynb		Turning_my_sister_into_an_old_painting.ipynb
Tweet_Generator.ipynb		Tweet_Generator.ipynb
Writing_like_Dante.ipynb		Writing_like_Dante.ipynb
compressions.jpg		compressions.jpg
dijkstraSP.cpp		dijkstraSP.cpp
shapley_values.ipynb		shapley_values.ipynb

GiuliaLanzillotta/My-ML-exercises

Folders and files

Latest commit

History

Repository files navigation

Exercises :)

Collaborative filtering exercise

PLSA exercise

Sentiment analysis exercise -> SA repo

Writing like Dante

Image compression by clustering

Dijkstra’s algorithm (in fieri)

Tweet Generator (in fieri)

The goal

Turning my sister into an old painting (in fieri)

Signal processing

About

Resources

Stars

Watchers

Forks

Languages