GitHub - RonyAbecidan/PrivateWordEmbeddings: Study of the paper "Differentially Private Representation for NLP"

Study of Private Word Embeddings

This repository is made as part of an assignment for the Privacy Preserving Machine Learning class of the University of Lille's Msc. in Data Science. taught by Aurelien Bellet .

Authors :

Samy Zouhri
Rony Abecidan

Here, we are studying the paper "Differentially Private Representation for NLP : Formal Guarantee and An Empirical Study on Privacy and Fairness" written by Lingjuan Lyu, Xuanli He and Yitong Li and published for the 2020 Conference on Empirical Methods in Natural Language Processing.

In this article, the authors propose for the first time a method enabling to guarantee formally the privacy of a word embedding while maintaining a satisfying utility and wiping off discriminations in most cases.

This repo is made of 3 parts :

The article studied in a .pdf format
A short report discussing about the strategy proposed in the paper for making a private word representation while maintaining utility in NLP models. There are also some additional information enable to better understand the logic of the paper.
Two illustrative notebooks in which we propose two experiments :
- The first one consists in studying to what extent a word embedding can leak sensitive information. We have used for this experiment the Word2Vec embedding of the library gensim
- The second one consists in implementing the strategy proposed by the authors while studying its impact on utility for different classifications tasks. This time we considered a "custom" embedding using pytorch.

Installation

If you want to reproduce our experiments you'll have to install the requirements listed in requirements.txt.

pip install -r requirements.txt

A part of the code for the second experiment is inspired by the tutorial of pytorch for text classification

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Experiments		Experiments
Paper		Paper
Report		Report
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiments

Experiments

Paper

Paper

Report

Report

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

Repository files navigation

Study of Private Word Embeddings

Authors :

Installation

About

Releases

Packages

Languages

RonyAbecidan/PrivateWordEmbeddings

Folders and files

Latest commit

History

Repository files navigation

Study of Private Word Embeddings

Authors :

Installation

About

Topics

Resources

Stars

Watchers

Forks

Languages