GitHub - apozas/private-tn: Code repository for arXiv:2202.12319

Code to accompany Physics solutions to machine learning privacy leaks

Alejandro Pozas-Kerstjens, Senaida Hernández-Santana, José Ramón Pareja Monturiol, Marco Castrillón López, Giannicola Scarpa, Carlos E. González-Guillén, and David Pérez-García

This repository contains the codes used for the article "Physics solutions to machine learning privacy leaks. Alejandro Pozas-Kerstjens, Senaida Hernández-Santana, José Ramón Pareja Monturiol, Marco Castrillón López, Giannicola Scarpa, Carlos E. González-Guillén, and David Pérez-García. arXiv:2202.12319." It provides the codes for cleaning the global.health database, training neural network and matrix product state models on the dataset generated, and attacking the models via shadow training.

All code is written in Python.

Libraries required:

matplotlib and seaborn for plots
numpy for array operations
pandas for database operations
scikit-learn for machine learning utils
tensorflow and pytorch for training the matrix product states and the neural networks, respectively
tensornetwork for defining the matrix product states
tqdm for progress bars
copy, glob, os, pickle, random, sys, time and typing

Files:

General files
- create_accuracy_figure: Create Figure 2c in the paper, showing the accuracies of the models in predicting the outcome of COVID-19 patients given demographics and symptoms.
- create_attack_figure: Create Figure 2d in the paper, showing the accuracies of attacks inferring the parity of the registration day of the models' training data.
- create_vulnerability_figure: Create Figure 1 in the paper, showing how neural networks store data from the training set that is irrelevant for the target task.
- database_processing: Clean the global.health database to generate the dataset used in the experiments.
Neural networks
- attack_nn: Attacks inferring the parity of the registration day of the neural networks' training data.
- create_nn_dataset_from_models: Generate the dataset with all the neural networks' model parameters.
- generate_nn_models: Train neural network models on predicting COVID-19 outcome given demographics and symptoms.
- utils_nn: Helper function for data processing and model training.
Matrix product states
- attack_mps: Attacks, based on shadow training, inferring the parity of the registration day of the matrix product states' training data.
- batchtensornetwork: Functions for evaluating matrix product states on input data.
- classifier: Definition of the classifier matrix product state model.
- create_mps_dataset_from_models: Generate the dataset with all the matrix product states' model parameters, either in standard or in canonical form.
- generate_mps_models: Train matrix product state models on predicting COVID-19 outcome given demographics and symptoms.
- training: Functions for training matrix produc state models.
- utils_mps: Helper function for data processing and model training.

If you would like to cite this work, please use the following format:

A. Pozas-Kerstjens, S. Hernández-Santana, J. R. Pareja Monturiol, M. Castrillón López, G. Scarpa, C. E. González-Guillén, and D. Pérez-García, Physics solutions to machine learning privacy leaks, arXiv:2202.12319

@misc{pozaskerstjens2022privatetn,
author = {Pozas-Kerstjens, Alejandro and Hern\'andez-Santana, Senaida and Pareja Monturiol, Jos\'e Ram\'on and Castrill\'on L\'opez, Marco and Scarpa, Giannicola and Gonz\'alez-Guill\'en, Carlos E. and P\'erez-Garc\'ia, David},
title = {Physics solutions to machine learning privacy leaks},
eprint = {2202.12319},
archivePrefix={arXiv}
}

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
matrix_product_states		matrix_product_states
neural_networks		neural_networks
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
create_accuracy_figure.py		create_accuracy_figure.py
create_attack_figure.py		create_attack_figure.py
create_vulnerability_figure.py		create_vulnerability_figure.py
database_processing.py		database_processing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

matrix_product_states

matrix_product_states

neural_networks

neural_networks

LICENSE

LICENSE

README.md

README.md

init.py

init.py

create_accuracy_figure.py

create_accuracy_figure.py

create_attack_figure.py

create_attack_figure.py

create_vulnerability_figure.py

create_vulnerability_figure.py

database_processing.py

database_processing.py

Repository files navigation

Code to accompany Physics solutions to machine learning privacy leaks

Alejandro Pozas-Kerstjens, Senaida Hernández-Santana, José Ramón Pareja Monturiol, Marco Castrillón López, Giannicola Scarpa, Carlos E. González-Guillén, and David Pérez-García

About

Releases 1

Packages

Contributors 2

Languages

License

apozas/private-tn

Folders and files

Latest commit

History

Repository files navigation

Code to accompany Physics solutions to machine learning privacy leaks

Alejandro Pozas-Kerstjens, Senaida Hernández-Santana, José Ramón Pareja Monturiol, Marco Castrillón López, Giannicola Scarpa, Carlos E. González-Guillén, and David Pérez-García

About

Topics

Resources

License

Stars

Watchers

Forks

Languages