Minibatch Wasserstein distance

Python3 implementation of the paper Learning with minibatch Wasserstein: asymptotic and gradient properties (AISTATS 2020)

Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches, i.e. they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.

We also wrote a medium blog post, feel free to ask if any question.

If you use this toolbox in your research or minibatch Wasserstein and find them useful, please cite minibatch Wasserstein using the following bibtex reference:

@InProceedings{pmlr-v108-fatras20a, 
title = {Learning with minibatch Wasserstein : asymptotic and gradient properties}, 
author = {Fatras, Kilian and Zine, Younes and Flamary, R\'emi and Gribonval, Remi and Courty, Nicolas}, 
booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, 
pages = {2131--2141}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, 
volume = {108}, 
series = {Proceedings of Machine Learning Research}, 
month = {26--28 Aug}, 
publisher = {PMLR}, 
pdf = {http://proceedings.mlr.press/v108/fatras20a/fatras20a.pdf}, 
url = { http://proceedings.mlr.press/v108/fatras20a.html }, 
abstract = {Optimal transport distances are powerful tools to compare probability distributions and have found many applications in machine learning. Yet their algorithmic complexity prevents their direct use on large scale datasets. To overcome this challenge, practitioners compute these distances on minibatches i.e., they average the outcome of several smaller optimal transport problems. We propose in this paper an analysis of this practice, which effects are not well understood so far. We notably argue that it is equivalent to an implicit regularization of the original problem, with appealing properties such as unbiased estimators, gradients and a concentration bound around the expectation, but also with defects such as loss of distance property. Along with this theoretical analysis, we also conduct empirical experiments on gradient flows, GANs or color transfer that highlight the practical interest of this strategy.} }

Prerequisites

Numpy (>= 1.11)
Matplotlib (>= 1.5)
time
sklearn
For Optimal transport Python Optimal Transport POT (>=0.5.1)

What is included ?

Minibatch Wasserstein color transfer (large scale)
Deviation of mb OT matrix and marginals
Time experiment
Slides
Poster

Authors

References

[1] Flamary Rémi and Courty Nicolas POT Python Optimal Transport library

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
communication		communication
data		data
README.md		README.md
color_transfert_full_pic.py		color_transfert_full_pic.py
marginale_exp.py		marginale_exp.py
mb_utils.py		mb_utils.py
time_color_transfert_small_pic.py		time_color_transfert_small_pic.py
two_moons_gan_toy.ipynb		two_moons_gan_toy.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

communication

communication

data

data

README.md

README.md

color_transfert_full_pic.py

color_transfert_full_pic.py

marginale_exp.py

marginale_exp.py

mb_utils.py

mb_utils.py

time_color_transfert_small_pic.py

time_color_transfert_small_pic.py

two_moons_gan_toy.ipynb

two_moons_gan_toy.ipynb

Repository files navigation

Minibatch Wasserstein distance

Prerequisites

What is included ?

Authors

References

About

Releases

Packages

Languages

kilianFatras/minibatch_Wasserstein

Folders and files

Latest commit

History

Repository files navigation

Minibatch Wasserstein distance

Prerequisites

What is included ?

Authors

References

About

Resources

Stars

Watchers

Forks

Languages