Markov-Lipschitz Deep Learning (MLDL)

Comparison of training processes of three autoencoders on the Spheres dataset.

This is a PyTorch implementation of the MLDL paper :

@article{Li-MLDL-2020,
  title={Markov-Lipschitz Deep Learning},
  author={Stan Z Li and Zelin Zang and Lirong Wu},
  journal={arXiv preprint arXiv:2006.08256},
  year={2020}
}

The main features of MLDL for manifold learning and generation in comparison to other popular methods are summarized below:

	MLDL (ours)	AE/TopoAE	MLLE	ISOMAP	t-SNE
Manifold Learning without decoder	Yes	No	Yes	Yes	Yes
Learned NLDR model applicable to test data	Yes	Yes	No	No	No
Able to generate data of learned manifold	Yes	No	No	No	No
Compatible with other DL frameworks	Yes	No	No	No	No
Scalable to large datasets	Yes	Yes	No	No	No

The code includes the following modules:

Datasets (Swiss Roll, S-Curve, MNIST, Spheres)
Training for ML-Enc and ML-AE (ML-Enc + ML-Dec)
Test for manifold learning (ML-Enc)
Test for manifold generation (ML-Dec)
Visualization
Evaluation metrics
The compared methods include: AutoEncoder (AE), Topological AutoEncoder (TopoAE), Modified Locally Linear Embedding (MLLE), ISOMAP, t-SNE. (Note: We modified the original TopoAE source code to make it able to run the Swiss roll dataset by adding a swiss roll dataset generation function and modifying the network structure for fair comparison.)

Requirements

pytorch == 1.3.1
scipy == 1.4.1
numpy == 1.18.5
scikit-learn == 0.21.3
csv == 1.0
matplotlib == 3.1.1
imageio == 2.6.0

Description

main.py
- SetParam() -- Parameters for training
- Train() -- Train a new model (encoder and/or decoder)
- Train_MultiRun() -- Run the training for multiple times, each with a different seed
- Generation() -- Testing generation of new data of the learned manifold
- Generalization() -- Testing dimension reduction from unseen data of the learned manifold
- InlinePlot() -- Inline plot intermediate results during training
dataset.py
- LoadData() -- Load data of selected dataset
loss.py
- MLDL_Loss() -- Calculate six losses: ℒ_Enc, ℒ_Dec, ℒ_AE, ℒ_lis, ℒ_push, ℒ_ang
model.py
- Encoder() -- For latent feature extraction
- Decoder() -- For generating new data on the learned manifold
eval.py -- Calculate performance metrics from results, each being the average of 10 seeds
utils.py
- GIFPloter() -- Auxiliary tool for online plot
- CompPerformMetrics() -- Auxiliary tool for evaluating metric
- Sampling() -- Sampling in the latent space for generating new data on the learned manifold

Running the code

Clone this repository

git clone https://github.com/westlake-cairi/Markov-Lipschitz-Deep-Learning

Install the required dependency packages
To get the results for 10 seeds, run

python main.py -MultiRun

To get the metrics for ML-Enc and ML-AE

python eval.py -M ML-Enc
python eval.py -M ML-AE

The evaluation metrics are available in ./pic/PerformMetrics.csv

To choose a dataset among SwissRoll, Scurve, MNIST, Spheres5500 and Spheres10000 for tow modes (ML-Enc and ML-AE)

python main.py -D "dataset name" -M "mode"

To test the generalization to unseen data

python main.py -M Test

The results are available in ./pic/file_name/Test.png

To test the manifold generation

python main.py -M Generation

The results are available in ./pic/file_name/Generation.png

Results

1. ML-Enc: Dimension reduction results -- embeddings in latent spaces

Swiss Roll and S-Curve

A symbol √ or X represents a success or failure in unfolding the manifold. The methods in the upper-row ML-Enc succeed and by calculation, the ML-Enc best maintains the true aspect ratio.

MNIST (10 digits)

Spheres 5500 (data designed by the TopoAE project )

Spheres 10000 (data designed by the TopoAE project )

2. ML-Enc: Performance metrics for dimension reduction on Swiss Roll (800 points) data

This table demonstrates that the ML-Enc outperforms all the other 6 methods in all the evaluation metrics, particularly significant in terms of the isometry (LGD, RRE, Cont and Trust) and Lipschitz (K-Min and K-Max) related metrics.

	#Succ	L-KL	RRE	Trust	Cont	LGD	K-Min	K-Max	MPE
ML-Enc	10	0.0184	0.000414	0.9999	0.9985	0.00385	1.00	2.14	0.0718
TopoAE	0	0.0349	0.022174	0.9661	0.9884	0.13294	1.27	189.95	0.1307
t-SNE	0	0.0450	0.006108	0.9987	0.9843	3.40665	11.1	1097.62	0.1071
MLLE	6	0.1251	0.030702	0.9455	0.9844	0.04534	7.37	238.74	0.1709
HLLE	6	0.1297	0.034619	0.9388	0.9859	0.04542	7.44	218.38	0.0978
LTSA	6	0.1296	0.034933	0.9385	0.9859	0.04542	7.44	215.93	0.0964
ISOMAP	6	0.0234	0.009650	0.9827	0.9950	0.02376	1.11	34.35	0.0429
LLE	0	0.1775	0.014249	0.9753	0.9895	0.04671	6.17	451.58	0.1400

3. ML-Enc: Ability to generalize on unseen data of the learned manifold

The learned ML-Enc network can unfold unseen data of the learned manifold, demonstrated using the Swiss-roll with a hole, whereas the compared methods cannot.

4. ML-AE: For dimension reduction and manifold data generation

In the learning phase, the ML-AE taking (a) the training data as input, output (b) embedding in the learned latent space, and then reconstruct back (c). In the generation phase, the ML-Dec takes (d) random input samples in the latent space, and maps the samples to the manifold (e).

5. ML-AE: Evolution of training evolution

The ML-AE training gradually unfolds the manifold from input layer to the latent layer and reconstructs the latent embedding back to data in the input space.

Feedback

If you have any issue about the implementation, please feel free to contact us by email:

Zelin Zang: zangzelin@westlake.edu.cn
Lirong Wu: wulirong@westlake.edu.cn

Name		Name	Last commit message	Last commit date
Latest commit History 216 Commits
figs		figs
param		param
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
eval.py		eval.py
loss.py		loss.py
main.py		main.py
model.py		model.py
samples_generator_new.py		samples_generator_new.py
utils.py		utils.py

License

Westlake-AI/Markov-Lipschitz-Deep-Learning

Folders and files

Latest commit

History

Repository files navigation

Markov-Lipschitz Deep Learning (MLDL)

Requirements

Description

Running the code

Results

1. ML-Enc: Dimension reduction results -- embeddings in latent spaces

2. ML-Enc: Performance metrics for dimension reduction on Swiss Roll (800 points) data

3. ML-Enc: Ability to generalize on unseen data of the learned manifold

4. ML-AE: For dimension reduction and manifold data generation

5. ML-AE: Evolution of training evolution

Feedback

About

Resources

License

Stars

Watchers

Forks

Languages