kFactorVAE: Self-Supervised Regularization for Better A.I. Disentanglement

This repo. contains all work conducted for my honors thesis project at William & Mary. It contains a few subdirectories for different VAE-based models I investigated adding my regularization term into, notably kFactorVAE in the kFactorVAE folder, Beta-VAE from the Disentangling folder, and Beta-TCVAE from the beta-tcvae folder.

It borrows from Professor Shao's ControlVAE GitHub repository. Note though that I do not use the ControlVAE model itself in kFactorVAE, although it is an open avenue!

Dependencies

I used a conda environment and the YML file for its CUDA dependencies can be found here and for its non-CUDA dependencies here.

Reproducing Thesis Paper Results

Hardware Setup

If you are affiliated with William & Mary, I recommend using the lab machines in McGlothlin-Street (McG) Hall. To remotely log onto these machines, if you do not possess a CS account already and/or you are new to using the machines, follow the guidance provided by Professor Timothy A. Davis here.

Otherwise, I recommend using a computer that has a NVIDIA GPU and is compatible with NVIDIA CUDA. The NVIDIA GPU models I used were:

NVIDIA RTX A4000 (fastest at around 70 training iterations/second)
NVIDIA A40 (middle at around 50-70 training iterations/second)
NVIDIA RTX A5000 (slowest at around 30-40 training iterations/second)

I made no typos. Surprisingly, yes, the A4000 is faster than the A5000 even though the specifications tell me otherwise.

And of course, all these models are available on the lab machines. GPU specifications for each machine may be found here. As years pass by, these computers will probably have significant upgrades in their GPUs. However, you can probably expect these GPUs to stick around if you are reading this in the year 2023 or a few years afterwards. But maybe I will be proven wrong.

Software Instructions

Visdom Server Initialization

You will need to perform this step if you want to see the training metric graphs, reconstructions, and latent traversals when reproducing experimental results on a convenient website interface. All these results are also stored in directories.

On a (Linux) shell, run:

❯ chmod +x run_visdom_server.sh

❯ run_visdom_server.sh [port number] [optional relative path to Visdom log file to replay]

You may replay as many log files as you want. Keep in mind you will get a .out file every time run run_visdom_server.sh.

kFactorVAE Scripts Setup

You may head over to the kFactorVAE README.

Technical Support for McG Hall Computers

If you have any issues/requests, please reach out to Joseph Hause in the W&M Computer Science Slack.

Useful Tidbits Learned

This is just for personal memory's sake or for your own benefit too, in case you were curious as to the minor techniques I have acquired.

(1) How to make a forked repository private

This was initially a challenge since by default, GitHub does not allow you to make a forked repo. private.
Source

# 1. Clone without any remote branches
git clone --bare [folder to repo. fork]

# 2. Create a new private repo. on the GitHub website

# 3. Push repo. contents + all branches 
cd [local repo. fork directory]
git push --mirror [copied link to private. repo you just created] 

# 4. Delete your local repo. fork
rm -rf .git
cd ..
rm -rf [folder to repo. fork]

# 5. Now just clone the private repo., and voila!

I needed to do this because I wanted to make a private clone of Professor Shao's ControlVAE and work with that private clone and finish the thesis before making it public.

(2) How the transpose operator works for higher dimensions

It's sill switching "rows" and "columns." aka $a_{ij} = a_{ji}$
But each element $a$ of the "matrix" you are transposing might something other than a singular number. An element could now be an entire matrix, an actual row/column, or a tensor.

Why I needed to learn this was to understand how Prof. Shao implemented the latent variable traversal (viz_traverse) and displayed the results to the visdom server.

(3) How to Search Within a Highlighted Text Section within Vim

Shift + V and select all that you want
You can yank or just press esc twice
/\%V[your_search_string]

(4) Shell scripting

conditionals to check directories
operations on directories
checking if a port is being used by a specific process
for loops
variables
realizing that ChatGPT is solid in shell scripting :)

(5) Literature Review

When bogged down by the technical details, making a slides presentation, where each paper gets only one slide, with abstracts, conclusions, advantages, and disadvantages helped me maintain engagement with the literature. The literature review and processing a new idea in light of it was by far the most challenging part.

(6) Logging data for visualizations

Visdom, CSV filewriting

(7) LaTeX

Side-by-side plots, images, adjacent text to images to label factors of variation
\input command to break down a LaTeX file into subfiles
inserting PDFs

Name		Name	Last commit message	Last commit date
Latest commit History 541 Commits
Disentangling		Disentangling
Image_generation		Image_generation
Language_modeling		Language_modeling
beta-tcvae		beta-tcvae
kFactorVAE		kFactorVAE
metrics		metrics
plot_fig		plot_fig
.DS_Store		.DS_Store
.gitignore		.gitignore
ControlVAE_README.md		ControlVAE_README.md
README.md		README.md
data_download.txt		data_download.txt
readMe.txt		readMe.txt
readMe_run_model.txt		readMe_run_model.txt
requirements.txt		requirements.txt
requirements_CUDA_11.6.yml		requirements_CUDA_11.6.yml
requirements_no_CUDA.yml		requirements_no_CUDA.yml
run_visdom_server.sh		run_visdom_server.sh

joegenius98/Undergrad_Honors_Thesis

Folders and files

Latest commit

History

Repository files navigation

kFactorVAE: Self-Supervised Regularization for Better A.I. Disentanglement

Dependencies

Reproducing Thesis Paper Results

Hardware Setup

Software Instructions

Visdom Server Initialization

kFactorVAE Scripts Setup

Technical Support for McG Hall Computers

Useful Tidbits Learned

About

Resources

Stars

Watchers

Forks

Languages