Skip to content

camlaedtke/imagen_pytorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

imagen_pytorch

Training pipeline for Imagen, Google's Text-to-Image Neural Network, on the Conceptual 12M dataset. Using Phil Wang's excellent repo.

Training runs are logged in Wandb: https://wandb.ai/camlaedtke/imagen?workspace=user-camlaedtke

Conceptual 12M

Downloaded with img2dataset

On Windows ...

curl.exe --output cc12m.tsv --url https://storage.googleapis.com/conceptual_12m/cc12m.tsv
sed -i "1s/^/url\tcaption\n/" cc12m.tsv
img2dataset --url_list cc12m.tsv --input_format "tsv"\
         --url_col "url" --caption_col "caption" --output_format webdataset\
           --output_folder cc12m --processes_count 8 --thread_count 32 --image_size 256\
             --enable_wandb True

Some running notes

  • Batch size of 64-512 seems to be good.
  • Setting max_grad_norm = 1.25 makes training more stable, but appears to considerably slow convergence and hurt performance.
  • Best results have been attained with a learning rate of around 1.5e-5 when combined with batch size of 256.

About

Training and inference scripts for Imagen, Google's Text-to-Image Neural Network, on the CocoCaptions dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published