Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Share]How to run demo and training in a docker container #31

Open
liupgd opened this issue Jun 28, 2021 · 0 comments
Open

[Share]How to run demo and training in a docker container #31

liupgd opened this issue Jun 28, 2021 · 0 comments

Comments

@liupgd
Copy link

liupgd commented Jun 28, 2021

After miserable experience on running training on this repo. I think I need to write it down to share to others. My situation is: I failed run the model on my own ubuntu server, as I didn't install CUDA10.0. But, running in the docker is also not so easy.

Run the demo

  • Build the docker
docker build -t new-comod-gan .
docker run -itd -v /your_work_dir:/work  -v /your_data_dir:/data --name comod -p 7200-7220:7200-7220 --gpus all new-comod-gan /bin/bash
  • Install extra packages in your container
pip install opencv-python 
pip install tqdm
pip install scikit-learn
  • Prepare your test image dataset
python dataset_tools/create_from_images.py --val-image-dir ./your_test_images_dir --tfrecord-dir ./tfrecords
  • Add cuda to path in your container, you'd better add this to .bashrc
export PATH="/usr/local/nvidia/bin:/usr/local/cuda/bin:$PATH"
  • Now you can run the demo with pretrained model
python run_demo.py -c ./pretrained/co-mod-gan-places2-050000.pkl -d ./tfrecords
  • But...you didn't get the GUI from your remote container? Maybe VcXsrv can help you if you are using windows locally. My solution is:
    • Install VcXsrv in windows
    • Use vscode to access my container, and install Remote X11 plugin in vscode.
    • In vscode settings->Remote, search Host Option, and set your remote Host'IP with your remote server's IP(Not the container's IP)
    • Start VcXsrv first in your local Windows, and then run your remote demo. GUI will be displayed if there is no error.

Run training

  • Preparing your own training dataset
    Here, I prepared some images in ./imgs/png_samples/ for training test.
python dataset_tools/create_from_images.py --train-image-dir ./imgs/png_samples/ --val-image-dir ./imgs/png_samples/ --tfrecord-dir ./train_dataset --resolution 512 --num-channels 3

Note:
1. Only 3 channels can be used. If you're using png files, do not set --num-channels to 4, you'll get error in training.
2. --val-image-dir should be specified, or you'll have error in training.

  • Run your training
python run_training.py --data-dir=./  --dataset=train_dataset --metrics=ids10k --mirror-augment True --num-gpus=4

For researchers in China, you may need a VPN. The training process will download an inception model file. You can:

export https_proxy="https://your_vpn_ip&port"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant