Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: The size of tensor a (1024) must match the size of tensor b (512) at non-singleton dimension 1 #629

Open
creatorcao opened this issue Nov 29, 2023 · 2 comments

Comments

@creatorcao
Copy link

creatorcao commented Nov 29, 2023

python train.py --outdir=./test --data=./images256x256.zip --cfg=stylegan3-r --gpus=1 --batch=32 --gamma=0.5 \
--freezed=13 --workers=2 --mirror=1 --kimg=2000 --tick=1 --snap=10 --metrics=none --cbase=16384 --cond=1 \
--resume=./weights/stylegan3-r-ffhqu-256x256.pkl

I received the error below when trying to train images with labels with pretrained weights, could somebody help me to fix this?

File "stylegan3/torch_utils/misc.py", line 162, in copy_params_and_buffers
tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
RuntimeError: The size of tensor a (1024) must match the size of tensor b (512) at non-singleton dimension 1

@sans-dev
Copy link

I get the same error but with different shapes:

Number of GPUs:      1
Batch size:          32 images
Training duration:   5000 kimg
Dataset path:        dataset/inat-insects.zip
Dataset size:        84524 images
Dataset resolution:  512
Dataset labels:      False
Dataset x-flips:     True

Creating output directory...
Launching processes...
Loading training set...

Num images:  169048
Image shape: [3, 512, 512]
Label shape: [0]

Constructing networks...
Resuming from "models/stylegan3-r-afhqv2-512x512.pkl"
Traceback (most recent call last):
  File "train.py", line 286, in <module>
    main() # pylint: disable=no-value-for-parameter
  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/opt/conda/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "train.py", line 281, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "train.py", line 96, in launch_training
    subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
  File "train.py", line 47, in subprocess_fn
    training_loop.training_loop(rank=rank, **c)
  File "/scratch/training/training_loop.py", line 162, in training_loop
    misc.copy_params_and_buffers(resume_data[name], module, require_all=False)
  File "/scratch/torch_utils/misc.py", line 162, in copy_params_and_buffers
    tensor.copy_(src_tensors[name].detach()).requires_grad_(tensor.requires_grad)
RuntimeError: The size of tensor a (512) must match the size of tensor b (1024) at non-singleton dimension 1

This is my cmd:
docker run --gpus all -it --shm-size 50G --rm --user $(id -u):$(id -g) -vpwd:/scratch --workdir /scratch -e HOME=/scratch stylegan3 python train.py --outdir=~/training-runs --cfg=stylegan3-t --data=dataset/inat-insects.zip --gpus=1 --batch=32 --gamma=8.2 --mirror=1 --kimg=5000 --snap=5 --resume=models/stylegan3-r-afhqv2-512x512.pkl

@qq272574497
Copy link

creato

我也有同样问题,能互相交流吗?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants