Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running training in Google Colab #630

Open
AaratiAkkapeddi opened this issue Dec 11, 2023 · 0 comments
Open

Running training in Google Colab #630

AaratiAkkapeddi opened this issue Dec 11, 2023 · 0 comments

Comments

@AaratiAkkapeddi
Copy link

Here is a Link to my colab notebook for reference: https://colab.research.google.com/drive/1_V9FjMqQOT7VXgKoLcKR_a93ADQbLwwp?usp=sharing
GPU 0: Tesla V100-SXM2-16GB

I am getting the following error:

reating output directory...
Launching processes...
Loading training set...

Num images:  1006
Image shape: [3, 512, 512]
Label shape: [0]

Constructing networks...
Setting up PyTorch plugin "bias_act_plugin"... Failed!
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1666, in _run_ninja_build
    subprocess.run(
  File "/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['ninja', '-v']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "train.py", line 286, in <module>
    main() # pylint: disable=no-value-for-parameter
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1134, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1059, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1401, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 767, in invoke
    return __callback(*args, **kwargs)
  File "train.py", line 281, in main
    launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
  File "train.py", line 96, in launch_training
    subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
  File "train.py", line 47, in subprocess_fn
    training_loop.training_loop(rank=rank, **c)
  File "/content/drive/MyDrive/real-sg3/stylegan3/training/training_loop.py", line 168, in training_loop
    img = misc.print_module_summary(G, [z, c])
  File "/content/drive/MyDrive/real-sg3/stylegan3/torch_utils/misc.py", line 216, in print_module_summary
    outputs = module(*inputs)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1071, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/content/drive/MyDrive/real-sg3/stylegan3/training/networks_stylegan3.py", line 511, in forward
    ws = self.mapping(z, c, truncation_psi=truncation_psi, truncation_cutoff=truncation_cutoff, update_emas=update_emas)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1071, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/content/drive/MyDrive/real-sg3/stylegan3/training/networks_stylegan3.py", line 151, in forward
    x = getattr(self, f'fc{idx}')(x)
  File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 1071, in _call_impl
    result = forward_call(*input, **kwargs)
  File "/content/drive/MyDrive/real-sg3/stylegan3/training/networks_stylegan3.py", line 100, in forward
    x = bias_act.bias_act(x, b, act=self.activation)
  File "/content/drive/MyDrive/real-sg3/stylegan3/torch_utils/ops/bias_act.py", line 84, in bias_act
    if impl == 'cuda' and x.device.type == 'cuda' and _init():
  File "/content/drive/MyDrive/real-sg3/stylegan3/torch_utils/ops/bias_act.py", line 41, in _init
    _plugin = custom_ops.get_plugin(
  File "/content/drive/MyDrive/real-sg3/stylegan3/torch_utils/custom_ops.py", line 136, in get_plugin
    torch.utils.cpp_extension.load(name=module_name, build_directory=cached_build_dir,
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1080, in load
    return _jit_compile(
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1293, in _jit_compile
    _write_ninja_file_and_build_library(
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1405, in _write_ninja_file_and_build_library
    _run_ninja_build(
  File "/usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py", line 1682, in _run_ninja_build
    raise RuntimeError(message) from e
RuntimeError: Error building extension 'bias_act_plugin': [1/2] c++ -MMD -MF bias_act.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /usr/local/lib/python3.8/dist-packages/torch/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.8/dist-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /root/.cache/torch_extensions/bias_act_plugin/3cb576a0039689487cfba59279dd6d46-tesla-v100-sxm2-16gb/bias_act.cpp -o bias_act.o 
FAILED: bias_act.o 
c++ -MMD -MF bias_act.o.d -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -isystem /usr/local/lib/python3.8/dist-packages/torch/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.8/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.8/dist-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.8 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++14 -c /root/.cache/torch_extensions/bias_act_plugin/3cb576a0039689487cfba59279dd6d46-tesla-v100-sxm2-16gb/bias_act.cpp -o bias_act.o 
In file included from /usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/Device.h:3,
                 from /usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
                 from /usr/local/lib/python3.8/dist-packages/torch/include/torch/extension.h:6,
                 from /root/.cache/torch_extensions/bias_act_plugin/3cb576a0039689487cfba59279dd6d46-tesla-v100-sxm2-16gb/bias_act.cpp:9:
/usr/local/lib/python3.8/dist-packages/torch/include/torch/csrc/python_headers.h:11:10: fatal error: Python.h: No such file or directory
   11 | #include <Python.h>
      |          ^~~~~~~~~~
compilation terminated.
ninja: build stopped: subcommand failed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant