RuntimeError during default execution #3

AlexanderGri · 2017-12-16T14:52:43Z

Hello, thank you for your implemenation!

I've just tried to run default experiment with

python main.py --no-cuda --epochs 1

and run into the following problem

/opt/conda/lib/python3.6/importlib/_bootstrap.py:205: RuntimeWarning: compiletime version 3.5 of module 'tensorflow.python.framework.fast_tensor_util' does not match runtime version 3.6 return f(*args, **kwds)
Prepare files
Define model
        Statistics
        Create model
Optimizer
Logger
=> no best model found at './checkpoint/qm9/mpnn/model_best.pth'
Check cuda
Traceback (most recent call last):
  File "main.py", line 321, in <module>
    main()
  File "main.py", line 182, in main
    train(train_loader, model, criterion, optimizer, epoch, evaluation, logger)
  File "main.py", line 242, in train
    output = model(g, h, e)
  File "/opt/conda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 319, in __call__
    result = self.forward(*input, **kwargs)
  File "/data/grishin/nmp_qc/models/MPNN.py", line 78, in forward
    m = self.m[0].forward(h[t], h_aux, e_aux)
  File "/data/grishin/nmp_qc/MessageFunction.py", line 43, in forward
    return self.m_function(h_v, h_w, e_vw, args)
  File "/data/grishin/nmp_qc/MessageFunction.py", line 175, in m_mpnn
    h_w_rows = h_w[..., None].expand(h_w.size(0), h_v.size(1), h_w.size(1)).contiguous()
RuntimeError: The expanded size of the tensor (25) must match the existing size (73) at non-singleton dimension 1

Am i doing something wrong? Thank you in advance.

The text was updated successfully, but these errors were encountered:

priba · 2018-02-01T09:59:29Z

Hi,

Sorry for the big delay on the answer, in my opinion the errors you reported come from the Pytorch version. I've got similar errors changing the pytorch release due to changes on the "sum" behaviour. It was in another code I am working on.

https://github.com/pytorch/pytorch/releases
"All reduce functions such as sum and mean now default to squeezing the reduced dimension."

I suggest to add keepdim=False in sum operations for fast and easy solve of this problem.

After a few weeks, I will try to fix the code to new pytorch versions.

ay27 · 2018-02-01T14:24:15Z

I tried to fix the problem and made some improvements, but not confident with the correctness, someone may verify it.

josejimenezluna · 2018-06-25T11:32:52Z

Hello, @priba

To make things easier, which version of pytorch are we supposed to be running?

adamxyang · 2019-01-19T01:03:45Z

Hello, @priba

Thanks for the implementation! I encountered the same issue here. I experimented with pytorch versions 0.2.0, 0.3.0 and 1.0.0, and I've also added keepdim=False to all sum operations in datasets.utils.py and models.MPNN.py, but none of them worked.

(rdkit) Adams-MacBook-Pro-4:mpnn iron4dam$ python main.py --no-cuda
Prepare files
Define model
	Statistics
	Create model
Optimizer
Logger
=> no best model found at './checkpoint/qm9/mpnn/model_best.pth'
Check cuda
Traceback (most recent call last):
  File "main.py", line 320, in <module>
    main()
  File "main.py", line 182, in main
    train(train_loader, model, criterion, optimizer, epoch, evaluation, logger)
  File "main.py", line 241, in train
    output = model(g, h, e)
  File "/Users/iron4dam/anaconda3/envs/rdkit/lib/python3.6/site-packages/torch/nn/modules/module.py", line 325, in __call__
    result = self.forward(*input, **kwargs)
  File "/Users/iron4dam/Google_Drive/Part_C/Dissertation/dissertation_code/mpnn/models/MPNN.py", line 78, in forward
    m = self.m[0].forward(h[t], h_aux, e_aux)
  File "/Users/iron4dam/Google_Drive/Part_C/Dissertation/dissertation_code/mpnn/MessageFunction.py", line 43, in forward
    return self.m_function(h_v, h_w, e_vw, args)
  File "/Users/iron4dam/Google_Drive/Part_C/Dissertation/dissertation_code/mpnn/MessageFunction.py", line 174, in m_mpnn
    h_w_rows = h_w[..., None].expand(h_w.size(0), h_v.size(1), h_w.size(1)).contiguous()
RuntimeError: The expanded size of the tensor (25) must match the existing size (73) at non-singleton dimension 1. at /Users/soumith/minicondabuild3/conda-bld/pytorch_1512381214802/work/torch/lib/TH/generic/THTensor.c:309

rmrmg · 2019-03-03T08:42:18Z

@ay27 I've applied your patch and have another problem:

(nmpqc) rmrmg@kolos:/chematica/pka/nmpqc/nmp_qc$ LD_PRELOAD=$CONDA_PREFIX/lib/libstdc++.so python ./main.py --no-cuda
loaeed
Prepare files
Define model
Statistics
Create model
Optimizer
Logger
=> no best model found at './checkpoint/qm9/mpnn/model_best.pth'
Check cuda
Traceback (most recent call last):
File "./main.py", line 330, in
main()
File "./main.py", line 191, in main
train(train_loader, model, criterion, optimizer, epoch, evaluation, logger)
File "./main.py", line 254, in train
losses.update(train_loss.data[0], g.size(0))
IndexError: invalid index of a 0-dim tensor. Use tensor.item() to convert a 0-dim tensor to a Python number

wmmxk · 2019-06-09T18:09:48Z

I run into the same error:
h_w_rows = h_w[..., None].expand(h_w.size(0), h_v.size(1), h_w.size(1)).contiguous()
RuntimeError: The expanded size of the tensor (24) must match the existing size (73) at non-singleton dimension 1

So if it is due to version update, could I know what version you are using? (I am using pytorch 0.4.1)

priba · 2019-06-10T15:02:53Z

At that time I was using Pytorch 0.3.0

njwm · 2021-08-08T13:04:01Z

I made a small change like this：
h_w_rows = h_w[..., None].expand(h_w.size(0), h_w.size(1),h_v.size(1), ).contiguous()
It seems to work. But i am not sure about the results.

sthakurr · 2022-03-24T06:07:35Z

I made a small change like this： h_w_rows = h_w[..., None].expand(h_w.size(0), h_w.size(1),h_v.size(1), ).contiguous() It seems to work. But i am not sure about the results.

@njwm I did the same in order to get past that error and it worked (even though another similar error came regarding a .sum operation). But can you please verify if it affected the results?

njwm · 2022-04-22T03:43:57Z

Perhaps it is better like this：
h_w_rows = h_w[:, None,:].expand(h_w.size(0), h_v.size(1), h_w.size(1)).contiguous()

njwm · 2022-04-22T03:48:02Z

I made a small change like this： h_w_rows = h_w[..., None].expand(h_w.size(0), h_w.size(1),h_v.size(1), ).contiguous() It seems to work. But i am not sure about the results.

@njwm I did the same in order to get past that error and it worked (even though another similar error came regarding a .sum operation). But can you please verify if it affected the results?

I don't think it makes sense，it just gets past that error.

ay27 linked a pull request Feb 1, 2018 that will close this issue

Solve Issue #3 & #4 #5

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError during default execution #3

RuntimeError during default execution #3

AlexanderGri commented Dec 16, 2017 •

edited

priba commented Feb 1, 2018

ay27 commented Feb 1, 2018

josejimenezluna commented Jun 25, 2018

adamxyang commented Jan 19, 2019

rmrmg commented Mar 3, 2019

wmmxk commented Jun 9, 2019

priba commented Jun 10, 2019

njwm commented Aug 8, 2021 •

edited

sthakurr commented Mar 24, 2022

njwm commented Apr 22, 2022

njwm commented Apr 22, 2022

RuntimeError during default execution #3

RuntimeError during default execution #3

Comments

AlexanderGri commented Dec 16, 2017 • edited

priba commented Feb 1, 2018

ay27 commented Feb 1, 2018

josejimenezluna commented Jun 25, 2018

adamxyang commented Jan 19, 2019

rmrmg commented Mar 3, 2019

wmmxk commented Jun 9, 2019

priba commented Jun 10, 2019

njwm commented Aug 8, 2021 • edited

sthakurr commented Mar 24, 2022

njwm commented Apr 22, 2022

njwm commented Apr 22, 2022

AlexanderGri commented Dec 16, 2017 •

edited

njwm commented Aug 8, 2021 •

edited