Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

发生浮点数溢出问题 #5

Open
foolishflyfox opened this issue Dec 7, 2018 · 1 comment
Open

发生浮点数溢出问题 #5

foolishflyfox opened this issue Dec 7, 2018 · 1 comment

Comments

@foolishflyfox
Copy link

在执行的过程中发生了数据溢出,下面是执行过程中的输出:

python main.py train --train-data-root=/home/linux_fhb/data/cat_vs_dog/train --use-gpu --env=classifier
user config:
env classifier
model ResNet34
train_data_root /home/linux_fhb/data/cat_vs_dog/train
test_data_root ./data/test1
load_model_path None
batch_size 32
use_gpu True
num_workers 4
print_freq 20
debug_file /tmp/debug
result_file result.csv
max_epoch 10
lr 0.1
lr_decay 0.95
weight_decay 0.0001
parse <bound method parse of <config.DefaultConfig object at 0x7f3e4a85b400>>
/home/linux_fhb/anaconda3/lib/python3.6/site-packages/torchvision/transforms/transforms.py:188: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
/home/linux_fhb/anaconda3/lib/python3.6/site-packages/torchvision/transforms/transforms.py:563: UserWarning: The use of the transforms.RandomSizedCrop transform is deprecated, please use transforms.RandomResizedCrop instead.
  "please use transforms.RandomResizedCrop instead.")
  0%|                                                 | 0/17500 [00:00<?, ?it/s]main.py:99: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  loss_meter.add(loss.data[0])
  3%|█▏                                   | 547/17500 [02:09<1:05:07,  4.34it/s]
main.py:138: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  val_input = Variable(input, volatile=True)
main.py:139: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  val_label = Variable(label.type(t.LongTensor), volatile=True)
Traceback (most recent call last):
  File "main.py", line 171, in <module>
    fire.Fire()
  File "/home/linux_fhb/anaconda3/lib/python3.6/site-packages/fire/core.py", line 127, in Fire
    component_trace = _Fire(component, args, context, name)
  File "/home/linux_fhb/anaconda3/lib/python3.6/site-packages/fire/core.py", line 366, in _Fire
    component, remaining_args)
  File "/home/linux_fhb/anaconda3/lib/python3.6/site-packages/fire/core.py", line 542, in _CallCallable
    result = fn(*varargs, **kwargs)
  File "main.py", line 121, in train
    if loss_meter.value()[0] > previous_loss:          
RuntimeError: value cannot be converted to type float without overflow: 10000000000000000159028911097599180468360808563945281389781327557747838772170381060813469985856815104.000000

其中环境的版本号为:

Python 3.6.5 :: Anaconda, Inc.
fire                               0.1.3    
numpy                              1.14.3   
numpydoc                           0.8.0    
torch                              0.4.1    
torchfile                          0.1.0    
torchnet                           0.0.4    
torchvision                        0.2.1    
visdom                             0.1.8.5  

显卡版本为:NVIDIA Corporation GP102 [GeForce GTX 1080 Ti] (rev a1), 11G 显存;

有遇到相同问题的兄弟吗?你们是怎么解决的?

@lijie2160
Copy link

改一下这个值:previous_loss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants