Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gadgetCPU.gadgetInit() report an error! #7

Open
xscjun opened this issue Oct 8, 2019 · 19 comments
Open

gadgetCPU.gadgetInit() report an error! #7

xscjun opened this issue Oct 8, 2019 · 19 comments

Comments

@xscjun
Copy link

xscjun commented Oct 8, 2019

When I run the code:
gadgetCPU = SpeedTorch.DataGadget( 'data.npy',CPUPinn=True)
gadgetCPU.gadgetInit()

report an error like this:
Exception ignored in: <function PMemory.del at 0x7fcef8ca86a8>
Traceback (most recent call last):
File "/usr/local/python3/lib/python3.7/site-packages/SpeedTorch/CUPYLive.py", line 19, in del
AttributeError: 'NoneType' object has no attribute 'runtime'

But it's ok when I run :
gadgetGPU = SpeedTorch.DataGadget( 'data.npy' )
gadgetGPU.gadgetInit()

I can't find the reason,it confused me .

@Santosh-Gupta
Copy link
Owner

Hi xscjun,

That is very confusing. I am investigating by trying to recreate the issue. So far I am unable to re-create the issue in Colab.

Here's the code I used

!pip install SpeedTorch
#Always import cupy before SpeedTorch 
import cupy
import SpeedTorch
import torch
import numpy as np
import torch.nn as nn

sampl = np.random.uniform(low=-1.0, high=1.0, size=(10, 10, 10, 10))
np.save('data.npy', sampl)
del sampl

gadgetGPU = SpeedTorch.DataGadget( 'data.npy', CPUPinn=True )
gadgetGPU.gadgetInit()

For convenience, here's the colab notebook I used.

https://colab.research.google.com/drive/1TbqKwZ94p_B6q0t_orYObKsWwa7Fg0ld

Are you able to recreate the issue in Colab? If so, link a notebook for further investigation.

If not, it looks like it's an issue with your system. In that case, you could provide your system information and see if we can figure out what is causing the error from that info.

@xscjun
Copy link
Author

xscjun commented Oct 9, 2019

Hi xscjun,

That is very confusing. I am investigating by trying to recreate the issue. So far I am unable to re-create the issue in Colab.

Here's the code I used

!pip install SpeedTorch
#Always import cupy before SpeedTorch 
import cupy
import SpeedTorch
import torch
import numpy as np
import torch.nn as nn

sampl = np.random.uniform(low=-1.0, high=1.0, size=(10, 10, 10, 10))
np.save('data.npy', sampl)
del sampl

gadgetGPU = SpeedTorch.DataGadget( 'data.npy', CPUPinn=True )
gadgetGPU.gadgetInit()

For convenience, here's the colab notebook I used.

https://colab.research.google.com/drive/1TbqKwZ94p_B6q0t_orYObKsWwa7Fg0ld

Are you able to recreate the issue in Colab? If so, link a notebook for further investigation.

If not, it looks like it's an issue with your system. In that case, you could provide your system information and see if we can figure out what is causing the error from that info.

Thanks for your reply,It's the python3.7 that report the error, I change the version of python to 2.7, It's ok now.

@Santosh-Gupta
Copy link
Owner

Glad you were able to get it working. I am wondering what the cause is; I did all my testing in Python 3.

@xscjun
Copy link
Author

xscjun commented Oct 9, 2019

Glad you were able to get it working. I am wondering what the cause is; I did all my testing in Python 3.

I can't get it working in python3.7 ,it is confusing.

@Santosh-Gupta
Copy link
Owner

I noticed Colab has Python 3.6.8 by default, so perhaps there is something off about 3.7.

Are you able to recreate the issue in Colab? Is the 'data.npy' the same as in the notebook?

@xscjun
Copy link
Author

xscjun commented Oct 10, 2019

I noticed Colab has Python 3.6.8 by default, so perhaps there is something off about 3.7.

Are you able to recreate the issue in Colab? Is the 'data.npy' the same as in the notebook?

The 'data.npy' is the same as in the notebook. I haven't recreate the issue in Colab

@Approximetal
Copy link

I got OOM error, how can I deel with it? BTY, how can I load multiple data in one container?
gadgetGPU = SpeedTorch.DataGadget(target_mel) gadgetGPU.gadgetInit() Traceback (most recent call last): File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-19-226edd99569f>", line 1, in <module> gadgetGPU.gadgetInit() File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/SpeedTorch/CUPYLive.py", line 265, in gadgetInit self.CUPYcorpus = cupy.load( self.fileName) File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/cupy/io/npz.py", line 71, in load return cupy.array(obj) File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/cupy/creation/from_data.py", line 43, in array return core.array(obj, dtype, copy, order, subok, ndmin) File "cupy/core/core.pyx", line 1768, in cupy.core.core.array File "cupy/core/core.pyx", line 1845, in cupy.core.core.array File "cupy/core/core.pyx", line 1920, in cupy.core.core._send_object_to_gpu File "cupy/core/core.pyx", line 134, in cupy.core.core.ndarray.__init__ File "cupy/cuda/memory.pyx", line 540, in cupy.cuda.memory.alloc File "cupy/cuda/memory.pyx", line 1234, in cupy.cuda.memory.MemoryPool.malloc File "cupy/cuda/memory.pyx", line 1255, in cupy.cuda.memory.MemoryPool.malloc File "cupy/cuda/memory.pyx", line 1033, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc File "cupy/cuda/memory.pyx", line 1053, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc File "cupy/cuda/memory.pyx", line 775, in cupy.cuda.memory._try_malloc cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 86,528 bytes (allocated so far: 0 bytes).

@Santosh-Gupta
Copy link
Owner

I got OOM error, how can I deel with it? BTY, how can I load multiple data in one container?
gadgetGPU = SpeedTorch.DataGadget(target_mel) gadgetGPU.gadgetInit() Traceback (most recent call last): File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 3326, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-19-226edd99569f>", line 1, in <module> gadgetGPU.gadgetInit() File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/SpeedTorch/CUPYLive.py", line 265, in gadgetInit self.CUPYcorpus = cupy.load( self.fileName) File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/cupy/io/npz.py", line 71, in load return cupy.array(obj) File "/home/zzy/anaconda3/envs/StarGAN-VC/lib/python3.6/site-packages/cupy/creation/from_data.py", line 43, in array return core.array(obj, dtype, copy, order, subok, ndmin) File "cupy/core/core.pyx", line 1768, in cupy.core.core.array File "cupy/core/core.pyx", line 1845, in cupy.core.core.array File "cupy/core/core.pyx", line 1920, in cupy.core.core._send_object_to_gpu File "cupy/core/core.pyx", line 134, in cupy.core.core.ndarray.__init__ File "cupy/cuda/memory.pyx", line 540, in cupy.cuda.memory.alloc File "cupy/cuda/memory.pyx", line 1234, in cupy.cuda.memory.MemoryPool.malloc File "cupy/cuda/memory.pyx", line 1255, in cupy.cuda.memory.MemoryPool.malloc File "cupy/cuda/memory.pyx", line 1033, in cupy.cuda.memory.SingleDeviceMemoryPool.malloc File "cupy/cuda/memory.pyx", line 1053, in cupy.cuda.memory.SingleDeviceMemoryPool._malloc File "cupy/cuda/memory.pyx", line 775, in cupy.cuda.memory._try_malloc cupy.cuda.memory.OutOfMemoryError: Out of memory allocating 86,528 bytes (allocated so far: 0 bytes).

Can you make a colab that reproduces this error? That way I can interact with the bug.

BTY, how can I load multiple data in one container?

I haven't put that feature in, but I can put it in. So you would want
ModelFactoryObject.loadCupy( loadFileName) for the first dataset, and then
for new datasets. something like
ModelFactoryObject.appendCupy( loadFileName2)

?

@Approximetal
Copy link

Approximetal commented Jan 6, 2020

import cupy
import SpeedTorch
gadgetGPU = SpeedTorch.DataGadget('mel-20170001P00084I0004.npy')
gadgetGPU.gadgetInit()
mel-20170001P00084I0004.zip
It seems gadgetGPU.gadgetInit()will result this error no matter which data I load.

@Santosh-Gupta
Copy link
Owner

It looks like the data format is incorrect. It looks like there's an issue with how you're saving the data, and/or how your zipping the file.

Checkout this notebook, which saves and loads numpy data

https://colab.research.google.com/drive/185Z5Gi62AZxh-EeMfrTtjqxEifHOBXxF

Try using numpy.save to directly save your data into a numpy format file.

@Approximetal
Copy link

Approximetal commented Jan 7, 2020

It looks like the data format is incorrect. It looks like there's an issue with how you're saving the data, and/or how your zipping the file.

Checkout this notebook, which saves and loads numpy data

https://colab.research.google.com/drive/185Z5Gi62AZxh-EeMfrTtjqxEifHOBXxF

Try using numpy.save to directly save your data into a numpy format file.

It seems hard to change the data formate as my model has trained for a long time... Is there any method I can transfer data from CPU to your container? Or is there any method can replace torch.utils.data.Dataloader in pytorch? For example I've already preprocessed my data and saved it in a list.

@Santosh-Gupta
Copy link
Owner

Yup, I forgot the exact commands, but you can access your embedding data and mount them to CPU, in numpy form. It looks something like this YourModel.YourEmbeddingVariable.Weight.data.cpu().numpy()

details

https://discuss.pytorch.org/t/how-to-transform-variable-into-numpy/104/5

@Approximetal
Copy link

Approximetal commented Jan 7, 2020

Yup, I forgot the exact commands, but you can access your embedding data and mount them to CPU, in numpy form. It looks something like this YourModel.YourEmbeddingVariable.Weight.data.cpu().numpy()

details

https://discuss.pytorch.org/t/how-to-transform-variable-into-numpy/104/5

Can't open the link... I don't mean the embedding data like weight or parameters in model, I mean the training data, a set of data loaded in CPU, the time cost usually waste on loading batch from CPU to GPU. So I was wondering if I could save training data in speedtorch. It will helpful if there is a document to explain those functions in speedtorch.

@Approximetal
Copy link

Approximetal commented Jan 7, 2020

Thank you for replying! I can load files by using speedtorch now. But cupy doesn't support multi-thread, so I have to modify the thread from 8 to 1, after that, the time cost is even longer......

@Santosh-Gupta
Copy link
Owner

Yes, as long as the data is saved in numpy format, data gadget can open it, or your could transfer live data onto there. If you give me a colab notebook which loads your data, I can tinker around with it. I think the easiest way to do this to upload your data onto google drive, then use !gdown --id followed by the google drive id to download it directly to your notebook.

There's documentation at the bottom of the readme, and here's a colab notebook which shows how to use the data gadget:

https://colab.research.google.com/drive/1TbqKwZ94p_B6q0t_orYObKsWwa7Fg0ld

@Santosh-Gupta
Copy link
Owner

Thank you for replying! I can load files by using speedtorch now. But cupy doesn't support multi-thread, so I have to modify the thread from 8 to 1, after that, the time cost is even longer......

How many cores is your CPU? The main speedtorch advantages are for a lower number of CPUs like, 1-4. After that, Pytorches indexing kernals become more efficient.

I would love to see a colab version of your code, maybe i can tinker a bit

@Approximetal
Copy link

Approximetal commented Jan 10, 2020

Thank you for replying! I can load files by using speedtorch now. But cupy doesn't support multi-thread, so I have to modify the thread from 8 to 1, after that, the time cost is even longer......

How many cores is your CPU? The main speedtorch advantages are for a lower number of CPUs like, 1-4. After that, Pytorches indexing kernals become more efficient.

I would love to see a colab version of your code, maybe i can tinker a bit

My CPU info: 8 Intel(R) Xeon(R) W-2123 CPU @ 3.60GHz
The model I use is based on https://github.com/NVIDIA/tacotron2
And I replaced this line by using

melspec = SpeedTorch.DataGadget(full_path)
melspec.gadgetInit()
melspec = melspec.getData()

(BTW, I don't know how to get full Data, so I modified getData() and removed the parameter index)

@Santosh-Gupta
Copy link
Owner

How many cores does that CPU have? I can't seem to look it up.

I'm not too familiar with that model. But with a colab notebook perhaps I can tinker around.

@Approximetal
Copy link

How many cores does that CPU have? I can't seem to look it up.

I'm not too familiar with that model. But with a colab notebook perhaps I can tinker around.

4 cores. I'm afraid I can't upload the model on colab, every time I open that link, my computer is about to freeze......

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants