Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Dtype issues with gpu backend #449

Open
zhiltsov-max opened this issue Apr 21, 2018 · 1 comment
Open

Dtype issues with gpu backend #449

zhiltsov-max opened this issue Apr 21, 2018 · 1 comment

Comments

@zhiltsov-max
Copy link

Hello, I was experimenting with Neon and had faced an issue with the convolutional and pooling layers. The task was image classification, so the input data shape was (3, H, W). If an ArrayIterator or HDF5Iterator are used as datasets, then the input shape values might have numpy datatypes like numpy.int64 (for ArrayIterator it is provided by lshape parameter, for HDF5Iterator they are retrieved from file['input'].attrs['lshape']). When these values are passed to the model configure method as in_obj, they are assigned to the layer.in_shape. After this, in_shape is used to initialize layer parameters. Next, during the forward pass, the following errors arise:

  • conv layer:
  File "<user>/neon/backends/nervanagpu.py", line 1990, in fprop_conv
    return self._execute_conv("fprop", layer, layer.fprop_kernels, repeat)
  File "<user>/neon/backends/nervanagpu.py", line 2072, in _execute_conv
    kernels.execute(repeat)
  File "<user>/neon/backends/convolution.py", line 224, in execute
    kernel.prepared_async_call(*self.launch_args, shared_size=self.shared)
  File "<user>/pycuda-2017.1.1-py3.5-linux-x86_64.egg/pycuda/driver.py", line 516, in function_prepared_async_call
    func._launch_kernel(grid, block, arg_buf, shared_size, stream)
TypeError: No registered converter was able to produce a C++ rvalue of type unsigned int from this Python object of type numpy.int64
  • pool layer:
  File "<user>/neon/backends/nervanagpu.py", line 2316, in fprop_pool
    layer.fprop_lut_size, repeat)
  File "<user>/neon/backends/nervanagpu.py", line 2349, in _execute_pool
    kernel.prepared_async_call(*params, shared_size=shared)
  File "<user>/pycuda-2017.1.1-py3.5-linux-x86_64.egg/pycuda/driver.py", line 516, in function_prepared_async_call
    func._launch_kernel(grid, block, arg_buf, shared_size, stream)
TypeError: No registered converter was able to produce a C++ rvalue of type unsigned int from this Python object of type numpy.int64
  • memory allocation in conv:
  File "<user>/neon/backends/convolution.py", line 1307, in bind_params
    input_data = self.lib.scratch_buffer_offset(self.size)
  File "<user>/neon/backends/nervanagpu.py", line 875, in scratch_buffer_offset
    data = int(_get_scratch_data(self.scratch_size)) + self.scratch_offset
  File "<decorator-gen-62>", line 2, in _get_scratch_data
  File "<user>/pycuda-2017.1.1-py3.5-linux-x86_64.egg/pycuda/tools.py", line 430, in context_dependent_memoize
    result = func(*args)
  File "<user>/neon/backends/nervanagpu.py", line 3287, in _get_scratch_data
    return drv.mem_alloc(scratch_size)
Boost.Python.ArgumentError: Python argument types in
    pycuda._driver.mem_alloc(numpy.int64)
did not match C++ signature:
    mem_alloc(unsigned long)

Layer parameters:

In "<>/neon/backends/convolution.py", line 75, in __init__:
(N, C, K, D, H, W, T, R, S, M, P, Q, pad_d, pad_h, pad_w, str_d, str_h, str_w, dil_d, dil_h, dil_w)

Have following values (idx, type, value):

[(0, <class 'int'>, 128), (1, <class 'numpy.int64'>, 3), (2, <class 'int'>, 32), (3, <class 'int'>, 1), (4, <class 'numpy.int64'>, 128), (5, <class 'numpy.int64'>, 128), (6, <class 'int'>, 1), (7, <class 'int'>, 3), (8, <class 'int'>, 3), (9, <class 'int'>, 1), (10, <class 'numpy.int64'>, 128), (11, <class 'numpy.int64'>, 128), (12, <class 'int'>, 0), (13, <class 'int'>, 2), (14, <class 'int'>, 2), (15, <class 'int'>, 1), (16, <class 'int'>, 1), (17, <class 'int'>, 1), (18, <class 'int'>, 1), (19, <class 'int'>, 2), (20, <class 'int'>, 2)]

Casting all parameters to int in layer initialization fixes the issue for me, but it seems not like a proper solution. Casting elements of lshape to int also helps. I think it would be great if the input values be checked or be converted to the expected types on the library side. Other layer types (like linear, batchnorm, recurrent, etc.) and backends (cpu, mkl) which I had used, had not shown to suffer from this issue.

Environment: python 3.5.2, neon 2.6.0 (f9d771b), cuda 8.0, gpu K40s, ubuntu 16.04, boost 1.58.0, pycuda 2017.1.1, numpy 1.13.1.

@baojun-nervana
Copy link
Contributor

@zhiltsov-max Agreed. A type check is needed here.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants