Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Elementwise value assignment silently fails after in-place array downsize using rows() #3534

Open
2 tasks done
warnellg opened this issue Feb 16, 2024 · 1 comment · May be fixed by #3538
Open
2 tasks done
Assignees
Labels

Comments

@warnellg
Copy link

warnellg commented Feb 16, 2024

In some scenarios, it seems that, after downsizing an array using an in-place call to rows() (ie, overwrite an array with only subset of its current rows), elementwise value assignment silently fails (ie, the element's value does not change and there are no compile or runtime errors).

Description

  • Did you build ArrayFire yourself or did you use the official installers: Built myself.
  • Which backend is experiencing this issue? CUDA.
  • Do you have a workaround? No.
  • Can the bug be reproduced reliably on your system? Yes.
  • A clear and concise description of what you expected to happen. Expected elementwise value assignment to succeed (eg, in example below, Drows(0,0) to have value 1234). Given that that fails, expected a compile or runtime error.
  • Run your executable with AF_TRACE=all and AF_PRINT_ERRORS=1 environment variables set:
# AF_TRACE=all AF_PRINT_ERRORS=1 aftest
[platform][1708117940][9998] [ /tmp/arrayfire/src/backend/common/DependencyModule.cpp:102 ] Attempting to load: libforge.so
[platform][1708117940][9998] [ /tmp/arrayfire/src/backend/common/DependencyModule.cpp:107 ] Unable to open forge
[platform][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:497 ] CUDA Driver supports up to CUDA 12.3.0 ArrayFire CUDA Runtime 11.3.0
[platform][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:478 ] CUDA driver version(12.3.0) not part of the CudaToDriverVersion array. Please create an issue or a pull request on the ArrayFire repository to update the CudaToDriverVersion variable with this version of the CUDA runtime.

[platform][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:566 ] Found 1 CUDA devices
[platform][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:588 ] Found device: NVIDIA RTX A3000 Laptop GPU (sm_86) (5.80 GB | ~12187.5 GFLOPs | 32 SMs)
[platform][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:652 ] AF_CUDA_DEFAULT_DEVICE: 
[platform][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:670 ] Default device: 0(NVIDIA RTX A3000 Laptop GPU)
[mem][1708117942][9998] [ /tmp/arrayfire/src/backend/common/DefaultMemoryManager.cpp:127 ] memory[0].max_bytes: 4.8 GB
[mem][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/memory.cpp:155 ] nativeAlloc:    1 KB 0x7fe876800000
[jit][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/compile_module.cpp:472 ] {14966320269747309860 : loaded from /root/.arrayfire/KER14966320269747309860_CU_86_AF_39.bin for NVIDIA RTX A3000 Laptop GPU }
[kernel][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/Kernel.hpp:37 ] Launching arrayfire::cuda::range<float>: Blocks: [1, 1, 1] Threads: [32, 8, 1] Shared Memory: 0
Drows
[5 5 1 1]
[mem][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/memory.cpp:155 ] nativeAlloc:    1 KB 0x7fe876800400
[jit][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/compile_module.cpp:472 ] {291210446400920389   : loaded from /root/.arrayfire/KER291210446400920389_CU_86_AF_39.bin for NVIDIA RTX A3000 Laptop GPU }
[kernel][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/Kernel.hpp:37 ] Launching arrayfire::cuda::transpose<float,false,false>: Blocks: [1, 1, 1] Threads: [32, 8, 1] Shared Memory: 0
    0.0000     0.0000     0.0000     0.0000     0.0000 
    1.0000     1.0000     1.0000     1.0000     1.0000 
    2.0000     2.0000     2.0000     2.0000     2.0000 
    3.0000     3.0000     3.0000     3.0000     3.0000 
    4.0000     4.0000     4.0000     4.0000     4.0000 

[jit][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/compile_module.cpp:472 ] {8447298384643760287  : loaded from /root/.arrayfire/KER8447298384643760287_CU_86_AF_39.bin for NVIDIA RTX A3000 Laptop GPU }
[kernel][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/Kernel.hpp:37 ] Launching arrayfire::cuda::memCopy<float>: Blocks: [1, 5, 1] Threads: [32, 1, 1] Shared Memory: 0
[jit][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/compile_module.cpp:472 ] {8641389271879371835  : loaded from /root/.arrayfire/KER8641389271879371835_CU_86_AF_39.bin for NVIDIA RTX A3000 Laptop GPU }
[kernel][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/jit.cpp:512 ] Launching : Dims: [1,1,1,1] Blocks: [1 1 1] Threads: [128 1 1] threads: 128
Drows
[4 5 1 1]
[kernel][1708117942][9998] [ /tmp/arrayfire/src/backend/cuda/Kernel.hpp:37 ] Launching arrayfire::cuda::transpose<float,false,false>: Blocks: [1, 1, 1] Threads: [32, 8, 1] Shared Memory: 0
    0.0000     0.0000     0.0000     0.0000     0.0000 
    1.0000     1.0000     1.0000     1.0000     1.0000 
    2.0000     2.0000     2.0000     2.0000     2.0000 
    3.0000     3.0000     3.0000     3.0000     3.0000

Reproducible Code and/or Steps

Program/output (note the (0,0) element of the last print):

#include <arrayfire.h>

using namespace af;

int main(int argc, char **argv)
{
    array Drows = range(dim4(5,5));
    af_print(Drows);

    Drows = Drows.rows(0,3);
    Drows(0,0) = 1234;
    af_print(Drows);

    return 0;
}


# aftest
Drows
[5 5 1 1]
    0.0000     0.0000     0.0000     0.0000     0.0000 
    1.0000     1.0000     1.0000     1.0000     1.0000 
    2.0000     2.0000     2.0000     2.0000     2.0000 
    3.0000     3.0000     3.0000     3.0000     3.0000 
    4.0000     4.0000     4.0000     4.0000     4.0000 

Drows
[4 5 1 1]
    0.0000     0.0000     0.0000     0.0000     0.0000 
    1.0000     1.0000     1.0000     1.0000     1.0000 
    2.0000     2.0000     2.0000     2.0000     2.0000 
    3.0000     3.0000     3.0000     3.0000     3.0000

Interestingly, initializing the array as a copy of an existing array yields the expected behavior:

#include <arrayfire.h>

using namespace af;

int main(int argc, char **argv)
{
    array D = range(dim4(5,5));
    array Drows = D;
    af_print(Drows);

    Drows = Drows.rows(0,3);
    Drows(0,0) = 1234;
    af_print(Drows);

    return 0;
}


# aftest
Drows
[5 5 1 1]
    0.0000     0.0000     0.0000     0.0000     0.0000 
    1.0000     1.0000     1.0000     1.0000     1.0000 
    2.0000     2.0000     2.0000     2.0000     2.0000 
    3.0000     3.0000     3.0000     3.0000     3.0000 
    4.0000     4.0000     4.0000     4.0000     4.0000 

Drows
[4 5 1 1]
 1234.0000     0.0000     0.0000     0.0000     0.0000 
    1.0000     1.0000     1.0000     1.0000     1.0000 
    2.0000     2.0000     2.0000     2.0000     2.0000 
    3.0000     3.0000     3.0000     3.0000     3.0000

EDIT: Additionally, here is the full debugging output for this case that succeeds (I note an additional [mem] line before the final af_print):

# AF_TRACE=all AF_PRINT_ERRORS=1 aftest
[platform][1708119537][10870] [ /tmp/arrayfire/src/backend/common/DependencyModule.cpp:102 ] Attempting to load: libforge.so
[platform][1708119537][10870] [ /tmp/arrayfire/src/backend/common/DependencyModule.cpp:107 ] Unable to open forge
[platform][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:497 ] CUDA Driver supports up to CUDA 12.3.0 ArrayFire CUDA Runtime 11.3.0
[platform][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:478 ] CUDA driver version(12.3.0) not part of the CudaToDriverVersion array. Please create an issue or a pull request on the ArrayFire repository to update the CudaToDriverVersion variable with this version of the CUDA runtime.

[platform][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:566 ] Found 1 CUDA devices
[platform][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:588 ] Found device: NVIDIA RTX A3000 Laptop GPU (sm_86) (5.80 GB | ~12187.5 GFLOPs | 32 SMs)
[platform][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:652 ] AF_CUDA_DEFAULT_DEVICE: 
[platform][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/device_manager.cpp:670 ] Default device: 0(NVIDIA RTX A3000 Laptop GPU)
ArrayFire v3.9.0 (CUDA, 64-bit Linux, build b59a1ae53)
Platform: CUDA Runtime 11.3, Driver: 545.23.08
[0] NVIDIA RTX A3000 Laptop GPU, 5938 MB, CUDA Compute 8.6
[mem][1708119537][10870] [ /tmp/arrayfire/src/backend/common/DefaultMemoryManager.cpp:127 ] memory[0].max_bytes: 4.8 GB
[mem][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/memory.cpp:155 ] nativeAlloc:    1 KB 0x7fe1c8800000
[jit][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/compile_module.cpp:472 ] {14966320269747309860 : loaded from /root/.arrayfire/KER14966320269747309860_CU_86_AF_39.bin for NVIDIA RTX A3000 Laptop GPU }
[kernel][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/Kernel.hpp:37 ] Launching arrayfire::cuda::range<float>: Blocks: [1, 1, 1] Threads: [32, 8, 1] Shared Memory: 0
Drows
[5 5 1 1]
[mem][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/memory.cpp:155 ] nativeAlloc:    1 KB 0x7fe1c8800400
[jit][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/compile_module.cpp:472 ] {291210446400920389   : loaded from /root/.arrayfire/KER291210446400920389_CU_86_AF_39.bin for NVIDIA RTX A3000 Laptop GPU }
[kernel][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/Kernel.hpp:37 ] Launching arrayfire::cuda::transpose<float,false,false>: Blocks: [1, 1, 1] Threads: [32, 8, 1] Shared Memory: 0
    0.0000     0.0000     0.0000     0.0000     0.0000 
    1.0000     1.0000     1.0000     1.0000     1.0000 
    2.0000     2.0000     2.0000     2.0000     2.0000 
    3.0000     3.0000     3.0000     3.0000     3.0000 
    4.0000     4.0000     4.0000     4.0000     4.0000 

[jit][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/compile_module.cpp:472 ] {8447298384643760287  : loaded from /root/.arrayfire/KER8447298384643760287_CU_86_AF_39.bin for NVIDIA RTX A3000 Laptop GPU }
[kernel][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/Kernel.hpp:37 ] Launching arrayfire::cuda::memCopy<float>: Blocks: [1, 5, 1] Threads: [32, 1, 1] Shared Memory: 0
[jit][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/compile_module.cpp:472 ] {8641389271879371835  : loaded from /root/.arrayfire/KER8641389271879371835_CU_86_AF_39.bin for NVIDIA RTX A3000 Laptop GPU }
[kernel][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/jit.cpp:512 ] Launching : Dims: [1,1,1,1] Blocks: [1 1 1] Threads: [128 1 1] threads: 128
Drows
[4 5 1 1]
[mem][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/memory.cpp:155 ] nativeAlloc:    1 KB 0x7fe1c8800800
[kernel][1708119537][10870] [ /tmp/arrayfire/src/backend/cuda/Kernel.hpp:37 ] Launching arrayfire::cuda::transpose<float,false,false>: Blocks: [1, 1, 1] Threads: [32, 8, 1] Shared Memory: 0
 1234.0000     0.0000     0.0000     0.0000     0.0000 
    1.0000     1.0000     1.0000     1.0000     1.0000 
    2.0000     2.0000     2.0000     2.0000     2.0000 
    3.0000     3.0000     3.0000     3.0000     3.0000

System Information

Please provide the following information:

  1. ArrayFire version: 3.9.0
  2. Devices installed on the system NVIDIA RTX A3000 Laptop GPU
  3. Output from the af::info() function if applicable.
ArrayFire v3.9.0 (CUDA, 64-bit Linux, build b59a1ae53)
Platform: CUDA Runtime 11.3, Driver: 545.23.08
[0] NVIDIA RTX A3000 Laptop GPU, 5938 MB, CUDA Compute 8.6
  1. Output from the following scripts:
    Linux output:
# bash afbugreport.sh 
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 20.04.6 LTS
Release:	20.04
Codename:	focal
name, memory.total [MiB], driver_version
NVIDIA RTX A3000 Laptop GPU, 6144 MiB, 545.23.08
rocm-smi not found.
clinfo not found.

Checklist

  • Using the latest available ArrayFire release
  • GPU drivers are up to date
@warnellg warnellg added the bug label Feb 16, 2024
@syurkevi syurkevi assigned syurkevi and umar456 and unassigned syurkevi Feb 16, 2024
@willyborn
Copy link
Contributor

As a current workaround you can add .copy() after the rows() method

#include <arrayfire.h>

using namespace af;

int main(int argc, char **argv)
{
    array D = range(dim4(5,5));
    array Drows = D;
    af_print(Drows);

    Drows = Drows.rows(0,3).copy();
    Drows(0,0) = 1234;
    af_print(Drows);

    return 0;
}

@willyborn willyborn linked a pull request Feb 28, 2024 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants