Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Program terminated when trying to get a kernel_map #579

Open
kirilllzaitsev opened this issue Feb 8, 2024 · 0 comments
Open

Program terminated when trying to get a kernel_map #579

kirilllzaitsev opened this issue Feb 8, 2024 · 0 comments

Comments

@kirilllzaitsev
Copy link

kirilllzaitsev commented Feb 8, 2024

Describe the bug
A clear and concise description of what the bug is.

  • Please complete all sections of this template if applicable. For installation, you must report the environment. Otherwise, your issue will be closed automatically.

To Reproduce
Steps to reproduce the behavior. If the code is not attached and cannot be reproduced easily, the bug report will be closed without any comments.

  • a minimally reproducible code.
import MinkowskiEngine as ME
import torch

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")


pruning = ME.MinkowskiPruning()
alpha = 1
x = ME.SparseTensor(
    features=torch.rand(10, 3),
    coordinates=torch.randint(0, 100, (10, 3)).int(),
    device=device,
)
y = torch.rand(10, 1)
keep = (y > alpha).squeeze().to(x.device)
out = pruning(x, keep)

cm = out.coordinate_manager

batched_target = ME.SparseTensor(
    features=torch.rand(4, 3),
    coordinates=torch.randint(0, 100, (4, 3)).int(),
    coordinate_manager=None,
    quantization_mode=ME.SparseTensorQuantizationMode.UNWEIGHTED_AVERAGE,
    device=device,
)
target_key, _ = cm.insert_and_map(batched_target.C, string_id="target")

strided_target_key = cm.stride(target_key, out.tensor_stride[0])
kernel_map = cm.kernel_map(
    out.coordinate_map_key,
    strided_target_key,
    kernel_size=1,
)

The output:

/opt/miniconda3/envs/tr/lib/python3.8/site-packages/MinkowskiEngine/__init__.py:36: UserWarning: The environment variable `OMP_NUM_THREADS` not set. MinkowskiEngine will automatically set `OMP_NUM_THREADS=16`. If you want to set `OMP_NUM_THREADS` manually, please export it on the command line before running a python script. e.g. `export OMP_NUM_THREADS=12; python your_program.py`. It is recommended to set it below 24.
  warnings.warn(
/tmp/pip-req-build-xctt5_hh/src/pruning_gpu.cu:132, (true) MinkowskiPruning: Generating an empty SparseTensor
[1]    17221 segmentation fault (core dumped)  python a.py

Expected behavior

The program should have exited normally.


Desktop (please complete the following information):

  • OS: Ubuntu 22.04
  • Python version: 3.10.13
  • Pytorch version: 2.1.2
  • CUDA version: 12.3
  • NVIDIA Driver version: 545.23.08
  • Minkowski Engine version 5.4.0
/MinkowskiEngine/MinkowskiEngine/__init__.py:36: UserWarning: The environment variable `OMP_NUM_THREADS` not set. MinkowskiEngine will automatically set `OMP_NUM_THREADS=16`. If you want to set `OMP_NUM_THREADS` manually, please export it on the command line before running a python script. e.g. `export OMP_NUM_THREADS=12; python your_program.py`. It is recommended to set it below 24.
  warnings.warn(
==========System==========
Linux-5.15.0-89-generic-x86_64-with-glibc2.31
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=20.04
DISTRIB_CODENAME=focal
DISTRIB_DESCRIPTION="Ubuntu 20.04.6 LTS"
3.10.13 (main, Sep 11 2023, 13:44:35) [GCC 11.2.0]
==========Pytorch==========
2.1.2+cu121
torch.cuda.is_available(): True
==========NVIDIA-SMI==========
/usr/bin/nvidia-smi
Driver Version 545.23.08
CUDA Version 12.3
VBIOS Version 95.06.25.00.56
Image Version G002.0000.00.03
GSP Firmware Version N/A
==========NVCC==========
/usr/local/cuda-12.3/bin/nvcc
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0
==========CC==========
/usr/bin/c++
c++ (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

==========MinkowskiEngine==========
0.5.4
MinkowskiEngine compiled with CUDA Support: True
NVCC version MinkowskiEngine is compiled: 12030
CUDART version MinkowskiEngine is compiled: 12030 

Additional context

My original problem is also due to the cm.kernel_map, yet in the training pipeline it results in:

0:05,  1.54it/s]/tmp/pip-req-build-xctt5_hh/src/pruning_gpu.cu:132, (true) MinkowskiPruning: Generating an empty SparseTensor
/opt/miniconda3/envs/tr/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '

I don't know how this is related to the segmenation fault problem from above, guessing that the code breaks due to the same thing. Happy to try reproducing the exact original error if needed.

The same problem happens in Python 3.8.15 as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant