Add PatchMatchNet module for MVS and calculation of normals from depth #1129

anmatako · 2021-02-22T23:13:03Z

This work mainly integrated PatchMatchNet functionality in colamp using a TorchScript pre-trained module. Additionally it introduces functionality to calculate normal maps from depth maps since PatchMatchNet evaluation does not create normal maps as part of its process. More details about the changes:

Colmap can compile with Torch support to enable PatchMatchNet. For this the pre-compiled LitTorch library needs to be downloaded from here https://pytorch.org/ on the desired configuration for GPU or CPU-only and the archive extracted under <colmap-root>/lib/ thus creating a libtorch subfolder. Then CMake should be able to find the dependency and set the correct compilation flags.
PatchMatchNet can now be enabled from patch_match_stereo by setting the mvs_module_path option to a valid TorchScript module. One such module is included as part of this PR in <colmap-root>\mvs-modules\patchmatchnet-module.pt
- The TorchScript interface is fairly generic using the following input structure: (images: List[Tensor], intrisics: Tensor, extrinsics: Tensor, depth_params: Tensor) with the output being a tuple of (depth: Tensor, confidence: Tensor). Thus any module that subscribes to that input/output format for forward evaluation can be used instead.
Functionality of standard patch-match remains unchanged. There is now an inheritance structure used to select between standard and PMNet processing
Normal maps are now not required for stereo fusion. If missing they will be calculated from the depth maps themselves. This is needed to accommodate PMNet processing that does not produce normal maps as part of the estimation work.
- Note that use of calculated normal maps can be forced even for standard patch-match processing through the use of a new stereo fusion option --StereoFusion.calculate_normals.
Confidence maps can now be used for stereo fusion and they are optional. If missing a confidence of 1 is assumed everywhere. This is also added to make use of the confidence maps that are created as part of PMNet estimation.
New method for finding related images for fusion based on triangulation scoring is introduced and can be enabled with the option --StereoFusion.use_triangulation_scoring. This is included for parity with PatchMatchNet that has this method for finding related images instead of the colmap default. (useful for comparing results between colmap and Python)

…stereo fusion - Normal maps are optional during stereo fusion. If the map is not found a map is estimated from the depth map using the cross-product method. - Confidence maps are now also used in stereo fusion along with a threshold specified in the options. If the confidence probability is below threshold the specific depth is ignored. If the confidence map does not exist, then a default map with probability 1.0 is used. - Introduced alternative calculation of overlapping images in `Model` class based on triangulation score instead of using the median triangulation angle and sorting by number of common points. This calculation can be enabled for fusion with the new `use_triangulation_scoring` option. The new method is what MVSNet and variants are using when processing a ColMap workspace. - Added flag to allow use of calculated normals (from depth maps) in stereo fusion, instead of using the ones estimated from patch-match. - Added flag to control whether or not the normal maps should be renormalized when rescaled. This is completing earlier work that was supposed to avoid normalization of normals during fusion - Corollary to this, we now have to explicitly normalize the normal vectors that are used to calculate the angular difference when filtering points during fusion. - Added utility method `SetSlice` to `Mat` class to allow setting entire entry of normal map more conveniently - Minor cleanup to fix compilation warnings Merged PR 3157: Include confidence maps in the MVS setup - Confidence maps are now part of the MVS setup and participate as dependencies in undistortion and batching - Calculated normal and confidence maps are written out when using a cached workspace to avoid redoing the calculations when a map get evicted from the cache - Minor change in normal map calculation to avoid using pixels with invalid (<=0) depth Merged PR 3356: Robust estimation of normals using planes - Improved calculation of normals from depth map using plane estimation with configurable window around the pixel of interest - Fix bug in normal calculation that was using pixel coordinates instead of local frame coordinates (needed multiplication with the inverse matrix of camera intrinsics) Cherry picked from !3327 Remove option to re-normalize normals

Initial implementation of PatchmatchNet evaluation modules. This is currently a standalone "library" not connected to any other parts of ColMap yet. Merged PR 3299: Add PatchmatchNet processing through TorchScript module - Added PatchMatchNet implementation as alternative to standard patch-match - New functionality controlled by the new option to load a TorchScript module from file `--PatchMatchStereo.mvs_module_path` - Created inheritance structure for `PatchMatch` (base class) and `PatchMatchCuda` and `PatchMatchNet` (derived) to facilitate the choice of processing method Remove LibTorch components Merged PR 3450: Update PatchMatchNet module and interface Fixes two issues from the previous module - Sizes now are handled internally to ensure each dimension is a multiple of 8 - Images are a vector of tensors to allow different sizes between reference and source images

This reverts commit 57113e3.

anmatako · 2021-02-22T23:14:31Z

src/exe/colmap.cc

-#ifndef CUDA_ENABLED
-  std::cerr << "ERROR: Dense stereo reconstruction requires CUDA, which is not "
-               "available on your system."
+#if !defined(CUDA_ENABLED) && !defined(TORCH_ENABLED)


Logic here is changed such that now we fail immediately only if both CUDA and Torch are missing. If either is present we can do patch-match through the existing method or PMNet

Merge main repo to fork

Dawars · 2021-02-28T15:35:17Z

I've been trying to compile this but I get the following error:

-- Caffe2: CUDA detected: 11.2
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 11.2
-- Found cuDNN: v8.1.0  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- Autodetected CUDA architecture(s):  6.1
-- Added CUDA NVCC flags for: -gencode;arch=compute_61,code=sm_61
-- Build type specified as Release
-- Enabling SIMD support
-- Enabling OpenMP support
-- Disabling interprocedural optimization
-- Autodetected CUDA architecture(s):  6.1
-- Enabling CUDA support (version: 11.2, archs: sm_61)
-- Enabling LibTorch support
-- Enabling OpenGL support
-- Disabling profiling support
-- Enabling CGAL support
-- Configuring done
CMake Error in src/CMakeLists.txt:
  Imported target "torch" includes non-existent path

    "MKL_INCLUDE_DIR-NOTFOUND"

  in its INTERFACE_INCLUDE_DIRECTORIES.  Possible reasons include:

  * The path was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and references files it does not
  provide.

Which libtorch/cuda version are you using?
I've tried Cuda 11.2, cuDNN: v8.1.0, MKL 2020.04 and libtorch 1.7.1 on a 1080Ti. Same for Cuda 10.2 cuDNN: v7.

anmatako · 2021-02-28T18:43:18Z

I've been trying to compile this but I get the following error:

-- Caffe2: CUDA detected: 11.2
-- Caffe2: CUDA nvcc is: /usr/local/cuda/bin/nvcc
-- Caffe2: CUDA toolkit directory: /usr/local/cuda
-- Caffe2: Header version is: 11.2
-- Found cuDNN: v8.1.0  (include: /usr/include, library: /usr/lib/x86_64-linux-gnu/libcudnn.so)
-- Autodetected CUDA architecture(s):  6.1
-- Added CUDA NVCC flags for: -gencode;arch=compute_61,code=sm_61
-- Build type specified as Release
-- Enabling SIMD support
-- Enabling OpenMP support
-- Disabling interprocedural optimization
-- Autodetected CUDA architecture(s):  6.1
-- Enabling CUDA support (version: 11.2, archs: sm_61)
-- Enabling LibTorch support
-- Enabling OpenGL support
-- Disabling profiling support
-- Enabling CGAL support
-- Configuring done
CMake Error in src/CMakeLists.txt:
  Imported target "torch" includes non-existent path

    "MKL_INCLUDE_DIR-NOTFOUND"

  in its INTERFACE_INCLUDE_DIRECTORIES.  Possible reasons include:

  * The path was deleted, renamed, or moved to another location.

  * An install or uninstall procedure did not complete successfully.

  * The installation package was faulty and references files it does not
  provide.

Which libtorch/cuda version are you using?
I've tried Cuda 11.2, cuDNN: v8.1.0, MKL 2020.04 and libtorch 1.7.1 on a 1080Ti. Same for Cuda 10.2 cuDNN: v7.

@Dawars It seems that LibTorch requires MKL as a dependency even though it already contains the headers and binaries in the LibTorch package itself. See if installing MKL on your system would resolve your issue.

On my end I made some modifications in the CMake configurations of LibTorch itself to make things work. I'll see if I can make changes in colmap CMake instead and have things work with vanilla LibTorch.

For reference here's a diff between my modified LibTorch and the vanilla one (LibTorch 1.7.1 for CUDA 10.1 with CUDNN 7.6.0)

diff --git "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/ATen/Parallel.h" "b/lib\\libtorch/include/ATen/Parallel.h"
index 9e2f9be..cc652f2 100644
--- "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/ATen/Parallel.h"
+++ "b/lib\\libtorch/include/ATen/Parallel.h"
@@ -38,7 +38,7 @@ namespace internal {

 // Initialise num_threads lazily at first parallel call
 inline CAFFE2_API void lazy_init_num_threads() {
-  thread_local bool init = false;
+  static thread_local bool init = false;
   if (C10_UNLIKELY(!init)) {
     at::init_num_threads();
     init = true;
diff --git "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/c10/util/StringUtil.h" "b/lib\\libtorch/include/c10/util/StringUtil.h"
index d2744f1..79da0ae 100644
--- "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/c10/util/StringUtil.h"
+++ "b/lib\\libtorch/include/c10/util/StringUtil.h"
@@ -74,7 +74,7 @@ struct _str_wrapper<const char*> final {
 template<>
 struct _str_wrapper<> final {
   static const std::string& call() {
-    thread_local const std::string empty_string_literal;
+    static thread_local const std::string empty_string_literal;
     return empty_string_literal;
   }
 };
diff --git "a/c:\\Users\\anmatako\\Downloads\\libtorch/share/cmake/Caffe2/public/cuda.cmake" "b/lib\\libtorch/share/cmake/Caffe2/public/cuda.cmake"
index 8b60915..041e19b 100644
--- "a/c:\\Users\\anmatako\\Downloads\\libtorch/share/cmake/Caffe2/public/cuda.cmake"
+++ "b/lib\\libtorch/share/cmake/Caffe2/public/cuda.cmake"
@@ -480,7 +480,7 @@ endforeach()
 # Set C++14 support
 set(CUDA_PROPAGATE_HOST_FLAGS_BLACKLIST "-Werror")
 if(MSVC)
-  list(APPEND CUDA_NVCC_FLAGS "--Werror" "cross-execution-space-call")
+  # list(APPEND CUDA_NVCC_FLAGS "--Werror" "cross-execution-space-call")
   list(APPEND CUDA_NVCC_FLAGS "--no-host-device-move-forward")
 else()
   list(APPEND CUDA_NVCC_FLAGS "-std=c++14")
diff --git "a/c:\\Users\\anmatako\\Downloads\\libtorch/share/cmake/Caffe2/public/mkl.cmake" "b/lib\\libtorch/share/cmake/Caffe2/public/mkl.cmake"
index 9515a4a..c68074b 100644
--- "a/c:\\Users\\anmatako\\Downloads\\libtorch/share/cmake/Caffe2/public/mkl.cmake"
+++ "b/lib\\libtorch/share/cmake/Caffe2/public/mkl.cmake"
@@ -1,4 +1,4 @@
-find_package(MKL QUIET)
+set(MKL_INCLUDE_DIR ${CMAKE_TORCHLIB_PATH}/include)

 if(NOT TARGET caffe2::mkl)
   add_library(caffe2::mkl INTERFACE IMPORTED)

anmatako · 2021-03-01T19:07:33Z

@Dawars @ahojnnes I update colmap's cmake to set the MKL flags without needing the full dependency for LibTorch to build. Also, I removed an NVCC flag set by LibTorch that was causing issues with Eigen/Core.

However I'm not sure what to do with this part of the diff:

diff --git "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/ATen/Parallel.h" "b/lib\\libtorch/include/ATen/Parallel.h"
index 9e2f9be..cc652f2 100644
--- "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/ATen/Parallel.h"
+++ "b/lib\\libtorch/include/ATen/Parallel.h"
@@ -38,7 +38,7 @@ namespace internal {

 // Initialise num_threads lazily at first parallel call
 inline CAFFE2_API void lazy_init_num_threads() {
-  thread_local bool init = false;
+  static thread_local bool init = false;
   if (C10_UNLIKELY(!init)) {
     at::init_num_threads();
     init = true;
diff --git "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/c10/util/StringUtil.h" "b/lib\\libtorch/include/c10/util/StringUtil.h"
index d2744f1..79da0ae 100644
--- "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/c10/util/StringUtil.h"
+++ "b/lib\\libtorch/include/c10/util/StringUtil.h"
@@ -74,7 +74,7 @@ struct _str_wrapper<const char*> final {
 template<>
 struct _str_wrapper<> final {
   static const std::string& call() {
-    thread_local const std::string empty_string_literal;
+    static thread_local const std::string empty_string_literal;
     return empty_string_literal;
   }
 };

I'm not sure if the issue with thread_local having to be static is specific to MSVC (windows) or if it happens on other platforms as well, since I have no good way to test this cross-platform.

Dawars · 2021-03-01T19:12:35Z

I'll try it out on Ubuntu and let you know. One problem for me was that cmake found the mkl.cmake from cgal which was installed on my machine, maybe we need to supply a custom version with colmap.

…

On 2021. Mar 1., Mon at 20:07, Antonios Matakos ***@***.***> wrote: @Dawars <https://github.com/Dawars> @ahojnnes <https://github.com/ahojnnes> I update colmap's cmake to set the MKL flags without needed the full dependency and also removed an NVCC flag set by LibTorch that was causing issues with Eigen/Core so now it should build without MKL being present. However I'm not sure what to do with this part of the diff: diff --git "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/ATen/Parallel.h" "b/lib\\libtorch/include/ATen/Parallel.h" index 9e2f9be..cc652f2 100644 --- "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/ATen/Parallel.h" +++ "b/lib\\libtorch/include/ATen/Parallel.h" @@ -38,7 +38,7 @@ namespace internal { // Initialise num_threads lazily at first parallel call inline CAFFE2_API void lazy_init_num_threads() { - thread_local bool init = false; + static thread_local bool init = false; if (C10_UNLIKELY(!init)) { at::init_num_threads(); init = true; diff --git "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/c10/util/StringUtil.h" "b/lib\\libtorch/include/c10/util/StringUtil.h" index d2744f1..79da0ae 100644 --- "a/c:\\Users\\anmatako\\Downloads\\libtorch/include/c10/util/StringUtil.h" +++ "b/lib\\libtorch/include/c10/util/StringUtil.h" @@ -74,7 +74,7 @@ struct _str_wrapper<const char*> final { template<> struct _str_wrapper<> final { static const std::string& call() { - thread_local const std::string empty_string_literal; + static thread_local const std::string empty_string_literal; return empty_string_literal; } }; I'm not sure if the issue with thread_local having to be static is specific to MSVC (windows) or if it happens on other platforms as well, since I have no good way to test this cross-platform. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1129 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI35ZCP6JH4HYRP37YFET3TBPQYJANCNFSM4YBNICEQ> .

Dawars · 2021-03-01T20:20:33Z

Now it compiles and runs fine, no additional cmake config needed for mkl.

However the model file seems to be corrupted. I get the following error at: torch::jit::load(options_.mvs_module_path, kDevIn);

cache_size: 20
write_consistency_graph: 0
mvs_module_path: /home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module.pt
allow_missing_files: 0
First definition of patch-match module for thread index: 0
Signal: SIGSEGV (signal SIGSEGV: invalid address (fault address: 0x0))

Process finished with exit code 9

I checked it Python and Netron as well:
Error loading Python module. Unknown expression '=' in 'patchmatchnet-module3.pt'.

Python 3.7.9 (default, Aug 31 2020, 12:42:55) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.18.1 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 7.18.1
Python 3.7.9 (default, Aug 31 2020, 12:42:55) 
[GCC 7.3.0] on linux
import torch
with open('/home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module.pt') as f:
    model = torch.load(f)
    
Traceback (most recent call last):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-f97813dbac00>", line 2, in <module>
    model = torch.load(f)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/serialization.py", line 572, in load
    if _is_zipfile(opened_file):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/serialization.py", line 56, in _is_zipfile
    byte = f.read(1)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa2 in position 72: invalid start byte
with open('/home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module3.pt') as f:
    model = torch.load(f)
    
Traceback (most recent call last):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-a6ef56580e99>", line 2, in <module>
    model = torch.load(f)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/serialization.py", line 572, in load
    if _is_zipfile(opened_file):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/serialization.py", line 56, in _is_zipfile
    byte = f.read(1)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa2 in position 72: invalid start byte

anmatako · 2021-03-01T21:49:55Z

I can load the module just fine in C++ and Python 3.8.5 on Windows using torch.jit.load; even torch.load works as well with a warning like this:

...Python\Python38\site-packages\torch\serialization.py:589: UserWarning: 'torch.load' received a zip file that looks like a TorchScript archive dispatching to 'torch.jit.load' (call 'torch.jit.load' directly to silence this warning)
  warnings.warn("'torch.load' received a zip file that looks like a TorchScript archive"

Wondering if there's some issue with committing the binary as part of the repo or an issue with Python version. See if it will run with a different python version. Also I can send you the module file directly so we can see if it's an issue caused when the file gets commited.

Now it compiles and runs fine, no additional cmake config needed for mkl.

However the model file seems to be corrupted. I get the following error at: torch::jit::load(options_.mvs_module_path, kDevIn);

cache_size: 20
write_consistency_graph: 0
mvs_module_path: /home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module.pt
allow_missing_files: 0
First definition of patch-match module for thread index: 0
Signal: SIGSEGV (signal SIGSEGV: invalid address (fault address: 0x0))

Process finished with exit code 9

I checked it Python and Netron as well:
Error loading Python module. Unknown expression '=' in 'patchmatchnet-module3.pt'.

Python 3.7.9 (default, Aug 31 2020, 12:42:55) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.18.1 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 7.18.1
Python 3.7.9 (default, Aug 31 2020, 12:42:55) 
[GCC 7.3.0] on linux
import torch
with open('/home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module.pt') as f:
    model = torch.load(f)
    
Traceback (most recent call last):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-f97813dbac00>", line 2, in <module>
    model = torch.load(f)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/serialization.py", line 572, in load
    if _is_zipfile(opened_file):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/serialization.py", line 56, in _is_zipfile
    byte = f.read(1)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa2 in position 72: invalid start byte
with open('/home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module3.pt') as f:
    model = torch.load(f)
    
Traceback (most recent call last):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-a6ef56580e99>", line 2, in <module>
    model = torch.load(f)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/serialization.py", line 572, in load
    if _is_zipfile(opened_file):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/serialization.py", line 56, in _is_zipfile
    byte = f.read(1)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa2 in position 72: invalid start byte

anmatako · 2021-03-01T22:28:58Z

@Dawars one more thing you can try in case it's an issue with encodings between windows and Linux would be to pull PatchMatchNet from the tip of my branch here https://github.com/anmatako/PatchmatchNet

Then uncomment these 3 lines here: https://github.com/anmatako/PatchmatchNet/blob/e21992b1c2d028536403632eb1bf4bfb1aa8f176/eval.py#L97-L99

and you can run from within the root folder of PatchmatchNet as follows:

python eval.py --output_folder <your output folder> --checkpoint_path checkpoints/patchmatchnet-params.pt --input_type params --output_type depth

This will create a new TorchScript module named patchmatchnet-module.pt in your specified output folder. If you can load that module then it must be some conversion issue between OSes.

Dawars · 2021-03-02T20:02:53Z

With PyTorch 1.7.1 I can read the model file properly.
I think the problem is that libtorch tries to open the file as a text file, not binary, that was one of my problems with Python.

I tried explicitly setting the file mode via:

std::ifstream model_file(options_.mvs_module_path, std::ios::in | std::ios::binary);

    model_[thread_index_] = torch::jit::load(model_file, kDevIn);

but I still get the same result.

Probably I'll have to compile a debug version of libtorch for linux to get more info. I have little experience with it but I'll try.

Here is the stack trace:

First definition of patch-match module for thread index: 0
Signal: SIGSEGV (signal SIGSEGV: invalid address (fault address: 0x0))
*** Aborted at 1614714901 (unix time) try "date -d @1614714901" if you are using GNU date ***
PC: @     0x7f2b751ee986 std::__detail::_Executor<>::_M_dfs()
*** SIGSEGV (@0x3e8000044a0) received by PID 17575 (TID 0x7f2b22fc4700) from PID 17568; stack trace: ***
    @     0x7f2b84b3a631 (unknown)
    @     0x7f2b8305f3c0 (unknown)
    @     0x7f2b751ee986 std::__detail::_Executor<>::_M_dfs()
    @     0x7f2b751eeb53 std::__detail::_Executor<>::_M_dfs()
    @     0x7f2b751eec6c std::__detail::_Executor<>::_M_dfs()
    @     0x7f2b751ef412 std::__detail::__regex_algo_impl<>()
    @     0x7f2b319995fe c10::Device::Device()
    @     0x7f2b7544963d torch::jit::Unpickler::readInstruction()
    @     0x7f2b7544b540 torch::jit::Unpickler::run()
    @     0x7f2b7544baf1 torch::jit::Unpickler::parse_ivalue()
    @     0x7f2b753ef9c2 torch::jit::readArchiveAndTensors()
    @     0x7f2b753efcdd torch::jit::(anonymous namespace)::ScriptModuleDeserializer::readArchive()
    @     0x7f2b753f2605 torch::jit::(anonymous namespace)::ScriptModuleDeserializer::deserialize()
    @     0x7f2b753f2bd9 torch::jit::load()
    @     0x7f2b753f5455 torch::jit::load()
    @     0x55620f2f4c46 colmap::mvs::PatchMatchNet::InitModule()
    @     0x55620f2f43d6 colmap::mvs::PatchMatchNet::PatchMatchNet()
    @     0x55620ec7c9b0 colmap::mvs::PatchMatchController::ProcessProblem()
    @     0x55620ec8fb63 std::__invoke_impl<>()
    @     0x55620ec8fa50 std::__invoke<>()
    @     0x55620ec8f851 _ZNSt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNS1_17PatchMatchOptionsEmEPS2_S3_mEE6__callIvJEJLm0ELm1ELm2EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
    @     0x55620ec8f367 std::_Bind<>::operator()<>()
    @     0x55620ec8efdd std::__invoke_impl<>()
    @     0x55620ec8ed55 std::__invoke<>()
    @     0x55620ec8ea7d _ZZNSt13__future_base11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNS3_17PatchMatchOptionsEmEPS4_S5_mEESaIiEFvvEE6_M_runEvENKUlvE_clEv
    @     0x55620ec8f436 _ZNKSt13__future_base12_Task_setterISt10unique_ptrINS_7_ResultIvEENS_12_Result_base8_DeleterEEZNS_11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNSA_17PatchMatchOptionsEmEPSB_SC_mEESaIiEFvvEE6_M_runEvEUlvE_vEclEv
    @     0x55620ec8f08c _ZNSt17_Function_handlerIFSt10unique_ptrINSt13__future_base12_Result_baseENS2_8_DeleterEEvENS1_12_Task_setterIS0_INS1_7_ResultIvEES3_EZNS1_11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNSD_17PatchMatchOptionsEmEPSE_SF_mEESaIiEFvvEE6_M_runEvEUlvE_vEEE9_M_invokeERKSt9_Any_data
    @     0x55620eacd258 std::function<>::operator()()
    @     0x55620eacc75e std::__future_base::_State_baseV2::_M_do_set()
    @     0x55620ead4019 std::__invoke_impl<>()
    @     0x55620ead1136 std::__invoke<>()
    @     0x55620eacce3e _ZZSt9call_onceIMNSt13__future_base13_State_baseV2EFvPSt8functionIFSt10unique_ptrINS0_12_Result_baseENS4_8_DeleterEEvEEPbEJPS1_S9_SA_EEvRSt9once_flagOT_DpOT0_ENKUlvE_clEv
Signal: SIGSEGV (unknown crash reason)

Process finished with exit code 11

This is the error I got with PyTorch 1.6 might be related:

with open('/home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module_windows.pt', 'br') as f:
    model = torch.jit.load(f)
Traceback (most recent call last):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-d6e3587a7e88>", line 2, in <module>
    model = torch.jit.load(f)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/jit/__init__.py", line 277, in load
    cpp_module = torch._C.import_ir_module_from_buffer(cu, f.read(), map_location, _extra_files)
RuntimeError: 
Arguments for call are not valid.
The following variants are available:
  
  aten::upsample_nearest1d.out(Tensor self, int[1] output_size, float? scales=None, *, Tensor(a!) out) -> (Tensor(a!)):
  Expected a value of type 'List[int]' for argument 'output_size' but instead found type 'Optional[List[int]]'.
  
  aten::upsample_nearest1d(Tensor self, int[1] output_size, float? scales=None) -> (Tensor):
  Expected a value of type 'List[int]' for argument 'output_size' but instead found type 'Optional[List[int]]'.
The original call is:
  File "C:\Users\anmatako\AppData\Roaming\Python\Python38\site-packages\torch\nn\functional.py", line 3130
    if input.dim() == 3 and mode == 'nearest':
        return torch._C._nn.upsample_nearest1d(input, output_size, scale_factors)
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    if input.dim() == 4 and mode == 'nearest':
        return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
Serialized   File "code/__torch__/torch/nn/functional/___torch_mangle_46.py", line 155
    _49 = False
  if _49:
    _51 = torch.upsample_nearest1d(input, output_size3, scale_factors6)
          ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _50 = _51
  else:
'interpolate' is being compiled since it was called from 'FeatureNet.forward'
Serialized   File "code/__torch__/models/net.py", line 139
  def forward(self: __torch__.models.net.FeatureNet,
    x: Tensor) -> List[Tensor]:
    _35 = __torch__.torch.nn.functional.___torch_mangle_46.interpolate
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _36 = torch.empty([1], dtype=None, layout=None, device=None, pin_memory=None, memory_format=None)
    _37 = torch.empty([1], dtype=None, layout=None, device=None, pin_memory=None, memory_format=None)

anmatako · 2021-03-02T20:20:04Z

Being able to load the module with Pytorch 1.7.1 at least means that the module does not seem to be corrupted. The Pytorch 1.6 issue you see looks like a simple incompatibility with older versions.

As for the error you get when you try to load with LibTorch, I'm quite confused as well, as it should not need any special configs in fstream and should load without issues. Can you try with LibTorch 1.7.1 for CUDA 10.1 and cudnn 7.6.0? That's the same package I'm using and I was wondering if there's something in these dependencies that makes the loading incompatible when doing it from colmap.

With PyTorch 1.7.1 I can read the model file properly.
I think the problem is that libtorch tries to open the file as a text file, not binary, that was one of my problems with Python.

I tried explicitly setting the file mode via:

std::ifstream model_file(options_.mvs_module_path, std::ios::in | std::ios::binary);

    model_[thread_index_] = torch::jit::load(model_file, kDevIn);

but I still get the same result.

Probably I'll have to compile a debug version of libtorch for linux to get more info. I have little experience with it but I'll try.

Here is the stack trace:

First definition of patch-match module for thread index: 0
Signal: SIGSEGV (signal SIGSEGV: invalid address (fault address: 0x0))
*** Aborted at 1614714901 (unix time) try "date -d @1614714901" if you are using GNU date ***
PC: @     0x7f2b751ee986 std::__detail::_Executor<>::_M_dfs()
*** SIGSEGV (@0x3e8000044a0) received by PID 17575 (TID 0x7f2b22fc4700) from PID 17568; stack trace: ***
    @     0x7f2b84b3a631 (unknown)
    @     0x7f2b8305f3c0 (unknown)
    @     0x7f2b751ee986 std::__detail::_Executor<>::_M_dfs()
    @     0x7f2b751eeb53 std::__detail::_Executor<>::_M_dfs()
    @     0x7f2b751eec6c std::__detail::_Executor<>::_M_dfs()
    @     0x7f2b751ef412 std::__detail::__regex_algo_impl<>()
    @     0x7f2b319995fe c10::Device::Device()
    @     0x7f2b7544963d torch::jit::Unpickler::readInstruction()
    @     0x7f2b7544b540 torch::jit::Unpickler::run()
    @     0x7f2b7544baf1 torch::jit::Unpickler::parse_ivalue()
    @     0x7f2b753ef9c2 torch::jit::readArchiveAndTensors()
    @     0x7f2b753efcdd torch::jit::(anonymous namespace)::ScriptModuleDeserializer::readArchive()
    @     0x7f2b753f2605 torch::jit::(anonymous namespace)::ScriptModuleDeserializer::deserialize()
    @     0x7f2b753f2bd9 torch::jit::load()
    @     0x7f2b753f5455 torch::jit::load()
    @     0x55620f2f4c46 colmap::mvs::PatchMatchNet::InitModule()
    @     0x55620f2f43d6 colmap::mvs::PatchMatchNet::PatchMatchNet()
    @     0x55620ec7c9b0 colmap::mvs::PatchMatchController::ProcessProblem()
    @     0x55620ec8fb63 std::__invoke_impl<>()
    @     0x55620ec8fa50 std::__invoke<>()
    @     0x55620ec8f851 _ZNSt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNS1_17PatchMatchOptionsEmEPS2_S3_mEE6__callIvJEJLm0ELm1ELm2EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE
    @     0x55620ec8f367 std::_Bind<>::operator()<>()
    @     0x55620ec8efdd std::__invoke_impl<>()
    @     0x55620ec8ed55 std::__invoke<>()
    @     0x55620ec8ea7d _ZZNSt13__future_base11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNS3_17PatchMatchOptionsEmEPS4_S5_mEESaIiEFvvEE6_M_runEvENKUlvE_clEv
    @     0x55620ec8f436 _ZNKSt13__future_base12_Task_setterISt10unique_ptrINS_7_ResultIvEENS_12_Result_base8_DeleterEEZNS_11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNSA_17PatchMatchOptionsEmEPSB_SC_mEESaIiEFvvEE6_M_runEvEUlvE_vEclEv
    @     0x55620ec8f08c _ZNSt17_Function_handlerIFSt10unique_ptrINSt13__future_base12_Result_baseENS2_8_DeleterEEvENS1_12_Task_setterIS0_INS1_7_ResultIvEES3_EZNS1_11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNSD_17PatchMatchOptionsEmEPSE_SF_mEESaIiEFvvEE6_M_runEvEUlvE_vEEE9_M_invokeERKSt9_Any_data
    @     0x55620eacd258 std::function<>::operator()()
    @     0x55620eacc75e std::__future_base::_State_baseV2::_M_do_set()
    @     0x55620ead4019 std::__invoke_impl<>()
    @     0x55620ead1136 std::__invoke<>()
    @     0x55620eacce3e _ZZSt9call_onceIMNSt13__future_base13_State_baseV2EFvPSt8functionIFSt10unique_ptrINS0_12_Result_baseENS4_8_DeleterEEvEEPbEJPS1_S9_SA_EEvRSt9once_flagOT_DpOT0_ENKUlvE_clEv
Signal: SIGSEGV (unknown crash reason)

Process finished with exit code 11

This is the error I got with PyTorch 1.6 might be related:

with open('/home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module_windows.pt', 'br') as f:
    model = torch.jit.load(f)
Traceback (most recent call last):
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-d6e3587a7e88>", line 2, in <module>
    model = torch.jit.load(f)
  File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/jit/__init__.py", line 277, in load
    cpp_module = torch._C.import_ir_module_from_buffer(cu, f.read(), map_location, _extra_files)
RuntimeError: 
Arguments for call are not valid.
The following variants are available:
  
  aten::upsample_nearest1d.out(Tensor self, int[1] output_size, float? scales=None, *, Tensor(a!) out) -> (Tensor(a!)):
  Expected a value of type 'List[int]' for argument 'output_size' but instead found type 'Optional[List[int]]'.
  
  aten::upsample_nearest1d(Tensor self, int[1] output_size, float? scales=None) -> (Tensor):
  Expected a value of type 'List[int]' for argument 'output_size' but instead found type 'Optional[List[int]]'.
The original call is:
  File "C:\Users\anmatako\AppData\Roaming\Python\Python38\site-packages\torch\nn\functional.py", line 3130
    if input.dim() == 3 and mode == 'nearest':
        return torch._C._nn.upsample_nearest1d(input, output_size, scale_factors)
               ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    if input.dim() == 4 and mode == 'nearest':
        return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors)
Serialized   File "code/__torch__/torch/nn/functional/___torch_mangle_46.py", line 155
    _49 = False
  if _49:
    _51 = torch.upsample_nearest1d(input, output_size3, scale_factors6)
          ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _50 = _51
  else:
'interpolate' is being compiled since it was called from 'FeatureNet.forward'
Serialized   File "code/__torch__/models/net.py", line 139
  def forward(self: __torch__.models.net.FeatureNet,
    x: Tensor) -> List[Tensor]:
    _35 = __torch__.torch.nn.functional.___torch_mangle_46.interpolate
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
    _36 = torch.empty([1], dtype=None, layout=None, device=None, pin_memory=None, memory_format=None)
    _37 = torch.empty([1], dtype=None, layout=None, device=None, pin_memory=None, memory_format=None)

Dawars · 2021-03-05T00:19:59Z

I compiled the debug version and it works, which is not very useful. Now I'm compiling in release mode. I tried setting up Cuda 10.1 but on Ubuntu cublas is missing for this version, therefore I used 11.0. Not sure what to do from here. Also Pytorch 1.8 has just been released.

…

On 2021. Mar 2., Tue at 21:20, Antonios Matakos ***@***.***> wrote: Being able to load the module with Pytorch 1.7.1 at least means that the module does not seem to be corrupted. The Pytorch 1.6 issue you see looks like a simple incompatibility with older versions. As for the error you get when you try to load with LibTorch, I'm quite confused as well, as it should not need any special configs in fstream and should load without issues. Can you try with LibTorch 1.7.1 for CUDA 10.1 and cudnn 7.6.0? That's the same package I'm using and I was wondering if there's something in these dependencies that makes the loading incompatible when doing it from colmap. With PyTorch 1.7.1 I can read the model file properly. I think the problem is that libtorch tries to open the file as a text file, not binary, that was one of my problems with Python. I tried explicitly setting the file mode via: std::ifstream model_file(options_.mvs_module_path, std::ios::in | std::ios::binary); model_[thread_index_] = torch::jit::load(model_file, kDevIn); but I still get the same result. Probably I'll have to compile a debug version of libtorch for linux to get more info. I have little experience with it but I'll try. Here is the stack trace: First definition of patch-match module for thread index: 0 Signal: SIGSEGV (signal SIGSEGV: invalid address (fault address: 0x0)) *** Aborted at 1614714901 (unix time) try "date -d @1614714901" if you are using GNU date *** PC: @ 0x7f2b751ee986 std::__detail::_Executor<>::_M_dfs() *** SIGSEGV ***@***.***) received by PID 17575 (TID 0x7f2b22fc4700) from PID 17568; stack trace: *** @ 0x7f2b84b3a631 (unknown) @ 0x7f2b8305f3c0 (unknown) @ 0x7f2b751ee986 std::__detail::_Executor<>::_M_dfs() @ 0x7f2b751eeb53 std::__detail::_Executor<>::_M_dfs() @ 0x7f2b751eec6c std::__detail::_Executor<>::_M_dfs() @ 0x7f2b751ef412 std::__detail::__regex_algo_impl<>() @ 0x7f2b319995fe c10::Device::Device() @ 0x7f2b7544963d torch::jit::Unpickler::readInstruction() @ 0x7f2b7544b540 torch::jit::Unpickler::run() @ 0x7f2b7544baf1 torch::jit::Unpickler::parse_ivalue() @ 0x7f2b753ef9c2 torch::jit::readArchiveAndTensors() @ 0x7f2b753efcdd torch::jit::(anonymous namespace)::ScriptModuleDeserializer::readArchive() @ 0x7f2b753f2605 torch::jit::(anonymous namespace)::ScriptModuleDeserializer::deserialize() @ 0x7f2b753f2bd9 torch::jit::load() @ 0x7f2b753f5455 torch::jit::load() @ 0x55620f2f4c46 colmap::mvs::PatchMatchNet::InitModule() @ 0x55620f2f43d6 colmap::mvs::PatchMatchNet::PatchMatchNet() @ 0x55620ec7c9b0 colmap::mvs::PatchMatchController::ProcessProblem() @ 0x55620ec8fb63 std::__invoke_impl<>() @ 0x55620ec8fa50 std::__invoke<>() @ 0x55620ec8f851 _ZNSt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNS1_17PatchMatchOptionsEmEPS2_S3_mEE6__callIvJEJLm0ELm1ELm2EEEET_OSt5tupleIJDpT0_EESt12_Index_tupleIJXspT1_EEE @ 0x55620ec8f367 std::_Bind<>::operator()<>() @ 0x55620ec8efdd std::__invoke_impl<>() @ 0x55620ec8ed55 std::__invoke<>() @ 0x55620ec8ea7d _ZZNSt13__future_base11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNS3_17PatchMatchOptionsEmEPS4_S5_mEESaIiEFvvEE6_M_runEvENKUlvE_clEv @ 0x55620ec8f436 _ZNKSt13__future_base12_Task_setterISt10unique_ptrINS_7_ResultIvEENS_12_Result_base8_DeleterEEZNS_11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNSA_17PatchMatchOptionsEmEPSB_SC_mEESaIiEFvvEE6_M_runEvEUlvE_vEclEv @ 0x55620ec8f08c _ZNSt17_Function_handlerIFSt10unique_ptrINSt13__future_base12_Result_baseENS2_8_DeleterEEvENS1_12_Task_setterIS0_INS1_7_ResultIvEES3_EZNS1_11_Task_stateISt5_BindIFMN6colmap3mvs20PatchMatchControllerEFvRKNSD_17PatchMatchOptionsEmEPSE_SF_mEESaIiEFvvEE6_M_runEvEUlvE_vEEE9_M_invokeERKSt9_Any_data @ 0x55620eacd258 std::function<>::operator()() @ 0x55620eacc75e std::__future_base::_State_baseV2::_M_do_set() @ 0x55620ead4019 std::__invoke_impl<>() @ 0x55620ead1136 std::__invoke<>() @ 0x55620eacce3e _ZZSt9call_onceIMNSt13__future_base13_State_baseV2EFvPSt8functionIFSt10unique_ptrINS0_12_Result_baseENS4_8_DeleterEEvEEPbEJPS1_S9_SA_EEvRSt9once_flagOT_DpOT0_ENKUlvE_clEv Signal: SIGSEGV (unknown crash reason) Process finished with exit code 11 This is the error I got with PyTorch 1.6 might be related: with open('/home/dawars/projects/colmap_torch/mvs-modules/patchmatchnet-module_windows.pt', 'br') as f: model = torch.jit.load(f) Traceback (most recent call last): File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3417, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "<ipython-input-4-d6e3587a7e88>", line 2, in <module> model = torch.jit.load(f) File "/home/dawars/miniconda3/envs/historic/lib/python3.7/site-packages/torch/jit/__init__.py", line 277, in load cpp_module = torch._C.import_ir_module_from_buffer(cu, f.read(), map_location, _extra_files) RuntimeError: Arguments for call are not valid. The following variants are available: aten::upsample_nearest1d.out(Tensor self, int[1] output_size, float? scales=None, *, Tensor(a!) out) -> (Tensor(a!)): Expected a value of type 'List[int]' for argument 'output_size' but instead found type 'Optional[List[int]]'. aten::upsample_nearest1d(Tensor self, int[1] output_size, float? scales=None) -> (Tensor): Expected a value of type 'List[int]' for argument 'output_size' but instead found type 'Optional[List[int]]'. The original call is: File "C:\Users\anmatako\AppData\Roaming\Python\Python38\site-packages\torch\nn\functional.py", line 3130 if input.dim() == 3 and mode == 'nearest': return torch._C._nn.upsample_nearest1d(input, output_size, scale_factors) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE if input.dim() == 4 and mode == 'nearest': return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors) Serialized File "code/__torch__/torch/nn/functional/___torch_mangle_46.py", line 155 _49 = False if _49: _51 = torch.upsample_nearest1d(input, output_size3, scale_factors6) ~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE _50 = _51 else: 'interpolate' is being compiled since it was called from 'FeatureNet.forward' Serialized File "code/__torch__/models/net.py", line 139 def forward(self: __torch__.models.net.FeatureNet, x: Tensor) -> List[Tensor]: _35 = __torch__.torch.nn.functional.___torch_mangle_46.interpolate ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE _36 = torch.empty([1], dtype=None, layout=None, device=None, pin_memory=None, memory_format=None) _37 = torch.empty([1], dtype=None, layout=None, device=None, pin_memory=None, memory_format=None) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1129 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAI35ZCJAMSZSBXYTXHTDHDTBVCAHANCNFSM4YBNICEQ> .

Dawars · 2021-03-05T11:33:44Z

src/mvs/patch_match_net.cc

+  if (model_.count(thread_index_) == 0) {
+    std::cout << "First definition of patch-match module for thread index: "
+              << options_.gpu_index << std::endl;
+    model_[thread_index_] =
+        torch::jit::load(options_.mvs_module_path, kDevIn);
+  } else {
+    std::cout << "Patch-match module already defined for thread index: "
+              << options_.gpu_index << std::endl;
+  }


Only the first run is successful, otherwise the thread terminates at model.forwards(...) (https://github.com/colmap/colmap/pull/1129/files#diff-ec9150c5522870ad0fd07f523905377b4d0670e7f42d92f2bbe11ceb42adb1beR59).

When I load the pytorch module every time this problem doesn't occur

That's a very interesting failure mode. Failing at the forward() evaluation likely means that the module is not there for the subsequent runs. Are you executing on a single or multi GPU environment? If running on multi-GPU environment, try using just a single GPU index in the patch-match options and see if it fails the same way.

The reason I ended up with this setup instead of loading the module for each problem is to take advantage of the optimizations that LibTorch does in the JIT modules. My main finding was that loading the module every time and no optimization is about 2x slower compared to reusing the module and allow optimizations.

If you can share a dataset that causes this failure I can try to reproduce on my end and see if I can debug it effectively.

I'm using a single GPU, not loading the model each time makes sense.

I'll send you the dataset.

jingyibo123 · 2022-09-13T09:22:46Z

Upvote on the integration of 3rd party learning-based MVS methods.

With the recent popularity of colmap amongst the greater CV community, and the advancements in the learning-based SfM & MVS methods, it would be very beneficial for both sides to be able to incorporate methods such as PatchMatchNet, MVSNet, SuperPoint, SuperGlue, etc..

Antonios Matakos and others added 15 commits February 10, 2021 20:48

Integrate libtorch and link to colmap executable

e3f9e79

Minor CMake tweak for LibTorch config and fix in thread_local usage

63abeb2

Fix Torch included dirs

4c763dd

Add PatchmatchNet modules

7b51685

Make tensor device explicit where needed

704e4b8

Initial implementation of PatchmatchNet processing

65cab9d

Delete Parallel.h

414693f

Delete StringUtil.h

7da4b32

Delete .gitignore

57113e3

Revert "Delete .gitignore"

0b0476f

This reverts commit 57113e3.

Delete param_dictionary.txt

a5bb072

Revert changes in gitignore and readme

f1b1f8d

Remove old Torch checkpoint

fdb8a43

anmatako commented Feb 22, 2021

View reviewed changes

Antonios Matakos and others added 2 commits February 22, 2021 15:15

Fix CMake merge issue

3731cfe

Merge pull request #1 from colmap/dev

029c427

Merge main repo to fork

anmatako mentioned this pull request Feb 24, 2021

Parallelize stereo fusion; needs pre-loading of entire workspace #1148

Merged

anmatako and others added 3 commits March 1, 2021 09:12

Merge branch 'dev' into public/anmatako

fda06bc

Fix CMake config for LibTorch

e311eb6

Additional fix for LibTorch build

0a7e3bb

Dawars reviewed Mar 5, 2021

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add PatchMatchNet module for MVS and calculation of normals from depth #1129

Add PatchMatchNet module for MVS and calculation of normals from depth #1129

anmatako commented Feb 22, 2021

anmatako Feb 22, 2021

Dawars commented Feb 28, 2021 •

edited

anmatako commented Feb 28, 2021

anmatako commented Mar 1, 2021 •

edited

Dawars commented Mar 1, 2021 via email

Dawars commented Mar 1, 2021

anmatako commented Mar 1, 2021 •

edited

anmatako commented Mar 1, 2021 •

edited

Dawars commented Mar 2, 2021

anmatako commented Mar 2, 2021

Dawars commented Mar 5, 2021 via email

Dawars Mar 5, 2021

anmatako Mar 5, 2021 •

edited

Dawars Mar 20, 2021

jingyibo123 commented Sep 13, 2022

Add PatchMatchNet module for MVS and calculation of normals from depth #1129

Are you sure you want to change the base?

Add PatchMatchNet module for MVS and calculation of normals from depth #1129

Conversation

anmatako commented Feb 22, 2021

anmatako Feb 22, 2021

Choose a reason for hiding this comment

Dawars commented Feb 28, 2021 • edited

anmatako commented Feb 28, 2021

anmatako commented Mar 1, 2021 • edited

Dawars commented Mar 1, 2021 via email

Dawars commented Mar 1, 2021

anmatako commented Mar 1, 2021 • edited

anmatako commented Mar 1, 2021 • edited

Dawars commented Mar 2, 2021

anmatako commented Mar 2, 2021

Dawars commented Mar 5, 2021 via email

Dawars Mar 5, 2021

Choose a reason for hiding this comment

anmatako Mar 5, 2021 • edited

Choose a reason for hiding this comment

Dawars Mar 20, 2021

Choose a reason for hiding this comment

jingyibo123 commented Sep 13, 2022

Dawars commented Feb 28, 2021 •

edited

anmatako commented Mar 1, 2021 •

edited

anmatako commented Mar 1, 2021 •

edited

anmatako commented Mar 1, 2021 •

edited

anmatako Mar 5, 2021 •

edited