Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use cuda_sparse as the sparse linear algebra library in solving BAL Problems with 2.20 version? #1021

Open
zhoupengwei opened this issue Nov 2, 2023 · 4 comments

Comments

@zhoupengwei
Copy link

I tested the bundle adjuster example with the latest released version 2.20 with cuda_sparse option as the sparse linear algebra solver type, the program encountered a crash, as follows:

./bundle_adjuster --input=problem-1723-156502-pre.txt --sparse_linear_algebra_library=cuda_sparse
F1102 17:46:22.171268 101672 sparse_cholesky.cc:92] Unknown sparse linear algebra library type : CUDA_SPARSE
*** Check failure stack trace: ***
    @     0x7efd9b803b03  google::LogMessage::Fail()
    @     0x7efd9b80b9d1  google::LogMessage::SendToLog()
    @     0x7efd9b8037c2  google::LogMessage::Flush()
    @     0x7efd9b80578f  google::LogMessageFatal::~LogMessageFatal()
    @     0x5590a712f81a  ceres::internal::SparseCholesky::Create()
    @     0x5590a711feeb  ceres::internal::SparseSchurComplementSolver::SparseSchurComplementSolver()
    @     0x5590a70f71b8  ceres::internal::LinearSolver::Create()
    @     0x5590a70a7399  ceres::internal::(anonymous namespace)::SetupLinearSolver()
    @     0x5590a70a8993  ceres::internal::TrustRegionPreprocessor::Preprocess()
    @     0x5590a705a114  ceres::Solver::Solve()
    @     0x5590a703722c  ceres::examples::(anonymous namespace)::SolveProblem()
    @     0x5590a7034370  main
    @     0x7efd70a29d90  (unknown)
    @     0x7efd70a29e40  __libc_start_main
    @     0x5590a7035365  _start
Aborted (core dumped)
@sandwichmaker
Copy link
Contributor

This means that you did not compile Ceres with CUDA linked into it. Please take a look at your cmake log and you will find that cuda was not detected by the cmake file.

@zhoupengwei
Copy link
Author

@sandwichmaker That is not the problem you mentioned, the command for cmake is following:

cmake -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local ../
-- The C compiler identification is GNU 11.4.0
-- The CXX compiler identification is GNU 11.4.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Performing Test HAVE_BIGOBJ
-- Performing Test HAVE_BIGOBJ - Failed
-- Looking for pow in m
-- Looking for pow in m - found
-- Detected Ceres version: 2.2.0 from /home/zpw/ThirdParty/ceres-solver-2.2.0/include/ceres/version.h
-- Found Eigen version 3.4.0: /usr/share/eigen3/cmake
-- Enabling use of Eigen as a sparse linear algebra library.
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE  
-- Found CUDA version 11.8.89 installed in: /usr/local/cuda-11.8
-- The CUDA compiler identification is NVIDIA 11.8.89
-- Detecting CUDA compiler ABI info
-- Detecting CUDA compiler ABI info - done
-- Check for working CUDA compiler: /usr/local/cuda-11.8/bin/nvcc - skipped
-- Detecting CUDA compile features
-- Detecting CUDA compile features - done
-- Setting CUDA Architecture to 50;60;70;80
-- Found LAPACK library: /usr/local/lib/libopenblas.so;-lm;-ldl
-- Found CHOLMOD headers in: /usr/local/include
-- Found CHOLMOD library: /usr/local/lib/libcholmod.so
-- Found SPQR headers in: /usr/local/include
-- Found SPQR library: /usr/local/lib/libspqr.so
-- Found Config headers in: /usr/local/include
-- Found Config library: /usr/local/lib/libsuitesparseconfig.so
-- Found AMD headers in: /usr/local/include
-- Found AMD library: /usr/local/lib/libamd.so
-- Found CAMD headers in: /usr/local/include
-- Found CAMD library: /usr/local/lib/libcamd.so
-- Found CCOLAMD headers in: /usr/local/include
-- Found CCOLAMD library: /usr/local/lib/libccolamd.so
-- Found COLAMD headers in: /usr/local/include
-- Found COLAMD library: /usr/local/lib/libcolamd.so
-- Found Intel Thread Building Blocks (TBB) library (2021.10 / 12100) include location: /usr/local/include. Assuming SuiteSparseQR was compiled with TBB.
-- Looking for shm_open in rt
-- Looking for shm_open in rt - found
-- Adding librt to SuiteSparse_config libraries (required on Linux & Unix [not OSX] if SuiteSparse is compiled with timing).
-- Found METIS: /usr/include (found version "5.1.0") 
-- Looking for cholmod_metis
-- Looking for cholmod_metis - found
-- Found SuiteSparse: /usr/local/include (found suitable version "7.3.1", minimum required is "4.5.6") found components: CHOLMOD SPQR Partition Config AMD CAMD CCOLAMD COLAMD 
-- Found SuiteSparse 7.3.1, building with SuiteSparse.
-- Building without Apple's Accelerate sparse support.
-- Found Google Flags (gflags) version 2.2.2: /usr/local/lib/cmake/gflags
-- No preference for use of exported glog CMake configuration set, and no hints for include/library directories provided. Defaulting to preferring an installed/exported glog CMake configuration if available.
-- Found installed version of glog: /usr/local/lib/cmake/glog
-- Detected glog version: 0.6.0
-- Found Glog: glog::glog  
-- Found Google Log (glog). Assuming glog was built with gflags support as gflags was found. This will make gflags a public dependency of Ceres.
-- Failed to find Google benchmark library, disabling build of benchmarks.
-- Building Ceres as a shared library.
-- Performing Test CHECK_CXX_FLAG_Wmissing_declarations
-- Performing Test CHECK_CXX_FLAG_Wmissing_declarations - Success
-- Performing Test CHECK_CXX_FLAG_Wno_unknown_pragmas
-- Performing Test CHECK_CXX_FLAG_Wno_unknown_pragmas - Success
-- Performing Test CHECK_CXX_FLAG_Wno_sign_compare
-- Performing Test CHECK_CXX_FLAG_Wno_sign_compare - Success
-- Performing Test CHECK_CXX_FLAG_Wno_unused_parameter
-- Performing Test CHECK_CXX_FLAG_Wno_unused_parameter - Success
-- Performing Test CHECK_CXX_FLAG_Wno_missing_field_initializers
-- Performing Test CHECK_CXX_FLAG_Wno_missing_field_initializers - Success
-- Creating configured Ceres config.h output directory: /home/zpw/ThirdParty/ceres-solver-2.2.0/build/include/ceres/internal
-- Enabling CERES_USE_EIGEN_SPARSE in Ceres config.h
-- Enabling CERES_NO_ACCELERATE_SPARSE in Ceres config.h
-- Performing Test CHECK_CXX_FLAG_Wno_missing_declarations
-- Performing Test CHECK_CXX_FLAG_Wno_missing_declarations - Success
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_VISIBILITY - Success
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY
-- Performing Test COMPILER_HAS_HIDDEN_INLINE_VISIBILITY - Success
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR
-- Performing Test COMPILER_HAS_DEPRECATED_ATTR - Success
-- Build the examples.
-- Configuring done
-- Generating done
-- Build files have been written to: /home/zpw/ThirdParty/ceres-solver-2.2.0/build

Ceres have been detected the cuda compiler with USE_CUDA option which its default value is ON. After this, I try to test the bundle adjustment problem with cuda sparse solver with different sparse linear solvers, both of them are failed!
Test 1

./bundle_adjuster --input=../../data/problem-16-22106-pre.txt --sparse_linear_algebra_library=cuda_sparse --linear_solver=sparse_normal_cholesky
F20240107 11:21:43.061152 53942 sparse_cholesky.cc:92] Unknown sparse linear algebra library type : CUDA_SPARSE
*** Check failure stack trace: ***
    @     0x7f9346609dc3  google::LogMessage::Fail()
    @     0x7f934660d78c  google::LogMessage::SendToLog()
    @     0x7f93466098a7  google::LogMessage::Flush()
    @     0x7f934660aeef  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f934632740d  ceres::internal::SparseCholesky::Create()
    @     0x7f9346327c18  ceres::internal::SparseNormalCholeskySolver::SparseNormalCholeskySolver()
    @     0x7f93462dc338  ceres::internal::LinearSolver::Create()
    @     0x7f9346337711  ceres::internal::TrustRegionPreprocessor::Preprocess()
    @     0x7f9345e5a5e4  ceres::Solver::Solve()
    @     0x7f9345e5b89c  ceres::Solve()
    @     0x5619422cf490  main
    @     0x7f9345629d90  (unknown)
    @     0x7f9345629e40  __libc_start_main
    @     0x5619422d1245  _start
Aborted (core dumped)

Test 2

./bundle_adjuster --input=../../data/problem-16-22106-pre.txt --sparse_linear_algebra_library=cuda_sparse --linear_solver=sparse_schur
F20240107 11:21:56.712909 54007 sparse_cholesky.cc:92] Unknown sparse linear algebra library type : CUDA_SPARSE
*** Check failure stack trace: ***
    @     0x7fd7ad18adc3  google::LogMessage::Fail()
    @     0x7fd7ad18e78c  google::LogMessage::SendToLog()
    @     0x7fd7ad18a8a7  google::LogMessage::Flush()
    @     0x7fd7ad18beef  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fd7acf2740d  ceres::internal::SparseCholesky::Create()
    @     0x7fd7acf1b9bb  ceres::internal::SparseSchurComplementSolver::SparseSchurComplementSolver()
    @     0x7fd7acedc158  ceres::internal::LinearSolver::Create()
    @     0x7fd7acf37711  ceres::internal::TrustRegionPreprocessor::Preprocess()
    @     0x7fd7aca5a5e4  ceres::Solver::Solve()
    @     0x7fd7aca5b89c  ceres::Solve()
    @     0x55a8f62ea490  main
    @     0x7fd7ac229d90  (unknown)
    @     0x7fd7ac229e40  __libc_start_main
    @     0x55a8f62ec245  _start
Aborted (core dumped)

I finally found the source code in sparse_cholesky.cc only supported suitesparse and eigen sparse libraries as shown the following:

std::unique_ptr<SparseCholesky> SparseCholesky::Create(
    const LinearSolver::Options& options) {
  std::unique_ptr<SparseCholesky> sparse_cholesky;

  switch (options.sparse_linear_algebra_library_type) {
    case SUITE_SPARSE:
#ifndef CERES_NO_SUITESPARSE
      if (options.use_mixed_precision_solves) {
        sparse_cholesky =
            FloatSuiteSparseCholesky::Create(options.ordering_type);
      } else {
        sparse_cholesky = SuiteSparseCholesky::Create(options.ordering_type);
      }
      break;
#else
      LOG(FATAL) << "Ceres was compiled without support for SuiteSparse.";
#endif

    case EIGEN_SPARSE:
#ifdef CERES_USE_EIGEN_SPARSE
      if (options.use_mixed_precision_solves) {
        sparse_cholesky =
            FloatEigenSparseCholesky::Create(options.ordering_type);
      } else {
        sparse_cholesky = EigenSparseCholesky::Create(options.ordering_type);
      }
      break;
#else
      LOG(FATAL) << "Ceres was compiled without support for "
                 << "Eigen's sparse Cholesky factorization routines.";
#endif

    case ACCELERATE_SPARSE:
#ifndef CERES_NO_ACCELERATE_SPARSE
      if (options.use_mixed_precision_solves) {
        sparse_cholesky =
            AppleAccelerateCholesky<float>::Create(options.ordering_type);
      } else {
        sparse_cholesky =
            AppleAccelerateCholesky<double>::Create(options.ordering_type);
      }
      break;
#else
      LOG(FATAL) << "Ceres was compiled without support for Apple's Accelerate "
                 << "framework solvers.";
#endif

    default:
      LOG(FATAL) << "Unknown sparse linear algebra library type : "
                 << SparseLinearAlgebraLibraryTypeToString(
                        options.sparse_linear_algebra_library_type);
  }

  if (options.max_num_refinement_iterations > 0) {
    auto refiner = std::make_unique<SparseIterativeRefiner>(
        options.max_num_refinement_iterations);
    sparse_cholesky = std::make_unique<RefinedSparseCholesky>(
        std::move(sparse_cholesky), std::move(refiner));
  }
  return sparse_cholesky;
}

So do you have a plan to add cuda spare into here which enable solver bundle adjustment with it?

@sandwichmaker
Copy link
Contributor

ah I did not realize what linear solver you were trying to use. I should have paid more attention to the fact that you are using SPARSE_NORMAL_CHOLESKY. We currently do not plan to have support for SPARSE_NORMAL_CHOLESKY using CUDA. We are however looking at solving bundle adjustment problems on the GPU using ITERATIVE_SCHUR. But I do not have an ETA for that.

Part of the reason we do not support SPARSE_NORMAL_CHOLESKY on the GPU is because we are lacking the cuda expertise to implement it. If you or anyone else has the expertise and wants to contribute it, we are open to it.

@zhoupengwei
Copy link
Author

@sandwichmaker Thank you for your suggestion, I will try to make it possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants