Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use struct Slab in grow() so that tape elements do not relocate #795

Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
49 changes: 24 additions & 25 deletions include/clad/Differentiator/Tape.h
Original file line number Diff line number Diff line change
Expand Up @@ -114,34 +114,33 @@ namespace clad {
T(std::move(*first));
}
}
/// Initial capacity (allocated whenever a value is pushed into empty tape).

// Slab struct to store the elements. Contains the pointer to the next slab
template <typename T>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: declaration of 'T' shadows template parameter [clang-diagnostic-error]

    template <typename T>
                       ^
Additional context

include/clad/Differentiator/Tape.h:13: template parameter is declared here

  template <typename T>
                     ^

struct Slab {
T elements[32]; // can store 32 elements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not declare C-style arrays, use std::array<> instead [cppcoreguidelines-avoid-c-arrays]

      T elements[32]; // can store 32 elements
      ^

Slab *nextSlab;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: expected ';' after struct [clang-diagnostic-error]

Suggested change
}
};

constexpr static std::size_t _init_capacity = 32;
CUDA_HOST_DEVICE void grow() {
// If empty, use initial capacity.
if (!_capacity)
_capacity = _init_capacity;
else
// Double the capacity on each reallocation.
_capacity *= 2;
T* new_data = AllocateRawStorage(_capacity);

if (!new_data) {
// clean up the memory mess just in case!
destroy(begin(), end());
printf("Allocation failure during tape resize! Aborting.\n");
trap(EXIT_FAILURE);
Slab<T>* newSlab = new Slab<T>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: initializing non-owner 'Slab *' with a newly created 'gsl::owner<>' [cppcoreguidelines-owning-memory]

      Slab<T>* newSlab = new Slab<T>;
      ^

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use auto when initializing with new to avoid duplicating the type name [modernize-use-auto]

Suggested change
Slab<T>* newSlab = new Slab<T>;
auto* newSlab = new Slab<T>;

newSlab->nextSlab = nullptr;

if (!_capacity) {
// the Tape is empty and we can update
// the _data pointer to the newly allocated slab.
_data = newSlab;
} else {
// find the last slab by iterating the chain of slabs
Slab<T>* currentSlab = static_cast<Slab<T>*>(_data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: use auto when initializing with a cast to avoid duplicating the type name [modernize-use-auto]

Suggested change
Slab<T>* currentSlab = static_cast<Slab<T>*>(_data);
auto* currentSlab = static_cast<Slab<T>*>(_data);

while(currentSlab->nextSlab != nullptr) {
currentSlab = currentSlab->nextSlab;
}
// update pointer to the newly allocated slab
currentSlab->nextSlab = newSlab;
}

// Move values from old storage to the new storage. Should call move
// constructors on non-trivial types, otherwise is expected to use
// memcpy/memmove.
MoveData(begin(), end(), new_data);
// Destroy all values in the old storage.
destroy(begin(), end());
// delete the old data here to make sure we do not leak anything.
::operator delete(const_cast<void*>(
static_cast<const volatile void*>(_data)));
_data = new_data;
// Double the capacity on each slab addition
_capacity *= 2;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably should have same size slabs. Can you create/run some of the benchmarks and see how this new implementation works?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am setting the size of each Slab to 32, which is the size of the elements array inside the Slab struct. Did you mean to suggest using _capacity += 32 instead of _capacity *= 2 for maintaining same-sized slabs?

 template <typename SlabT>
    struct Slab {
	  std::array<SlabT, 32> elements; // can store 32 elements
	  std::unique_ptr<SlabT> nextSlab;
    };

I tried to run the existing benchmarks, but got this error:

In file included from /home/okabe/open-source/clad/unittests/Kokkos/main.cpp:1:
In file included from /usr/include/Kokkos_Core.hpp:45:
In file included from /usr/include/KokkosCore_Config_DeclareBackend.hpp:22:
In file included from /usr/include/decl/Kokkos_Declare_OPENMP.hpp:21:
In file included from /usr/include/OpenMP/Kokkos_OpenMP.hpp:235:
/usr/include/OpenMP/Kokkos_OpenMP_Instance.hpp:23:2: error: "You enabled Kokkos OpenMP support without enabling OpenMP in the compiler!"
#error \
 ^

Does this mean I have to build the OpenMP part of LLVM in debug mode as well?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have kokkos installed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have installed Kokkos with spack. But running cmake -DLLVM_DIR=../llvm-project/build -DCMAKE_BUILD_TYPE=DEBUG -DLLVM_EXTERNAL_LIT="$(which lit)" ../, gives me this error

CMake Warning at unittests/CMakeLists.txt:37 (find_package):
  By not providing "FindKokkos.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "Kokkos", but
  CMake did not find one.

  Could not find a package configuration file provided by "Kokkos" with any
  of the following names:

    KokkosConfig.cmake
    kokkos-config.cmake

  Add the installation prefix of "Kokkos" to CMAKE_PREFIX_PATH or set
  "Kokkos_DIR" to a directory containing one of the above files.  If "Kokkos"
  provides a separate development package or SDK, be sure it has been
  installed.

Is the KokkosConfig.cmake or kokkos-config.cmake file provided by the Kokkos installation or do I have to create it myself?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you are hitting a similar issue as described in #798. Kokkos is an optional dependency. If you remove it clad will still run...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. But interestingly the tests run fine in master branch despite giving the error message:

warning: input 'LIT' contained no tests

If there were problems with the build, then it should have made the tests in the master branch fail as well. But they only fail in the branch I am working on the grow() function. Could it be because of my implementation?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok, I did not realize that the tests ran. Yes, if they pass with the master and fail in your branch it's probably due to the changes that were made.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please give a feedback on my implementation. The only changes I've made are adding the Slab struct and the changes to the grow() function to use this struct for allocating data.

namespace clad {
    
  /// Dynamically-sized array (std::vector-like), primarily used for storing
  /// values in reverse-mode AD inside loops.
  template <typename T>
  class tape_impl {
    struct Slab {
	    T* elements; // can store 32 elements
	    Slab *nextSlab;
    };
    T* _data = nullptr;
    std::size_t _size = 0;
    std::size_t _capacity = 0;
    Slab* _last_added_slab = nullptr;
    //
    // other member functions
    //
    constexpr static std::size_t _init_capacity = 32;
    CUDA_HOST_DEVICE void grow() {
	    auto newSlab = (Slab*) new Slab;

      if (!_capacity) {
        // the Tape is empty and we can update
        // we can set the capacity with the initial capacity
        _capacity = this->_init_capacity;
      } else {
        // double the capacity if Slab exists
        _capacity *= 2;
      }
      auto* new_data = this->AllocateRawStorage(_capacity);
      if(!new_data) {
        delete newSlab; // delete the allocated slab on data allocation failure
        printf("Allocation failure during tape resize! Aborting\n");
      } else {
        newSlab->elements = new_data;
        _last_added_slab = newSlab;
      }
    }

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looks good as a start. Now you need to find where it falls short when clad uses it. Maybe you can start with some of the failing tests that explicitly test the Tape. Maybe TapeMemory.C is something that is worth looking into...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mistakenly thinking that the grow() function was also responsible for pushing the data, where it is actually emplace_back which is doing it. I was keeping a T* elements field in the struct to keep this data. But since we do not need to put data in grow() we need to change the struct fields.

I was considering we need to also change the emplace_back function since we are moving to structs and convert T* elements to T element, but I am unsure if it would be the correct way to do it. Could you please give me a feedback on this approach?

}

template <typename It>
Expand Down