Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extent hooks and opt.retain #2544

Open
antoniofilipovic opened this issue Sep 21, 2023 · 0 comments
Open

Extent hooks and opt.retain #2544

antoniofilipovic opened this issue Sep 21, 2023 · 0 comments

Comments

@antoniofilipovic
Copy link

I am trying to use extent_hooks to count memory allocations in the program, and I encountered a weird problem. If opt.retain is set to true, the alloc hook is rarely called with *commit==true. But on the other hand, there are all the time calls to decommit and purge.

From docs, it stands that when purge_forced is called with success, memory should be unmapped from the physical address space. If decommit is success (err=false), it should also be decommited from physical address space.

I understand that opt.retain=true is used so that virtual memory address space doesn't want to be fragmented, but from documentation, it seems that either commit hook should be called more often, or alloc hook should be called with *commit==true.

By counting allocations from hooks it seems that memory is in negative, due to frequent calls to decommit and purge.

Here are stats I get from program, by counting calls to my wrapper function for extent hooks:

[TOTAL] RAM: -80244736 Bytes, VIRT: 32.50MiB
[ALLOC] committed calls: 14, uncommited calls: 0
[DALLOC] committed calls: 0, uncommited calls: 0
[PURGE] forced calls: 1002,  lazy calls 0
COMMIT: 0
DECOMMIT calls: 1002

Here is how I get the hooks and set custom hooks:


static extent_hooks_t custom_hooks = {
    .alloc = &my_alloc,
    .dalloc = &my_dalloc,
    .destroy = &my_destroy,
    .commit = &my_commit,
    .decommit = &my_decommit,
    .purge_lazy = &my_purge_lazy,
    .purge_forced = &my_purge_forced,
    .split = &my_split,
    .merge = &my_merge,
};

static extent_hooks_t *new_hooks = &custom_hooks;

static std::vector<extent_hooks_t *> original_hooks_vec;

extent_hooks_t *old_hooks = nullptr;

void SetHooks() {
  uint64_t allocated{0};
  uint64_t sz{sizeof(allocated)};

  sz = sizeof(unsigned);
  unsigned narenas{0};
  int err = mallctl("opt.narenas", (void *)&narenas, &sz, nullptr, 0);

  if (err) {
    return;
  }

  std::cout << narenas << " : n arenas" << std::endl;

  if (nullptr != old_hooks) {
    return;
  }

  // get original hooks and update alloc
  original_hooks_vec.reserve(narenas);

  for (int i = 0; i < narenas; i++) {
    std::string func_name = "arena." + std::to_string(i) + ".extent_hooks";

    size_t hooks_len = sizeof(old_hooks);
    int err = mallctl(func_name.c_str(), &old_hooks, &hooks_len, nullptr, 0);

    if (err) {
      LOG_FATAL("Error getting hooks for jemalloc arena {}", i);
    }
    original_hooks_vec.emplace_back(old_hooks);

    // Due to the way jemalloc works, we need first to set their hooks
    // which will trigger creating arena, then we can set our custom hook wrappers

    err = mallctl(func_name.c_str(), nullptr, nullptr, &old_hooks, sizeof(old_hooks));
    
    if (err) {
      LOG_FATAL("Error setting jemalloc hooks for jemalloc arena {}", i);
    }

    err = mallctl(func_name.c_str(), nullptr, nullptr, &new_hooks, sizeof(new_hooks));

    if (err) {
      LOG_FATAL("Error setting custom hooks for jemalloc arena {}", i);
    }
  }
}


Now once these hooks are set, here are definitions of my wrappers around jemalloc hooks:

void *my_alloc(extent_hooks_t *extent_hooks, void *new_addr, size_t size, size_t alignment, bool *zero, bool *commit,
               unsigned arena_ind) {
  auto *ptr = original_hooks_vec[arena_ind]->alloc(extent_hooks, new_addr, size, alignment, zero, commit, arena_ind);
  if (ptr == nullptr) {
    return ptr;
  }

  if (*commit) {
    amount_.fetch_add(size, std::memory_order_relaxed);
  } else {
    amount_virt_.fetch_add(size, std::memory_order_relaxed);
  }
  return ptr;
}

static bool my_dalloc(extent_hooks_t *extent_hooks, void *addr, size_t size, bool committed, unsigned arena_ind) {
  auto err = original_hooks_vec[arena_ind]->dalloc(extent_hooks, addr, size, committed, arena_ind);

  if (err) {
    extent_hook_stats.dalloc.error.fetch_add(1);
    return err;
  }

  if (committed) {
    amount_.fetch_sub(size, std::memory_order_relaxed);
  } else {
     amount_virt_.fetch_sub(size, std::memory_order_relaxed);
  }
  return false;
}

static void my_destroy(extent_hooks_t *extent_hooks, void *addr, size_t size, bool committed, unsigned arena_ind) {
  if (committed) {
    amount_.fetch_sub(size, std::memory_order_relaxed);
  } else {
    amount_virt_.fetch_sub(size, std::memory_order_relaxed);
  }
  original_hooks_vec[arena_ind]->destroy(extent_hooks, addr, size, committed, arena_ind);
}

static bool my_commit(extent_hooks_t *extent_hooks, void *addr, size_t size, size_t offset, size_t length,
                      unsigned arena_ind) {
  auto err = original_hooks_vec[arena_ind]->commit(extent_hooks, addr, size, offset, length, arena_ind);

  if (err) {
    return err;
  }
  amount_.fetch_add(size, std::memory_order_relaxed);
  amount_virt_.fetch_sub(size, std::memory_order_relaxed);

  return false;
}

static bool my_decommit(extent_hooks_t *extent_hooks, void *addr, size_t size, size_t offset, size_t length,
                        unsigned arena_ind) {
  
  auto err = old_hooks->decommit(extent_hooks, addr, size, offset, length, arena_ind);

  if (err) {
    return err;
  }

  amount_.fetch_sub(size, std::memory_order_relaxed);
  amount_virt_.fetch_add(size, std::memory_order_relaxed);

  return false;
}

static bool my_purge_forced(extent_hooks_t *extent_hooks, void *addr, size_t size, size_t offset, size_t length,
                            unsigned arena_ind) {
  auto err = original_hooks_vec[arena_ind]->purge_forced(extent_hooks, addr, size, offset, length, arena_ind);
  if(err){
    return err;
  }
  amount_.fetch_sub(size, std::memory_order_relaxed);

  return err;
}

static bool my_purge_lazy(extent_hooks_t *extent_hooks, void *addr, size_t size, size_t offset, size_t length,
                          unsigned arena_ind) {
  // If memory is purged lazily, it will not be cleaned immediately if we are not using MADVISE_DONTNEED (muzzy=0 and
  // decay=0)
  //amount_.fetch_sub(size, std::memory_order_relaxed);
  return original_hooks_vec[arena_ind]->purge_lazy(extent_hooks, addr, size, offset, length, arena_ind);
}

static bool my_split(extent_hooks_t *extent_hooks, void *addr, size_t size, size_t size_a, size_t size_b,
                     bool committed, unsigned arena_ind) {
 
  return original_hooks_vec[arena_ind]->split(extent_hooks, addr, size, size_a, size_b, committed, arena_ind);
}

static bool my_merge(extent_hooks_t *extent_hooks, void *addr_a, size_t size_a, void *addr_b, size_t size_b,
                     bool committed, unsigned arena_ind) {
  return original_hooks_vec[arena_ind]->merge(extent_hooks, addr_a, size_a, addr_b, size_b, committed, arena_ind);
}


I am using jemalloc version 5.2.1. My custom hooks are used only as a wrapper around default jemalloc hooks as you can see. purge_forced and decommit are called constantly trying to unmap certain pages, but nothing really gets deallocated by tracking RES size on htop.

To avoid calling lazy purging, I am forcing jemalloc to use MADV_DONTNEED instead of MADV_FREE by setting the following flags to 0: muzzy_decay_ms:0, dirty_decay_ms:0

This is how I configure jemalloc:

 CFLAGS="-O1" CXXFLAGS="-O1" MALLOC_CONF="prof:true,retain:true,percpu_arena:percpu,oversize_threshold:1000000000000,muzzy_decay_ms:0,dirty_decay_ms:0" \
    ./configure \
        --disable-cxx \
        --enable-stats \
        --enable-debug \
        --enable-prof \
        --enable-shared=no --prefix=$PREFIX \
        --with-malloc-conf="prof:true,retain:true,percpu_arena:percpu,oversize_threshold:1000000000000,muzzy_decay_ms:0,dirty_decay_ms:0"

Also, by setting oversize_threshold to a big number, I want to disable jemalloc for using a custom arena for huge allocations as you can't set a custom hook on that arena if you don't modify the code itself.

From debugging, it seems jemalloc arena_decay_to_limit calls arena_decay_stashed which calls extent_dalloc_wrapper which first tries to decommit but fails and then tries purging memory. I don't understand why that happens. It all starts with the call of free.

My question is, why is purge called with success if nothing happens, and how should I adapt program to correctly count allocations?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
@antoniofilipovic and others