Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xmallox-test benchmark 40x slower with tcmalloc than with glibc's malloc #1448

Open
JulianMarcusSchroeder opened this issue Oct 3, 2023 · 1 comment

Comments

@JulianMarcusSchroeder
Copy link

This is a corner case that may be of interest.
I run on a 32 core AARCH64 Neoverse v1 system / Ubuntu-22 with linux-5.15/linux-6.2
The xmalloc-test runs 32 malloc and 32 free threads that concurrently allocate/free 409600x64 bytes.
Repro:
git clone https://github.com/daanx/mimalloc-bench
cd mimalloc-bench
sed -i "s/if defined(APPLE)/if 1/g" bench/alloc-test/test_common.h
./build-bench-env.sh tc bench
cd out/bench
SYSMALLOC=1 ./xmalloc-test -w 32 -t 5 -s 64
rtime: 1.085, free/sec: 92.164 M
LD_PRELOAD=/home/ubuntu/gperftools/.libs/libtcmalloc_minimal.so.4 ./xmalloc-test -w 32 -t 5 -s 64
rtime: 39.578, free/sec: 2.527 M
This is with gperftools-2.13.

@alk
Copy link
Contributor

alk commented Oct 22, 2023

Thanks for letting us know. So I can argue that this specific benchmark is somewhat artificial in that it has threads releasing bunches of objects into same glibc malloc heap "partition". Our design is non-partitioned heap which actually works surprisingly well across lots of workloads, but occasionally loses quite a bit as highlighted in this workload.

But there are definitely lessons for us too. Thanks again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants