Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce memory footprint #667

Open
howardjohn opened this issue Aug 24, 2023 · 3 comments
Open

Reduce memory footprint #667

howardjohn opened this issue Aug 24, 2023 · 3 comments

Comments

@howardjohn
Copy link
Member

This issue tracks lowing the memory usage of ztunnel at large scales. I have been testing at 200k-1M pods.

Some raw notes:

get_by_namespaced_host
insert endpoint: add endpoint
remote endpoint: remove endpoint

From: 14% total
we clone the whole thing. 30% is services hashmap that we ignore (we use it later, not like its totally wasted).
we can clone individual parts instead of whole easily, and intern here. I bet we can drop the clone entirely though - just need to swap the hashmaps
25% is string clones so interning will help
20% is addresses, not sure how to handle that tbh
stoer insert: 14% on hashmap insert
insert_endpoint:22%; 8% from service clone, 6% hashmap; 4% nsName clone
insert_svc 40%
10% hashmap enrtry
10% hashset insert
10% hashmap insert
10% string clone insert
10% hashmap iter

1M pods, 66k svcs: 3G rss in ztunnel
7s to process

heappy is so slow it cannot process all the xds in 3 min

endpoint_uid is a bit slow, and we don't need to make it a string... 4% total

notify_on_demand is useless with not on demand. remove it. 1% cpu

insert_endpoint is 20%

a workload is 528 bytes in rust. quite large!
So 1M pods is 528 MB which is basically what memory profile shows

Service is 216 bytes

Endpoint is 168
port hashmap 48, NamespacedHostname 48, NetworkAddress 48, uid 24

workload is arc, so we have 3 indexes but share the same workload
so the indexs take a lot. 60mb from strings (key), 200mb from the maps

dropping service_endpoints saves 500mb from pprof, unsure the real usage

insert_endpoint is 36.7% now
mostly the triple hashmap. NSName clone is 7%. endpoint_uid is 3%

@hzxuzhonghu
Copy link
Member

IMO, If we migrate to real on demand fetch, then the memory could be heavily decreased

@howardjohn
Copy link
Member Author

Yes agreed. Memory is basically a no concern at that point - but brings some new complexity

@wulianglongrd
Copy link
Member

Can you describe how you obtain this performance data? The starting point of optimization may be to be familiar with performance evaluation, which may be a difficult point for most people, and I am one of them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants