Reduce memory footprint #667

howardjohn · 2023-08-24T20:44:13Z

This issue tracks lowing the memory usage of ztunnel at large scales. I have been testing at 200k-1M pods.

Some raw notes:

get_by_namespaced_host
insert endpoint: add endpoint
remote endpoint: remove endpoint

From: 14% total
we clone the whole thing. 30% is services hashmap that we ignore (we use it later, not like its totally wasted).
we can clone individual parts instead of whole easily, and intern here. I bet we can drop the clone entirely though - just need to swap the hashmaps
25% is string clones so interning will help
20% is addresses, not sure how to handle that tbh
stoer insert: 14% on hashmap insert
insert_endpoint:22%; 8% from service clone, 6% hashmap; 4% nsName clone
insert_svc 40%
10% hashmap enrtry
10% hashset insert
10% hashmap insert
10% string clone insert
10% hashmap iter

1M pods, 66k svcs: 3G rss in ztunnel
7s to process

heappy is so slow it cannot process all the xds in 3 min

endpoint_uid is a bit slow, and we don't need to make it a string... 4% total

notify_on_demand is useless with not on demand. remove it. 1% cpu

insert_endpoint is 20%

a workload is 528 bytes in rust. quite large!
So 1M pods is 528 MB which is basically what memory profile shows

Service is 216 bytes

Endpoint is 168
port hashmap 48, NamespacedHostname 48, NetworkAddress 48, uid 24

workload is arc, so we have 3 indexes but share the same workload
so the indexs take a lot. 60mb from strings (key), 200mb from the maps

dropping service_endpoints saves 500mb from pprof, unsure the real usage

insert_endpoint is 36.7% now
mostly the triple hashmap. NSName clone is 7%. endpoint_uid is 3%

hzxuzhonghu · 2023-08-25T01:42:13Z

IMO, If we migrate to real on demand fetch, then the memory could be heavily decreased

howardjohn · 2023-08-25T02:02:24Z

Yes agreed. Memory is basically a no concern at that point - but brings some new complexity

wulianglongrd · 2023-09-02T01:19:38Z

Can you describe how you obtain this performance data? The starting point of optimization may be to be familiar with performance evaluation, which may be a difficult point for most people, and I am one of them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce memory footprint #667

Reduce memory footprint #667

howardjohn commented Aug 24, 2023

hzxuzhonghu commented Aug 25, 2023

howardjohn commented Aug 25, 2023

wulianglongrd commented Sep 2, 2023

Reduce memory footprint #667

Reduce memory footprint #667

Comments

howardjohn commented Aug 24, 2023

hzxuzhonghu commented Aug 25, 2023

howardjohn commented Aug 25, 2023

wulianglongrd commented Sep 2, 2023