Note that the memory usage _could_ potentially be significantly improved for the JVM by just using an alternative allocator, such as jemalloc. In our system, we saw, in some instances, native memory usage decrease by about 60%, and it also resolved a slow "leak" that we saw, since glibc was allocating memory, and not returning it to the OS. In our case it was because we were opening a lot of class loaders, and hence zip files, from different threads.
I can second what you wrote about jemalloc. Some internal services at Amazon are using it with solid outcomes. I also recommend trying out 5.3.0 version released earlier this year.
Last I did benchmarking, a vast majority of memory allocations were strings that were typically all dereferenced right away and cleaned up in GEN1 GC. I had contemplated whether string pooling would be useful or not but never got around to it. Would be interesting to see if you could get reduced memory usage and potentially better performance by decreasing pressure on the GC during the GEN1 phase.
(Side note: this was when I was co-maintaining MCPC so was typically with mods installed and they heavily use NBT which I suspect is where a lot of that string allocation was happening.)
This is very interesting. Could you share more details on this particular issue in glibc? Jar files get mapped so I'm really interested where glibc failed to release memory.
No the OP, but we had similar issue — our service was leaking when allocating native memory using JNI. We onboarded Jemalloc as it has better debugging capabilities, but the leak dissapeared and performance improved. We never got around to root causing original leak.
For performance reasons, glibc may not return freed memory to the OS. You can increase the incentive for it to do so, by reducing MALLOC_ARENA_MAX to 2.
https://github.com/prestodb/presto/issues/8993