Tuesday, 28 July 2009

Java Heaps and Virtual Memory - Part 2

The story so far... in my earlier post I described a memory stress test that demonstrated the soundness of the advice to keep your Java heap size smaller than your physical memory.

I also wanted to check on the very limited explanations that I'd found and get a better understanding of what was going on. In particular, I wanted to know how Java (i.e. the Sun JVM) allocates heap memory and whether it adopts any strategies to avoid swap thrashing.

I did some further digging using the various things under the /proc file system and the JDK source code to find out.

The first surprise was in /proc/meminfo - the only counter that was going up significantly during the test was 'Mapped' - i.e. memory mapped files. I was expected this approach to be used for reading in .jar files and native libraries (and it was), but I wasn't expecting this for the heap. Digging into the source code explains why - The JDK uses the mmap system call to request more heap memory from the O/S.

I also took several snapshots of /proc/PID/smaps to see exactly what memory regions were being used in the processes address space. What this showed was:-
  1. There was a memory region (in my case starting from 0x51840000) that was clearly growing as the app allocated more and more heap. During the early part of the app's execution this would show up with a resident size 7Mb less than its overall size and with all of the resident pages showing up as being dirty.
  2. Once memory started to become scarce, many of the other memory regions start to show a reduction in their resident sizes and their shared sizes.
  3. Once swap thrashing was happening, the memory region which had been growing still had a 6-7Mb difference between the resident size and the allocated size. The big difference, however was that 15Mb of the space was now showing up as 'Private_Clean'.
So what does it all mean? Here's what I think is happening...
  1. During the early stages of execution the app is getting as much memory as it asks for but Linux delays giving it real physical memory until the specific pages are really accessed. This explains why the resident size is less than the allocated size - Java has probably extended the heap but hasn't yet accessed all of the allocated space.
  2. Memory is getting scarce, so Linux starts to reclaim pages that have the smallest impact. In the first instance it is hunting around for less critical pages (e.g. pages of jar files or libraries that haven't been used recently) that it can reclaim.
  3. This behaviour surprised me a little - I was expecting the resident size of the heap to have reduced, but this doesn't seem to have happened. What we can see is that part of the heap is now 'clean' - this tells us that Linux has indeed flushed part of the heap out to the swap file. The fact that the resident size has not reduced significantly tells us that we aren't getting much benefit - basically I think that the swapper is trying to swap pages out but the garbage collector is pulling them all back in again.
Finally I went back to the JDK sources again to see whether these would help me to understand what was going on. What I really wanted to understand was where the per-object data used by the garbage collector resides. The answer appears to be that it resides at the beginning of the memory block allocated to the object itself. The implication of this is that with 'normal' sized objects, the garbage collector run will need to access a few fields at the start of every single object on the heap, thus generating read and write accesses to practically every page contained in the heap.

So it would seem to me that the design choices in terms of the in-memory layout of objects and their garbage collector data mean that the garbage collectors really do conflict with the swapper once memory becomes tight. Based on the simple test that I did earlier, this happens both suddenly and with a severe impact.

In real life situations there may be several other Java and non-Java apps running on the same machine. I think this has a couple of implications:-
  1. The requirements of other apps may mean that memory becomes scarce much sooner - i.e. well before your Java heap size reaches the amount of physical memory.
  2. The swapper is not redundant - there may be plenty of 'low risk' pages belonging to other apps (or JAR mappings used only at startup time) that can be swapped out before the system gets to the point of swap thrashing.

Java Heaps and Virtual Memory - Part 1

Virtual memory has been around for a long time - Wikipedia reckons that it was first introduced in the early 1960s and it's still with us today. When we start using Java for large scale applications, however, it seems that virtual memory is perhaps not such a good thing. Several sources around the Internet recommend sizing the Java heap so that it fits within physical memory. The reason given is that the Java garbage collector likes to visit every page, so if some pages have been swapped out the GC will take a long time to run.

This question has cropped up on several occasions in my current project. While I have no reason to disagree with the advice on heap sizing, I was a little uncomfortable that I hadn't seen much real evidence to back it up or indication of how bad things would be once the limit was reached, so I decided to find out for myself.

The first thing I tried was creating a simple Java class to stress the heap. This class will progressively populate an ArrayList with a large number of Java objects each owning five 80 byte random strings. It can also be asked to 'churn' the objects by selecting and replacing groups of them, thus making the old ones eligible for garbage collection. I ran this on a small Linux box and watched what happened using 'top' and 'vmstat' ...

What I found was this...
  1. While there was plenty of free memory, the resident size of the process grew.
  2. Once free memory became short, the shared size started to shrink
  3. Very soon after that, the swap file usage started to grow
  4. If the 'churn' feature of the stress test was enabled, the system quickly got into heavy swap thrashing and the stress test ground to a halt.
  5. With no churn (probably not realistic for most real apps), the app could get a little further, but not much and would still get into swap thrashing.
My original intention was to capture some numbers and draw a graph or two to illustrate what happens. In practice what I found was that the results were rather variable, even on the same machine. In every case though there was a point soon after swapping started where the test tipped dramatically into swap thrashing and was unable to make any further progress.

I drew two conclusions from my simple test:-
  1. The advice to keep the Java heap smaller than physical memory is very sound.
  2. The degradation in performance if you let your Java heap grow bigger than physical memory is both sudden and severe.
I also wanted to check on the very limited explanations that I'd found and get a better understanding of what was going on. In particular, I wanted to know how Java (i.e. the Sun JVM) allocates heap memory and whether it adopts any strategies to avoid swap thrashing. I'll save this for a later post.