Intermediate SQL Color Coded SQL, UNIX and Database Essays

23Jul/1020

How AIX Paging Space works. Part 1: Why your program memory footprint gets bloated sometimes

If you recall AIX memory saga from my previous “AIX memory” posts (Process memory, SGA Memory), one of the points was that during the lifetime of the process (or shared memory segment) at any given moment each allocated memory page could reside in one of the two places:

  • Either the page is located in a physical RAM
  • OR the page is located in paging space

In memory location is, obviously, preferred. If I was a memory page I would WANT to be there ๐Ÿ™‚ In paging space ? – not so much, but, of course I could be forced there if I’m not popular enough and AIX needs memory for other things.

As you can see, logically, “page location” is an exclusive EITHER/OR deal – either the page is in memory OR it is in paging space – it can never be “in between”. In a way, virtual memory “flows” between physical memory and the paging space and because of that, the following approximate formula should work:

Simple Memory Allocation Formula

Virtual (memory requested by the process) = InMemory + InPagingSpace

This formula is very logical and very easy to understand … and would even work in a perfect world (that does not require memory optimizations).

Unfortunately in our world we may get stuck with a memory allocations that look like this:

---------- -- ---------- ---------- ---------- ----------
Vsid       Pg   InMem       Pin       Paging     Virtual
---------- -- ---------- ---------- ---------- ----------
3d0778      s       0.00       0.00       0.00       0.00
...
37876d      s     255.28       0.00     221.55     255.94
36076e      s     255.31       0.00     219.55     255.94
358769      s     255.33       0.00     220.97     255.94
---------- -- ---------- ---------- ---------- ----------
TOTAL:           7796.02       0.00    6550.62   7914.35

Requested SGA: 8192.02 Segments: 33

Here, we obviously have a problem with our calculation as 7,796 + 6,550 = 14,346, which is much larger than the virtual size of: 7,914.

So, what happened ? Where is the memory ? And how much of it are we really using ?

How AIX Paging Space work


The answer to the “what happened” question sends us again to the AIX bag of tricks.

In short:

In AIX, paging space may not be released when the pages are read back (PAGED IN) into memory

In this post, to show how AIX paging space works in details, I’m going to create a few small C programs that will allocate AIX shared memory segment (mimicking ORACLE SGA), but the same logic should apply to any AIX computational memory that is regularly managed and is eligible for page out (*). So, let’s begin, shall we ?

(*) And here is an example of not regularly managed AIX memory

Step 0. Setting the environment …

To see how paging space behaves, we clearly need to use it. That means that we have to create system conditions that will encourage paging … or, in other words, our programs should request (a lot) more memory than available in the system …

First of all, so that not to make our job too difficult, let’s artificially limit the amount of available memory to ~ 3 Gb

AIX> rmss -c 3000
Simulated memory size changed to 3000 Mb.

Next, to the test programs. I’m going to need 3 of them:

  1. shmalloc: A program to allocate shared memory. We will use it to mimic ORACLE SGA as well as (in a separate instance) to create “external” workload and push “SGA” into paging space
  2. shmwrite: A ‘writer’ program to change SGA memory, to represent UPDATE operations
  3. shmread: A ‘reader’ program to read SGA memory and emulate SELECT operations

You can find the source code for these programs in the TOOLS section.

Ok, now all the preparations are made, so let’s start the test.

Step 1: Starting up “SGA” …

# Here we are "Creating ORACLE SGA" 1.2 Gb in size
AIX>shmalloc 1200000000 1
Shared memory segment: 98566147 (1200000000) has been CREATED
Shared memory segment: 98566147 has been ATTACHED.
Shared memory segment: 98566147 has been POPULATED.
Initial allocated size: 180000000 bytes.
Now sleeping for 6000 seconds ...

# Let's double check that shared memory segment has really been created
AIX> ipcs -bm | grep 98566147
m  98566147 0x01035077 --rw-------   oracle      dba 1200000000

# And finally let's see where "SGA" memory is
AIX> ps -fu oracle | grep shmalloc
  oracle 192654 577598   0 09:48:25  pts/0  0:00 shmalloc 1200000000 1

AIX>svmon -P 192654 | grep shmat
    Vsid      Esid Type Description              PSize  Inuse   Pin Pgsp Virtual
   6878f  70000000 work default shmat/mmap           s  43946     0    0 43946
   48a0b  70000003 work default shmat/mmap           s      0     0    0     0
   7898d  70000001 work default shmat/mmap           s      0     0    0     0
   689ef  70000004 work default shmat/mmap           s      0     0    0     0
   68a0f  70000002 work default shmat/mmap           s      0     0    0     0

This is a first snapshot of our “SGA” right after it started up. Notice that right now it uses ~ 180 Mb of physical RAM and not the full 1.2 Gb that was requested. This is because, as you might recall, AIX does not allocate memory until it is actually used (and at the moment, we are only using 180 Mb).

Current SGA memory snapshot looks like this:

Step 2: Loading “SGA” into memory … All “SGA” memory is “in memory”

Let’s load more of the “SGA” into memory … We are going to use shmwrite program for that (but with real ORACLE instance, you could use i.e. a FULL TABLE SCAN for the large table … at least if you are still in ORACLE 10g).

AIX> shmwrite 1200000000 1 A
SHMGET successful for: 98566147
SHMAT successful for: 98566147
Memory space has been WRITTEN: 1200000000 bytes, symbol: A

After updating “SGA” memory it is now fully allocated. Moreover, as our “SGA” is still the only game in town, it fits entirely in the available physical RAM:

AIX> svmon -P 192654 | grep shmat
   6878f  70000000 work default shmat/mmap           s  65536     0    0 65536
   68a0f  70000002 work default shmat/mmap           s  65536     0    0 65536
   7898d  70000001 work default shmat/mmap           s  65536     0    0 65536
   48a0b  70000003 work default shmat/mmap           s  65536     0    0 65536
   689ef  70000004 work default shmat/mmap           s  30825     0    0 30825

And the overall “SGA” memory picture looks like this:

Notice a subtle thing here: No paging space has been allocated as of yet !

You might wonder why it is “subtle”, after all, obviously, there is still plenty of “real” memory available …

But what seems obvious right now is, in fact, a fairly recent improvement in AIX paging space behavior. Only since AIX 5.1 did IBM introduce a deferred paging space allocation policy that does not ALLOCATE paging space until system needs to PAGE OUT.

In older versions, with then default late paging space allocation policy AIX would not USE paging space until PAGE OUT, but would ALLOCATE it anyway. So, the picture that you’d see before AIX 5.1 would look like this:

Step 3: Creating “external” memory pressure to force “SGA” pages to PAGE OUT

Ok, moving on … Now, let’s simulate some workload and create a huge memory pressure for the system. This will hopefully make AIX page out some of “SGA” pages…

To simulate “external memory pressure” let’s request lots of of RAM by another process.

First of all, how much memory do we have right now ?

AIX>svmon -G
               size       inuse        free         pin     virtual     stolen
memory       798420      522077      276343      179670      590293    1167660

That would be: 276343 * 4k or ~ 1.3 Gb of available RAM. To be sure, let’s request 2 Gb now. Hopefully, this will force some of our “SGA” out to the paging space.

AIX> shmalloc 2000000000 2
Shared memory segment: 368050180 (2000000000) has been CREATED
Shared memory segment: 368050180 has been ATTACHED.
Shared memory segment: 368050180 has been POPULATED.
Initial allocated size: 300000000 bytes.
Now sleeping for 6000 seconds ...

AIX> shmwrite 2000000000 2 B
SHMGET successful for: 368050180
SHMAT successful for: 368050180
Memory space has been WRITTEN: 2000000000 bytes, symbol: B

Since our “SGA” was idle and thus represented older workload, it must give way … and it does:

AIX svmon -P 192654 | grep shmat
   68a0f  70000002 work default shmat/mmap           s  60362     0 5174 65536
   7898d  70000001 work default shmat/mmap           s  53229     0 12307 65536
   6878f  70000000 work default shmat/mmap           s      0     0 65536 65536
   48a0b  70000003 work default shmat/mmap           s      0     0 65536 65536
   689ef  70000004 work default shmat/mmap           s      0     0 30825 30825

A significant portion of the “SGA” memory was moved to the paging space, and, the overall “SGA” memory picture looks like this now:

Note that our simple memory estimation formula still works as Virtual (that is: overall requested) memory size is equal to “in memory” + “in paging space”. In other words, memory more or less cleanly “flows” from “in memory” to “in paging space” location.

Step 4: Bringing “SGA” back in memory FOR UPDATE

We’ve seen how memory and paging space behave on PAGE OUTs … Now, let’s see how they behave on PAGE INs …

To do that, let’s kill the “outside workload” (so that the system once again has lots of available memory) and request “SGA” pages back in RAM by requesting another update on them:

# Killing "external memory workload"
AIX> ipcrm -m 368050180

# Let's check the status of our "SGA" BEFORE update
AIX> svmon -P 192654 | grep shmat
svmon -P 409662 | grep shmat
   68a0f  70000002 work default shmat/mmap           s  60362     0 5174 65536
   7898d  70000001 work default shmat/mmap           s  53229     0 12307 65536
   6878f  70000000 work default shmat/mmap           s      0     0 65536 65536
   48a0b  70000003 work default shmat/mmap           s      0     0 65536 65536
   689ef  70000004 work default shmat/mmap           s      0     0 30825 30825

# Now, let's update some of the "SGA"
# (this will recall "SGA" memory pages back in memory) ...
AIX> shmwrite 480000000 1 C
SHMGET successful for: 98566147
SHMAT successful for: 98566147
Memory space has been WRITTEN: 480000000 bytes, symbol: C

# And now, let's check "SGA" state AFTER update
AIX> svmon -P 192654 | grep shmat
   7898d  70000001 work default shmat/mmap           s  65536     0    0 65536
   6878f  70000000 work default shmat/mmap           s  65536     0    0 65536
   68a0f  70000002 work default shmat/mmap           s  60362     0 5174 65536
   48a0b  70000003 work default shmat/mmap           s      0     0 65536 65536
   689ef  70000004 work default shmat/mmap           s      0     0 30825 30825

Ok, the memory “flowed” again. 100% of our RAM is still accounted for by our simple memory allocation formula as i.e.: 60362+5174 = 65536. Our “SGA” memory is still roughly a zero sum game …

It looks like paging space is deallocated once associated pages are read into memory during UPDATE operation

This paging space behavior, by the way, is actually another major feature that became available only with AIX 5.3. Paging space deallocation is controlled by a garbage collection mechanism, which, by default, works only during PAGE IN operations …

Step 5: Bringing “SGA” back in memory FOR READ

We’ve seen that paging space is deallocated when memory pages are brought back in memory for UPDATE … Would the same hold true if we simply READ ? Let’s find out …

First, let’s repeat the steps to spill “SGA” out to paging space again:

AIX> shmalloc 2000000000 2
Shared memory segment: 371195908 (2000000000) has been CREATED
Shared memory segment: 371195908 has been ATTACHED.
Shared memory segment: 371195908 has been POPULATED.
Initial allocated size: 300000000 bytes.
Now sleeping for 6000 seconds ...

AIX> shmwrite 2000000000 2 B
SHMGET successful for: 371195908
SHMAT successful for: 371195908
Memory space has been WRITTEN: 2000000000 bytes, symbol: B

And now, let’s kill the “external workload” and read “SGA” memory …

# Kill "external memory workload"
AIX> ipcrm -m 371195908

# Checking status of "SGA" BEFORE read
AIX> svmon -P 192654 | grep shmat
   6878f  70000000 work default shmat/mmap           s  65536     0    0 65536
   7898d  70000001 work default shmat/mmap           s  28243     0 37293 65536
   68a0f  70000002 work default shmat/mmap           s  11325     0 54211 65536
   48a0b  70000003 work default shmat/mmap           s      0     0 65536 65536
   689ef  70000004 work default shmat/mmap           s      0     0 30825 30825

# Let's read some memory
AIX> shmread 480000000 1
SHMGET successful for: 98566147
SHMAT successful for: 98566147
Memory space has been READ: 480000000 bytes, symbol: C

# And now let's check "SGA" status AFTER read
AIX>svmon -P 192654 | grep shmat
   48a0b  70000003 work default shmat/mmap           s  65536     0 65536 65536
   7898d  70000001 work default shmat/mmap           s  65536     0 37293 65536
   6878f  70000000 work default shmat/mmap           s  65536     0    0 65536
   68a0f  70000002 work default shmat/mmap           s  65536     0 54211 65536
   689ef  70000004 work default shmat/mmap           s  30825     0 30825 30825

Wow, this is weird ! Take the 2nd line for example: 65536 + 37293 = 102829 … Yet only 65536 (4k) blocks are really requested …

As you can see, during READ operations paging space is NOT released, instead, many memory pages now appear to be “double allocated” (they “exist” in both “real memory” and paging space) and the overall memory picture looks like this:

This is where our simple formula finally breaks down !

And the question naturally arises – why AIX is not releasing paging space on reads ? What good are the “leftover” paging space copies if our memory is “truly” in RAM ?

Well, this is actually pretty simple to figure out once you realize that:

  • Memory that has already been paged out once is likely to do so again
  • After PAGE IN caused by READ, leftover “in paging space” page copy remains identical to its “in memory” twin

Thus, on a likely subsequent PAGE OUT (and assuming memory did not change in between), AIX can simply discard the memory page and will NOT need to copy its contents back to paging space … which does represent some serious savings for AIX

So, essentially, by keeping “associated” memory copies in paging space AIX is positioning itself to potentially (and likely) skip some serious work in the future and a bit more disk space that is used for that is only a small price to pay for this gain in efficiency.

Step 6: Bringing “SGA” back in memory FOR READ and then UPDATING it

You might wonder – what would happen if memory pages are PAGED IN “for read” initially, but then changed afterwards ? Are they going to be “released” from the paging space at the time of change ?

Let’s find out …

# Let's read ALL the "SGA" memory to be 100 % sure
AIX> shmread 1200000000 1
SHMGET successful for: 98566147
SHMAT successful for: 98566147
Memory space has been READ: 1200000000 bytes, symbol: C

AIX> Check SGA status BEFORE memory update
AIX> svmon -P 192654 | grep shmat
   48a0b  70000003 work default shmat/mmap           s  65536     0 65536 65536
   7898d  70000001 work default shmat/mmap           s  65536     0 37293 65536
   6878f  70000000 work default shmat/mmap           s  65536     0    0 65536
   68a0f  70000002 work default shmat/mmap           s  65536     0 54211 65536
   689ef  70000004 work default shmat/mmap           s  30825     0 30825 30825

# Update ALL the "SGA" memory
AIX> shmwrite 1200000000 1 R
SHMGET successful for: 98566147
SHMAT successful for: 98566147
Memory space has been WRITTEN: 1200000000 bytes, symbol: R

# And now, let's check "SGA" state AFTER
AIX> svmon -P 716870 | grep shmat
   48a0b  70000003 work default shmat/mmap           s  65536     0 65536 65536
   68a0f  70000002 work default shmat/mmap           s  65536     0 54211 65536
   7898d  70000001 work default shmat/mmap           s  65536     0 37293 65536
   6878f  70000000 work default shmat/mmap           s  65536     0    0 65536
   689ef  70000004 work default shmat/mmap           s  30825     0 30825 30825

Looks like the answer is: NO. “SGA” memory is still double allocated even though memory has really changed and no longer matches its “in paging space” copy.

This is because the default garbage collection process only works during PAGE IN operations (at least, up until 6.1 ML5) and if there is no PAGE IN (and remember, after the READ, all “SGA” pages were already “in memory”) then there cannot be a paging space release.

AIX memory usage formula: Be careful what you ask for …

The bottom line here is that with the way AIX paging space works our simple memory estimation formula may no longer yield correct results.

So, is there a way to fix it ?

The answer is a qualified YES and, curiously, the formula (and the results) that you get will depend on the exact question that you ask.

Let’s say that you want to know how much of your program’s memory is “really” in the paging space ? (this is a performance tuning question).

The key insight here is that during the time when memory pages are double allocated, the primary page copy is ALWAYS in memory. Thus, svmon “inuse” field returns “true” data, while “pgsp” value is artificially “bloated” (double counting paging space pages that are “really” in memory).

Thus, “real” paging space use will be:

InPagingSpace (“real”) = Virtual – InMemory

Alternatively, you might want to know, how much paging space is being used by your program ? (this is paging space sizing question).

In this case, svmon “pgsp” values is not going to be “bloated” any longer – yes, the pages are double counted here, but they are nevertheless physically allocated.

Hence the resource usage formula would look like our old simple memory allocation formula:

Resources Used = InMemory + InPagingSpace

It’s just that “what is used” may NOT equal “what is requested” (Virtual) anymore …

Comments (20) Trackbacks (0)
  1. In my image, memory management on AIX looks like a pandora’s box.

  2. Indeed ๐Ÿ™‚ But it also makes it that much interesting to work with it …

  3. Such a great explanation and demonstration. Thank you.

  4. yea, it really is interesting to work with it if you have the courage ๐Ÿ™‚ On the other hand when you start experiencing memory leaks, it not always is very easy task to track them down.. Or sometimes some mate asks “provide me a map of all allocated system memory.. per process”; now, how to do that accurately (more or less) and so that non-memory guru understands it? ๐Ÿ™‚ Nevertheless, articles like this are very, and I mean very helpful. Thanks!

  5. Hello Leon,

    Thanks, and, you know what they say: knowledge breeds courage, eh ? ๐Ÿ™‚

    Memory is actually quite easy to explain, once you figure it out … Starting from the top and going deep, you will usually deal with a few things at most, something you can easily write as boxes on a whiteboard …

    For example, “ORACLE memory” is really just two things:

    • Shared segment for SGA
    • “Process memory” that is allocated by a bunch of instance processes

    Going deeper, “Process memory” is also mostly just two things:

    • Shared CODE Segment (The text of the program)
    • Private DATA Segment (variables, dynamically allocated memory etc … – in other words, the important stuff, that you really care about)
    • (well, there are other “segments” of course, such as stack or loader, but they are usually insignificantly small)

    Important memory stats (i.e.: sizes) of these are easily viewable by simple UNIX commands, such as ps, svmon (ok, it is specific for AIX) or ipcs

    Now, there is, of course, a limit of what OS level commands can see and (at least in my mind) they are best used for comparisons and “big picture” questions rather than process memory map as developers would probably want to see …

    I.e. OS commands will tell you that process 123 uses 10x more “private” memory than process 456), but will not tell you whether that space is used by array aBigArray or out of control sort operation (still, a good developer should have a pretty good idea where memory is (mis?)spent). But that is why they have specific “product” tools for, say, Java or ORACLE

    Cheers,
    Maxym Kharchenko

  6. to force AIX freeing page space for any page operation, just adjust the following parameter :

    vmo -o rpgclean=1

  7. do you have experience with Oracle Tuning by using sga memory pinning (LOCK_SGA)
    – is this a useful option on VPAR (lpm) Systems ?

  8. Hello Imhotep,

    We are locking SGA quite successfully and in fact, I recently wrote about it here: http://intermediatesql.com/index.php/aix/how-oracle-uses-memory-on-aix-part-3-locking-sga/

    Cheers,
    Maxym Kharchenko

  9. May I know where does the Paging spaces located at?

  10. I do not have access to AIX systems anymore, but I’m pretty sure either lsps -s or smitty paging will help you.

  11. Hello Maxym, great post.

    Have you ever come across this bug on aix:

    Metalink: Paging Space Growth May Occur Unexpectedly on AIX Systems With 64K (medium) Pages Enabled [ID 1088076.1]
    IBM: https://www-304.ibm.com/support/docview.wss?uid=isg1IZ71987

    Quote from metalink note: “Unexpected growth in paging space utilization may occur on AIX systems running database applications (inlcuding Oracle RDBMS), even where sufficient memory exists to keep the application (Oracle) memory in RAM.

    So, it looks like this is a normal behavior as you demonstrated on this post, can i say that? I am now very confused, is this a bug or a normal behavior? Did i understand wrong?

    Could you share your knowledge please?

    Regards,

    Thiago Maciel

  12. Hello Maciel,

    I do not think I encountered this specific bug. However, as with any software (ORACLE and AIX included) there might sometimes be bugs that will change the way paging space behaves (and which will simply need to be “fixed”).

    Without seeing metalink note, it is hard to judge what happened, but no, AIX paging space (if configured properly) should not grow unexpectedly absent memory pressure. However, at the same time, if it ever grows, it might never be released if the first operation out of paging space is read.

    Cheers,
    Maxym Kharchenko

  13. Fantastic article! Thank you so much for explaining this concept!!!

    All the best,

    Steve

  14. Thanks, Steven. Glad it helped you ๐Ÿ™‚

  15. Thanks Maxym, for a great explanation.
    I had a memory leak problem which I only found by logging process VSZ every 10 seconds and relating to spikes in paging space levels. The offending process was not freeing allocated memory and used 4GB each time it ran. The fact that it only took 20 seconds to run made it hard to catch without regular logging.
    cheers,
    Anthony

  16. Nice article ! As an AIX admin I’d like to know how really to monitor the active usage of the swap as “sps -a” is not a real indicator anymore for the health status of the system …

  17. Thanks Marcus.

    lsps command on AIX gives an overall overview of paging space being used at the moment and you can always use svmon if you need detailed per process stats. Is that what you were looking for ?

  18. I see a difference between the outputs of lsps and svmon.
    lsps reports 84% usage of a total of 4 GB of paging space.
    When I do
    svmon -U `ps -ef|cut -c1-8|sort -u`-O segment=off -O unit=MB
    (list memory usage for all users currently having a process running)
    the sum of all Pgsp values totals just 1.4 GB. Where is the difference coming from?

  19. Hello Kurt,

    Your question is fairly technical, which means that I would have to run a few experiments to get to the answer.
    Unfortunately, since I changed jobs I no longer have access to AIX systems (I’m living in 100% Linux world now) …

    Perhaps one of the readers can help …

    Thanks for the question though,
    Maxym Kharchenko

  20. The solution seems to be that AIX even keeps pages in the paging space that do not belong to a process anymore.
    Following another hint I got from the ‘net, I set up a secondary paging space of equal size to the primary, and then did
    swapon secondary
    swapoff primary
    swapon primary
    swapoff secondary
    voila, now svmon and lsps do agree. During swapoff primary, one can easily see how the percentage of used paging space shrinks continually while only the needed pages are transferred to the secondary paging space.

    Thanks, Kurt


Leave a comment

No trackbacks yet.