Monday, December 5, 2011

Why doesn't SSAS cache the entire cube?

Would you expect a 36GB server dedicated to SSAS to eventually cache an entire 11GB cube in memory?  If so, you would be wrong.  SSAS does not necessarily keep aggregations in memory, even though memory used is less than Memory\LowMemoryLimit.

I have seen this effect on many production servers but never quite believed it.  There is much more RAM than cube size, yet SSAS is continually losing data from cache and asking for physical IO.  Actually, SSAS relies on the Windows File Cache to reduce much of its IO.  However, the Windows File Cache does not cache the entire cube either.

To demonstrate this effect, I have set up a 36GB server, dedicated to SSAS, and set up an 11GB MOLAP cube.  Then, over a period of 19 hours, I have client tasks querying the cube in a semi random fashion.  The MOLAP cube was not updated over the period.  Over the 19 hours, if SSAS or Windows File Cache were effective, there wouldn't be much more than 11GB of read IO on the SSAS data drive over the 19 hours.  However, the system reached an equilibrium, where it was requesting over 20MB of read IO/second.  If you add this up, the system read 1.3 terabytes of data from the 11GB cube (20MB*60*60*19).  Obviously, the system was not effectively caching the cube.  This is very different from SQL Server, which would quite happily keep an 11GB database in memory, effectively eliminating further read IO.

My server was running Windows 2008, SQL Server 2008 R2 with default configuration settings.  The SSAS database was located on the d: drive.  Nothing else is located on the d: drive
Here is a Perfmon chart with some of the key counters during my test.
SSAS (msmdsrv.exe) did not use more than 12GB of memory during the test, which is well below the LowMemoryLimit 65% (which is about 23GB).

During the 19 hours, SSAS requested, on average, 102MB/second of read IO.  The Windows File Cache was able to satisfy about 80% of these read requests, so the physical disk bytes/second was only 20MB/second. 

The Perfmon chart above shows some of the key counters.  Notice
  • The Process (msmdsrv) Bytes/second y axis scale is in MB, often going off scale at 100MB/sec. These are not necessarily physical IO, as the Windows cache will satisfy many.
  • The PhysicalDisk Bytes/second is averaging 20.1MB/second for the 19 hours
  • The Windows Cache Bytes is sitting between 0.6GB and 1.1 GB.
  • The red line (Cache Copy Reads/sec) is the IO requested of the cache.  These are IO that the Windows cache has managed to satisfy by finding the page in its cache.
  • See how the Cache Copy Reads/sec is satisfying IO requested of the msmdsrv process, reducing the PhysicalDisk Bytes/sec.  Not so easy to see, as the reads are reads/sec, while the PhysicalDisk and Process are in bytes/sec, but you can see it.
Now that I understand this effect, I am much more appreciative of the Windows Cache.  This is very different from a SQL Server system, where the Windows Cache has little to do as SQL disables the Windows File Cache for its database files.

However, you can clearly see that SSAS (or the system) is performing much more IO than would be necessary if the entire 11GB cube was cached in memory.
There are a couple of takeaways from this exercise
  1. Be thankful that we have a Windows Cache, and consider configuring it for SSAS.
  2. Try and put SSAS databases on very fast (ideally solid state disks) storage devices.  If you can't avoid the IO, at least make it fast.
  3. Partition large cubes.  This helps reduce logical IO requests and to the extent it bunches the popular data together it will probably help the Windows Cache hold the hot data more effectively.
  4. Hopefully, in a future version, either SSAS or the Windows File Cache will cache more of the cube.
By the way, this effect is not always noticeable.  If your cube is less than 2-3GB, then it is likely to be well cached between SSAS and Windows File Cache.  Also, if you cube is much larger than the memory available to SSAS, then you would expect to see continual IO, and it is likely to be quite well optimised.  However, when you have a 64 bit server with a cube that is larger than 3GB but is comfortably less than the server memory, you might be surprised to see the volume of continual IO.

7 comments:

Unknown said...
This comment has been removed by the author.
Unknown said...

Richard, this is really strange to me. Are you sure that your server is not paging?
You should track much more performance counter, including private bytes from SSAS, SSAS counters (especially memory) and system memory counters (paging in particular).

Richard Lees said...

Hi Unknown,
I was tracking every Permon counter. I only showed the salient counters in the picture above, or it would be quite unreadable. The system was not paging. It had free memory all day, and the paging file is on the c: drive.
Why don't you try and reproduce the same results? All you need is an 11GB cube and a server with, at least 2-3 times the RAM.

cwebbbi said...

Hi Richard,

I tried to leave a comment yesterday but it hasn't showed up...? Anyway, I was going to suggest that you're running into the problem I describe here: http://cwebbbi.wordpress.com/2009/02/04/arbitrary-shaped-sets-and-the-storage-engine-cache/

Chris

Richard Lees said...

Hi Chris,
I think the shaped sets have something to do with it. But I would like an 11GB cube to be cached in a 36GB server.

shalini said...

This is one of the most incredible blogs Ive read in a very long time. The amount of information in here is stunning, like you practically wrote the book on the subject. Your blog is great for anyone who wants to understand this subject more. Great stuff; please keep it up!
domain name web hosting

Richard Lees said...

Thanks Shalini,
This comment will keep me blogging for sometime.