Wednesday, February 15, 2012

Cube processing goes slow when there are dimension key errors

I try and keep my cubes as clean as possible, so that dimension key errors are minimised. However, as a safeguard I tend to enable UnknownMembers. This way the totals are correct even if the customer, item, cashier etc isn't found in the dimension. 
Unfortunately, SSAS (2008 R2) still consumes quite a lot of cpu dealing with these UnknownMembers. This is fine if you only have a few million facts to process, but if you have billions (or even just hundreds of millions) processing noticeably slows down. This manifests as a single cpu busy in msmdsrv. The difference in cube processing time can be by a factor of 20, or so. My rule of thumb is that a partition should be able to process tens of millions of facts per minute. However, if there are key errors, this might drop to less than a million per minute.

If you have configured the ErrorConfiguration to "IgnoreError" you won't even see any error messages when you process in the foreground. But the cpu overhead is still there. The same goes for KeyErrorLimit, cpu will continue to be consumed even though it has stopped logging.

My suggestion is to try, as much as possible, to avoid dimension key errors. A simple coalesce(DimKey,-1) with a "unknown" key value (-1) as the last parameter value, combined with a "unknown" (-1) dimension member. Of course, you don't have to remove every dimension key error, just ensure that there aren't millions (or billions) of them.

No comments: