Since SQL Server 2005, it is possible to put dozens of fact tables (cubes in the old days) into one cube with separate measure groups for each fact table. This is a great feature. For example, Sales and Budget have slightly different dimensionality as Sales is by cashier and hour of day, whereas, budget does not have cashier and time granularity is only available down to week. Having these in the same cube is great, since it makes querying of sales against budget (by all the common dimensions) very easy. Also, if a user is only interested in one measure group, say sales, they can ask the browser that measure group, and they will on see the associated measures and dimensions. In the same way they might browse a separate cube.
You might ask, are there any performance implications of having one super cube?
- In terms of cube size and processing overhead, the design alternatives are comparable.
- MDX queries can be broken down to FE (formula engine) and SE (storage engine) components. The SE component of a query isn't materially affected by unrelated dimensions, since the storage is comparable. However, the FE component of a query is affected by the unrelated dimensions as it is works within the entire cube space (larger with more dimensions).
There might also be operational considerations. The design alternative I have discussed is having multiple cubes in the same database, but if the cubes really are unrelated, you might consider having separate SSAS databases for them. This would lead to more operational options including, parallel processing, and even hosting databases on separate servers.
So, as a cube designer, you must weigh up the performance, functional and maintenance costs. As an over simplified summary, on the side of mini cubes is the minimisation of FE performance costs, while on side of super cubes are increased user functionality.
For another discussion on this topic see Chris Webb's blog
No comments:
Post a Comment