Wednesday, December 10, 2008

Mixing SQL and MDX queries

A very useful tool in any BI developers toolbox is the OpenRowset query that can combine the data from an SQL and AS database. If you do Data Mining, the chances are that you use it all the time. The basic format is quite simple.
SELECT * FROM OPENROWSET(provider, connection, query)

There are many occasions when this heterogeneous query is useful, for example
  1. ETL processes that need to query source systems depending on what data is in the OLAP database, or where the OLAP database is up to.
  2. SQL transaction system might need to query summary data, and while the programmers can query the OLAP database with a MSOLAP provider, they might find it easier to use their existing SQL provider with an OpenRowset.
  3. You need a combination of aggregated and atomic data for a query
  4. You are data mining and want to predict values based on data within an SQL database
Here is a trivial example of an SQL query joining with an OLAP query. The OLAP database is the Perfmon cube (on http://RichardLees.com.au/Sites/Demonstrations Perfmon) which is joined with sql data from sys.databases.

SELECT
db.*,
olap."[Measures].[Value]" as "Average Transactions/Second"
FROM OpenRowset('MSOLAP','DATASOURCE=Localhost; Initial Catalog=Perfmon;',
'SELECT
Measures.[Value] ON COLUMNS,
[Counter Details].[Database Name].[Database Name] ON ROWS
FROM Perfmon
Where ([Counter Details].[Counter].[Object].&[SQLServer:Databases].&[Transactions/sec])
') as olap
INNER JOIN sys.databases db on convert(varchar(255),olap."[Counter Details].[Database Name].[Database Name].[MEMBER_CAPTION]")=db.name


Richard,
http://RichardLees.com.au

2 comments:

Өнөрбат.Х said...

hi.
How do I improve performance of OpenRowset for OLAP server?

Unuruu.

Richard Lees said...

It really depends on what your performance issue is. You will need to note that there is no optimiser managing a join between the openrowset and the other half of the query (if there is one). I would encourage you to start with two Profiler traces for each side of the query.

Also note, hybrid queries can be very efficient. Every request to the live data mining pages on http://RichardLees.com.au/Sites/Demonstrations are hybrid queries. They tend to be sub second and can scale quite well.