Saturday, August 10, 2019

AWS Redshift versus SQL Server cube

Here is a straight out performance benchmark between AWS' Redshift, SQL Server cube, and SQL Server columnstore indexing. If you are choosing between these (and potentially other) platforms, this will provide some interesting reading. 
Essentially, the Microsoft cube was 97% faster than Redshift.

Note, I tried to make the test as independent and representative of a retail enterprise's query load as possible. The underlying fact tables contained 100 million, 430 million and 600 million rows. The queries ranged from the banal to common calcs such as "year to date" and "stock days on hand".

If you are serious about making the right decision for your organisation, you might set up a similar test with your own data and popular queries. 

Here is the full report:
Data Platform Performance Paper

This benchmark has been updated to include SQL Tabular cube performance. Here is the updated paper.

4 comments:

Unknown said...

rticle! A couple points -
1. Distribution keys based on query usage is key for Redshift performance. I agree with the person above that if we choose appropriate distribution keys, it will make a significant difference.
2. Cubes in SQL Server have already pre-computed a lot of the query and stored the results in an optimized manner. It would be interesting to compare Microstrategy cubes to SQL Server cubes.

Unknown said...

SSAS multidimensional automatically aggregates fact data based on the dimensions. So it doesn't matter how many rows the fact tables have, but the unique permutations of the dimension keys does. The total aggregation count may be less than 1 million out of a table with 600 million rows

This will tell how many aggregations are in the measure group
SELECT *
FROM SystemRestrictSchema($system.discover_partition_stat
,DATABASE_NAME = 'Adventureworks'
,CUBE_NAME = 'Adventureworks'
,MEASURE_GROUP_NAME = 'FactInternetSales'
,PARTITION_NAME = 'FactInternetSales')

Alas it's still disappointing to see how slow redshift is compared to SQL server columnstored indexes

Richard said...

Yes, it is disappointing that redshift doesn't perform as well as SQL columnstore indexes. However, it's not surprising when you think SQL has the advantage of a high performing optimiser.

Bhanu Sree said...

Very informative and creative content. This concept is a good way to enhance the knowledge.thanks for sharing
RedShift Training
AWS RedShift Training
Amazon RedShift Online Training