SSIS: Dynamic Processing of SSAS Partitions

SSAS Partitions are primarily geared towards better query performance by narrowing the MDX query to a pre-defined subset of data. Each partition must be in processed state for the cube to be available for browsing. However in a delta ETL run, it is more than likely that only some partitions would be refreshed with data while the rest remain unchanged. For example in a hospital admissions cube partitioned by month, data movements are frequent for admissions made during the current/last month compared to earlier months. This means if the partitions updated by the ETL run can be identified, then we can tailor SSIS package to dynamically process only those partitions that require data refresh while leaving the remaining partitions untouched.

XMLA Command to Process Partitions

SSAS partitions are processed by XMLA commands similar to the one shown below:

<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Parallel>
<Process xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Obj>
<DatabaseID>AdventureWorks</DatabaseID>
<CubeID>Adventure Works</CubeID>
<MeasureGroupID>Fact Internet Sales 1</MeasureGroupID>
<PartitionID>Internet_Sales_2005</PartitionID>
</Obj>
<Type>ProcessFull</Type>
<WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>
</Parallel>
</Batch>

Create XMLA Command to Dynamically Process Partitions

The key to dynamic processing of partitions is to generate the <Process> XMLA fragment for each partition that requires data refresh. To get to that, first create a table that has information about the SSAS partitions as shown below: SSASPartitions

DatabaseID SSAS database ID. Not name
CubeID Cube ID. Not name
MeasureGroupID Measure group ID. Not name
PartitionID Partition ID. Not name
PartitionQuery SQL query of partition
ProcessType ProcessData, ProcessFull, ProcessDefault, ProcessIndex
RequiresProcessFlag Flag=1 – partition requires data refreshFlag=0 – partition already processed and in refreshed state
CreatedDateTime Audit field
LastProcessedDateTime Audit field

Then populate this table with all SSAS partitions. Note that Database ID, Cube ID, Measure Group ID and Partition ID may not be same as their corresponding names (especially if the objects are renamed after they are first deployed) although in most instances they are same. Usually the ETL Framework would give an indication of the fact records inserted/updated in the last run. From this information it is possible to determine which partitions should be refreshed. Update those partition records with RequiresProcessFlag=1. Create a stored procedure to return the XMLA as shown below:

create procedure [dbo].[getProcessingCommand]
@DatabaseID as varchar(200), @CubeID as varchar(200),@ProcessingCommand as varchar(max) output
as
begin

declare @ProcessXmla  as varchar(max);
declare @ProcessBatchXmla as xml;

with XMLNAMESPACES('http://www.w3.org/2001/XMLSchema' as xsd
,'http://www.w3.org/2001/XMLSchema-instance' as xsi
)
select @ProcessXmla =
(

select DatabaseID as  "Object/DatabaseID"
,CubeID as "Object/CubeID"
,MeasureGroupID as "Object/MeasureGroupID"
,PartitionID as "Object/PartitionID"
,ProcessType as "Type"
,'UseExisting' as "WriteBackTableCreation"
from dbo.SSASPartitions
where DatabaseID =@DatabaseID
AND CubeID = @CubeID
AND RequiresProcessFlag=1
for xml path ('Process')
) ;

set @ProcessBatchXmla= Convert(xml,
'<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Parallel>'
+@ProcessXmla
+  '</Parallel>
</Batch>'
,1)

set @ProcessingCommand = Convert(varchar(max),@ProcessBatchXmla,1)

end

This stored procedure uses FOR XML with XMLNAMESPACES to generate the <Process> XMLA fragment which is then encapsulated by the <Parallel> and <Batch> XMLA fragments. It is better to return the output XMLA as output parameter instead of a record set (SELECT @ProcessingCommand). If returned as record set, SQL Server implicitly converts the XMLA string into an XML which to SSIS is object data type and not a string. With limited operations around SSIS object data type, it is convenient to work with string especially when setting task expressions. The output of the stored procedure will be:

<Batch xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">
<Parallel>
<Process xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Obj>
<DatabaseID>AdventureWorks</DatabaseID>
<CubeID>Adventure Works</CubeID>
<MeasureGroupID>Internet Orders</MeasureGroupID>
<PartitionID>Internet_Orders_2008</PartitionID>
</Obj>
<Type>ProcessData</Type>
<WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>
<Process xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Obj>
<DatabaseID>AdventureWorks</DatabaseID>
<CubeID>Adventure Works</CubeID>
<MeasureGroupID>Internet Orders</MeasureGroupID>
<PartitionID>Internet_Orders_2009</PartitionID>
</Obj>
<Type>ProcessData</Type>
<WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>
<Process xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<Obj>
<DatabaseID>AdventureWorks</DatabaseID>
<CubeID>Adventure Works</CubeID>
<MeasureGroupID>Internet Orders</MeasureGroupID>
<PartitionID>Internet_Orders_2011</PartitionID>
</Obj>
<Type>ProcessData</Type>
<WriteBackTableCreation>UseExisting</WriteBackTableCreation>
</Process>
</Parallel>
</Batch>

The output of the stored procedure can be used in SSIS either by Analysis Services Processing Task or Analysis Services Execute DDL Task to dynamically process SSAS partitions.

Using XMLA Command in SSIS Analysis Services Processing Task

The Analysis Services Processing Task can be used as shown in screen capture 1

Screen Capture 1 – Using XMLA Command in Analysis Services Processing Task

The stored procedure getProcessingCommand is called in execute SQL task and the output XMLA returned to a string variable.

Screen Capture 2 – Stored Procedure Call

The XMLA string variable is then assigned to the ProcessingCommands attribute of Analysis Services Processing Task from the Expressions option

Screen Capture 3 -Analysis Services Processing Task

At design time, Analysis Services Processing Task does require a partition to build the package. This partition will be overridden by the XMLA string at run time. Finally the RequiresProcessFlag is set to 0 on successful execution.

Using XMLA Command in Analysis Services Execute DDL Task

Another way to process SSAS partitions is to use the Analysis Services Execute DDL Task and supplying XMLA commands as shown in Screen Capture 4

Screen Capture 4- Using XMLA Command in Analysis Services Execute DDL Task

From the DDL options, set the Source Type attribute to Variable and Source to the XMLA string generated by the getProcessingCommand stored procedure

Screen Capture 5 – Analysis Services Execute DDL Task Editor

There is no difference between the two SSIS Analysis Service tasks as far as SSAS processing is concerned and is only a matter of preference to use one over another.

Download Source Code

You can download the SSIS package and Stored Procedure used in this post from http://1drv.ms/1avLkl7

Related Posts

  • SSIS: Creating Dynamic Data-driven SSAS Partitions
  • SSIS Package to Process SSAS Cube
  • SSAS: Process Incremental – What it is & isn’t
  • 29 thoughts on “SSIS: Dynamic Processing of SSAS Partitions

    1. Great post. Much appreciated! Unfortunately the download for the example does not work. Can you re-upload?

    2. Hi Benny and thanks for sharing this.

      I reused your code that builds the processing XMLA and it works great! One question though: we are on SQL Server 2017 and when scripting out the process command for an SSAS object, such as a Cube, there are references to 11 different XML namespaces in the code, all with a microsoft.com URL, in addition to the two w3.org-namespaces. Example: xmlns:ddl500_500=”http://schemas.microsoft.com/analysisservices/2013/engine/500/500″>

      When now building the XMLA dynamically, are really all those namespaces necessary in the processing command? I ask because they add a substantial amount of characters to the overall XMLA script, once for every partition, and we have many partitions… I notice you only include the two w3.org-namespaces in your WITH XMLNAMESPACES()-code. Maybe it is good enough just including those two then? But then what are the 11 others for? What is the purpose of them in the autogenerated XMLA (if any)? Just curious…

      Thanks!

      1. Martin, I am not sure. Haven’t gone back to multi-dimensional models since moving to Azure. Perhaps when you you figure out, you could reply here for the benefit of others. I do have a solution for partitioning Azure Analysis Service tabular model. I will publish a blog post soon.

        1. Thanks for replying Benny

          I found a thread on SSAS-forum here regarding the same question, in which I replied too:
          https://social.msdn.microsoft.com/Forums/sqlserver/en-US/645f49dd-ce90-42e4-a081-cd1237398e1e/just-wondering-about-the-namespace-declarations-in-xmla?forum=sqlanalysisservices

          I actually tested removing all namespace-references but the first one in the tag, and then executed the script, and the processing seemed to work fine / just as normal. So the question is really whether there are any downsides of doing that or not. If I ever find out, yes, I will try remember to reply here again 🙂

          Martin

    3. Actually I created SSIS package to load Tabular partition cubes and I’m adding new partitions manually (if I want to load new one).

      How to load Dynamically(without adding automatically need to process) this Tabular partition cubes in SSIS package.
      Can you please explain me with a small example.

      Or Skype I’m ready to take.

    4. Hi Benny!
      Wonderful and very useful article, can you tell us how can you list dinamically all the avalaible partitions. We have lots of partitions and we want to avoid to write all of them by hand.
      Thanks a lot for your help

    5. Benny – Incredible stuff – thank you. I have a scenario where I continuously reprocess a 12 month window (6 months back / 6 months forward) of partitions, do you have any recommendations how do tweak this and incorporate a forloop somewhere?

      1. KDW, the solution already loops through the metadata table to process relevant partitions. Just set the process flag from your ETL based on your 12 month window.

    6. Benny – This is a great post. I successfully set up dynamic partitioning in my cube using this post and all of your examples. Now when I go back to modify my cube, my solution overrides the dynamic partitioning – how do you overcome the changes that need to be made from time to time when you make changes in the solution?

    7. I am having a problem with SSIS 2014 adding the \ escape character to the xmla command, which makes it fail. The xmla comming from the stored procedure is great, but once it is passed to the string variable the escape characters are added. Any ideas.

      1. Can you explain bit more in detail ? Based on what you have mentioned, I am not sure whether you have values with \ in them or whether you are trying to append something to the generated xmla

      2. Hi guy, late reply but for other people today, the issue I had as well was related to your string being too long. The varchar(max) was not sufficient in my case when it created the XMLA path. When i did the test with 5 partitions instead of 20 everything worked smoothly. I don’t know bets practice but if you have 20+ partitions that you need to process daily I think it would be a good idea to run them in parallel with a batch counter or something…

    8. Hi! Is there any scenario where it would be useful to apply this technique to the Dimensions? Or is “ProcessDefault and ProcessUpdate” smart enough to see that a dimension has not changed in source?

      Ex: We have 18 dimensions and usually only 2 dimensions are updated daily.

      1. You could extend the metadata approach to dynamically process dimensions, however it might not be as feasible as measure group partitions. On balance process default plus process update will figure out whether dim data has changed but comes with a little extra processing overhead

    9. Hi Benny ,
      Thanks for the reply, yea i noticed as soon as we set dimensions to ProcessFull it brings the dependent objects to unprocessed state.
      I did Process Default followed by Process Update and everything worked fine.

      But have one more question if i am not wrong then process update only works good when we have attribute relationship type as flexible or if relation type is rigid and there is no change in source data.
      But If its rigid and if there is some change in source data then it errors out.

      So does doing Process Default first and then Process Update will resolve this problem ?

    10. Hi Benny,
      I implemented this process and its working fine but having one issue.
      I have cube partitioned by year and currently having 2 partition for 2013,2014,2015.
      When i use this process to only process 2015 then after processing my data for other 2 partition is gone not able to browse that.
      Before doing processing for 2015 i am full processing dimensions.

      Why data is not visible for other 2 partition ? am i missing something in setting ?

      thanks.

    Leave a comment