Recursive parent-child dimensions are painful to query using SQL. A common example of a recursive parent-child dimension is a ragged organization hierarchy containing Manager/Employee relationships. You usually have to flatten them out to make them work in a star schema. That is, instead of having two columns in the dimension for child id and parent id, you need to create n columns, one for each level of the parent-child relationship. Even this is less than optional for users.
Luckily, Analysis Services OLAP Cubes make this easy.
Consider the classic situation of an organization structure. There are four levels in the organization:
- Mary is the CEO and has two direct reports: Bob the Sales manager and Charlie the Engineering manager
- Bob has two salespeople: Debbie and Elsa
- Charlie has two engineers: Fred and Gary
- Fred has 2 helpers, Harry and Ina
To represent this data in a dimension we have two options. The first is a flattened structure so that every level in the organization is visible at each row. For example:
This structure is good for querying via SQL. It’s useful when the number of levels in the hierarchy is small, known, and reasonably static. When the number of levels is large (i.e. more than 10) or when the number of levels is unknown or can change this structure becomes difficult to populate and query. There’s also the question of what to do with the blanks in the unpopulated cells at the lower levels.
An alternative approach is to represent the data in a recursive parent-child structure. Each row has only its parent listed. For example:
This structure is very flexible however it is very difficult to write SQL queries across multiple levels. The significant advantage of this structure is that OLAP cubes can flatten the hierarchy on the fly when they are processed. This means that a hierarchy with changing levels or a large number of levels can be catered for with relative ease. It’s also relatively straightforward to write an MDX query to browse the hierarchy in a cube. Most query tools (for example Excel) will do this dynamically without any code being written.
Here’s an example:
A simple warehouse built from Adventure Works contains a sales orders star schema. The fact table has five dimensions - customer, order date, ship date, product and salesperson. Both the ship date and order date dimensions are role-playing dimension views on dim_date. Similarly, the salesperson dimension is a role-playing dimension view of the employee dimension table. The employee dimension stores the data in the parent-child structure described above - every employee record had its manager business key recorded.
The star schema looks like this:
The objective is to produce a Microsoft Analysis Services OLAP cube of the star schema with the salesperson dimension being an OLAP parent-child dimension.
Microsoft Analysis Services expects an OLAP parent child dimension to have the artificial key for the child AND the artificial key for the parent on the same record. By default, WhereScape RED only creates the artificial key for the child when it builds a standard dimension.
To create an OLAP parent-child dimension, we need to add the artificial key for the parent (manager) to the SalesPerson view. Make sure this is NOT an artificial key in RED and has the end user display check box turned on. Set the source table to be blank and in the transformation of the parent key put the source table (an alias of dim_Employee) and source column. Using the alias of dim_Employee facilitates the join of the dimension to itself to provide the parent artificial key. The column list looks like this:
Set a where clause on the SalesPerson dimension view as follows:
LEFT JOIN dim_Employee dim_Employee1
ON dim_Employee.ManagerID = dim_Employee1.EmployeeID
Next create an OLAP cube called olap_SalesOrder. The wizard in RED also created the OLAP dimensions automatically. You need to delete any hierarchy that gets created - it’s not necessary for this exercise. Edit the attribute list for the dimension so that it has just these two keys:
Create an attribute relationship between the two keys so that Analysis Services knows how to relate the keys to each other. To do this right-click on the OLAP dimension for SalesPerson and select “Add Attribute Relationships”. The details should be:
Now change the properties of the attributes so that the sales person’s name is displayed, not their key (which would be meaningless to any user). To do this, display the attributes again and edit the properties of the child (sales person) key. Ensure the usage was “Key” and change the name column to the descriptive column in the dimension.
Define the relationship as a parent-child relationship. To do this edit the properties of the parent (manager) key attribute and change the usage to “Parent”. Leave the name column the same as the key column. It looks like this:
Now just create the cube and update or process it as you usually would with any other cube.
Based on a white paper by Michael Tantrum, WhereScape USA Inc.