A common problem encountered when building a data warehouse is to populate a single target table (dimension or fact) from multiple sources. WhereScape RED supports this capability through staging tables via the "Set Merge" Procedure option.
The Set Merge Procedure option allows the merging of two or more identical tables. The table to be merged must have exactly the same number of columns and column names. If necessary, blank columns could be added to one or other of the tables to make them identical. To use this procedure you must have the tables to be merged mentioned at least once in the ‘Source Table’ field of a columns properties.
To illustrate this capability, we will step through a simple example. We are attempting to build a customer dimension that will contain customer id, name and other demographic information. Depending on the type of customer (online or store) their information is stored in separate tables within the source system and the process for sourcing each type of customer varies:
1. First step is to create a load table (load_customer_store) that will source all customer information for those customers which purchase directly from a store.
2. Next step is to create a load table (load_customer_online) that will source all customer information for those customers which are deemed online (ie: purchase via web). Ensure that the column names for the table are identical to the ones in the load_customer_store table.
3. We now create a staging table (stage_customer) that will initially source from the load_customer_store table. Then alter one of the columns on the stage_customer table and modify the ‘Source Table’ property to source from load_customer_online table.
4. Now create the update procedure for stage_customer table and select procedure type of ‘Set Merge’. If you look at the generated code for this procedure you will notice that it loads data from both load_customer_store and load_customer_online tables into the stage_customer table.
5. Last step is to create the dimension dim_customer, using the stage_customer staging table as the source for this dimension.
Note: When scheduling the table for loading you must ensure that the dimension table (dim_customer) is not processed until after the staging table (stage_customer) is updated.
By taking advantage of the built-in functionality of WhereScape RED you eliminate the need to write and maintain custom code.