Menu Request Demo

Loading Data Using RED

05 June 2008

The are several different ways to load data from files and source systems into the data warehouse load table using WhereScape RED.  Each load table type has different attributes, options and performance advantages in different situations.

RED automates loading data from files by either parsing files into columns as they're loaded or by loading files into single column tables.  The following kinds of flat files:

  • fixed width
  • delimited
  • XML

The default is to have RED manage the code for loading file.  This is called a file load.  You may decide to generated a script and manage this yourself.  This is called a script load.

When load fixed width and delimited files, RED uses the load utilities in the target data warehouse database environment.  For SQL Server this is BULK insert.  In an Oracle data warehouse, SQL*Loader is used.  Teradata has several different loaders and RED can utilize fastload, multiload and TPT.  DB2 uses the LOAD statement.

All options available for these loaders are able to be used by RED.

Two options are available for XML.  The default is to allow RED to read and load the XML file itself.  Another option available for SQL Server data warehouses.  This involves using the XML OLEDB provider for SQL Server via a Script load.

RED also automates extracting and loading data from other databases into the data warehouse.  Again, there are several approaches available:

  • ODBC loads
  • Native ODBC loads
  • Database Link loads via a database link (Oracle warehouses only)
  • Database Link loads via a linked server (SQL Server warehouses only)
  • Externally Loaded load tables

For large volume loads from another database into the data warehouse, the fastest option is almost always the Native ODBC load.  The exceptions for this rule are:

  • In an Oracle data warehouse loading data from another oracle database, a database link load may be faster if one of the following is the case:
    • the two databases are connected via a really fast network
    • the two databases on the same physical server

For smaller volumes, you should always use database link loads for SQL Server->SQL Server and Oracle->Oracle load tables.  Similarly, for small volumes, loading data from a different kind of database, it may be better to use ODBC loads rather than Native ODBC loads.

The final option RED provides is Externally loaded load tables.  These provide integration with existing ETL or EAI infrastructure.  RED knows about the load table in its metadata and can build downstream objects from the load table.  RED does not load the load table.  It waits for the other system to load it.  Job integration is achieved using a combination of RED and the other products APIs.

Refer to the WhereScape RED User Guide for full information on all of the load types discussed.


Comments 0

Leave a Comment

No comments.