Menu Request Demo

Solving Load Problems in SQL Server

Date:
03 March 2011
Author:
Douglas Barrett

This article looks at the different levels of error reporting provided by the different database load methods into SQL Server.  There are a number of ways that we can define how to load data into SQL Server using RED:

  •          SSIS – this is Integration Services, SQL Server’s ETL tool that RED will generate and call dynamically.
  •          Database links – which utilize OLEDB providers to connect to from SQL Server to the source database.
  •          Native ODBC – which uses an ODBC connection to read source data and SQL Server’s BULK INSERT command to efficiently load the data.
  •          ODBC - which utilizes an ODBC client to read and load data into the data warehouse 1 row at a time.

The factors which determine the best load method to choose are typically:

  •          Load performance – a separate blog covers performance, but generally SSIS is the fastest, followed by Native ODBC or a Database Link (depending on the OLEDB and ODBC driver for the source system) and finally ODBC loading. 
  •          Error reporting - to assist troubleshooting.  RED will capture error messages that are generated by the method that is used to load data.  Some of these methods generate better error messages than others. 
  •          Environment – which mechanism is supported by the standards, policies and software environment. 

Load issues are usually caused by incorrect data type mapping (incorrect load table definition, data type incompatibilities or subsequent changes to the source system) or incorrect column NULLability.  Data type conversion can be significantly improved by editing the mapping of data types from source systems using Tools / Data Type Mappings – see the RED documentation for more details.  Other factors that can cause load failure are source system changes, network failure or duplicates when using a unique constraint. 

The ideal error reporting will identify:

  •          the reason that the load failed,
  •          the row that caused the failure (either by showing the value that failed, or a row number) ,
  •          the column that caused the failure. 

With all three of these parts in the failure report a developer can easily find the offending data that caused the problem and amend the load table attribute definition or apply a transformation to cater to load issue. 

The following table records the message recorded for each of the 4 load methods for loading data from SQL Server to SQL Server.  This test was run by altering the load table to generate a failure, and then capturing the error message during the load.  Interestingly some methods would allow a load where other methods fail.  For example an SSIS load will cope with a shorter destination data type by automatically truncating the data – it will generate a warning, but not a failure. 

 

Incorrect data type Length (varchar(15) to varchar(5)

Incorrect data type (char into an int)

NULLs into an NON-NULL column

SSIS

Warning:

Truncation may occur due to inserting data from data flow column "shipcity" with a length of 15 to database column "shipcity" with a length of 5.

 

Note:  data is loaded successfully but data is truncated to fit the destination column.

Failure:

Fails the load if implicit conversion fails:

There was an error with input column "shipcity" (92) on input "OLE DB Destination Input" (14). The column status returned was: "The value could not be converted because of a potential loss of data.".

Failure:

SSIS Error Code There was an error with input column "shipregion" (93) on input "OLE DB Destination Input" (14). The column status returned was: "The value violated the integrity constraints for the column."

Native ODBC

Failure:

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 921, column 12 (shipregion).

Failure:

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 11 (shipcity).

Completion or Failure:

The bulk load failed. Unexpected NULL value in data file row 921, column 12. The destination column (shipregion) is defined as NOT NULL.

 

Note: a NULL will be converted to an empty text string and not fail loading into a text column.

Database Links

Failure:

[Microsoft][ODBC SQL Server Driver][SQL Server]String or binary data would be truncated.

Failure:

[Microsoft][ODBC SQL Server Driver][SQL Server]Conversion failed when converting the nvarchar value 'Reims' to data type int.

Failure:

[Microsoft][ODBC SQL Server Driver][SQL Server]Cannot insert the value NULL into column 'shipregion', table 'WSLWarehouse.dbo.load_orders_link'; column does not allow nulls. INSERT fails.

ODBC

Failure:

Failed on insert after 8104 rows. [Microsoft][ODBC SQL Server Driver][SQL Server]String or binary data would be truncated.

Failure:

Failed on insert after 1 rows. [Microsoft][ODBC SQL Server Driver][SQL Server]String or binary data would be truncated.

 

Failure:

Failed on insert after 0 rows. [Microsoft][ODBC SQL Server Driver][SQL Server]Cannot insert the value NULL into column 'shipregion'

 

In conclusion a Native ODBC load consistently provides the best error reporting, identifying the failure reason, row number (of the intermediate text file) and the column that caused the problem.  In the event of a failure the failed data file will be persisted as a text file, and the reported error row can be found if the text file is opened with a text editor (eg Notepad or Textpad) and using Goto line.   

If another load method is used which does not clearly identify the failure details required to identify and fix the problem, then it is recommended that the particular load table is temporarily changed to use a Native ODBC load method to get better error reporting. 

CATEGORIES:

Comments 0

Leave a Comment

No comments.