Data Warehouse Automation 


To capitalize on the value of enterprise data, organizations must create a modern data warehousing environment that  embodies agile design and development, rapid time-to value, and an adaptable infrastructure that can flex quickly  and easily as business needs change. 

Unsurprisingly,  achieving this goal requires not only the right approach, but the appropriate toolset.

This guide highlights the uses and advantages of data  warehouse automation compared to traditional extract,  transform, and load (ETL) and extract, load, and transform  (ELT) tools, and helps you make informed decisions about  the capabilities your team needs for a modern approach to  data warehousing.

What is ETL?

ETL tools are typically server-based, data integration solutions for moving and  manipulating data from its sources to a target data warehouse. When ETL tools first emerged four decades ago, the servers that databases ran on did not have the  computing power of today, so ETL solutions were developed to alleviate the data  processing workload. They typically provided additional database and application connectivity and data manipulation functions that were previously limited in  database engines.

What is ELT?

Instead of using the older ETL method, today some vendors take an ELT approach.  With ELT, data transformation happens in the target data warehouse rather than requiring a middle-tier ETL server. This approach takes advantage of today’s  database engines that support massively parallel processing (MPP) as well as its  availability within cloud-based data platforms such as Snowflake, Amazon Redshift  and Microsoft Azure SQL Data Warehouse.

The Data Warehouse Lifecycle  

While ELT certainly represented a step forward in thinking compared to ETL,  both types of data movement solutions still only cover a small portion of the  data warehousing lifecycle. This means that organizations must rely on many  disparate tools to support everything else involved in designing, developing, deploying, documenting and operating their data warehouses and other  data infrastructure.

Data Warehouse Automation

In comparison to the limited scope of ETL and ELT tools, data infrastructure  automation encompasses the entire data warehousing lifecycle. From  planning, data discovery and design through development, deployment,  operations, change management — and even documentation — automation  unifies it all.

While ETL and ELT tools generate code to perform the movement and  transformation of data, automation goes beyond code generation. For  example, WhereScape generates database objects such as tables, indexes,  views, dimensions and facts – and comprehensive business and technical  documentation, including full data lineage – that is always in sync with  deployed data warehousing systems.

From design through operation, WhereScape automation helps teams reduce  the data warehouse development lifecycle by 80 percent and increase  developer productivity five-fold.

See the chart on page 5 for a look at the steps involved within the data  warehousing lifecycle and the tools data warehousing teams use to perform  the work within each.

7 Scenarios Where Automation Outperforms ETL/ELT Tools

Automation is designed for the needs of modern data warehousing. Review these seven scenarios where automation provides value beyond ETL/ELT to see if it is a fit for your data warehousing environment:

  1. Our users are unsure of which data they need and are unclear on their requirements
  2. Our requirements change rapidly during the project once users begin to see the real data
  3. Repetitive, time-consuming SQL hand-coding requires additional developers and/or lengthens project duration
  4. Consistent development standards and documentation have been lacking
  5. Our source systems change from time to time
  6. Time-to-value must be achieved in a matter of days and weeks versus months and years
  7. Simplifying deployment and ongoing operations is key to long-term viability of our data warehouse

Metadata for Data Infrastructure

Unlike ETL and ELT tools, WhereScape is metadata-driven, automatically producing and maintaining metadata to keep track of upstream and  downstream dependencies of all objects in the data infrastructure. This means developers can create, manage and document dependent objects knowing that they will automatically remain integrated and appropriately updated whenever there are changes to the underlying infrastructure that affects them.

By using metadata to synchronize and automate change management and documentation, WhereScape makes it much easier and faster for teams to respond to change. Whether evolving existing environments or migrating to new ones, metadata-backed automation gives developers a leg up compared to utilizing previous ETL- or ELT-only methods.See the chart on page 5 for a  look at the steps involved within the data warehousing lifecycle and the tools  data warehousing teams use to perform the work within each.

Data Warehouse Automation 

Automation is designed for the needs of modern data warehousing. Review these three scenarios where automation provides value beyond ETL/ELT to see if it is a fit for your data warehousing environment:

  1. Save months or years of upfront development time
  2. Automate documentation and data lineage to increase efficiency and decrease reliance on rare and siloed tribal knowledge
  3. Iterate in hours instead of weeks or month with automated change management where metadata synchronizes all development tasks.

Data Warehousing Tools 

Using Automation and ETL/ELT Tools Together

For situations where ETL or ELT code is already set up and working, data warehouse automation can be introduced without the need to reengineer or rebuild the existing code. Both solutions can co-exist, letting you reap the benefits of automating the rest of the data warehouse lifecycle.

Automated Data Warehouse

When efficiency, speed, and agility are critical to delivering insights to the business, data warehouse automation is the technology of choice. Fast-track the delivery and operation of your data infrastructure by using automation to

  • Eliminate repetitive, manual efforts
  • Reduce the need for broad, non-integrated toolsets
  • More easily address needed change within the infrastructure  you create.

Learn more about data infrastructure automation and how it can help your data warehousing team unite and fast-track the entire data warehousing lifecycle to deliver data infrastructure projects faster. See how your organization can deliver more with automation.