Join us for a 90-minute, insightful webinar...
Maintaining a Data Warehouse
In the final article of this series, I address the topic of maintaining the data warehouse that has been designed, built, and deployed over the previous three articles. But, maintenance is too small a word for what is entailed!
In more traditional IT projects, when a successful system is tested, deployed and in daily operation, its developers can usually sit back and take a well-deserved rest as users come on-board, and leave ongoing maintenance to a small team of bug-fixers and providers of minor enhancements. At least until the start of the next major release cycle. Developers of today’s data warehouses have no such luxury.
The measure of success of a data warehouse is only partly defined by the number and satisfaction level of active users. The nature of creative decision-making support is that users are continuously discovering new business requirements, changing their mind about what data they need, and thus demanding new data elements and structures on a weekly or monthly basis. Indeed, in some cases, the demands may arrive daily!
This need for agility in regularly delivering new and updated data to the business through the data warehouse has long been recognized by vendors and practitioners in the space. Unfortunately, such agility has proven difficult to achieve in the past. Now, ongoing digitalization of business is driving ever higher demands for new and fresh data. Current—and, in my view, short-sighted—market thinking is that a data lake filled with every conceivable sort of raw, loosely managed data will address these needs. That approach may work for non-critical, externally sourced social media and Internet of Things data. However, it really doesn’t help with the legally-binding, historical and (increasingly) real-time internally and externally sourced data currently delivered via the data warehouse.
Fortunately, the agile and automated characteristics of the Data Vault / data warehouse automation (DWA) approach described in the design, build and operate phases discussed in earlier posts apply also to the maintenance phase. In fact, it may be argued that these characteristics are even more important in the maintenance phase than in the earlier ones of data warehouse development.
One explicit design point of the Data Vault data model is agility. A key differentiator between Hub, Link, and Satellite tables is that they have very different usage types and temporal characteristics. Such separation of concerns allows changes in both data requirements (frequent and driven by business needs) and data sources (less frequent, but often requiring deep “data archeology”) to be handled separately and more easily than in traditional designs. In effect, the data warehouse is structured according to good engineering principles, while the data marts flow with user needs. This structuring enables continuous iteration of agile updates to the warehouse, continuing through to the marts, by reducing or eliminating rework of existing tables when addressing new needs. For a high-level explanation of how this works, see Sanjay Pande’s excellent “Agile Data Warehousing Using the Data Vault Architecture” article.
The engineered components and methodology of the Data Vault approach are particularly well-suited to the application of DWA tools, as we saw in the design and build phases. However, it is in the maintain phase that the advantages of DWA become even more apparent. Widespread automation is essential for agility in the maintenance phase, because it increases developer productivity, reduces cycle times, and eliminates many types of coding errors. WhereScape® Data Vault Express™ incorporates key elements of the Data Vault approach within the structures, templates, and methodology it provides to improve a team’s capabilities to make the most of potential automation gains.
Furthermore, WhereScape’s metadata-driven approach means that all the design and development work done in preceding iterations of data warehouse/mart development is always immediately available to the developers of a subsequent iteration. This is provided through the extensive metadata that WhereScape stores in the relational database repository and makes available directly to developers of new tables and/or population procedures. This metadata plays an active role in the development and runtime processes of the data warehouse (and marts) and is thus guaranteed to be far more consistent and up-to-date than typical separate and manually maintained metadata stores such as spreadsheets or text documents.
In addition, WhereScape automatically generates documentation, which is automatically maintained, and related diagrams, including impact analysis, track back/forward, and so on. These artefacts aid in understanding and reducing the risk of future changes to the warehouse, by allowing developers to discover and avoid possible downstream impacts of any changes being considered.
Another key factor in ensuring agility and success in the maintenance phase is the ongoing and committed involvement of business people. WhereScape’s automated, templated approach to the entire design, build and deployment process allows business users to be involved continuously and intimately during every stage of development and maintenance of the warehouse and marts.
With maintenance, we come to the end of our journey through the land of automating warehouses, marts, lakes, and vaults of data. At each step of the way, combining the use of the Data Vault approach with data warehouse automation tools simplifies technical procedures and eases the business path to data-driven decision making. WhereScape Data Vault Express represents a further major stride toward the goal of fully agile data delivery and use throughout the business.
You can find the other blog posts in this series here:
- Week 1: Designing a Data Warehouse
- Week 2: Building a Data Warehouse
- Week 3: Operating a Data Warehouse
Dr. Barry Devlin is among the foremost authorities on business insight and one of the founders of data warehousing, having published the first architectural paper on the topic in 1988. Barry is founder and principal of 9sight Consulting. A regular blogger, writer and commentator on information and its use, Barry is based in Cape Town, South Africa and operates worldwide.
Optimizing Enterprise Data Management Solutions with WhereScape RED
Empowering Enterprise Data Management with WhereScape RED Choosing the best data warehouse automation software can make enterprises more scalable, accurate, and competitive. WhereScape RED is one of the most empowering enterprise data management solutions available,...
Enhancing Data Interoperability: Connecting Diverse Systems with WhereScape
Understanding Data Interoperability in Modern Enterprises Modern enterprises use countless different systems to gather, store, analyze, and process data. From varying data types to unique purposes and levels of security, a single enterprise handles data in several...
Gartner Highlights the Rise of Data Warehouse Automation
Imagine a world where the manual, tedious tasks of data warehouse development are a thing of the past. This isn't a far-off fantasy but a present-day reality, thanks to advances in Data Warehouse Automation (DWA). Gartner's latest report by analyst Henry Cook,...
Experience the Power of WhereScape 3D 9.0.3: New Features and Improvements
We’re thrilled to introduce our latest iteration of WhereScape 3D! Version 9.0.3 brings a host of new features and enhancements designed to make your data warehousing journey smoother, faster, and more efficient. Let’s dive into the details of what you can expect from...
Ahead of the Curve: Future Trends in Data Automation and WhereScape’s Pioneering Solutions
The Evolving Landscape of Data Automation As new technologies emerge and existing tools constantly change and improve, the world of data automation transforms rapidly. Even the most well-versed data teams find themselves disoriented and overwhelmed in the face of...
Investing in Data Automation: A Strategic Approach to Business Growth
Unlocking Growth: The Strategic Advantage of Data Automation Organizations reaping the benefits of data automation stay ahead of industry trends and improve the efficiency of their operations and decision-making. Data automation tools offer a strategic advantage for...
Data + AI Summit 2024: Key Takeaways and Innovations
The Data + AI Summit 2024, hosted by Databricks at the bustling Moscone Center in San Francisco, has concluded with remarkable revelations and forward-looking innovations. Drawing over 16,000 attendees in person and virtually connecting over 60,000 participants from...
WhereScape RED 10.1 is Here: Enhanced Scheduling and Customization
We’re proud to announce the highly anticipated WhereScape RED 10.1 is now available, and it’s packed with exciting new features and enhancements designed to make your data warehousing experience more efficient and enjoyable. Let's take a closer look at what’s new and...
Supercharging Data Integration: The WhereScape and Databricks Advantage
The demand for robust data management systems has never been higher, and Databricks has quickly become a favored choice for cloud-based solutions. Its powerful capabilities make it a top contender for managing large-scale data, but when combined with WhereScape's...
Empowering Customer Success: WhereScape’s Comprehensive Support and Training Resources
Enhancing Operational Success with WhereScape’s Support Systems At WhereScape, we understand that a data warehouse is only useful to the extent that it is understood. In order to drive your organization closer to your key goals and objectives, you need full mastery of...
Related Content
Optimizing Enterprise Data Management Solutions with WhereScape RED
Empowering Enterprise Data Management with WhereScape RED Choosing the best data warehouse automation software can make enterprises more scalable, accurate, and competitive. WhereScape RED is one of the most empowering enterprise data management solutions available,...
Enhancing Data Interoperability: Connecting Diverse Systems with WhereScape
Understanding Data Interoperability in Modern Enterprises Modern enterprises use countless different systems to gather, store, analyze, and process data. From varying data types to unique purposes and levels of security, a single enterprise handles data in several...
Gartner Highlights the Rise of Data Warehouse Automation
Imagine a world where the manual, tedious tasks of data warehouse development are a thing of the past. This isn't a far-off fantasy but a present-day reality, thanks to advances in Data Warehouse Automation (DWA). Gartner's latest report by analyst Henry Cook,...
Experience the Power of WhereScape 3D 9.0.3: New Features and Improvements
We’re thrilled to introduce our latest iteration of WhereScape 3D! Version 9.0.3 brings a host of new features and enhancements designed to make your data warehousing journey smoother, faster, and more efficient. Let’s dive into the details of what you can expect from...