Introduction
Data management has become an important aspect of modern businesses, as the volume of data being generated and stored continues to grow at an exponential rate. This is certainly true for our clients at Scalefree. As data becomes increasingly important for them, organizations must ensure that the data they collect, store, and use is reliable and trustworthy. In order to do so, businesses require a robust data warehousing solution that provides a high level of accuracy, security, and traceability. To implement such a solution, the Data Vault 2.0 concept can be used. It provides auditability as one of its key features.
In this blog, we will explore the advantages of auditing and traceability in Data Vault and how WhereScape can help you unlock these advantages.
Explanation of the Concept of Auditability in Data Vault
Data Vault is a data warehousing concept that provides a centralized, flexible, and scalable approach to data management. One of its key features is the concept of auditability. The concept of auditability in Data Vault not only focuses on the tracking of data changes and lineage but also covers the various processes involved in data management. In Data Vault, the four key areas that are covered in auditability are the data model, the operational process, the development process, and security. The data model includes both the data and the information it represents. The operational process encompasses processing timelines, runtime and error logs, ensuring that data processing runs smoothly and efficiently. The development process involves documenting and tracing code changes and implementing version control, allowing for easier debugging and maintaining the integrity of the data. Finally, the security aspect covers defined roles and responsibilities, as well as access control lists, to ensure that the data is protected and accessed by authorized individuals only. These four areas combined provide a comprehensive approach to auditability in Data Vault, ensuring that businesses have accurate and trustworthy data for decision-making purposes.
Typical Requirements
The requirements for auditability in the Data Vault modeling approach can be broadly categorized into two categories: data auditability and information auditability.
In terms of data auditability, it is crucial that every source delivery is reconstructable, to ensure that no data is lost and that no artificial data is generated. True duplicates and failed or incomplete deliveries must also be included in this process. Additionally, business rules must be managed through proper configuration management. By meeting these requirements, Data Vault helps to maintain the integrity of data used in decision-making processes, allowing for an exact representation of the source data in the raw Data Vault at the time of loading and the traceability and lineage of data changes.
Information auditability refers to the ability to track and reconstruct the transformation of raw data into meaningful information. This requires the ability to track the lineage of data changes, as well as the application of business rules and calculations that generate insights. By ensuring information auditability, Data Vault ensures the reliability and trustworthiness of the information used in decision-making processes.
In conclusion, the Data Vault modeling approach emphasizes both data and information auditability, allowing organizations to make confident, data-driven decisions based on a complete and accurate representation of their data. In the next section, we will explore the advantages and keys of implementing audibility in Data Vault.
Advantages of Auditability in Data Vault
In a modern data environment, the data runs through various layers. To still provide continuous data quality, it must always be clear where data has come from.
Exact copy of data from the source at the time of loading
To achieve this, we store an exact copy of the source data in the raw Data Vault. This data is enriched by a record source and load times for every row. Furthermore, Data Vault relies on an Insert-only architecture. That means we do not update or delete any data. Instead, changes are tracked as a new row in a satellite. Deletes are tracked in a so-called record-tracking satellite. The result is a fully auditable data warehouse that reflects the source data's complete history, and you can access data from any point in time.
Traceability and lineage of data
Auditability in Data Vault also provides organizations with the ability to trace data back to its source, or to track changes to a particular data element over time. This helps to ensure data accuracy and integrity, as well as providing organizations with greater control over their data. Wherescape offers tools that visualize the data lineage, which is covered in detail later in the article.
Accountability and continuous improvement
Auditability in Data Vault allows organizations to monitor and control access to data, as well as set up role-based access control. This helps to ensure that data is being used appropriately and that any changes to the data are tracked and documented.
Handling of changes in the source system
Data Vault's modular design allows for changes in source systems to be handled easily and efficiently. Like already discussed before, the satellites in Data Vault are only responsible for inserting new records and never updating existing ones, which eliminates the risk of data loss. In the event that changes need to be made, such as adding a new hub to an existing link structure, a new link is created while the old link remains intact. The same applies for satellites. This approach ensures that the lineage and traceability of data remains intact, while also providing convenience and flexibility when changes occur in source systems.
Changing business rules and versioning
In Data Vault, audibility also facilitates version control and access management of data. By tracking and documenting any modifications to the data, it allows for greater control and ease when making changes. Role-based access control is also established to ensure the proper authorization of data changes.
Leveraging WhereScape Features for Auditability in Data Vault
WhereScape is a powerful tool for building and managing data warehouses, and it provides several key features that support the implementation of auditability in Data Vault. Here are some of the best practices for leveraging these features to maintain auditing in your Data Vault implementation:
- Versioning: WhereScape provides robust versioning capabilities, allowing you to track changes to your data and maintain a history of those changes over time. This helps you to understand how your data has evolved over time, making it easier to identify any issues or problems that may have arisen along the way.
- Role-based access control: WhereScape also provides role-based access control, which allows you to control who can access your code and data and what they can do with it. This helps you to maintain the integrity of your data and to ensure that only authorized users have access to the information they need.
- Workflow management: WhereScape 3D provides a comprehensive set of tools for managing workflows, so you can automate and streamline your data management processes. This helps you to maintain a high level of accuracy and control over your data by properly documenting and tracking all your data management activities, which is crucial for maintaining auditability.
- Documentation and reporting: Finally, WhereScape provides a range of reporting and documentation tools, so you can easily create reports and other forms of documentation that can be used to support your audit processes. This helps to properly document and track your activities, essential for supporting any audits or compliance requirements.
Conclusion
Auditability and traceability are crucial components in modern data management at Scalefree’s clients’ workplaces and across the broader industry. By combining auditing in various areas, including data modeling, operations processes, development processes, and security, Data Vault guarantees that the collected, stored, and utilized data is trustworthy and reliable. Utilizing WhereScape, a powerful tool for constructing and managing data warehouses, you can make the most of its key features such as versioning, role-based access control, workflow management, and documentation and reporting to maintain a fully auditable Data Vault implementation with high accuracy and control. Wherescape can aid in building and managing an auditable data warehouse. However, it is also important to remember that the solution is aligned with the foundational principles of Data Vault 2.0.
Lorenz Kindling is a Consultant in Business Intelligence and Enterprise Data Warehousing (EDW) with a focus on data warehouse automation and Data Vault modelling. Since 2021, he has advised renowned companies in various industries for Scalefree International. Throughout his career, Lorenz has acquired profound knowledge and skills in effectively utilizing WhereScape alongside Data Vault 2.0.
About Scalefree:
“Scalefree international is focused on offering companies, from a variety of industries, practical yet innovative solutions towards leveraging Big Data within modern business.
Built upon the tenets set out by Data Vault’s inventor, Dan Linstedt, Scalefree also provides clients with in-depth training to maximize the success of implementing Data Vault 2.0 within their business.”
Link to the Website: https://scalefr.ee/ws-webinar