1. Incremental delivery: While it is sensible to build any Data Warehouse within the context of an overall enterprise model, Data Vault supports entirely incremental delivery. Just like Kimball’s Dimensional Design approach, you can start small and incrementally add additional sources over time.
2. Flexibility: Unlike the 3rd Normal Form modelling approach, which can be inflexible, Data Vault requires no rework when adding additional sources. As Data Vault stores the Raw and Business derived data separately, it supports changes to business rules with ease.
3. Reduced Complexity: As Data Vault is built out in a two-step approach, it separates the technical data restructuring from the application of business rules, which helps isolate these potentially complex stages. Likewise, data cleaning is considered a business rule and can be managed independently of the initial data load effort.
4. Raw Data Included: Recording the raw data in Data Vault means it’s possible to back-populate the presentation area with historical attributes that were not initially made available. If the Data Marts are implemented as views, this can be as simple as adding an additional column to an existing view.
5. Elegantly supports change over time: Similar to the slowly changing dimension in the Kimball approach, Data Vault elegantly supports changes over time. Unlike the pure Dimensional Design, however, Data Vault separates the Raw and Business derived data and supports changes resulting from both the source system and the business rules.
6. Lineage and Audit: As Data Vault includes metadata identifying the source systems, it makes it easier to support data lineage. Unlike the Dimensional Design approach in which data is cleaned before loading, Data Vault changes are always incremental, and results are never lost, which provides an automatic audit trail.
7. High-Performance Parallel Loads: With the introduction of Hash Keys in Data Vault 2.0, data load dependencies are eliminated, which means near real-time data loading is possible in addition to parallel loads of terabytes to petabytes of data.
8. Possible to Automate: While both Entity Relationship Modelling and Dimensional Design require time and experience to build skills, Data Vault tends to easier to automate, and there are several tools (listed below) to help deliver the solution.