Prototype
Data Warehouse Prototyping - Progressive Discovery and User Commitment
WhereScape Data Warehouse Prototyping White Paper:
Reducing Risk, Securing Commitment and Improving Project Governance
IDC White Paper:
Fail Often, Fail Fast: Using Rapid Prototyping for Successful Data Warehousing
The number one nontechnical business analytics challenge identified by IDC surveys in 2007 was unclear end-user requirements and identifying and gathering user requirements.
The need to engage in iterative prototyping in data warehousing / decision support projects has been well-understood since the early 1990s when Bill Inmon made note of the non-traditional data warehouse development cycle in Building the Data Warehouse. Inmon noted the need for prototype-and-iterate cycles in order to discover, over time, precisely what informational needs a particular analytical community has, operating on the assumption - since clearly demonstrated to be a good rule-of-thumb for practitioners - that analysts starved for data and unaware of the data elements available to them often have to be shown what they ask for (and do not actually need) before they can formulate any accurate description of their real needs and desires. People, in other words, need to see what they can have - and what it means to have what they want - before designers can trust their answers to questions like "Do you need X?" or "At what level of granularity does Y need to appear?" or "How do you relate X to Y?" - the basic sorts of questions designers ask.
The other reason - as Ralph Kimball and others have pointed out - for prototype-and-iterate cycles is that such cycles both raise the probability that the data warehouse design solves the broadest possible set of analyst needs, and socializes both the data warehouse design itself and the process of doing data warehouse design collaboratively with the end-user community. When compared to traditional data modeling exercises - in which designers (whom the end-users had never seen before, and were unlikely to ever see again) arrived in a conference room to unroll a 4-square-meter third-normal-form Entity Relationship (ER) diagram and ask the pointless question Does this make sense to you? - a well-done prototype-and-iterate cycle can do more to gain end-user commitment to a data warehousing project than any other activity in the traditional data warehouse methodological process.
That's the theory, anyway: better designs, all relevant data at the right level of granularity, strong user commitment.
In practice, our experience with successful and with failed data warehouse projects tells us that:
- fewer than 50% of data warehouse projects include any prototype-and-iterate cycle
- those projects that do include a prototype-and-iterate cycle execute the loop only once (a preliminary design is tested, and then modified, and then implemented)
- most of those single-loop cycles use paper-based designs: at best, small subsets (typically measured in weeks or months along a time dimension) of data may sometimes be available but rarely in the user communities' preferred analytical tools
- the use of a predefined logical model does not obviate the need for prototyping. Data driven prototypes used along side logical data models produce better outcomes.
The reason for short-circuiting the design processes by not including data driven prototypes is two-fold: time, and cost. Simply put, it takes too long with the creaky tools we have had available to us in the past to actually implement and populate a data warehouse or data mart prototype with sufficient data for user constituencies to test-drive the model in their real-world analytical environment. Time is of the essence, and we have very little money - so let's show them a few design diagrams and call it good.
But it isn't good, and every decent data warehouse designer knows that. With WhereScape RED, you can fix this short-circuit, and:
- design with technically-savvy users in the loop
- produce a fully-populated prototype from a design in a few hours
- rework designs and repopulate prototypes in minutes, depending on extraction windows and load set sizes
- maintain multiple candidate designs, and their working prototypes, simultaneously, allowing users to compare various design solutions, or suggest design solutions that are combinations of two or more competing designs
- perhaps most importantly, allow large numbers of "influencer" users to see "their data" in "their tools" during the entire prototype phase
In addition, when a prototype is judged effective by designers and users, WhereScape RED designers are never in a position where the user community's now-much-loved prototype is scrapped, and the logical design redeveloped from scratch in "the production environment." WhereScape RED designers, on design approval, can take their data warehouses into production the next day with WhereScape's integrated deployment and operations facilities.
For more information on how prototyping can be used read WhereScape's Data Warehouse Prototyping White Paper - "Reducing Risk, Securing Commitment and Improving Project Governance", or the IDC White Paper "Fail Often, Fail Fast: Using Rapid Prototyping for Successful Data Warehousing".