A Data Warehouse is an application and a concept. Data warehouse applications collect and organize data from various parts of an organization to enable report development and analytics. Based on the above definition, an organization should only have one data warehouse – however, in practice, we find many organizations with multiple data warehouses, with each serving specific business functions. In these organizations, a warehouse is an idea or a concept.
Many organizations invest in data warehouses to satisfy specific business objectives. There are different presumed functions for a data warehouse: since a data warehouse collects data from multiple sources, it could easily act as a hub for data distribution or for integrations between applications. Alternatively, one could easily use the warehouse to develop operational reports for business applications. Furthermore, many may use the data warehouse for its original well-known purpose of developing analytics and business intelligence. If all these and many more are acceptable functions of a warehouse, then why the specificity when making the investment? Can a single data warehouse do everything? The simple answer to this question is no.
One needs to establish and understand the primary function of a warehouse at its inception. This is critical for the data warehouse design and support strategies. For example, if the data warehouse is responsible for generating invoices for customers, then any disruption in the collection of data or quality of data will directly impact business function and revenue. In such an implementation, greater rigor is placed on stability of data collection processes and on quality of data to ensure we meet the established service level agreement with business. On the other hand, if warehouse’s primary function is analytics and supporting executive decisions, then one must place more emphasis on collecting data from across the institution.
At Washington University in St. Louis, we are shifting our paradigm. Our legacy data warehouse supported operational reporting within units. This approach was necessary, as our applications did not have robust reporting capabilities; however, that is changing and most of our newer business applications have robust reports and report development platforms. Our new direction is to move the data warehouse towards a more analytically-oriented tool, while also supporting historical and cross-domain reporting. This shift is significant and intentional, as newer applications in our ecosystem have robust operational reporting capabilities and there is limited value in creating another environment to fulfill the operational reporting goal.
Additionally, we feel interdisciplinary programs and research are gaining prominence, and the data warehouse is the one application that can rise up to the challenge. The data warehouse can go across domains and applications to support analytics and development of reports. University executives and senior leaders within schools can lean on the data warehouse to provide insight for complex business questions. Lastly, the data warehouse program is a journey, where the data warehouse’s value for an institution grows with time. As we get more and more data into the warehouse, the analytics get richer and more accurate.
The data warehouse program is like the Star Wars movie franchise – with every movie released, the characters become richer and the audience gains a better understanding of the overall plot. Similarly, with every data warehouse release we add more data to the warehouse, making insights richer, more meaningful, and more precise. This program will enable WashU leadership to make decisions that are rooted in rich data.