An Update on Data Warehouse Work at SMU
SMU is often required to report information and make decisions based on institution-wide data, but accumulating that data can be extremely time-consuming and difficult. Even for questions as simple as what defines an academic department, degree or major across the University, it’s often impossible to compare and contrast data from school to school or to report accurately across the University due to discrepancies – for instance, some departments list “majors,” others “concentrations,” still others “tracks.” And for more complicated issues, such as those involving student retention or substance abuse or university admissions, the problems of siloed data in different forms in myriad departments mean months of labor by teams of people just to gather basic information.
To combat this problem, the Office of Operational Excellence launched a Data Warehouse Initiative in June 2016. The purpose of a Data Warehouse is to take large amounts of data, from different sources on campus and in different formats, and transform it into consistent, accurate and useful information. The ultimate goal is to enable SMU faculty and staff to access information more effectively to make better strategic, data-informed decisions.
Organizations and universities nationwide use data warehouses to transform thousands of bits of data into coherent information. Basically, a data warehouse is a relational database designed for reporting and analysis rather than for transaction processing. A data warehouse pulls from systems across the university, simplifies, consolidates and transforms complex, unmatched data into usable, consistent data that can flow easily into analytics and visualization tools from which organizations can generate usable information. All parts of the organization then have the ability to draw from a rich, trusted information source.
Led by Patty Alvey, director of Assessment and Accreditation for SMU, the Data Warehouse Initiative team spent its first six months doing extensive research on data warehouses, interviewing other universities about their experiences, and reviewing all of SMU’s data holdings, processes and systems. The team confirmed that student data, financial data, alumni data, donor data and a host of other data sets exist in different locations and formats all over campus, making it difficult to conduct good strategic research and planning. More than 150 separate data sets were identified – examples include faculty teaching loads, student academic progress, semester course offerings, student fraternity and sorority affiliation, and many more.
To begin the process of synthesizing all this data, in spring 2017 the team recommended and the Operational Excellence Executive Committee approved the establishment of a Data Governance model based on a similar successful model at the University of Notre Dame. It includes a Data Governance Steering Committee of 22 senior administrators to discuss overarching data needs, and a Data Governance Committee of 34 people who are involved with data management on a daily basis. Both committees are led by Michael Tumeo, director of Institutional Research at SMU, and include representatives from all areas of the University.
“Basically, data governance is looking at data that an institution owns as an asset, just like any other asset the business has – its buildings, its personnel, etc. And if you’re going to see it as an asset, then it needs to have active, intentional oversight,” said Tumeo. “As a University, we need to get a handle on what data we own and how we are using it. That means looking at questions such as, what are the data systems that hold potential information? Where in these systems are the data stored? How do we link data together from different systems? Who is responsible for the data in those systems? Who has access, and what are the policies and procedures for accessing data?”
Since May 2017, the Data Governance committees have been studying data policy issues, common terminologies, data sharing and access, and business processes surrounding the input, maintenance, extraction and reporting of SMU data holdings.
In October, the Data Warehouse Initiative made several key recommendations in collaboration with the Data Governance committees, which were approved by SMU’s executive leadership.
“Newer thinking on data warehouses isn’t that you grow the whole thing at once, but that you pick a project and then get the data for that project together,” said Alvey. “That’s called a Data Mart – a smaller version of a Data Warehouse. Eventually, our all-encompassing Data Warehouse will be built from a series of Data Marts. We picked five Data Mart projects to tackle over the next several years, all of which are important in the University’s 2016-2025 Strategic Plan.”
The projects, in order, are Substance Abuse Prevention, Retention and Graduation, Admission and Recruitment, Academic Program Performance, and Research and Creative Scholarship.
Each project is expected to take about six months.
Michael Hites, who joined SMU this summer as the University’s new chief information officer, worked with the Data Warehouse team to identify the hardware and software needed to begin the Data Marts, as well as the personnel. “We will appoint a Data Architect to design how the myriad pieces of data live with one another and ‘talk’ to each other, and a Data Visualization Specialist to help the software tell an interactive and visual story,” said Hites, a member of the Data Governance Steering Committee. “The specialist will also train someone in each college on campus on how to use the visualization software. Both of these data professionals will support the schools and departments to answer their own important questions using the shared Data Warehouse.
“The Data Architect will help SMU create and build the best model, the best framework on which to hang our data,” Hites said. “Once that is designed, our own OIT experts in extraction, translation and loading will grab the data sets from different data repositories of the University and load them into the Data Mart so that the new, comprehensive data set can be easily used. The Data Visualization person works with the technical and departmental staff to develop intuitive and visual ways to interact with the data throughout the University. The goal is to put data analytics in the hands of those who need it, beyond that of simple reports, statistics and graphs.”
Once the personnel are identified, the first Data Mart project will begin, with a targeted completion in summer 2018.
The Data Warehouse Initiative has now officially dissolved, and most of the team members have joined one of the two Data Governance committees, where work will continue for years. Six subcommittees have also been formed and are in the process of talking with departments across campus about data storage, definitions, existing reports, policies, access and stewardship related to the first Data Mart.
“Getting SMU’s data in line so that it can be linked and analyzed, and the schools have faster access to cleaner information – that’s the win, the most important result of this whole effort,” said Alvey. “Thanks to the extremely dedicated work of the entire team over the past 18 months, SMU is now positioned to make that goal a reality.”