Awash in Data: Promises and Pitfalls

The word “data” may appear to many policymakers and managers as a modern-day “open sesame,” to enter the cave of well-run states. But, while gathering facts and figures is a crucial first step, actually analyzing, utilizing and communicating them is the key to progress. That’s not easy.

  Download the Article in PDF / E-Reader Compatible Format

About the Authors
Katherine Barrett and Richard Greene are a husband and wife team who are senior fellows at the Council of State Governments; senior advisers to the Pew Charitable Trusts government performance unit, senior advisors to the Fels Institute at the University of Pennsylvania and fellows in the National Academy of Public Administration. They are also columnists for Governing Magazine and senior fellows at the Governing Institute.

The use of valid, timely data to effectively run a government is not a new factor in the world of the states. The first census—just 14 years after the Declaration of Independence was signed—was an indicator that the Founding Fathers understood that if this new republic couldn’t count things (people in this case), it could hardly be expected to manage itself.

In the pre-computer age, of course, the utility of data was little more sophisticated than that in a Mom and Pop candy store at the time, where cash sat in the register and was supposed to match a pile of receipts kept in a nearby drawer. Even after computers began to enter the scene, simple financial data for the states was still in its infancy. 

When Edward Regan took the comptroller’s helm in New York state in 1978, he lamented the state of the art of the day. As he told Forbes magazine in 1980, “There [has been] no accountability. The books [have been] so loose and kept in such an undisciplined manner that governors and legislators [have not] been held responsible for their actions.”1

While there’s a wide variation between the quality of data today—both financial and performance related—and its utility, one thing has increasingly emerged as a significant trend in states: the sheer quantity of data available grows every minute of every day across the 50 states. 

Consider a sunset review that recently was completed in Texas about the five agencies that make up the health and human services system: “According to informal estimates, the total volume of information maintained by system agencies could top 200 terabytes of data. For comparison, a digitized version of the Library of Congress’s 17 million printed holdings would total about 136 terabytes; while all data sent from the Hubble Telescope from its first 24 years was about 100 terabytes.”2

Meanwhile, as the review points out, Texas’ health and human services now has 800 underlying data systems that have grown up over time with different standards and different data definitions. 

Despite obstacles such as those cited above, the potential for the technology that permits states to gather nearly unthinkable quantities of data is huge. Some people increasingly have referred to “big data,” as the science of utilizing huge quantities of data across a variety of databases to come up with better policies and practices.3 As a result of this phenomenon, the vision of what such information can do has grown immensely. 

This potential can be realized with knowledge, skills and creativity. It is not necessarily dependent on buying new expensive technology systems. 

“You can simulate things that you couldn’t have done in the past,” said Max Arinder, executive director of Mississippi’s Joint Committee on Performance Evaluation and Peer Review (PEER). “You can do some magical things.”4

A few examples:

  • In order to improve collection of child support payments, Pennsylvania in 2009 determined to provide new, real-time, information about parents’ financial employment and asset history. It began to use electronic data exchanges to lessen reliance on manual data collection, improve case management and develop more sophisticated predictive abilities. That was the beginning. In 2010, the state’s data exchanges were enhanced and case managers had access to reliable, relevant and current data, which allowed them to quickly determine the most effective actions for collecting child support or providing medical insurance to children. By 2011, the most recent year for which national statistics are available, Pennsylvania had been able to raise its child support collection rate to about 80 percent, compared with the 50-state average of 62.4 percent.
  • One audit, recently issued by the Massachusetts State Auditor’s Office examined all 710,025 Medicaid claims paid for by Marshal the program designed to aid illegal immigrants with emergency treatments. Data analytics allowed the state to fully analyze every transaction processed through this program over a three-year period in a relatively short amount of time. Some 45 percent of those claims were discovered to be questionable. This was a remarkably high figure and has galvanized the state to target attention on specific issues, ameliorate them and prevent them from recurring.
  • Indiana attacked the problem of infant mortality with intensive data analytics in 2014, combining multiple data sets to look at the drivers of infant mortality and give policymakers information on how to best lower Indiana’s above average infant mortality rates. Analysts started initially with 17 integrated data sets that came from five agencies, and public sources like the U.S. Census, and ended up concentrating on five data sets that helped analyze what population sub-groups were most at risk. Data revealed that about 65 percent of deaths occurred for mothers who had fewer than 10 prenatal visits and that younger mothers on Medicaid were most at risk of not getting the prenatal care needed.7 

This knowledge, combined with more qualitative research, is contributing to new policy approaches that will help the state target pregnant women who may be most vulnerable to poor birth outcomes and develop ways to improve their prenatal care. For example, while the deep data dig revealed that distance to a health facility or doctor was not a factor in lowering access to prenatal care, subsequent follow-up research has suggested that helping individuals with transportation could have positive effects on increasing that care.8

Like others, Gary Blackmer, the director of the audit division in Oregon’s Secretary of State’s Office, talks about the difference in the way his office makes use of full data sets now as opposed to the sampling that occurred in the past. For example, it did some matching between lottery winners and recipients of public assistance since some of these winners now have enough cash to make it unnecessary for the public to subsidize their living costs. 

“We were matching millions of records to millions of records. We worked with the Department of Human Services to make sure that what we came up with is not a keying error,” said Blackmer.

The auditors found about 9,000 files in which there could be problems and they pointed the department to those 9,000 files to look more closely. In the past, they would have drawn a sample of files and reached a conclusion about the number of potential problems in that sample. But they wouldn’t have been able to point specifically to the problem cases to look at.

“Now, we can hand them 9,000 records in which we think there are errors,” Blackmer said. “In the past, we would have said, ‘There are fish out there. Go catch them.’”10

One key to getting more and better utility out of data is to persuade state agencies to share the information they’ve gathered, considering it to be a state-wide asset rather than an agency possession. Doug Robinson, executive director of the National Association of State CIO, recommended that every state have a data management element to its architecture and that data be regarded as a major strategic asset. He said discussions need to occur at the enterprise level. This requires a major shift in thinking. 

“This is an asset of the state government, not data owned by individual state agencies,” he said.11 

Getting buy-in for this concept isn’t always the easiest thing in the world. Some agencies—particularly those with a great deal of private information about citizens—feel they are restrained from sharing that information with anyone outside their particular agencies. Even when there’s no legislative mandate or federal regulations requiring this kind of privacy, it’s often part of the ethos of the agencies themselves. 

Beyond that, there’s often a technological blockage to sharing data across agencies. Multiple platforms have developed over the course of years, and so there’s no magic button to push to combine data from the department of mental health, say, and the department of corrections. Yet, these are exactly the kinds of agencies that can benefit from dealing with one complete database. 

“The really big obstacle to using data is culture,” said Catherine Lyles, Louisiana’s senior auditor. “People don’t understand the value of the information they collect and therefore they collect it in an inconsistent way that limits its usefulness.”12

The Louisiana state audit performance division has been particularly active in working with agencies to help them understand how to use data more effectively. 

“Agencies collect a lot of data, but they don’t use it for management purposes,” said Karen Leblanc, director of performance audit services in Louisiana. “We try to teach them to use the data they collect.”13 

Of course, drawing data from the agencies is only useful if the material they’ve gathered is reliable and that’s not always the case. 

“The legislature needs to make decisions, but the decisions are only as good as the information they receive,” said Jan Yamane, acting auditor in Hawaii.14

Ohio Auditor Dave Yost says auditors must do a much better job of ensuring that the data driving decisions can be trusted. At the summer 2015 meeting of the National Association of State
Auditors, Comptrollers and Treasurers, he plans to come armed to discuss the speedily evolving need to ensure data integrity. 

“I think this is probably the most important emerging trend for government executives, across the board at all levels,” Yost said. If data is distorted, he said, “then making decisions based on that data is worse than making decisions based on no data at all.”15

Even as states make progress toward gathering and validating data, there’s a sense of a receding shore phenomenon here. The more information states have to make good policy decisions, the more they seem to want and the greater demands policymakers are imposing on their data-crunchers. As John Turcotte, the director of the North Carolina General Assembly’s division of program evaluation put it, “There is strong and sustained legislative interest in decision analytics in North Carolina and frustration with lagging capability within state government.”16

Medicaid payment reform, for example, is built on the principle that payments be based on quality and performance. But that requires that the quality and performance information be up to the task. James Nobles, legislative auditor in Minnesota, has worked in the legislative auditor’s office for 36 years and has seen many improvements in the quality and use of data.

“But our expectations get raised,” Nobles said. “So there’s always that gap. I think we have greater expectations that if we have big data systems and powerful computers, why can’t we answer these questions more easily. There’s a frustration level there.”17

One obstacle in many states to making the highest and best use of data is a lack of a governing structure over its use.

In Maine, data governance is getting attention from both the legislative and executive branches. The Office of Program Evaluation and Government Accountability started looking into ways to move the state forward in a report they did almost 10 years ago that asked how Maine could make better use of data. It’s now doing a follow up review. One of the issues that has materialized is the role that the Office of Information Technology plays. As in many states, the technology officials see their role as supporting the technology and the tools that agencies use, and in ensuring the security of the data.

“But they clearly don’t think their role includes how consistent the data is or being able to use the data,” said Beth Ashcroft, director of Maine’s Office of Program Evaluation and Government Accountability. “Part of our effort has been to see whether there is some place for data leadership to emerge.”

Although chief information officers have everything to do with the technology that houses data, they are often quite removed from the management of the data. 

“Honestly, they don’t have a lot of authority in that space,” said Robinson of the National Association of State CIO.18

The next step for many states is to develop a governance structure; a way of bringing together groups that represent the various agencies and assigning responsibilities for data stewardship. Some states have talked about having a master data index and centralized rules for data management, but this includes many policy discussions that states have generally not successfully confronted yet. 

“A lot of times, nobody has really thought about it,” Ashcroft said. “Nobody has thought through proactively what kind of data we need. When the federal government requires certain data to be reported, it’s fine, but beyond that it doesn’t get a lot of time with folks thinking through the key things to be looking at and what kinds of different analyses we might do at the management level to help inform us. If this isn’t thought through, the data isn’t captured. If nobody has focused on data and the key pieces of data that need to be gathered in a consistent way, then it’s weak.”19

There are a handful of states that already are focusing more intently on these questions. Virginia and Utah, often leaders in matters of management, have been developing an enterprise approach to data. Virginia’s move toward improved data governance includes developing standards and ways to manage data like an asset; they are developing data about their data.

Indiana presents a powerful example of the ways in which attention to data itself—as opposed to the technology that is used to store and access it—likely will become a greater focus of attention.  

Paul Baltzell, the chief information officer for Indiana, took on that position when Gov. Mike Pence took office in January 2013. In the first months of the administration, Baltzell saw his role as many CIO do—he was in charge of the technology and the technology infrastructure. 

“We didn’t really manage data,” Baltzell said.20 

But Pence had a strong belief in the importance of using data more effectively to manage government programs and policies. Shortly after he took office, the state made its first major plunge into data analytics with its study of infant mortality. Then in March 2014, Pence issued an executive order to officially create the Governor’s Management and Performance Hub. The order requires agencies to provide central executive branch access to data and systems; in effect, making the data itself a property of the state enterprise, not just of any individual agency.21 

With the importance of data as a strategic asset evolving, Baltzell’s office was moved from its traditional position under the chief of staff, to the Office of Management and Budget, where it has a closer link to the governor himself. 

“When you have a petabyte22 of data, that has a value,” Baltzell said. “We’ve taken on data management as a challenge because we believe we need to centralize and manage our data better. We believe it’s an asset that is helping to leverage the M in OMB.”

Following its analysis of infant mortality, Indiana officials have embarked on an effort to use data to analyze recidivism and are looking into ways that the data can help reduce child abuse and domestic violence. It’s also pursuing the more typical data analytics that target a reduction in fraud. 

Baltzell believes other states that want to leverage data likely will move in a similar direction, making the information aspect of information technology a much clearer focus. 

“It’s making us think about it,” he said. “Quite honestly, before, when I talked with other CIO, it wasn’t a topic. This wasn’t on my radar as CIO until we did this.”

1 Forbes Magazine, You Can’t Fight City Hall if You Can’t Understand it, March, 1980
2 Texas Sunset Advisory Commission Staff Report on Health and Human Services Commission and System Issues, October 2014
3 It’s worth being cautious with the term “big data,” though the definition used in this article is reasonably generic, based on telephone conversations with high-level representatives of 10 entities (eight states, one city, one university), it emerged that there is no refined and universally accepted definition of “big data.”
4 Interview by authors with Arinder, Jan. 19, 2015
5 “2011 State by State Child Support Collections,” National Conference of State Legislatures, updated January 2013,
6 Office of Medicaid (MassHealth) – Review of the MassHealth Limited Program (Emergency Medical Services, Massachusetts), The Massachusetts Office of the State Auditor, Dec. 10, 2014,
7 Hughes, Jessica, “Data Analytics Helps Indiana Change its Approach to Infant Mortality,” Government Technology, Feb. 3, 2015
8 Hughes, Jessica, “Data Analytics Helps Indiana Change its Approach to Infant Mortality,” Government Technology, Feb. 3, 2015
9 Interview with Gary Blackmer, Jan. 22, 2015 
10 Ibid
11 Iinterview by authors, Jan. 20, 2015
12 Author’s interview, Dec. 19, 2014
13 Authors interview, Dec. 19, 2014
14 Authors’ Interview, Dec. 10, 2014
15 Authors’ Interview, Feb. 4, 2015
16 Authors’ interview, Jan. 14, 2015
17 Authors’ interview, Jan. 6, 2015
18 Ibid
19 Ibid
20 Author’s interview, March 5, 2015
21 Ibid
22 A petabyte (PB) is 1,000 terabytes (TB) or 1,000,000 gigabytes (GB).


Awash in Data: Promises and Pitfalls63.03 KB