We love data. Especially “big data” that fits neatly into spreadsheets and generates impressive graphs. We call some of those graphs infographics, or infauxgraphics, but even that witticism hints at an underlying discomfort: Data can be cool and interesting without being useful.
This note is the first in a proposed series about making data useful. My own occupation and curiosity is in city government, so the notes will consider the questions attendant to using data to make better decisions there. For the last four years, I have tried to do so in my own organization, the government in a city of 41,000 people in South Carolina. I have not had any outstanding success, and I would certainly not call myself an expert. But I have learned. I plan to ask questions, encourage comments, welcome communication, and adapt. I hope that people from inside and outside city government will find the series useful, and will consider participation worth their effort.
In short, I am trying to start a conversation about the data-driven city.
I’m sure there are other theories, but mine is that the ordinary earth person became familiar with the data-driven organization through Michael Lewis’s Moneyball: The Art of Winning an Unfair Game. Or perhaps through Brad Pitt and Jonah Hill’s Moneyball. The core insight there is that the ways in which many organizations traditionally make decisions — reliance on intuition, expert assessment, heuristics, rules of thumb, anecdotes, comparisons — are inefficient. Objective, unvarnished, unemotional data is (or at least can be) a better tool for decision-making. Particularly, but by no means only, in baseball.
Lewis’s most recent book, The Undoing Project: A Friendship That Changed Our Minds, begins with the observation that using data to inform decision-making is not just for baseball anymore:
I’ve read articles about Moneyball for Education, Moneyball for Movie Studios, Moneyball for Medicare, Moneyball for Golf, Moneyball for Farming, Moneyball for Book Publishing(!), Moneyball for Presidential Campaigns, Moneyball for Government, Moneyball for Bankers, and so on. “All of a sudden we’re ‘Moneyballing’ offensive linemen?” an offensive line coach for the New York Jets complained in 2012. After seeing the diabolically clever data-based approach taken by the North Carolina legislature in writing laws to make it more difficult for African Americans to vote, the comedian John Oliver congratulated the legislators for having “Money-balled racism.”
Lewis, Michael. The Undoing Project: A Friendship That Changed Our Minds (p. 16). W. W. Norton & Company. Kindle Edition.
Thoughtful people are trying to Moneyball cities as well. Not too long ago, “New York City challenged top institutions around the world to design an applied science campus that would make the city a world capital of science and technology, as well as dramatically grow its economy.” The accepted proposal, from New York University and NYU-Poly, led to the creation of the Center for Urban Science and Progress (CUSP) at NYU. The core program at CUSP is urban informatics, which “uses data to better understand how cities work.” In Europe, the City of Amsterdam sponsored a similar contest to design an international technology institute. The winner (a group including Delft University of Technology, Wageningen University, and MIT) has since established a research center to develop urban solutions, particularly through data-based “smart city” collaborations involving academic and research institutions, private enterprise, cities, and citizens.
This is all just great, and it would be relatively easy to write a hyperventilatory article about using data to make cities better: “Five Ways to Make Your City Smarter.” But using data to improve any organization, much less a city, is hard.
In the spirit of the age, I will adhere to the “five things” model. So to start the conversation, I suggest five challenges to becoming a data-driven city. In doing so, I offer tools to frame the conversation that I hope will follow.
1. Staff Might Resist the Effort to Engage Data.
Several years ago, I took a course in productivity improvement and emerged flush with enthusiasm, a crusader on a quest to recapture the holy land of endlessly improving efficiency. And then an odd thing happened. My coworkers didn’t particularly share my enthusiasm. Indeed, some of the staff were overtly or at least covertly (but in any case actively) hostile to the effort.
I mention this because modern productivity improvement theory draws heavily on performance management, which uses data to analyze and improve performance. My initiative would thus involve collecting, analyzing, and applying data to improve city operations, a step towards becoming a more data-driven organization. To me, the effort made perfect sense. To many of my coworkers, it was an incipient disaster. I was nonplussed.
Not long after, I took another course, this time in organizational theory. If you have studied organizational theory in the last thirty or so years, you are probably familiar with Bolman and Deal’s Reframing Organizations: Artistry, Choice, and Leadership. As it happens, the book helps explain the different reactions to data-driven management.
Bolman and Deal propose that a person encounters reality through one of four primary frames of reference. The structural frame assumes that appropriate, rationally designed systems will maximize organizational effectiveness and minimize personal disruption. The human resource frame focuses on people rather than systems, and reflects a core belief that personal skills, morale, motivation, and satisfaction undergird a successful organization. The political frame views organizations as primarily an arena in which people relentlessly compete for scarce resources. And the symbolic frame envisions that symbols, culture, figurative understandings, and organizational self-identity determine the success of an organization.
The Bolman and Deal frames are a highly useful way to understand organizational behavior — that’s why the book has become a modern classic. My frame is deeply structural; I suspect that many who succumb to the call of data-driven management are similarly situated. My resistant coworkers generally shared a political frame. To them, data is a way to discredit or at least question their leadership and departmental performance and thus to diminish their influence within the organization. Yes, I would think, that’s kind of the point. But Bolman and Deal relentlessly remind the reader that no frame of reference is right; the analysis is not normative but positive. Whether you like it or not, your organization is full of people with political frames, and in many or even most cases those people are highly valuable employees. They just don’t share your passion for numbers and infographics.
To probe further: the interactions among frames of reference at different levels of leadership matter as well. If the city manager herself has a structural frame, then resistance to data-driven analysis from department heads with political frames may merely create some discomfort without stalling the effort. In my experience, however, the majority of city managers have a human resources frame. After all, the soft skills of people management that emerge from a human resources frame tend to be useful in getting promoted to the city manager level. A city manager with a human resources frame, confronted with resistant employees using a political frame, may conclude that the potential harm to the organization’s morale and harmony outweighs the potential gain of tracking the number of missed trash pickups. This is not an irrational conclusion, it just relies on a different frame of reference. And it might slow or even stop the effort to become a data-driven city.
My conclusion is that countless cities underuse data-driven analysis not because they are unaware of the tools or lack the desire to get better. Instead, they tried, encountered resistance, and decided that the effort was not worth the organizational pain. Any successful effort to transition to a data-driven model, then, must account for intra-organizational compliance and adoption.
2. Elected Officials Might Not Get It.
Consider the following conversation:
Elected Official: “My neighbor called the building department all day last Tuesday and never got an answer.”
Staff Member: “Our new phone system tracks incoming calls, percent answered, number of rings, and average call duration. The rate at which incoming calls in answered is actually over 95% within the first three rings.”
Elected Official: “Yeah, but my neighbor said.”
Variants of this conversation happen every day in government. These conversations are maddening both to staff (who trust the statistics) and to elected officials (who answer to individual constituents regardless of the statistics). They also challenge the effort to become more data-driven. If elected officials don’t particularly care about the statistics, from where will the resources and leadership in data usage come?
John Nalbandian, an emeritus professor of public administration and a former mayor, explains what is going on in these conversations. He notes that elected officials view their activity as a type of game or contest, and relate to the world through stories:
Politicians and citizens communicate with each other through stories and anecdotes because stories convey symbols better than statistics and reports. These representatives play the game of politics by trading, exchanging, manipulating, and dealing with interests and symbols.
On the other hand, administrators (here called “experts”) view their activity as problem solving, and relate to the world through facts:
Experts deal with information, money, people, buildings, machinery and other tangibles, and when they have to deal with interests and political symbols, they find themselves often moving uncomfortably outside the realm of administrative problem solving and into the realm of politics — an arena they justifiably do not like. Experts solve problems by exchanging and applying knowledge. The symbolic world of the expert is much more narrowly drawn in comparison to the politician’s because democratic values are broader than traditional professional values and because politicians must communicate with a broader audience than do technically trained professionals.
Shorter version: Politicians and administrators speak different languages, and the language of the politician doesn’t have much to say about data-driven governance.
I have found this insight especially useful, and it has allowed me to preserve my patience in many a conversation with politicians. Nalbandian, like Bowman and Deal, reminds me that my preference for data isn’t necessarily right, and that the politicians’ preference for narrative isn’t necessarily wrong. In any event, the main point for now is that a successful effort to transition to a data-driven model must also provide stories for the politicians and their constituents to understand and tell.
3. You Might Not Know What You Want to Do With the Data
Perhaps you have heard that you should not go looking for a solution without a problem. In information technology, the adage means that you should select IT solutions that address your organization’s unique processes and problems, not the other one way around. In using data, it means that you should start with a clear idea of the questions you are asking, rather than with the data that happens to be readily available.
There are three primary ways in which a city may use data. First, a city may use data for performance management, which involves the inward-facing mission of improving the organzation’s efficiency and effectiveness. Second, a city may use data for policy and program development, which involves the outward-facing mission of improving outcomes in the community that the city serves. Third, the city may use data for community engagement, which involves increasing knowledge and paticipation by the city’s citizenry.
These categories certainly may overlap, and the same data (and even the same application of the data) may further two or even three of the stated purposes. For example, if a city’s police chief believes that neighborhood crime watches are a helpful tool in preventing and detecting crime, she may seek out data that correlates rates of crime with the active presence of a neighborhood watch association. That data may implicate performance management (i.e., by increasing efficiency through more “eyes on the street”), policy and program development (i.e., by supporting policy initiatives to encourage and support neighborhood watches), and community engagement (i.e., by increasing community participation in and knowledge of police activities).
Even in such cases, however, the application of data will be more fruitful if the city starts with a clear knowledge of its goals. At least in my experience, it is easier (and therefore more common) to start with whatever data is readily at hand and capable of being displayed in fetching graphs.
4. You Might Not Know Where to Get the Data.
The problem with finding data for city use is that there is too much, and not enough, all at the same time. In my city, we would like to inventory the availability and condition of rental housing for low-income tenants. The exact data we need apparently doesn’t exist, or at least not in places that I know where to look. But we could easily drown ourselves in very closely related data, from thousands of publicly available datasets.
Even when there is data, it may not be structured or organized:
“Companies that come to Amsterdam, for instance, expect ‘data that’s structured, that’s suitable, and everybody actually thinks we have that. We don’t even know how many bridges we have,’ says Baron, a large man with unruly hair, wearing on this day an ‘I Amsterdam’ polo shirt. In the case of the bridges, that’s because the city’s individual districts used to maintain the bridges within them, and their data has not been centralized. ‘We have a lot of stuff. It’s not structured at all,’ he says.”
I don’t intend to write much on this issue now. I expect that future notes in this series will probe much deeper into sources of data. I do note that logical starting places for publicly available data include the Census Bureau and https://www.data.gov.
I also note that cities are increasingly turning to private sources of data. For example the City of Amsterdam, as part of a health initiative, wanted to encourage children to eat more vegetables. Tracking the success of program was obviously difficult. The city therefore persuaded local grocery stores to share their sales data. “Instead of trying to evaluate the program’s impact after a couple of years, the city can now make speedier assessments of, and adjustments to, its outreach efforts.” At least in larger cities, this public-private approach to data collection seems certain to increase.
5. The Data Might Not Always be Useful.
I recently reviewed a low-income housing survey prepared, at great expense, by a private consultant for a larger city. Over one hundred pages of the survey presented detailed statistical information about the city, by neighborhood. It is a very impressive survey. But the more I think about it — given that I may be wrong — the more convinced I am that most of the data in the survey is not useful for guiding city policy intervention in the housing sector.
The survey covers everything from demographics to per capita income to educational attainment in the designated neighborhoods. It also surveys supply and demand for housing, again by neighborhood. So the data is interesting, and generally quite useful, but it doesn’t obviously support any particular policy action by the city government. Michael Flowers, the Chief Analytics Officer for New York City, notes:
A focus on outcomes is often lost in the discussion of big data because it is so frequently an afterthought. We have a huge fire hose of information, but even a fire hose is only valuable when it’s pointed at a fire.
For an example of hard-to-get data that would be highly useful for policy intervention in the housing sector, consider a recent article that tracks mobility within the household unit. Available data tracks mobility at the individual level, but a thoughtful housing intervention would consider the needs of a persistent household unit. The problem is that household-level mobility is very hard to track.
Unfortunately, such is often the case. Data that is useful for a specific question is the often the most difficult to collect.
Closing Thoughts.
I expect this has been a frustrating read, very unlike a “Five Ways to Use Data” note. Questions and problems are less exciting than answers. But as I noted from the outset, I hope that the conversation will lead to some answers in future notes. Keep coming back.