Free as in Puppies: Taking on the Community Resource Directory ProblemPosted: 2013/10/22
Last week Code for America (CfA) released Beyond Transparency: Open Data and the Future of Civic Innovation, an anthology of essays about open civic data. The book aims to examine what is needed to build an ecosystem in which open data can become the raw materials to drive more effective decision-making and efficient service delivery, spur economic activity, and empower citizens to take an active role in improving their own communities.
An ecosystem of open data? How might this brave new thing intersect with human service organizations? That’s mostly beyond the scope of CfA’s current book. One chapter, though—“Toward a Community Data Commons” by Greg Bloom—takes a very serious stab at resolving a perennial headache of information and referral efforts.
Bloom is trying to solve what he calls the community resource directory problem. Various players—government agencies, libraries, human service organizations—develop lists of community resources, i.e. programs to which people can be referred. Originally the lists were paper binders. Later they became databases. Almost always, each directory belongs to the agency that created and maintains it. That’s a problem: what the community needs isn’t a bunch of directories, it’s a single unified directory that would allow one-stop shopping. And the overlap among directories is inefficient too: each one has to separately update its information about the same programs.
One solution might be to somehow make directory data free and open to all. Bloom describes one experiment in that direction. It’s instructive because of the way it failed. Open211, which a CfA team built in 2011 in the Bay Area, scraped together directory data from a lot of sources and made it available via the web or mobile phone—and it also allowed the users at large to submit data. As Bloom tells it: This last part was key: Open 211 not only enabled users to create and improve their community’s resource directory data themselves, it was counting on them to so. But the crowd didn’t come. This, it turns out, is precisely what administrators of 211 programs had predicted. In fact, the most effective 211s hire a team of researchers who spend their time calling agencies to solicit and verify their information.
Exactly right. Reading this, I vividly remembered a project I did years ago: updating the Queens Library’s entire directory of immigrant-serving agencies. It took well over a hundred hours of tedious phone work. (There was time on the pavement too… for example, an afternoon wandering around Astoria looking for a Franciscan priest—I did not have his address or last name—who was rumored to be helpful to Portuguese-speaking immigrants. I never found him.) And then each piece of data had to be fashioned and arranged to fit into a consistent format.
That’s what goes into maintaining a high quality community resource directory. It will not just happen. It cannot be crowd-sourced. And this harsh fact—the labor cost of carefully curated information products—can be hard to reconcile with the aspirations of the open civic data movement.
The lesson: it’s certainly possible to collect community resource information and then set it free… but it will be, as Bloom says, free as in puppies. (This wry expression comes from the open source software movement, where people make the distinction between free as in beer—meaning something that can be used gratis—and free as in speech—meaning the liberty to creatively modify. Someone noticed that freely modifiable software might also require significant extra labor to maintain—like puppies offered for free that need to be fed, trained, and cleaned up after.)
But then Bloom takes the problem toward an interesting possible solution. He invokes the idea of a commons—not the libertarian commons of the proverbial tragedy but rather an associational commons that would include the shared resource (in this case, the data) and a formal set of social relationships around it. He suggests that a community data co-op might be an effective organizational framework for producing the common data pool and facilitating its use.
It’s an intriguing idea. It acknowledges the necessary complexity and cost of maintaining a directory. It might be able to leverage communitarian impulses among nonprofits. And if successful, it could be a far more efficient way of working than the current situation of multiple independent and overlapping directories. Of course, it would face all the usual practical difficulties that cooperatives do; but there’s no reason those should be insurmountable.
This framework could solve a lot of current problems in information and referral. But how might it eventually fit into some larger imagined ecosystem of open data?
Bloom offers a vision for how such a unified directory could be widely used by social workers, librarians, clients, community planners, and emergency managers. That seems entirely feasible, because all those players would need the same kind of information: which programs offer what services when and where.
But Bloom also takes the vision a couple of steps further, imagining a journalist seeing this same data while researching city contracts; and how the directory might be combined with other shared knowledge bases—about the money flowing into and out of these services, about the personal and social outcomes produced by the services, etc.—to be understood and applied by different people in different ways throughout this ecosystem as a collective wisdom about the “State of Our Community.”
This, I think, is where Bloom’s vision will face strong headwinds. I don’t mean political or inter-organizational resistance (though those might crop up too). The problem is that across the broad domain of human service data, very few sub-domains have achieved much clarity or uniformity. Community resource records happen to be one of the more advanced areas. For a couple of decades there’s been a library system standard (MARC21) for storing these records, and now AIRS offers an XSD data standard. So the way is fairly clear toward creating large standardized databases of community resources. Those could then be meshed with, say, the databases of 990 forms that U.S. nonprofits submit to the Internal Revenue Service. The problem is that outside of these few (relatively) clean small sub-domains, human service data gets very murky and chaotic indeed.
A project to mesh community resource records with data on contracts, funding sources and public expenditures, for example, would immediately run into the problem that the linkage would need to be made through the concept of a program. Yet that core term is used in very different ways. Sometimes it means a discrete set of resources, practices and goals; sometimes it implies a separate organizational structure; and sometimes it seems to be a mere synonym for a funding stream. The use of that term would need to be tightened, or it would need to be replaced by some clearer concept. But even then, people trying to mesh community resource data with fiscal administrative data would find that the latter are equally unruly. A city’s contract with a nonprofit, for example, may fund one program or many; and the original source of the contract’s funding may be one or many government funding streams from the federal, state or city level. There is no uniform pattern for how these arrangements must be made, nor are there well-developed data standards.
Meshing community resource records with programmatic statistics such as outputs and outcomes would be equally fraught. While there’s a movement toward standardizing some performance measures, concrete results on the ground have been slow in coming. Even if complications from the politics around performance measurement were miraculously eliminated, there would still be the issue of murky and chaotic data that don’t easily support performance measures.
So what’s the solution?
In a nutshell: for the human service sector’s work to become a significant part of the ecosystem of open civic data downstream, the sector will have to embark on a new kind of conversation about the way data is organized upstream. This will necessarily be a longer-term conversation. It will have to involve a more diverse set of stakeholders than are usually assembled at the same table. It will have to ask (and answer) unfamiliar questions such as how can information system designers create good interrogable models of public service work rather than merely meeting stated user requirements? It will have to take a hard look at sub-domains that have often not been modeled very well. (Funny-looking taxonomies are an important red flag for identifying those.)
Eventually, that kind of conversation can lead the sector toward far more coherent and holistic ways of organizing its data. The downstream benefits: more successful information system projects, more efficient production of performance measures, and more meaningful data for open civic uses.
If you found this post useful, please pass it on! (And subscribe to the blog, if you haven’t already.)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License