The Diagnostic Value of Funny-Looking TaxonomiesPosted: 2013/10/08
Taxonomies are everywhere in information management, yet they are hardly ever formally acknowledged and managed.
So begins a very entertaining article published last year by Malcolm Chisholm. It’s entitled “The Celestial Emporium of Benevolent Knowledge” after a famous short story in which Jorge Luis Borges presented a (fictional) ancient Chinese taxonomy of animals. The deliberately absurd chaos of Borges’ taxonomy serves as a jumping off point for Chisholm to outline the (non-fictional) ancient Western art of taxonomy: the logical and constant division of a genus into the mutually exclusive and jointly exhaustive species that compose it. The very existence of this practice is now almost newsworthy because, as Chisholm notes, traditional logic is hardly taught in the West anymore. (Once upon a time that statement itself would have seemed as unlikely as one of Borges’ fantasies; but no, it is true.)
How do taxonomies show up in software? The most obvious way is in lookup tables. Those tables determine the categories from which the user is allowed to choose.
I’ve seen strangely constructed taxonomies create costly problems in a lot of information systems, and it continues to puzzle me that there’s relatively little discussion of taxonomy among software designers; so I was glad to see Chisholm’s article come out in a popular newsletter.
Why so little discussion? Perhaps it’s because people don’t want to simply bemoan a problem. Fair enough. But in this case, the problem actually points to its own solution.
In fact, the taxonomies already embedded in lookup tables can have enormous diagnostic value for information system designers. They can be a very fast track to understanding past problems and eliciting current requirements.
Several reasons why:
1 – They’re often an important (albeit perhaps ill-organized) embodiment of the way the organization thinks about its work.
2 – Their virtues and vices are easy to discuss with the stakeholders who are used to them.
3 – They’re a gateway to understand and critique larger architectural decisions, since a lookup table is the domain of a particular attribute within a particular entity within a conceptual data model.
4 – Their very nature carries with it the expectation of a rigorous internally consistent structure. That doesn’t mean that the ancient tradition of classical logic need be considered sacrosanct. But when we see a list that is clearly not being guided by those rules, it should at least lead us to ask: Why not, and what price is being paid for it?
This is an almost archaeological use of taxonomies—but the purpose isn’t merely to dig up the past, it’s to better understand the present and to better design the future.
The trick is to keep an eye out for things that look funny, then ask how they got that way, then ask how they’re working out right now and what might need to be different. (After all, taxonomies from yesterday determine how data analysts can slice and dice today.)
Here’s an example, a lookup table in a human service information system. The software tracks referrals, meaning the attempts by one human service program to help people receive services from another human service program. Referrals are a basic part of what such organizations do; yet data about referrals is often of rather poor quality.
This lookup table lists the permissible statuses of a referral:
- Client Received Service
- Client Refused Service
- Client On Waiting List
- Service Not Available
- Referral Inappropriate
- Appointment Pending
- Client No Show For Appointment
- Pending—Client In Hospital
- Pending—Client Too Ill
- Pending—Letter/Info Sent
- Pending—Needs Home Visit
- Pending—Scheduling Conflict
- Pending—Unable To Contact
- Pending—Requires Reassessment
- Pending—Needs Spanish-Speaking Staff
- Lost to Follow-up
Having suggested that this may be a funny-looking taxonomy, I now risk being accused of suffering from some strange Aristotelian snobbery. After all, what’s wrong with it?
Well, in real life the process of a human service referral generally involves several stages and decision points, each of which involves various parties and possibilities. There is an initial outreach by one service provider to another, (often) the making of an appointment for the client, (often) an assessment by the second service provider, the offer (or not) of services, the acceptance (or not) by the client, and then the actual provision (or not) of the services.
Meanwhile, the original service provider follows up with the second one to find out what happened. This list looks like a grab bag of responses that the second service provider might give at any point along the way. But it lacks any explicit representation of the expected stages. So perhaps, strictly speaking, it’s really several taxonomies—belonging to several stages of a referral—that have all been joined together willy-nilly in a table. The problem is that without explicitly representing each stage, there’s no way to analyze whether (and how aptly) the values cover all possible situations. Furthermore, without that context, there are few cues about the exact meaning of some of the statuses. (What exactly is pending? Is it the service itself or some prerequisite stage?) It also points to the question: is it really enough for the organization to record only the current status without capturing any information about how the process unfolds?
So this taxonomy has certain limitations. They’re not necessarily horrible or laughable, but they do lead to useful questions about the original design decisions and about how well the resulting artifact is serving the organization’s needs. It looks as though the original decision was simply not to tease out what goes on in the referral process very precisely, but instead stuff everything into an ill-defined status field. (If so, it’s probably an example of merely meeting stated requirements instead of creating a good interrogable model.) Then the already loose taxonomy may have been further changed as administrators added new values when users requested them. The next question should be: How well has this approach worked out for the data analysts downstream? And if the answer is not very well then how might this area be better modeled in the future?
That’s the value of paying attention to funny-looking taxonomies.
And beyond the individual organization, they might even be helpful for understanding how well (or poorly) the work of an entire sector is currently being modeled. Right now the human service sector is fitfully advancing toward various ways of standardizing its data: common performance measures, common data exchange standards, and others. Taxonomies are a necessary part of that mix.
Do you have any favorite funny-looking human service taxonomies? If so please share using the Comments form below!
If you found this post useful, please pass it on! (And subscribe to the blog, if you haven’t already.)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License