Jon Husband over at Wirearchy has a piece on Wirearchy
(his term for what I think of as Expert Communities - I like Wirearchies better but for now they have to be explained so I stick with EC)
He also mentions Tag Clouds that I encountered a couple of days ago and gets into a discussion on tagging with a visitor on what this tagging stuff is.
Some thoughts.
- Jon has officially launched Qumana and, if you haven't checked it out, do it now.
- Also have a look at Tag Cloud, its an interesting idea, although I've asked them for some tools to edit the cloud, dispose of words that aren't relevant to me, map duplicate terms such as Matt Simmons and Matthew Simmons to each other and generally clean up a potentially useful tool.
(BTW, I mean only that my view of the tag cloud be controllable, not that my personalisation would affect anyone else's, although a tool to import someone else's structure might be useful.) - Go and read Clay Shirky's piece Ontology is Overrated: Categories, Links, and Tags.
I wish I was as smart as that guy, his ability to dig into a topic and tease out some startling ideas is just brilliant.
His point is that tags are not just a way of organising information, but that they change the rules about organising information. Tags enable information to live in any number of locations simultaneously, a book on Balkan Art can be simultaneously on the Balkans Shelf, the Art shelf, the Beautiful Binding shelf and the appalling translation shelf where those interested in each of those factors may have nothing but that single book in common.
Like me, he is heavily down on Taxonomies and Ontologies because they make impossible demands and lead to impossible, and highly inefficient processes, as soon as they are released into the wild. Unlike tags, Ontologies and Taxonomies don't scale.
There is a lovely set of diagrams that lead from the periodic table of the elements to the wheel on which top down information structures are broken, "The Parable of the Ontologist, or, 'There Is No Shelf' ". One of the key paragraphs for getting the information structure of the net is this
The Filtering is Done Post Hoc - There's an analogy here with every journalist who has ever looked at the Web and said "Well, it needs an editor." The Web has an editor, it's everybody. In a world where publishing is expensive, the act of publishing is also a statement of quality -- the filter comes before the publication. In a world where publishing is cheap, putting something out there says nothing about its quality. It's what happens after it gets published that matters. If people don't point to it, other people won't read it. But the idea that the filtering is after the publishing is incredibly foreign to journalists.
Then he starts to look at the second order benefits of looking at how people tag their information. Not only can you learn something by sharing information structures among users, but you can learn something about the information by looking at the distribution of tagging strategies applied by many users to that single source. It turns out that different kinds of information have different "tag signatures".
This fascinates me because a couple of years ago Jonathan Schull was looking at what he called the Macroscope Manifesto. It involved looking at the flows on information across the net and trying to figure out what it meant.
My problem with it then, and still, is that it would no doubt produce many interesting pictures and charts, but that we would have no way of assigning meaning to them. More probably we would assign speculative meanings that may or may not be true, then we would fight about them and get nowhere. After all, we have been using macroscopes of a type for generations, to look at the cosmos.
With every generation of star gazers we find that the meaning of what what we are looking at needs, sometimes massive, revision. This seems to be happening ever faster as the rate at which the tools improve accelerates.
Now, if instead of watching the flows of information and trying to assign them a meaning, what happens if we could look at the flow of meaning through an information space? Tags already guarantee us that the data has a meaning, or group of meanings, we can chart that meaning and we can watch it spread through a system; we could watch it grow and discard new meanings and, at any moment, we could swoop down from our macroscopic perch and select any one of those meanings, or group of them, to see exactly what was going on, for some fine, but arbitrary value of "exactly"
As I told both conferences I presented to last month, the future of this stuff will be much bigger, much more interesting and much more disruptive than its past. People like Shirky are drawing the maps and people like Jon Husband are building the vehicles.
Clay is good writer and offers many insights, but on these topics he has an unfortunate tendency toward a political style of argument. Typically he creates a straw-man - the Hierarchic Taxonomy - and conflates it with Ontology and the Semantic Web.
Imagine a Victorian armchair scientist who decides to catalog the world's food ingredients. Drawing from the leather-bound volumes in his library he and his assistant write out the first hundred cards, and they have a problem - retrieval. So he makes a command decision (being recently retired from Her Majesty's service) to sort the cards first by Animal-Vegetable-Mineral and to chart the subdivisions below. Maybe he persevered through his long retirement to build a curious if not very useful reference.
This is Clay's mythical Hierarchic Taxonomy. It was born in a time with different information processing constraints and survives in cases where it truly matches the problem at hand. The hierarchy of species helps to understand evolution. Maintenance of modular assemblies is guided by a hierarchic reference. But no one uses it for general information management.
A century later retired gentlemen undertakes the same task. He writes note cards at the public library and types them into a flat data file in his a second-generation personal computer - to make searching easy. He puts fields in his database including the name of the foodstuff, cuisines that use it, cost, and AVM - AVM referring to one of the labels Animal, Vegetable, or Mineral.
Now there's no hierarchic classification needed and our gentleman has invented Tags! In fact the technology of the day doesn't support hierarchy but tags are quite natural.
A few decades later a group of culinary arts students decides to create an open catalog of the world's recipes. One of the first challenges is to design an ontology of foodstuffs so that cooks around the planet can describe their diverse local ingredients.
Does this mean they need to make a "global" classification scheme? Of course not. An ontology is merely a flexible database scheme, first delineating some useful category - foodstuffs - and then stating an open list of things that one might say about a foodstuff - name, cuisine, AVM, etc.
Would tagging accomplish the same thing? Not as currently conceived. First, Clay seems to reject any classification of the problem domain. While practitioners naturally think of different sets of tags for, say, ingredients and recipes, apparently this two-level hierarchy is too much. Second, tags are not as expressive as a database field because the field name carries semantics - Dollars-per-pound: 15 vs. the tag "15DPP" ??
Going further our students make ontologies for ingredients, recipes, utensils, and skills using semantic web technologies. In time the ad hoc standards they establish catch on with influential grocers, publishers, chefs, manufacturers, and schools. This creates an open ecosystem in which we all become better cooks.
The key feature of the semantic web is to be able to combine, search, and assemble data from all sources. The semantic web is scalable; tagging is not because it is a muddle. Tags are certainly part of this picture. But if tagging is to evolve to do real work it will have to take on more structure and it will begin to look like a distributed open database - a lot like the semantic web.
Think which would you prefer to eat - a nicely structured dish, or a big tag mush?
Posted by: Rick Thomas | June 17, 2005 at 03:57 AM
Clay starts with the premise that all classification schemes and all tags are transient. No doubt in some absolute sense. But let's be real; no one is pursuing a "theoretically perfect view of the world". In most cases there's a plain and obvious view of the world that just lacks organization. Everybody knows the world changes constantly, but stable classifications are common, practical tools, not binary-mindedness. Ontology and the semantic web (and typical XML web applications for that matter) are useful distributed database applications, not the descendents of the antique absolute Taxonomy.
Clay seems to have some hope that because of the simplicity of tags and thus their profusion tagging will lead to some new kind of emergent applications, which incidentally will vanquish all classification. May be.
But more likely we'll want to extract classifications from them - blatant categorization. That's what comes to mind when you suggest using Clay's "tag signatures" to identify different kinds of information. Likewise when Clay observes that tags are often correlated. The structure teased out of tags will start to look a lot like ontologies: a commonly observed class and an open list of things that might be said about members of that class.
You would like to "look at the flow of meaning through an information space". But Clay insists dogmatically that "tag semantics are in the users. This is not a way to inject linguistic meaning into the machine." I suspect this is one reason that Clay is hot for tags: tagging naively short-circuits the basic methods of representing semantics in data. Politics, ugh.
Really, semantic web technology is your friend so don't throw it out before you see how it complements tagging. Ontology may be flexibly extended so it doesn't restrict what we can say; tags may be expressed unchanged. Ontology is also adopted socially, admittedly with a heavier cost and requiring more mature commitment compared to the teen-like flightiness of tagging. Ontology guides expression that is more kin to natural language than the truncated notion of "tag".
Posted by: Rick Thomas | June 18, 2005 at 09:10 AM
I'm thinking that this is one of those domains/areas where *both/and* will evolve, with probably relative degress of *loose/tight* in terms of applicabilities.
And it will become the practice to "cook your own", in an environment of chacun a son gout ... which is what tags, generally, can add to the recipes ... no ?
Posted by: Jon Husband | June 21, 2005 at 05:03 PM