Dave Weinberger points with approval to Jay Fienberg's reflexive post on Metadata and vagueness. I agree with most of what both of them are saying, so I wonder why all this stuff about metadata is such a dog's breakfast. Why are we having so much trouble organising ourselves around this subject and agreeing on whatever is useful.
Jay hits the nail on the head with his reference to vagueness and, by his title, seems to say that one somehow stands in the way of the other. But one of the most intractable problems that I see is that data, and metadata are both linguistically deployed tools, and among the vital characteristics of language are that it has varying degrees of vagueness which are utterly essential to effective communication, that its internal boundaries are fluid as adaptive processes and that it has very ragged edges to enable growth and complexity.
Those fundamentals are inherently at odds with anything binary and any attempt to draw boundaries that a binary model can "manage" will have one of two possible problems. It will either constrain the potential for leakage, subversion, malleability and vagueness necessary to actual communication, or it will proliferate categories and sub-categories until they become themselves inherently unmanageable. In other words, the more clearly we define something the harder it is to link to anything else and the more broadly we define it, the harder it is to chain effectively. A Heisenberg uncertainty principle of meaning.
Our Internet problem is that we have to find a way to make this vageness value work in the fundamentally binary model of machine to machine communication.
From where I sit, all data is inherently "meta", it is always "about" something, not the thing itself. The only reason I ask you to pass me a ladle is because I don't have it, I need it and you need to know what to pass me. "Ladle" is metadata about an implement and data about a class of implements and, of course, I can treat it as data or metadata depending whether I want to know how many ladles I have or how much steel is contained in all the ladles in my warehouse or some other measurement and it would be fine if all I wanted to do with information was count stuff. (Even those things we just want to count don't even hold still while we do it, who would have thought that agreeing on what is a name and address would be like herding cats?)
But you can know everything about ladles from their design to their manufacture to their many uses to where they are used and still have no clues about what Spike Milligan was doing when he wrote (in a piece whose origin I now forget) by "She ladled her breasts in to the next room". Jokes, attitude and metaphor are woven into everything to do with communication, and they stop working as soon as we pin them down.
Language grows and shifts and develops through puns, jokes, abuse, insult and co-option. It accretes new words from other languages at the edges of disciplines and activities which can be right at the heart of our social interactions, it elevates and denigrates terms, concepts and actions and with those shifts goes power and influence, control and innovation. It doesn't work without this kind of vagueness and there are few things more pathetic than watching some guardian of lingusitic purity trying to force people to refrain from using it in ways the authority deems unacceptable.
My solution to this is to shortcut the process a bit. The metadata is needed because we are trying to get our machines to talk to each other about what their humans will think is interesting and useful, and to build for them a structure that will encode "what we mean by interesting". In attempting this we will no doubt create some fascinating and valuiable insights into machine code and what it can do, we will realise that we know a hell of a lot more about how we "think" about thinking, but while we continue trying to achieve this desirable outcome by forcing data at whatever level of metaness into a binary straightjacket, we will fail.
Which is why PageRank is still a stroke of genius and why the future of metadata is not in information about information, but in opinion about information, opinion and people. By outsourcing the ambiguity to the people, who handle it without having to define it, then accumulating their shared pointings, we will continue to make progress long before we have agreed on anything much about metadata, including the boundaries where the meta kicks in.

Comments