David Harvey says a mouthful on filling in metadata right here. Which links with my post on Tim Langeman's openreferences.
The thing that troubles me most about metadata is that it is generally construed as proprietary; the "owner" of the information also controls the metadata. Not only is that unrealistic for all the practical reasons David mentions, but also because that the person who creates the document cannot be trusted to describe its quality and cannot know any more than a small part of its meaning. Which is why I'm fascinated by David's comment on the environmental modelling tool vision that he and Jim Hall have been developing.
He says:
We need to store our data and definitions of our models in such a way that structured metadata can be attached, and however we do this, we must do it in such a way that the metadata schema is open ended. I don't yet know what metadata will prove necessary next year.Precisely. And for it to be worth anything, it must also be held separately from the original data. That way, others can contribute to the development of the metadata or annotation, of the document. That way too, new documents could easily inherit many characteristics of other documents by the same author, on related subjects or created within the same organisation, project, religious affiliation or football team. Even if a document by a given author is not at all annotated, I could find out from the separate repository, a great deal about that source of information, filtered according to my own needs. Then I could add the new, or old, document to the list, submit some further rating information about the content, and both enrich the metadata of all related documents, while saving the time resources of everyone with an interest in that information.
To me that sounds like a web service, one that might look a bit like this,
We need to build into our model development and exploration tools mechanisms for the harvesting of valuable information. The discussions have been mostly about collecting information on the sensitivity of models to parameters, for example, where the details of such studies, which tend to use very many realisations (realisation = model + particular parameter set). There are other, more mundane things to collect though. Who modified a model, and when? What processing has been applied to data, and where is the raw data it was applied to?If I get this properly, both open and easily relatable parameter sets are essential to the development of metadata tools that are sufficiently flexible to meet our actual needs for them, as distinct from the "this is what we can do, make the most of it approach". I can go with that any time, I hope they make it work.
One of these days I need should try to clear up the ambiguity a bit. As you detected, my name is David. But I'm called Hamish ...
No big reason, it's just a nickname that stuck.
Posted by: Hamish Harvey | October 06, 2003 at 05:16 AM