July 21, 2004

The Problem with Data that is Meta

by psu

Talk to geeks the world over, and they will wax lyrical about all the ways in which meta-data will save the world.

It will make your disk searchable.

It will provide a semantic framework for WWW content.

It will allow tools from different vendors to manage your workflow and asset files.

It can form the basis for archiving your digital life.

Sadly, it's all a lie.

The problem here is that by its nature, the data that is meta is still data, and therefore it is subject to all the same problems that the original data had in the first place:

- Everyone has to agree on a schema.

- Meta-data will evolve but the tools you have to manage it will not.

- Vendors with power will try and use meta-data as a leverage point to lock you into their tools, rather than as an interface point to allow mulitple tools to interoperate.

and so on.

An example of this problem that is near and dear to my heart is meta-data related to digital photographs. There are multiple standards (EXIF, IPTC, XMP) and multiple standards for embedding those standards into various standard file formats like JPEG. In addition, different camera vendors embed the data differently in their various proprietary formats (Nikon NEF files, for example). I had a workflow set up between Photoshop and iView that was working great for me mostly because Photoshop and iView agreed to write the EXIF data in the same place in all their files. This is something of a minor miracle, actually, as I was soon to find out.

Then I got Photoshop CS. One of the new features in the Adobe CS line is a completely new standard (XMP) for embedding meta-data into files. Adobe trumpets this new standard as a new extensible base for all your meta-data needs now and in the future. All I know is that iView can't read the new data, so my entire workflow is shot, and I downgraded back to Photoshop 7 until I can find another solution.

Of course, if I were to just use the Adobe tools for this, I'd be all set, since presumably they all know how to read the same formats.

In other words, Adobe changed the schema, didn't give people time to evolve the tools, and thus used the new meta-data standard to try and lock me in to their stuff, even though their file browser software blows.

Hallelujah and Amen.

Posted by psu at July 21, 2004 12:27 AM | Bookmark This

hey pete remember me?
found you thru rich engel's blog.

Posted by megan dietz at July 21, 2004 12:53 PM

8 hours after writing this in a fit of anger, I found that Adobe released a patch to their Camera RAW plugin that fixes this problem. So my workflow is in tact and I get to keep my beloved iView and use all the new Photoshop toys too.

Thank goodness for good apps.

Posted by psu at July 21, 2004 09:31 PM

I couldn't have said it better myself!

As someone who spends all day cooped up with Information Science grad students*, I can tell you that, yes, metadata does give lots of idealists wet dreams. As someone with a background in IM protocols**, metadata strikes me the same way as it does you. I foresee a future full of weirdo kludges to bridge between one person's idea of a document and someone else's, with plenty of third-party "extraction" tools. That is, it'll be a lot like today.

The thing that is positive about metadata is that at least now people are thinking about it. I used to do accounting system migrations (read: the world's dullest and least sexy reverse engineering gig), where I had to spend days of my life writing perl to handle all the nuances of text files containing COBOL-generated, bizarro-formatted accounting reports. Yeah, we would just replace their serial printer with a serial port on a linux box, and suck up all the data as it got printed. It was easier than divining the nuances of a dozen or more binary files. At least with "metadata," people are thinking that encapsulation is needed, instead of just spewing data onto screens or drives. Rigidly structured data is a lot easier to work with than "amorphous" data. The problems of missing/incongruous information are still there, though.

As someone who's made parts of a career out of this sort of thing, I like the idea. It means that my jobs get easier, but I'll probably still be able to bill the same =)

* I work at a little place called ibiblio, nee MetaLab, nee SunSite... we have a few RAs that do a lot of the site development. I'm just an admin there: it's a nice quiet student job while I finally knuckle down and get a degree.

** I worked on reverse engineering AOL's OSCAR protocol, better known as AIM.

Posted by Josh at August 1, 2004 10:44 PM

Hey Psu,

Even after finding the update, I'm personally still pissed. Adobe is force-feeding their standard down everyone's throat.

"The nice thing about standards is there are so many" NOT!!!!!!!!!

There are a million+ digital cameras and printers that support EXIF meta-data and then Adobe introduces a new format that doesn't work along side of what is existing to give folks a choice or even an opportunity to transition to, but just kicks it to the curb like yesterday's garbage.

Sure, all the big boys are adopting XMP, but they completely left out the lil folks that create/use utilities to work as they want to.

I have a Nikon D70 DSLR and even the latest version of Nikon's software doesn't display all the EXIF data. I have to rely upon third party utilities to grab all of it.

Not even Photoshop CS can see/view all the available data (even with the new RAW plugin).

I'm in the process of writing a script to read/write IPTC, EXIF and XMP data, all at the same time. A very slow process btw (writing the script that is).

Just too darn difficult to bulk edit (crop and watermark in/on) 250+ photos at a time with Adobe products. And I agree that the image browser in PS completely sucks. Paint Shop Pro has that feature down flat (I use both just to be productive).

[end rant]

To email: name + at symbol + email address

Posted by Jymmm at October 9, 2004 02:00 AM

Please help support Tea Leaves by visiting our sponsors.

November October September August July June May April March February January

December November October September August July June May April March February January

December November October September August July June May April March February January