November 05, 2004

iTunes OCD

by peterb

You'd think that having an iPod would be an endless parade of musical bliss. And, mostly, it is. But the one worm in the apple is that now that you have 5,000 songs in the library, and you have to rate them.

"Oh, come now," you say. "You don't have to rate those songs." And maybe you'd be right in a sort of narrowminded, positivist, literal way. But when you're talking about me, you're talking about someone who eats all of the green M&M's first, and is convinced that they actually taste better. The robots demand that I rate my music; therefore, I must rate it. I am a man without free will.

So the first problem you face is what the ratings actually "mean." iTunes lets you rate songs anywhere from 0 to 5 stars. I quickly disposed of the "0" rating -- with 5,000 songs, I needed some way to distinguish those that I haven't rated at all. So a zero rating means "not rated yet."

It only took me a few days to assign semantic meanings to each of the ratings:

  • 5 stars: A song that I love, and would consciously add to a playlist and want to hear.

  • 4 stars: A song that I love, but for some reason don't ever think of until the randomizer picks it for me. Then I enjoy it until it's over, and promptly forget about it again.

  • 3 stars: I like this song. Sometimes I might skip it, and sometimes I might listen to it.

  • 2 stars: I don't really like this song, and am likely to skip it if it comes on, other factors notwithstanding.

  • 1 star: This is unlistenable, and I probably won't even sync it on to the iPod.

The rating isn't everything, of course. Ratings are focused on songs, and that doesn't tell the whole story. For example, The Ex's Joggers and Smoggers album is composed of bits of found sound and rhythm that, considered on their own, are probably 1 or 2 stars no matter who is "rating" them; but taken as a whole, the album is a masterwork -- the songs need their neighbors. This is true also for the brief, nameless interstitial audiocollages on The Loud Family's Days For Days. But at least for your average pop song, it's a workable system. I just took a look at the distribution of ratings, and it is pleasingly bell-shaped, with perhaps a slight bias towards low ratings.

The next problem is what to do with the ratings. Sure, I made the inevitable smart playlists and such, but fundamentally I've generated this mass of data, and I'm a geek, so: I want to data mine it. What music do I like? Off the top of my head, if you said "Who are your favorite artists?" I'd say: Aimee Mann, Loud Family (a.k.a Game Theory), Nick Cave, Tom Waits, and Shadowy Men on a Shadowy Planet. Does the view of my library I get when I look at my individual song ratings match what I would say that I like if you asked me?

It turns out that iTunes doesn't have any built-in way to do this easily. Lots of people have written scripts to do various things, but I was only able to find one script that came close to what I wanted to do: a script which ranked albums, rather than artists. So as a first cut, I decided to run it and see what it said.

Rank Artist Album Rating
1 David Bowie Scary Monsters 4.2
2 Aimee Mann Bachelor No 2 4.08
3 Nick Cave And The Bad Seeds Live Seeds 4.0
4 They Might Be Giants Lincoln 3.78
5 Nick Cave And The Bad Seeds Let Love In 3.7

Hmmm, not too bad. It's a deceptive view, though. The album ranking script is brittle. In particular, it will only rank albums for which you've rated every single song. Since I know for a fact I haven't even come close to that yet, I know it's missing a lot of data (and by the very nature of rating psychology, there's probably a bias to rate songs that you like before rating songs that you don't like). Furthermore, the "album" is the wrong unit of measure for this sort of averaging. Going back to my Loud Family example: Days for Days has 10 superb songs that I rated highly and 10 little interstitial pieces that I gave low ratings because they don't really stand on their own. This kills the curve. The end result is I have a script telling me that I think, subconsciously, that The Proclaimers' Sunshine on Leith is marginally better than Days for Days, which isn't true at all.

So, I decided to take the script and see if I could use it as a base to generate some more interesting views of the data. This meant I had to play with Applescript. Have you ever used Applescript? It's like Cobol, but less versatile. Oh my God, what an ugly language. Basically, you can have your script interact with iTunes, sending iTunes commands asking it to give you the ratings for all the songs by such and such an artist. Really, if you want to do something like this, I think a better way would be to just parse the iTunes Library XML database directly, rather than using Applescript. But, what can I say, I was lazy.

So with just a little tweaking, I got a list of artists and rankings, rather than albums. The invariants were a bit different: you didn't have to have rated every song by an artist, but only rated songs count towards an artist's score. Artists with fewer than 10 rated songs weren't included at all:

Rank Artist Rating
1 Nick Cave And The Bad Seeds 3.85
2 Aimee Mann 3.81
3 DJ Z-Trip & DJ P 3.8
4 Kate Bush 3.64
5 Me First And The Gimme Gimmes 3.42
6 Nick Cave 3.35
7 Paola & Chiara 3.33
8 Michiru Oshima 3.29
9 Billy Bragg 3.29
10 Jane Siberry 3.26
11 Richard Thompson 3.25
12 Tom Waits 3.22

Better, but still some odd results. Fundamentally, I started converging on the idea that using the mean of the score in number of stars is broken. The stars have a semantic meaning which doesn't really map smoothly on a real 0-5 scale. So I gave it one more try.

This time, I decided that what I really wanted to know was: which artists have the strongest repertoires (subjectively) in my library? So I decided to use the idea of "hot songs." If I gave a song a 4 or 5 rating then -- by my personal scale, described above -- it was a song that I'd be glad to listen to nearly any time. Whether it's a 4 or a 5 doesn't matter while I'm listening to it. If a song is below that threshold, it doesn't really matter whether it's a 1, 2, or 3. Back to the Applescript editor once again (oh, the pain, the unending pain) to generate a list of artists. This time, the score is an integer between 0 and 100, which is approximately "number of 'hot' songs by that artist in the library, divided by the total number of songs by that artist in the library, * 100". So an artist for whom I rated every song a 4 or 5 would get a score of 100, and an artist who got no 4s or 5s would get a 0. I also decided to actually display the number of songs affecting the calculation, rather than just showing the score. I find this a bit more interesting:

Rank Artist Rating # Hot Songs # Songs
1 Aimee Mann 59 22 37
2 Bob Mould 55 6 11
3 Billy Bragg 52 11 21
4 Nick Cave 45 14 31
5 Nick Cave And The Bad Seeds 44 18 41
6 Lou Reed/John Cale 40 6 15
7 Indigo Girls 40 4 10
8 DJ Z-Trip & DJ P 39 9 23
9 Pixies 38 36 94
10 Game Theory 37 10 27
11 Tom Waits 36 49 135
12 Karl Hendricks Trio 34 23 68
13 Talking Heads 33 5 15
14 The Ex + Tom Cora 33 4 12
15 Me First And The Gimme Gimmes 33 4 12
16 Mary's Danish 33 4 12
17 Los Straitjackets 33 4 12
18 Paola & Chiara 31 8 26
19 Wall Of Voodoo 30 3 10
20 The Sisters of Mercy 30 3 10

Now we're starting to approximate my subjective worldview much better. Kate Bush drops completely out of the top 20 (thank God, my hipster status is secure, unless people notice that Paola e Chiara are there also). Furthermore, it's immediately apparent which of the top 20 members are outliers, just by seeing the low number of tracks that have been rated. I have one Indigo Girls album; it's the one that is universally known as "the good one," and it has 4 songs on it that I really like. That's enough to give them a probably unjustly high position in the list. So this version of the script is unfairly generous to artists who only have a few tracks in the library. Probably the next step is to give a bonus to artists who have many tracks in the library, and a penalty to artists who only have a few (David Bowie, for example, has pretty much the same rating as Roxy Music, even though they only have 2 highly rated songs in my library, and he has 30. That's unjust. Unjust, I tell you!)

And, of course, I'll need to continue with the arduous, backbreaking work of listening to all my music. And rating it.

The original Applescript that I used as the pattern for all of this stat generation can be found here. If you've got a script that you like that does something similar, please mention it in the comments below. I can't possibly be the only person in the world that is obsessive-compulsive about this.

Posted by peterb at November 5, 2004 07:20 PM | Bookmark This

I feel better about my personal rating OCD now.

I haven't written scripts to rank artists/albums (though I must confess that I have thought about it), but I do tend to create smart playlists of unrated-only songs to encourage me to actually rate everything. Combined with Party Shuffle, that makes it pretty easy to actually remember to rate stuff -- I can listen while I work, and every now and then flip over to iTunes and rate the last N songs that shuffled through.

I've gotten to the point where pretty much the only stuff in my library that isn't rated is the brand-new stuff, and that makes me happier than it really ought to.

Posted by Nat Lanza at November 5, 2004 07:51 PM

My god. Wall of Voodoo actually wrote more than one song?

I don't feel the compunction to rate all my songs in iTunes, but that means I do have to try to rate them more or less when I rip the CD. If it's a brand-new CD, that can be difficult.

On the other hand, the green M&M's definitely taste better.

Posted by Christina Schulman at November 5, 2004 08:28 PM

So I've rated nearly all my albums, using a similar methodology. The problem is that I need some way to adjust the probability that a song gets played. I could listen to 4 & 5s all day, but that gets boring... I need the occasional 3 or 2, and every once and a while 1 star song, to act as a musical sorbet to cleanse the palette. As far as I can tell iTunes can't handle this.

The best I've come up with is a smart play list with all my 4s and 5s, + all unranked (which are mostly singles) If the proportion of songs are right it's OK, but it's still far from optimal.

Posted by Mark Denovich at November 5, 2004 11:47 PM

I resist my robot overlords. I don't even know how to rate the damn songs.

Posted by psu at November 6, 2004 07:18 AM

Bah! Do not mock Kate Bush!

Er... That aside, one thing that might be useful is to weight ratings by song length--that way, things like the little interstitial pieces that only really make sense when you're listening to a whole album don't drag the rating down as much as they might otherwise.

Posted by John Prevost at November 6, 2004 01:18 PM

Hey, buddy, don't even talk to me about musical OCD until you've signed up for Audioscrobbler, gotten a plugin and started scrobbling... It's a whole 'nother kinda obsession...

Posted by Kelly DeYoe at November 6, 2004 07:21 PM

I'm confounded. Or -fused.

Isn't the point of data mining to find interesting data that wasn't visible at a first glance? Why then produce a list that exactly reflects your likes/dislikes that you know in the first place?

Unless, of course, this is only a first step. Have you tried matching your likes/dislikes against total playtime? Scrutinized your listening habits? Maybe you *do* like Kate Bush after all? That's what OCD is all about ;)

That said, I *have* to scrobble now. That's a bit more interesting. What I'd really like to see is something like Scrobbler pointing me to music I've never listened to, though. Now *there* is an interesting application...

Posted by Robert 'Groby' Blum at November 7, 2004 03:10 PM

Hm. Let me reconsider my last statement. Scrobbler is actually happy to share all your listening habits publicly - that rules it out for me. I'm all for anonymous data sharing, but I don't like that data tied to my identity...

Posted by Robert 'Groby' Blum at November 7, 2004 03:25 PM

I have the same need-a-little variety issue with "only high-ranked songs" playlists.

So what I do instead is make fairly broad smart playlists to use with Party Shuffle and use the 'play high-rated songs more often' option combined with manual fiddling with the next-N-songs list every now and then.

It's still not perfect, but it's good enough for me.

Posted by Nat Lanza at November 7, 2004 03:58 PM

Robert, the whole point is that I -don't- know my likes/dislikes to a sufficient level of detail. I know the "big" ones, but am constantly being surprised. Getting a different view of the data can help enlighten me.

Maybe we're talking past each other? I'm not sure.

Posted by peterb at November 8, 2004 08:56 AM

Audioscrobbler data is public, yes, but it isn't like you have to attach your real world identifying data to it unless you want to.

Also, once you've built up a listening profile, it does automatically generate a listing of other people whose listening habits are similar to yours, as well as a list of artist recommendations for artists you haven't listened to but might want to check out.

Its ranking strategies aren't perfect, but they are interesting.

And just because I don't care if anyone knows what I'm listening to:

Posted by Kelly DeYoe at November 8, 2004 03:46 PM

I have been looking for an album rating script for some time. i too have spent months rating my itunes collection. Is there somewhere I could download your scripts? The original doesn't work for me, as a lot of albums have songs that are not rateable (intros, bridges, etc)

Posted by Derek Brown at November 11, 2004 01:38 PM

Please help support Tea Leaves by visiting our sponsors.

November October September August July June May April March February January

December November October September August July June May April March February January

December November October September August July June May April March February January