Hi there,
I have a data set with for about 38,000 artists with a MusicBrainz ID a list of the artists that are most related to it according to last.fm.
I have about 60,000 more without a MusicBrainz ID and I can obviously retrieve more data through the last.fm webservices.
My plan is to add all artists with a MusicBrainz ID that are more than 80% similar to an artist (about two or three artists, usually) as a 'similar artist' relation. Is that ok?
As an example, I added links for about 100 artists to the sandbox. I do my lookups based on the MusicBrainz ID, and do not create new artists. See: Édith Piaf
Any comments? Should I somehow link the artists to their last.fm page after processing? Add their last.fm urlname as a key, perhaps?
Thanks,
- Jeroen

