> I'm particularly interested in your opinions on tagging. Can you explain what most software gets wrong about tagging?
First up, tagging is such a personal preference, from the highest level things like which fields to fill in (e.g. Album Artist), to small differences (e.g. do tracknumbers have a leading 0?), but also technical differences and compatibility choices (e.g. which tagging format? for MP3 you can have any of APE1, APE2, ID3v1 or ID3v2.{2,3,4})
Next these different tagging formats are in no way uniform:
APE2 and vorbiscomments are a string map with somegenerally agreed upon conventions,
ID3v1 has very limited fields,
ID3v2 is extremely over specified with a tag for everything in practice this just seems to mean that every program gets them wrong somewhere.
There are several other formats too...
Next, the tagging systems are very tricky to locate and modify: they all have various forms of headers, footers, padding, and odd storage formats.
e.g. flac/vorbiscomments keep swapping back and forth betweeen little and big endien.
This is made mode complex by the various fighting container formats that have their own escaping, rules etc on top e.g. Ogg, MPEG, Matroska....
To make things even harder, things aren't even correct without context. To take the duration issue from the article, VBR MP3 has no way to calculate duration without completely parsing the files: this takes too long for most files (reading the whole file into memory just to get the length???? no way!) so there's all sorts of heuristics and weird headers (e.g. Xing)
First up, tagging is such a personal preference, from the highest level things like which fields to fill in (e.g. Album Artist), to small differences (e.g. do tracknumbers have a leading 0?), but also technical differences and compatibility choices (e.g. which tagging format? for MP3 you can have any of APE1, APE2, ID3v1 or ID3v2.{2,3,4})
Next these different tagging formats are in no way uniform: APE2 and vorbiscomments are a string map with some generally agreed upon conventions, ID3v1 has very limited fields, ID3v2 is extremely over specified with a tag for everything in practice this just seems to mean that every program gets them wrong somewhere. There are several other formats too...
Next, the tagging systems are very tricky to locate and modify: they all have various forms of headers, footers, padding, and odd storage formats. e.g. flac/vorbiscomments keep swapping back and forth betweeen little and big endien. This is made mode complex by the various fighting container formats that have their own escaping, rules etc on top e.g. Ogg, MPEG, Matroska....
To make things even harder, things aren't even correct without context. To take the duration issue from the article, VBR MP3 has no way to calculate duration without completely parsing the files: this takes too long for most files (reading the whole file into memory just to get the length???? no way!) so there's all sorts of heuristics and weird headers (e.g. Xing)