Anyone more knowledgeable in assembly and file formats care to expand on this:
>It serves no purpose, except proving that files format not starting at offset 0 are a bad idea
What exactly does it mean to start at offset 0 and why don't these file formats do that? Is there an advantage in not starting at offset 0 or is it simply oversight/indifference? Any kind of background on the problem would be appreciated, I'm really quite intrigued.
Every major file type (or nearly every, anyway) has a set of signature bytes, a "magic number" or something equivalent that identifies it as being of that type. This lets programs identify what kind of object a file represents without requiring this information to be supplied by the user.
Most file types have this magic signature as the initial few bytes of the file. For example, a Windows executable always begins with the ASCII characters "MZ".
The point is that with non-overlapping magic signatures, a single file can be simultaneously identified as more than one type.
It's a little more complicated than that, actually. Any given application of a file format may use various obfuscation techniques on the file's header or contents that render the file invalid from the perspective of the published standard (if there is one; it is also common in these cases to change the file extension to further disguise what format the file actually uses). Programs that do this may or may not de-obfuscate the file prior to use, depending largely on how and why the file was obfuscated.
For instance, a common obfuscation method is simply removing the magic number from the file; in this case, the program may simply try to use the file as the given format and return an error (or crash; we are talking largely about proprietary software in these cases after all) if the file can't be read.
When a file format starts at offset 0, it simply means that it starts at the first byte of the file.
Other than that, I can't provide any information on file formats allowed to start at offsets other than 0, or why this may or may not be a good idea (I suppose maybe it would allow an enterprising programmer to hide a malicious file by embedding it in an otherwise-innocuous format?), though I am certainly curious as well.
I think you're on to the right answer (though I don't know for sure myself).
It seems to me that if all file format identifiers started at the zero offset, it would be impossible for a single file to identify as more than one format. However, when different formats use different offsets to identify themselves, it is possible to construct the file in such a way that it validly identifies as more than one format.
That's kind of a different issue though, my understanding is that .jpeg has an unlimited size footer and .rar has an unlimited size header. It gets similar results, though.
A lot of archive formats start at the end because you don't know what is going to be written beforehand. But there is very little reason not to have magic bytes at either the very start or end of a file.
>It serves no purpose, except proving that files format not starting at offset 0 are a bad idea
What exactly does it mean to start at offset 0 and why don't these file formats do that? Is there an advantage in not starting at offset 0 or is it simply oversight/indifference? Any kind of background on the problem would be appreciated, I'm really quite intrigued.