Incorrect encoding, incorrect separators (record and field both), incorrect escaping / quoting, etc…
> If I know what my data looks like
If you control the entirety of the pipeline, the format you're using is basically irrelevant. You can pick whatever you want and call it however you want.
> Have you ever worked in embedded systems? Writing XML files and then zipping them on a platform with 32 kilobytes of RAM would be hell. CSV is easy, I can write the file a line at a time through a lightweight microcontroller-friendly filesystem library like FatFS.
You can pretty literally do that with XML and zip files: write the uncompressed data, keep track of the amount of data (for the bits which are not fixed-size), write the file header, done. You just need to keep track of your file sizes and offsets in order to write the central directory. And the reality's if you're replacing a CSV file the only dynamic part will be the one worksheet, everything else will be constant.
> If you control the entirety of the pipeline, the format you're using is basically irrelevant.
I think you are missing the point -- you only need to know about generator to know about format.
Since the parent poster was talking embedded, here is one example: a data logger with tiny embedded records tuples: (elapsed-time, voltage, current). You need this to be readable in the widest variety of programs possible. What format do you use?
I think the answer is pretty clear: CSV. It is compatible with every programming language and spreadsheet out there, and in a pinch, you can even open it in text editor and manually examine the data.
Using something like XLSX here would be total craziness: it will make code significantly bigger, and it will severely decrease compatibility.
Incorrect encoding, incorrect separators (record and field both), incorrect escaping / quoting, etc…
> If I know what my data looks like
If you control the entirety of the pipeline, the format you're using is basically irrelevant. You can pick whatever you want and call it however you want.
> Have you ever worked in embedded systems? Writing XML files and then zipping them on a platform with 32 kilobytes of RAM would be hell. CSV is easy, I can write the file a line at a time through a lightweight microcontroller-friendly filesystem library like FatFS.
You can pretty literally do that with XML and zip files: write the uncompressed data, keep track of the amount of data (for the bits which are not fixed-size), write the file header, done. You just need to keep track of your file sizes and offsets in order to write the central directory. And the reality's if you're replacing a CSV file the only dynamic part will be the one worksheet, everything else will be constant.