It's a completely different format. Iirc .doc files are basically implementation defined files and consist of c-structs dumped to disk. .docx is a properly specified format of compressed xml.
It's not "C structs dumped to disk". It's https://en.wikipedia.org/wiki/COM_Structured_Storage, which is basically a filesystem-in-a-file. And it has been documented for a long time, ever since Microsoft was forced to write docs for Office file formats because of antitrust: