The lesson I took from the MS Office XML fiasco is that a poorly abstracted single-client data model remains impenetrable no matter how you serialize it. It's not impossible to commit to a documented and stable wire format behind a web app, but almost nobody has the diligence to actually do it when it's so much easier to make random changes and then tweak your own client code to handle whatever you did (instantly making your client the only one that still works). Before javascript happened, HTML wasn't that easy to scrape, but it at least had the virtue of forcing everyone to generate output that made some kind of sense without first being munged by one idiosyncratic piece of code.