>If a Web page is to contain data and a service wants to act on this data, it has to scrape the Web page.
This is completely the wrong approach and this isn't how people who know what they're doing work now.
A web page is just a presentation layer. If you have a service that wants data, it needs to work with a model or presenter/controller layer. On the web this can be a REST service, SOAP, something proprietary, etc. Ideally, the web site will be using this same source to get its data.
If the web application presents data via a web interface and doesn't offer a presentation/controller layer to allow you to access that same underlying API, then yes, you will have to scrape if you want that data for some reason. But I don't see this as wrong, you're doing something the owners of the data didn't intend for you to do. You'll have similar issues if you want to get data out of any application view (e.g. screen scraping a windows native app).
EDIT: Read the rest of your post and I see that you addressed much of this already. I still maintain that this is already how people are working who want others to use their data.
This is completely the wrong approach and this isn't how people who know what they're doing work now.
A web page is just a presentation layer. If you have a service that wants data, it needs to work with a model or presenter/controller layer. On the web this can be a REST service, SOAP, something proprietary, etc. Ideally, the web site will be using this same source to get its data.
If the web application presents data via a web interface and doesn't offer a presentation/controller layer to allow you to access that same underlying API, then yes, you will have to scrape if you want that data for some reason. But I don't see this as wrong, you're doing something the owners of the data didn't intend for you to do. You'll have similar issues if you want to get data out of any application view (e.g. screen scraping a windows native app).
EDIT: Read the rest of your post and I see that you addressed much of this already. I still maintain that this is already how people are working who want others to use their data.