I didn't mean it as a "Hey GPT: format the following text neatly", rather it cou...

hansvm · 2024-12-16T21:59:12 1734386352

That latter thing is a much harder cat and mouse game though. Is the proposal to use the rendered output as feedback to help the LLM converge faster than the people already making de-shittified UIs? If you can't use the actual content of the page to assist the AI then that seems like a very hard problem.

gosub100 · 2024-12-17T17:10:42 1734455442

It would train AI to recognize an article's text, and discard the ads. Then format the text neatly, like a newspaper used to be.

hansvm · 2024-12-18T02:57:17 1734490637

How do you deal with the text being distributed mixed in many JS files, half only being delivered over the wire after you click a button, "text" being displayed as a nested div to generate some sort of formatting (especially when the HTML for such a monstrosity doesn't fit in a single context window), ...?

gosub100 · 2024-12-19T12:38:46 1734611926

you could just render it as if it were being shown to a person. Thats where the fuzzy logic of AI would come it. It could be trained to identify what an article looks like, based on the layout and size and other inferences (that it figures out, hence AI).

For a while now, I've thought of the idea of outsourcing this to a 3rd world sweatshop. Basically pay people to click on the scummiest ad-loaded pages all day, saving copies of just the content, and re-hosting it as, say, web 1.0 content, text and pictures, nothing more. Whether they used copy-paste, or just "save page" and then pipe it through another program, who cares. just extract the content and host it as web 1.0 that would load super fast but maybe have the same font or formatting as the original.