> What I like about your comment is that it points out that all technical work-arounds are moot if people as a whole are not willing to stand up with pitchforks and torches to defend their freedoms.
If your system requires extraordinary political efforts from large numbers of people, your system will fail. We are the elites, we have to oppose this. If Netflix asks us to implement this kind of DRM, we have to resign. If Facebook asks us to implement sophisticated surveillance, we have to resign. Etc. etc. We can't keep cashing the checks and then point to the body politic like "I beg you to stop me".
This is because whatever Google tech is checking for this stuff is too simple right? Or, maybe it's too aggressive because of music industry contracts?
Am I right that, while we can't have SQLite because there's only 1 implementation, we can have XSLT even though there's only 1--unmaintained--implementation?
Nah, libxslt is a spinoff of Expat, at the very least (and mozilla mantains its own xslt library) there's a full implementation by the standard writer called Saxon[0]
I assume you're referring to Web SQL? As I understand it, the argument against isn't just "there's only 1 implementation," it's "there's no standard and there's only 1 implementation," so the standard would have to devolve to "whatever that 1 implementation does."
As others have pointed out there are multiple implementations of XSLT, but I’d also argue that this situation seems like a decent argument in favor of that policy. If everyone is using a single implementation then in practice that implementation is the standard and things like Hyrum’s law become serious considerations.
XSLT is grandfathered in from the early days of the web, and while it’s turned out better than Microsoft exposing random COM interfaces which even Windows developers hated it’s still something of a cautionary example of a feature which never really caught on but browser developers have to support decades later or be willing to break a modest number of sites, some relatively important in particular niches like government information. I think of what happened with WebSQL as a reaction to the maintenance costs of a decade earlier.
Isn't the situation essentially the opposite? We apparently can't have it in the standard just because Google don't want to maintain the specific implementation they have chosen for their browser.
That's at least not something you can accuse XLST 1.0 of. Like most parts of the old "XML stack", XLST 1.0 has ample implementations in Xalan/C, Xalan/J, Saxon, libxslt2, MS XML, to name only mainstream ones. And the portability for XLST 1.0 is almost perfect/gives identical results (up to DOM equivalency eg. attribute ordering, and even beyond) in my experience.
XSLT 2.x/3.y however, while still a "W3C recommendation", violates (or had violated for the longest time) W3C's own policy of at least two interworking implementations to reach "recommendation" stage, and is authored by the vendor of the single XSLT 2.0/3.0 product, which used to be a problem I pointed out several times. Of course, nobody cares about W3C, Inc. anymore, precisely because of those pay-as-you-go and other self-serving policies among other things.
Just for the record, Xalan-C is even less maintained than libxslt. It had no releases for over a decade, and I made a final 1.12 release in 2020 adding CMake support, since the existing builds had bitrotted significantly, along with a number of outstanding bugfixes.
It's a great shame we are now in a situation where there is only a single proprietary implementation of the very latest version of the standard, but even the open-source 1.x implementations are fading fast. These technologies have fallen out of favour, and the the size and complexity of the standards is such that it's a non-trivial undertaking to keep them maintained or create a modern reimplementation.
> And the portability for XLST 1.0 is almost perfect/gives identical results (up to DOM equivalency eg. attribute ordering, and even beyond) in my experience.
Not my experience, they all have different ideas of what the current node is at any one point in the execution of a script.
We actually have three distinct one-of-one libraries.
There's no standard, so there's no way to really evaluate how standards-compliant they are.
It seems like the idea is for the turoso projects to be compatible with sqlite, but it's not clear exactly what means.
As a fork, libsql could be kept reasonably backward compatible with recent versions of sqlite by keeping up with merging changes, and avoiding extending sql in conflicting ways. That seems doable if they keep up with the merges, though mainly because they share very large parts of the implementation, so it's not clear it counts as a separate implementation from a web standards perspective.
Turso seems like a reimplementation and has a while to go before it achieves some level of compatibility. It probably needs to be much further along before we can really even evaluate it.
What are you talking about? There are many maintained implementations of XSLT at various levels compliance and versions.
The problem is libxslt is built on top of libxml, and libxml is being used for a bunch of stuff through browsers etc. And that it is a C implementation which most others aren't, actually I say most but not sure if there is a C implementation other than libxslt.
> I say the tools are better and reading than writing.
No way, models are much, much better at writing code than giving you true and correct information. The failure modes are also a lot easier to spot when writing code: it doesn't compile, tests got skipped, it doesn't run right, etc. If Claude Code gave you incorrect information about a system, the only way to verify is to build a pretty good understanding of that system yourself. And because you've incurred a huge debt here, whoever's building that understanding is going to take much more time to do it.
Until LLMs get way closer (not entirely) to 100%, there's always gonna have to be a human in the loop who understands the code. So, in addition to the above issue you've now got a tradeoff: do you want that human to be able to manage multiple code bases but have to come up to speed on a specific one whenever intervention is necessary, or do you want them to be able to quickly intervene but only in 1 code base?
More broadly, you've also now got a human resource problem. Software engineering is pretty different than monitoring LLMs: most people get into into it because they like writing code. You need software experts in the loop, but when the LLMs take the "fun" part for themselves, most SWEs are no longer interested. Thus, you're left with a small subset of an already pretty small group.
Apologists will point out that LLMs are a lot better in strongly typed languages, in code bases with lots of tests, and using language servers, MCP, etc, for their actions. You can imagine more investments and tech here. The downside is models have to work much, much harder in this environment, and you still need a software expert because the failure modes are far more obscure now that your process has obviated the simple stuff. You've solved the "slop" problem, but now you've got a "we have to spend a lot more money on LLMs and a lot more money on a rare type of expert to monitor them" problem.
---
I think what's gonna happen is a division of workflows. The LLM workflows will be cheap and shabby: they'll be black boxes, you'll have to pull the lever over and over again until it does what you want, you'll build no personal skills (because lever pulling isn't a skill), practically all of your revenue--and your most profitable ideas--will go to your rapacious underlying service providers, and you'll have no recourse when anything bad happens.
The good workflows will be bespoke and way more expensive. They'll almost always work, there will be SLAs for when they don't, you'll have (at least some) rights when you use them, they'll empower and enrich you, and you'll have a human to talk to about any of it at reasonable times.
I think jury's out on whether or not this is bad. I'm sympathetic to the "an LLM brain may be better than no brain", but that's hugely contingent on how expensive LLMs actually end up being and any deleterious effects of outsourcing core human cognition to LLMs.
Coding is how I build a sufficiently deep understanding of the problem space--there's no separating coding and understanding for me. I acknowledge there's different ways of working (and I imagine this is one of the reasons a lot of people think they get a lot more value out of LLMs than I do), but like, having Cursor crank code out for me actually slows me down. I have to read all the stuff it does so I can coach it into doing better, and also use its work to build a good mental model of the problem, and all that takes longer than writing the code myself.
Well, actually there could be a separate step: understanding is done during and after gathering requirements, before and while writing specifications. Only then are specifications turned into code.
But almost no-one really works like that, and those three separate steps are often done ad-hoc, by the same person, right when the fingers hit the keys.
I can use those processes to understand things at a high level, but when those processes become detailed enough to give me the same level of understanding as coding, they're functionally code. I used to work in aerospace, and this is the work systems engineers are doing, and their output is extremely detailed--practically to the level of code. There's downsides of course, but the division of labor is nice because they don't need to like, decide algorithms or factoring exactly, and I don't need to be like, "hmm this... might fail? should there be a retry? what about watchdog blah blah".
> Well, actually there could be a separate step: understanding is done during and after gathering requirements, before and while writing specifications. Only then are specifications turned into code.
The promise of coding AI is that it can maybe automate that last step so more intelligent humans can actually have time to focus on the more important first parts.
Waterfall is a caricature straw man process where you can never ever go back to the drawing board and change the requirements or specifications. The defining characteristic is the part where design up front, you can never go back and really really have to do everything in strict order for the whole of the project.
Just having requirements and a specification isn't necessarily waterfall. Almost all agile processes at least have requirements, the more formal ones also do have specifications. You just do it more than once in a project, like once per sprint, story or whatever.
Waterfall certainly has processes for going back and adjusting previous steps after learning things later in the process. The design was updated if something didn’t work out during implementation, and of course implementation was changed after errors was found during testing.
Now that agile practitioners have learned that requirements and upfront design actually is helpful, the only difference seems to be that the loops are tighter. That might not have been possible earlier without proper version control, without automated tests, and the software being delivered on solid media. A tight feedback loop is harder when someone has to travel to your customer and sit down at their machines to do any updates.
That thinking and understanding can be done before coding begins, but I think we need to understand the potential implementation layer well in order to spec the product or service in the first place.
My feeling is that software developers will need end up working this type of technical consultant role once LLM dominance has been universally accepted.
> For a large LLM I think the science in the end will demonstrate that verbatim reproduction is not coming from verbatim recording
We don't need all this (seemingly pretty good) analysis. We already know what everyone thinks: no relevant AI company has had their codebase or other IP scraped by AI bots they don't control, and there's no way they'd allow that to happen, because they don't want an AI bot they don't control to reproduce their IP without constraint. But they'll turn right around and be like, "for the sake of the future, we have to ingest all data... except no one can ingest our data, of course". :rolleyes:
People are gonna pretty quickly quit paying for AI--we're well into the "let me see what everyone's talking about" phase and that'll wear off soon. The price is already skyrocketing way ahead of quality or utility, so that'll accelerate the decline. Businesses incorporating AI into their products will scale that back as costs increase, or as they replace the most commonly used functionality with purpose-built code.
The real question is how do we continue the grift? AI's a huge, economy-sustaining bubble, and there's currently no off-ramp. My guess is we'll rebrand ML: it's basically AI, it actually works, and it can use video cards.
AI is a great feature funnel in terms of like, "what workflows are people dumping into AI that we can write purpose-built code for", but it has to transition from revenue generator to loss leader. The enormity of the bubble has made this very difficult, but I have faith in us.
If your system requires extraordinary political efforts from large numbers of people, your system will fail. We are the elites, we have to oppose this. If Netflix asks us to implement this kind of DRM, we have to resign. If Facebook asks us to implement sophisticated surveillance, we have to resign. Etc. etc. We can't keep cashing the checks and then point to the body politic like "I beg you to stop me".
reply