> Recursive descent is fine if you trust that you won't write buggy code. If you implement a generator for it (easy enough), this may be a justifiable thing to trust (though this is not a given).
The idea that you're going to hand-roll a parser generator and then use that to generate a parser and the result is going to be less buggy than just hand-rolling a recursive descent parser, screams "I've never written code outside of an academic context".
Sure, if you need parsers in a dozen languages, then a parser generator might to make sense because you're not writing one parser, you're writing a dozen.
But, the vast majority of parsers I've written didn't have this requirement. I needed to write one parser in one language.
> The SQL language parser for SQLite is generated using a code-generator program called "Lemon".
> ...
> Lemon was originally written by D. Richard Hipp (also the creator of SQLite) while he was in graduate school at Duke University between 1987 and 1992.
SQLite is kind of cheating here, you won't catching me writing my own source control management system either.
But I do think the wider point is still true, that there can be real benefit to implementing 2 proper layered abstractions rather than implementing 1 broader abstraction where the complexity can span across more of the problem domain.
Yeah, let me know when you're writing the next SQLite. For your average parser, you're not writing the SQLite parser, you don't have the SQLite parser's problems, and you don't need SQLite's solutions.
Most people aren't writing something as complex as SQLite, but most people aren't writing parsers either. Those writing parsers are disproportionately writing things like programming languages and language servers that are quite complex.
SQLite isn't some kind of universal template, I'm not saying people should copy it or that recursive descent is a bag choice. But empirically parser generators are used in real production systems. SQLite is unusual in that they also wrote the parser generator, but otherwise is in good company. Postgres uses Bison, for example.
Additionally, I think that Lemon was started as a personal learning project in grad school (as academic a project as it gets) and evolved into a component of what is probably the most widely deployed software system of all time shows this distinction between what is academic and what is practical isn't all that meaningful to begin with. What's academic becomes practical when the circumstances are right. Better to evaluate a technique in the context of your problem than to prematurely bin things into artificial categories.
> Those writing parsers are disproportionately writing things like programming languages and language servers that are quite complex.
Sure, but adding the complexity of a parser generator doesn't help with that complexity in most cases.
[General purpose] programming languages are a quintessential example. Yes, a compiler or an interpreter is a very complex program. But unless your programming language needs to be parsed in multiple languages, you definitely do not need to generate the parser in many languages like SQLite does. That just adds complexity for no reason.
You can't just say "it's complex, therefore it needs a parser generator" if adding the parser generator doesn't address the complexity in any way.
I'm not saying everyone or even anyone in particular needs a parser generator. I'm saying real, widely deployed projects - as far as you can get from academic - empirically find it useful.
Creating abstractions does decrease complexity if one (or more) of the following is true:
- The abstraction generates savings in excess of it's own complexity
- The abstraction is shared by enough projects to amortize the cost of writing/maintaining it to a tolerable level
- There are additional benefits like validating your grammars are unambiguous or generating flow charts of your syntax in your documentation, amortizing the cost across different features of the same project
It's up to you as the implementer to weigh the benefits and costs. If you choose to use recursive descent, more power to you. (For what it's worth, I personally use parser combinators to split the difference between writing grammars and hand-rolling parsers. But I've used parser generators before and found them helpful.)
I agree, I just don't understand why you seem to think this is a correction to anything I said.
If your goal is simply to reduce bugs--not something more complex like generating parsers in a bunch of languages--then hand rolling a parser generator and then using it to generate your parser [singular] is not a path to achieving your goals. That's what I said, and that's actually just true, which you probably know.
This is not an invitation to bring up irrelevant, exceptional cases, it's the rule of thumb you should operate on. Put another way, don't add layers when there isn't a reason to do so. If there is a reason to do so, have at it. Obviously.
In a meta sense, it's pretty socially inept to jump in with corrections like this. In a complex field like programming, of course there are exceptions, and it's disrespectful to the group of professionals in the room to assume that they don't know about the exceptions. I'm guilty of this myself: it's because I was brought up being praised for knowing things, so I want to demonstrate that I know things. But as an adult, I had to learn that I'm not the only knowledgeable person in the room, and it's rude to assume that I am.
The only time I have used this myself was an expat style transformer for terraform (HCL). We had a lot of terraform and they kept changing the language, so I would build a fixer to make code written for say 0.10 to work with 0.12 and then again for 0.14. It was very fun and let us keep updating to newer terraform versions. Pretty simple language except for distinguishing quoted blocks from non-quoted.
> The only time I have used this myself was an expat style transformer for terraform (HCL). We had a lot of terraform and they kept changing the language, so I would build a fixer to make code written for say 0.10 to work with 0.12 and then again for 0.14. It was very fun and let us keep updating to newer terraform versions. Pretty simple language except for distinguishing quoted blocks from non-quoted.
I hear stories like this and I just wonder how we got here. Like, did this work provide any monetary value to anyone? It sounds like your team just got way too lost in the abstractions and forgot that they were supposed to make a product that did something, ostensibly something that makes money.
I mean, I guess if you can persuade people to give you money to do something, it's profitable. :shrug:
> The idea that you're going to hand-roll a parser generator and then use that to generate a parser and the result is going to be less buggy than just hand-rolling a recursive descent parser, screams "I've never written code outside of an academic context"
Your comment is quite funny as hand-rolling a recursive descent parser is the kind of thing that is often accused of being a) bug-prone, b) only done in academic environments.
What? Accused of only being done in academic environments? Never heard that. Academics seem to spend 99% of their time talking about parser generators and LR parsing for some reason while most production compilers have handwritten recursive descent parsers...
Having written several parser generators, all my production parsers are hand-written - either pure recursive descent or a combination of recursive descent and operator precedence parsing.
The reason being that the reason there are so many parser generators is largely that we keep desperately looking for a way of writing one that isn't sheer pain in production use.
> SPA is not only about seamless transitions but also being able to encapsulate a lot of user journey on the client side, without the need of bothering server too much.
True, but as a user, I don't want you encapsulating my journey. You can wax poetic about hypothetical book categories, but the reality of SPAs is that they break back buttons, have terrible accessibility anti-patterns, attempt to control my attention, and expose my computer to all your bad security practices. SPAs usually contain half-assed implementations of half the features that ship standard in a modern browser, and the only motivation for all that effort is to make the site more favorable to the server's owner.
When a site is implemented with simple HTML and careful CSS, I can configure it to my needs in the browser quite easily. That's a site that favors me, not your opaque blob nonsense.
> Like most "CIA coups", the role the CIA played in Chile is more of a "hey let's help this guy who is already planning a coup" and if you dig into the details, it raises the question if the CIA had done nothing whether the outcome would have changed at all.
Helping a fascist coup is bad, even if the fascist coup didn't need your help.
Is it worse if the alternative is another authoritarian?
It's not a choice between democracy and a fascist (Allende was going regardless), it was a choice between a US friendly authoritarian or a USSR friendly authoritarian.
This is a nice summary of the situation in Chile at the time, the actors involved (domestic and international) and the role of the CIA.
To get a sense of the CIA’s role, they didn’t even think Pinochet had it in him - they had others pegged as the coup leader. They were surprised to find out it was Pinochet.
No. Non sequitur. If I say Pinochet is bad, saying "Allende is bad too" is a completely nonsensical comparison, as they aren't even remotely similar and the things people are claiming are bad aren't even in the same categories.
It would be like someone saying "Kim Jong Un is a horrific leader" and then another person responding with "yeah well my home owner's association president is terrible too." Just a complete non sequitur.
It's really not complicated. Don't support people who murder, torture, and expel people from their homes.
"The other guy was worse," is factually pretty off base. You can pick and choose sources all you want, but the fact is that Allende was elected as democratically as any US president in the last few decades: the idea that foreign interference invalidates an election is pretty specious. And even if you want to call Allende a dictator, he's definitely a better dictator than Pinochet: he killed far, far fewer innocent people. I give zero fucks about US-friendly vs. USSR-friendly in this case: if the US friendly dictator kills hundreds of thousands of people and the USSR friendly one doesn't, the USSR friendly one is better.
Let me make this clear: if you choose capitalism over preventing mass-murder, your morality is screwed up.
And even if somehow Allende was worse (which again, is not true), that doesn't make supporting Pinochet morally right. Most 5 year olds know two wrongs don't make a right.
> To get a sense of the CIA’s role, they didn’t even think Pinochet had it in him - they had others pegged as the coup leader. They were surprised to find out it was Pinochet.
If your argument is that the CIA was incompetent, that doesn't look much better for them.
> As if the money was for the strikers and not just their leaders...
So you're claiming the leaders were able to convince 250,000 union members to strike because... the leaders wanted it? That makes no sense.
As laid out in the article, the truckers were already upset at the government undermining their entire industry. They didn't need $28 USD to convince them to strike.
It wild how people think the CIA with a few million dollars can convince an otherwise stable democratic nation to overthrow it's leader in a coup.
As the article lays out, the CIA were mostly observers who tossed a bit of money to opposition parties. It's questionable if the CIA had any impact at all considering they weren't backing Pinochet himself, and the timing of the coup caught them by surprise. It's pretty clear they weren't very plugged in to what was happening.
> Is it worse if the alternative is another authoritarian?
Yes, the USA shouldn't be meddling in the domestic affairs of other countries to action its proxy cold war against a rival super power.
I acknowledge that the USA determined this was a correct course of action in order to strengthen its hegemony, and the hegemony of global capitalism, however it was still unethical and in opposition to the needs of people in the USA.
But if the USDR is already meddling it’s not longer purely “domestic affairs” is it?
If your take is that it’s unethical, that’s fine, but you need to consider the alternative - giving the USSR free rein to meddle in the domestic politics of the Southern hemisphere. The citizens of those countries end up living under an authoritarian anyways.
I’m not saying it isn’t an ugly business, but I’m not sure the alternative is much better.
I also believe it's unethical for the USSR to meddle. I don't think two wrongs make a right. Also, let's not be naive and pretend like the USA supported Pinochet out of the goodness of the CIA's heart - it was absolutely to use the country as a pawn in the country's cold war against the USSR.
First, I'll answer the post ipso facto aspect: The USA did meddle, and was that good? In the case of Pinochet, no, because he was a brutal authoritarian and was obviously the worse alternative to the leftist, not even communist, government he overthrew. Also, if the people voted for communism, then, that's self determination, let them have it. If it works, it works, if it doesn't, it doesn't, that's no business of America's. A military coup is "might makes right," an unethical ideology. So if we compare the two forms of meddling, actually, the USSR's was more ethical, since it was aligned with the will of the people. Overall though I still think neither country should have meddled.
What should have been done instead? If the USSR is meddling, the USA as a nation state should do nothing more than leverage its platform to expose any instances of meddling, especially if they were against the will of the people (e.g. fraudulent votes). The people in the USA is a different thing entirely, if I knew what direct action people could take to resist nation state meddling entirely I'd write it here, since I don't, I'll just say the usual: form subversive relationships with neighbors in opposition to authority, mutual aid in opposition to capital-derived infrastructure, mutual education, mutual bonds.
As for Hitler, who also rose to power undemocratically I might add (Reichstag fire and the like), he was committing a genocide, any and all means to stop that is ethical, including full invasion by other nation states. On the other hand, I can't think of an ethical way for a nation state to prevent him coming to power. After all, at the time, I'm not sure it was possible to predict what he was about to do - an anti-semitic politician wasn't exactly groundbreaking, and nobody had ever seen a Holocaust before. If Germany can't prevent itself from becoming a fascist hellhole I don't really see America's responsibility there other than to offer safe haven to any fleeing Jewish people, gays, trans people, communists, etc. Since time machines don't exist, I can't think of an ethical justification for USA meddling in Germany pre-Holocaust or pre-invasion of Poland.
What do you think? I think an interesting question is, "what is ethical and allowed if Hitler 2 arrived today and began seeking power?" Such questions could have interesting answers depending on what you think America should be allowed to do to the current person and nation conducting a genocide, Netanyahu in Israel.
> It sounds like you’re backing away from meddling is always bad?
No, they just never said that in the first place. What they said was, "Yes, the USA shouldn't be meddling in the domestic affairs of other countries to action its proxy cold war against a rival super power." Emphasis mine.
It is insane that this is downvoted. You have to be wrong in the head to think that a country helping a coup that clearly damaged another country is a good thing.
I think it's even worse than that. The CIA simply was not concerned with the well-being of the Chilean people, seeking only to further US cold war interests no matter how many people it killed.
This is only bewildering to people who refuse to admit the problems of our current economic system because our current economic system benefits them.
Advertising needs to go. Advertising is why worse products at higher prices beat out better products at lower prices. Advertising isn't information, it's lies: nobody tells you the problems with their product or things their competitor does better. We don't need advertising to find out about products: word of mouth, experts, and independent review sites are much better sources of information already. And it's a huge drain on our economy: once you let one company advertise, then advertising is no longer optional for all their competitors.
Advertisers of HN will surely refuse to admit these pretty basic, obvious facts, use their advertising platforms to make sure pro-advertising talking points are louder than reason, and the enshittification of everything will continue.
I think fundamentally, no parsing/normalizing library can be effective for addresses. A much better approach is to have a search library which finds the address you're looking for within a dataset of all the addresses in the world.
Addresses are fundamentally unstructured data. You can't validate them structurally. It's trivial to create nonexistent addresses which any parsing library will parse just fine. On the flipside, there's enough variety in real addresses that your parser has to be extremely tolerant in what it accepts--so tolerant that it basically tolerates everything. The entire purpose of a parser for addresses is to reject invalid addresses, so if your parser tolerates everything it's pointless.
The only validation that makes any sense is "does this address exist in the real world?". And the way to do that is not parsing, it's by comparing to a dataset of all the addresses in the world.
I haven't evaluated this project enough to understand confidently what they're doing, but I hope they're approaching this as a search engine for address datasets, and not as a parsing/normalizing library.
And keeping such datasets up to date is another matter entirely, because clearly a lot of companies rely datasets that were outdated before their company even existed.
A trivially simple example of just how messy this is when people try to constrain it is that it's nearly random whether or not a given carrier would insist on me giving an incorrect address for my previous place, seemingly because traditionally and prior to 1965 the address was in Surrey, England.
The "postcode area name" for my old house is Croydon, and Croydon has legally been in London since 1965, and was allocated it's own postcode area in 1966. "Surrey" hasn't been correct for addresses in Croydon since then.
But at least one delivery company insisted my old address was invalid unless I changed the town/postcode area to "Surrey", and refused to even attempt a delivery. Never mind they had my house number and postcode, which was sufficient to uniquely identify my house.
Agreed. Keeping an up-to-date dataset of addresses is enormously hard. It's impossible to do perfectly, and only a few companies are capable of doing it passably, while the rest of us have no choice but to buy from them.
But notably, to validate a parser/normalizer, you need this dataset anyway, so creating a parser/normalizer isn't even saving you that work. It's just giving you a worse result for more work.
I know that you are equating things that are not equatable, since I have personally been affected by businesses relying on "datasets" to claim that my real world address, which definitely existed, did not exist. Data is not the same as reality.
> I know that you are equating things that are not equatable, since I have personally been affected by businesses relying on "datasets" to claim that my real world address, which definitely existed, did not exist.
It sounds like people at those businesses equated a dataset to the real world, not me. You're an adult, direct your frustrations appropriately.
> Data is not the same as reality.
That glosses over a lot of nuance.
Obviously, no dataset perfectly represents reality. But, this fact is often used to dismiss data entirely, resulting in people making decisions with absolutely no evidence whatsoever.
An appropriate use of an address database might be: when the user enters an address not in the database, do a fuzzy search and suggest the best match you can find, asking "Did you mean X?" At that point, if the user says, "No, I really meant what I put in," then you accept the data they gave you. This catches most mistakes while allowing users to put in addresses that aren't in your dataset.
I just don't understand why you'd want this, or any of the JS server-side nonsense. WebAssembly is designed for the web--why would you want to use it outside the browser, when there are dozens of much more mature frameworks out there that were designed to work on your machine?
It seems to me like a lot of the JS outside the browser stuff out there is motivated by JS people not wanting to learn something different. Meanwhile for those of us who have been doing dev outside the browser, all this is worse solutions to problems we've already got solutions for.
> WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications.
While the primary target for WebAssembly is indeed the web there are surely plenty of applications for a cross-platform stack-based virtual machine.
> It seems to me like a lot of the JS outside the browser stuff out there is motivated by JS people not wanting to learn something different.
Well that certainly can't be the case here because WebAssembly specifically isn't JavaScript. It's a well specified, reliable platform for different languages to target for cross-platform execution. Is that really so bad? To turn it on its head, if you're going to build a cross-platform application framework, why not use WebAssembly, aside from the fact that it has "web" in the name?
Sandboxing 3rd party code inside existing applications for example. Like plugins for Photoshop-like applications or game mods. Portability is also a big plus for this kind of thing too.
Agreed. Wasm let's you just revoke network access from things that don't need it. I know you can also do that with docker but its an awful big hammer for the job.
Yeah exactly, it is kinda insane how deep open source tech stacks have gone.
We need a world where we can seamlessly synchronously (no IPC, no network) call a function from a 3rd party and be sure it doesn't make any network requests, read the disk (or any other type of I/O) or read non-assigned RAM.
If we could do that the amount of scrutiny you need for your 3rd party code goes down massively. WASM is a step to get there.
I think what you're saying is that we need a world where more things are built out of pure functions with explicit inputs. And I agree.
Wasm lets you take something that is coincidentally pure and run it in a way that ensures that it'll stay that way even for novel inputs, but you're still treating it like a black box... Just a particularly well behaved one.
Sometimes you need that because you're dealing with somebody else's code and they weren't especially interested in providing the kinds of assurances that you'd like to have. So I agree, it is a step in the right direction.
But if you want to author code that gives users such assurances, building to wasm is a rather small step. There only so much that can be done at the bytecode level. Much more can be done at a language level (Unison comes to mind, which goes so far as to eliminate the function's name as a source of works-differently-on-my-machine, though Haskell is probably a more common choice).
The solution to insanely deep open source tech stacks is to use simpler stacks. This is really my whole problem with the way JS is ubiquitously developed: you simply do not need 99% of the tools the JS devs are using these days. I use very simple full-stacks: so simple that I don't need a JS package manager (I do use pip/pypy, but I'm extremely guarded about what code I'll import--usually <10 packages). I do write some code that could be imported in a library sometimes, but I also barely ever do any rewriting to deal with deprecations, and my builds never break due to a package change. My builds are often reproduceable by accident.
In the short run, I'm not less productive than people who use all of npm. In the long run, I'm far more productive.
And even if that weren't the case, developer productivity at the expense of user security was a pretty bad tradeoff anyway. I really wish fewer people were doing that.
One possible use-case is a system like Inferno[0] (the plan9-like operating system). Since, on that OS, you can "mount" CPUs from other machines over the network[1], you either need to make sure all of your machines have the same CPU architecture (including additional instruction sets lest you end up with AVX512 code trying to run on an Intel processor), or you use a non-compiled language. Inferno went with the latter and introduced Limbo[2] for that purpose, but with webasm we could use any language and still leverage all the great tooling being written for it.
[0] https://en.wikipedia.org/wiki/Inferno_(operating_system)
[1] Behind-the-scenes it actually mounts your filesystem onto the remote machine and sends it commands to run, rather than actually "mounting" a CPU over the network. https://9p.io/magic/man2html/1/cpu
[2] https://en.wikipedia.org/wiki/Limbo_(programming_language)
Because people need to run their Apps on multiple targets and having one codebase for all targets helps with keeping maintenance and development costs down. Java would meet this requirement if the UX/UI story was anywhere near as good as the Web's. I'd argue that UI has been the main driver of all of this.
For safely running an untrusted architecture- and operating-system-agnostic binary blob. If all operating systems could even run simple WASI cmdline blobs directly that would actually solve a real problem for me (of distributing a shader compiler executable for Linux, macOS, Windows across x86 and ARM, which is built from massively big 3rd party C++ libraries (so compiling on the user machine isn't really an option).
AFAIK Cosmopolitan is missing the sandboxing features of WASM runtimes which would allow to run untrusted code safely, and for x86 vs ARM it uses the universal binary approach (e.g. you'll either have the x86 or the ARM code as dead weight in the executable).
VSCode is an entire web browser (Electron packaged Chromium). It even ships with "Simple Browser" built in, which is just a web view.
Perhaps a more direct example is Node's/Bun's ability to package a JS/TS project into a single binary with only the JS engine browsers use rather than the whole thing.
I don't feel entitled to anything. YouTube is free to stop serving me content at any time. It's trivial to refuse to serve people content they haven't paid for.
Why do advertisers feel entitled to my attention when I never agreed to give it to them? Simply visiting a page with ads doesn't mean I agree to view ads.
I will never pay for an ad-supported product. As long as YouTube accepts money from advertisers, their loyalty is split between users and advertisers. And advertisers will eventually win: if YouTube Premium gains traction, advertisers will be willing to pay more for access to premium users, and YouTube can only ignore that for so long. YouTube Premium will have ads eventually--it's just a matter of time. It already happened to cable, it happened to Prime, and it will happen to every streaming service that relies on ads eventually.
The only answer is to support companies that do not receive any money from ads (i.e. Kagi). Until that exists for streaming, I'm blocking ads and not giving them a cent.
> First it was "I hate how much ad companies track me and build profiles on me."
> Now it is "I hate how ads are irrelevant."
This is an HN echo-chamber complaint, made by people who work for advertisers trying to come up with a way to make their ads seem less awful.
The fact is, relevant ads aren't better. They're still ads, and ads are still inherently bad.
If I'm looking for a used car, I do not want to hear ads from Bob's Lemon Shop about why they're the best place to buy cars. If Bob's Lemon Shop is the best place to buy cars, I'll find that out from independent reviewers who have shopped their before. An ad from Bob's Lemon Shop is relevant to my interest, but that makes it worse because now I'm susceptible to manipulation by the company that paid the most for ads instead of making a more rational decision based on true information from unbiased sources. Having more relevant ads is not good for me, it's good for advertisers.
The idea that you're going to hand-roll a parser generator and then use that to generate a parser and the result is going to be less buggy than just hand-rolling a recursive descent parser, screams "I've never written code outside of an academic context".