Hacker News new | past | comments | ask | show | jobs | submit login
In defense of complicated programming languages (viralinstruction.com)
209 points by jakobnissen on Jan 25, 2022 | hide | past | favorite | 370 comments



I remember first encountering classes. I simply could not understand what they were from reading the documentation. Later, I picked op Bjarne's C++ book, read it, and could not figure out what classes were, either.

Finally, I obtained a copy of cfront, which translated C++ to C. I typed in some class code, compiled it, and looked at the emitted C code. There was the extra double-secret hidden 'this' parameter. Ding! The light finally went on.

Years later, Java appears. I couldn't figure that out, either. I thought those variables were value types. Finally, I realized that they were reference types, as they didn't need a *. I sure felt stupid.

Anyhow, I essentially have a hard time learning how languages work without looking at the assembler coming out of the compiler. I learn languages from the bottom up, but I've never seen tutorials written that way.


I understand this thought process, but in my opinion it's the wrong way to think about software concepts. Understanding what a bridge is doesn't mean knowing how to build one, and in fact tying your understanding to a certain implementation of a bridge just limits your ideas about what is, in fact, an abstract concept.

We understood functions as "mappings" between objects for hundreds of years, and when programming came along it gave us new ways to think about functions, but being able to "make" a function in hardware/software doesn't actually change what a function is at its core.

There's a reason why computer science professors explain concepts at a high or abstract level and don't jump into implementation to help students understand them. It's because the concepts ARE the high-level meanings, and if you need to see an implementation then you're not really understanding the idea for what it is.

If an idea stays abstract in your mind, it gives you more flexibility with how you apply it and more ways to find use out of it. But it does take a mindset shift, and is an actual learned skill, to be able to see something as purely abstract and accept that how it's made doesn't matter.

-- Edit - just realized who I was replying to. So take this comment as not meant to be lecturing, but just a 2c to offer :).


> if you need to see an implementation then you're not really understanding the idea for what it is.

I 100% disagree. At the very least, I think you're wrong if your assumption is that such a statement applies in general. That statement certainly doesn't fit me as well as many people I've taught in the past. I got my PhD in (pure) mathematics and I could only understand high level abstractions _after_ I worked through many concrete examples. I know the same applies for many successful mathematical researchers because we discussed the subject at length. Now such a statement certainly does apply for _some_ people (I've taught them as well), but certainly not all.

If you're someone that likes this sort of abstract thinking, that's great. If you're someone that needs concrete examples to understand, that's great too. The real lesson is that everyone learns differently.


But there's a difference between trying to understand, say, a theorem by applying it in concrete situations and by studying its proof.


> But there's a difference between trying to understand, say, a theorem by applying it in concrete situations and by studying its proof.

There may be a difference or there may not. You could study a proof within the context of a specific example. That is how I usually would do it. But yes of course it's possible not to do like that (many people don't study proofs that way). In any case, I don't really understand your point.


I was drawing an analogy, in this particular context, between studying the proof and taking something apart.


I didn't understand how engines worked until I took them apart, either. I was taking things apart to understand them long before computers :-)

But the notions of sending a "message" to a "method" just was way way too handwavy for me. I like examining the theory after I learn the nuts and bolts.

> Understanding what a bridge is doesn't mean knowing how to build one

If you don't know how to build one, you don't understand them. Architects who get commissions to design innovative skyscrapers definitely know how to build them.


Some months ago I had to learn React for a project. I was struggling a lot. The documentation was so poor. And everything looked so inconsistent (it still does...) and the "solutions" on stackoverflow looked so arbitrary. Until I remembered the lessons I had learned in the past and sat down and studied how it was internally working. Then things started to make sense.

I think it's an advantage to be able to mentally map high-level constructs to low-level operations. (Edit) Learning the low-level stuff first can help to understand what problems high-level languages are trying to solve. For example, the first languages that I learned were assembly and BASIC. Many people said that learning such low-level languages would make it harder for you to learn abstract thinking and structured programming with high-level languages. For me, it was quite the opposite. Writing complex programs in BASIC was so cumbersome, it made me appreciate programming in C with local variables and data structures. After mastering function pointers in C and discovering that you can elegantly express operations on "living" data structures with them (I wrote simple game engines and world simulations for fun), the concept of messages and methods in OO languages looked so natural when I first learned about them. And once you witnessed the mess of complex interdependencies in large programs (with or without multithreading), immutability and functional programming looked like the right way.


> Writing complex programs in BASIC was so cumbersome, it made me appreciate programming in C with local variables and data structures

My language trajectory was approximately from BASIC to Pascal to C to Common Lisp, but I had a very similar reaction. My move from C to Common Lisp probably had the greatest increase in appreciation because a task I semi-failed to complete in three years of C programming took me six months in CL, which was perfectly suited to the task in hand.

(the C task was postgraduate work in AI, in the late 1980s. As well as the importance of choosing the right language for the task, I also learned a lot about the importance of clearly agreed requirements, integration tests and software project management, none of which really existed on the four-person project I was working on).


On the other hand, you can walk over a bridge any amount of times without understanding the nuts and bolts of it. There are plenty of ways to build a bridge, there are static concrete bridges, there are bridges that can open, there are hanging bridges and a lot more variants. But for the people making use of them, the implementation matter a lot less than the purpose - connecting two places.

But yes, you are of course right in that if you build bridges, you need to understand the mechanics, and someone that builds a language of course need to have a deeper understanding of how the underlying abstractions work together than most of the users will have.


Thank you, what you wrote is exactly what I meant.


Is it possible to really learn how programming concepts work, though? Modern optimizing compiler are pretty amazing and just seeing the assembly output may make a concept harder to grasp.

To your bridge analogy. At this point what we are saying is "give me a method to traverse this river" and compilers are either building bridges, shooting out maps to fallen trees, or draining the river all together. If you looked at that output you might consider "traversing rivers" to only be walking over natural bridges.

This gets even more sticky when talking about types. Computers don't care about types, they are strictly a concept to help developers.

Or to your class point, would you know better what a class does if you pulled up godbolt and found that the this pointer is completely removed? You might come to the mistaken conclusion that classes are simply a name-spacing technique.


> Is it possible to really learn how programming concepts work, though? Modern optimizing compiler are pretty amazing and just seeing the assembly output may make a concept harder to grasp.

They usually have debugging options that let you read the internal steps (SIL, LLVM, GIMPLE, etc). That can be easier to understand than full asm, but also, the asm can't hide anything from you unless it's obfusticated.


> the asm can't hide anything from you unless it's obfusticated.

I think that misses the point. It's not the case of the ASM hiding things, it's a case of the optimizing compiler erasing things.

Here's a simple example: You say "Ok, the time I need to wait is 30 minutes. This method takes seconds, so I'll send in 60 (seconds in a minute) * 30 (minutes). The compiler is free to take that and say "Oh hey, 60 * 30? I know that's simply 1800 so I'll put that there instead of adding the instructions for multiplication".

Trying to learn how something like that works from the ASM output would leave you confused trying to reason where that "1800" came from. It's not hidden, it's not obfuscated. It's simply optimized.

That's a simple example, but certainly not the end of the optimizations a compiler can apply. The ASM isn't lying, but it also isn't telling the whole picture. The very nature of optimizing compilers is to erase parts of the picture that's ultimately only there for programmers benefit.


That's true, but often the things it erases are what's confusing you. For instance, it can remove highly abstracted C++ templates and dead code. Or if you don't know what a construct does, you can compile code with and without it and see exactly what it does.

Often new programmers think asm/low level programming is scary, but because it's so predictable it's actually quite easy to work with… in small doses.


> But the notions of sending a "message" to a "method" just was way way too handwavy for me. I like examining the theory after I learn the nuts and bolts.

Ooh-- are we talking about Smalltalk here? Java?

C'mon, Walter. You can't tell half a ghost story and then say goodnight. What did you find staring back at you deep in the void of all that object-orientedness?

Or, if this is a famous ghost story you've told before at least give us a link. :)


> sending a "message" to a "method"

I think you mean “to an object.”

The problem with an approach like yours is that implementations of abstract concepts often vary, and learning the “nuts and bolts” of one does not necessarily give you the true understanding of the concept itself.


Yeah, sending a message to an object via a method.


It’s because the “sending message” explanation is BS unless you talk languages like Erlang. The OOP explanations doesn’t make sense because they are BS. In the real world you don’t ask your shoe to tie itself, you don’t ask a tree to chop itself etc. And in OOP languages you normally don’t ask a string to send itself over a socket or draw itself in 3D using OpenGL. The reality is that you have code operating on data. The same way you have an axe operating on a tree, or your hands operating your shoe laces. That’s it. Everything else is BS.


>>"But the notions of sending a "message" to a "method" just was way way too handwavy for me."

This. I've felt like I haven't been able to keep up with commodity programming because I can't stand the way today's drivers (MSFT) control mindshare by emphasizing frameworks over mechanics. I feel like it's a people aggregation move instead of facilitating more creative solutions. The most enjoyable classes I had in uni were 370 assembler and all the Wirth language classes taught by the guy who ran the uni patent office.


> But the notions of sending a "message" to a "method"

You mean to an “object”?

Also, this looks like a perfect example of theory vs practice, having different implementations of object communication as in classic OOP vs actors etc?


Have you taken Rust appart yet?


it would have helped me a lot.from the outside it looks very complicated and capricious. I really do get the sense that the little axiomatic ur-language inside is a lot more tractable - would love to have learned that first.


same, i'd like to see such an approach


Chris, a friend of mine in college (who is unbelievably smart) decided one day to learn how to program. He read the FORTRAN-10 reference manual front to back, then wrote his first FORTRAN program. It ran correctly (as I said, the man is very smart) but ran unbelievably slowly.

Mystified, he asked another friend (Shal) to examine his code and tell him what he did wrong. Shal was amazed that his program worked the first time. But he instantly knew what was wrong with it - it wrote a file, character by character, by:

1. opening the file

2. appending a character

3. closing the file

Chris defended himself by saying he had no idea how I/O and disk systems worked, and so how could he know that the right way was:

1. open the file

2. write all the characters

3. close the file

and he was perfectly correct. This is why understanding only the abstractions does not work.


> This is why understanding only the abstractions does not work.

I don't think your example shows that at all: If it didn't actually explicitly say in his Fortran reference manual that "The 'thing' you write between a file-open and a file-close can only be a single character", then... Sorry, but then AFAICS your example only shows that he didn't understand the abstraction that "file-open" just opens a file for writing, without specifying what to write. (Maybe he slavishly followed some example in the manual that only wrote one character?)

This needless sprinkling of file-open / file-close looks a bit like he did the work of a current optimising compiler (only here it was pessimising), "unrolled a loop"... So AIUI it shows the opposite of what you set out to show: Too concrete without higher-level understanding was what did him in.


> I understand this thought process, but in my opinion it's the wrong way to think about software concepts. ... just limits your ideas about what is, in fact, an abstract concept.

There's nothing abstract about language constructs. Learning about a language construct via translation is a perfectly fine way of clarifying its semantics, whereas an "abstract" description can easily be too general and fail to pin down the concept; it can also rely on unstated "rules of the game" for how the abstract construct will operate within the program, that not everyone will understand the same way.


I tend to agree with you in principle, but for me too, a lot of high-level features are better understood in term of translations. Objects, coroutine, closures...

Even in formal CS, it's common to define the semantics of a language by translation, which can give more hindsight than operational semantics.

Now that I think of it, I think the problem is that most languages are defined informally which can be imprecise and inadequate.

The translation provided by the compiler is the closest thing we have to a formal semantics, it's natural to rely on it.


I've found that C compiler documentation on how their extensions work to be woefully inadequate. The only way to figure it out is to compile and examine the output. Most of the time, the implementors can't even be bothered to provide a grammar for their extensions.


> The translation provided by the compiler is the closest thing we have to a formal semantics, it's natural to rely on it.

Which translation, though? Depending on your compiler flags, you may get very different translations, sometimes (if your program contains undefined behavior) even with dramatically different runtime effects.


Yes, you are correct. But you are wrong too.

People need complete understanding of their tools. And complete understanding includes both how to use the concepts they represent and how those concepts map into real world objects. If you don't know both of those, you will be caught by surprise in a situation where you can't understand what is happening.

That focus on the high level only is the reason we had a generation of C++ developers that didn't understand the v-table while being perfectly capable of creating one by hand. It's also why we have framework-only developers that can't write the exact same code outside of the magical framework, even when it's adding no value at all.


IMO this is very elitist view of software developers' job.

The analogy from tangible world would be all the bridge engineers using "proven" / "boring" / "regulator endorsed" practices and techniques to build a "standard" bridge versus those constantly pushing the limits of materials and construction machines to build another World-Wonder-Bridge. There is nothing wrong with having both types of engineers.


> There is nothing wrong with having both types of engineers.

Acksherly, yes there is. In this context, there is: The world doesn't need engineers "constantly pushing the limits of materials" when building bridges; let's stick with proven, boring, regulator endorsed practices and techniques for that.


>There's a reason why computer science professors explain concepts at a high or abstract level and don't jump into implementation to help students understand them.

It's because they're trying to teach something to people who don't have anything to build on. Later in their education that'll be different. At the school I attended Object Oriented Programming class had Computer Organization as a pre-req and the teacher would often tangent into generated assembly to help students understand what was happening.

Regardless of whether I agree with your thoughts about the ideal approach to understanding the concepts vs implementation, I live in the status quo where - sooner or later - a C++ programmer is going to encounter a situation where they need to know what a vtable is.


I did not understand what virtual functions were at all until I examined the output of cfront.

Oh, it's just a table of function pointers.


You must have just had the bad luck to have read the most appallingly stupid books. I think you might be even a little older than I, and I got into the game late-ish; those C++ manuals or specs you read were probably from the 1970s or 80s? By the time I learned (imperative- and inheritance-based[1]) OOP from the Delphi manuals in the late nineties the virtual method table was explicitly mentioned and explained, along with stuff like "since all objects are allocated on the heap, all object references are implicitly pointers; therefore, Borland Object Pascal syntax omits the pointer dereferencing markers on object variable names" (which you also mention about C above). I'm fairly certain this gave the reader a pretty good grasp of how objects work in Delphi, but I still know nothing about what machine language the compiler generates for them.

It's not the idea of top-down learning from abstractions that's wrong, it's being given a shitty presentation -- more of an obfuscation, it seems -- to learn about the abstractions from that is the problem.

___

[1]: So yeah, that whole Smalltalk-ish "messaging paradigm" still feels like mumbo-jumbo to me... Perhaps because it's even older than C++, so there never were any sensible Borland manuals for it.


Sometimes textbook descriptions are just needlessly obtuse. I have noticed that there are some concepts which I already understood but if I were first introduced to them via the textbook, I would have been hopelessly confused. I wasn't confused only because I recognized this as something I already knew.


I agree that everyone shouldn't need to know every implementation detail, but I'd argue there should be more emphasis on the low-level details in CS education.

Programming is often approached from a purely abstract point of view, almost a branch of theoretical mathematics, but imho in 99% of cases it's better understood as the concrete task of programming a CPU to manipulate memory and hardware. That framing forces you to consider tradeoffs more carefully in terms of things like time and performance.

You shouldn't be able to hand translate every line of code you write into assembly, but in my experience you write much better code if you at least have an intuition about how something would compile.


I think that's the difference between computer science and programming. Yes, you'll be a better programmer if you focus on the low-level details, but you'll be a worse computer scientist.

I guess, if your goal is to write optimizations, focus on details. If your goal is to find solutions or think creatively, focus on abstractions. Obviously, there’s a lot of overlap, but I’m not sure how else to describe it.


I disagree that you'll be a worse computer scientist. Computer science doesn't happen in Plato's heaven, it happens in real processors and memory.

I tend to think focusing on abstractions is almost always the wrong approach. Abstractions should arise naturally as a solution to certain problems, like excess repetition, but you should almost always start as concrete as possible.


I would disagree with where you are drawing that line. I would say that computer science does happen in Plato's heaven; software engineering happens in real processors and memory. But most of us are actually software engineers (writing programs to do something) rather than computer scientists (writing programs to learn or teach something).


I agree that there is some value to purely theoretical work, but I think this is over-valued in CS. For instance, in the first year of physics instruction at university, problems are often stated in the form of: "In a frictionless environment..."

I think a lot of problems are created in the application of computer science because we treat reality as if there are no physical constraints - because often it is the case that our computers are powerful enough that we can safely ignore constraints - but in aggregate this approach leads to a lot of waste that we feel in every day life.

I think incremental cost should play a larger role in CS education, and if every practitioner thought about it more we would live in a better world.


> I think a lot of problems are created in the application of computer science because we treat reality as if there are no physical constraints - because often it is the case that our computers are powerful enough that we can safely ignore constraints - but in aggregate this approach leads to a lot of waste that we feel in every day life.

ObTangent: Bitcoin.


Undergrad "CS" education, probably for better (considering the career paths and demand), is more about teaching what you call software engineering than what you call computer science.


My company fired two Computer Science PhDs because they knew the theory brilliantly but couldn’t turn it into code. That’s the problem with only learning the theory.


This depends heavily on the context. In The Art of Computer Programming, the analysis of algorithms is done in terms of machine code. On the other hand, the proverbial centipede lost its ability to move as soon as it started wondering about how it moves.


I tend to think you should be able to go back and fourth between mental models. Like obviously when you're thinking through how to set up business logic, you should not be thinking in terms of registers and memory layouts.

But when you're considering two different architectural approaches, you should, at least in broad terms, be able to think about how the machine is going to execute that code and how the data is roughly going to be laid out in memory for instance.


The issue with viewing programming concepts as purely abstract is that the abstractions have varying degree of leakiness. Even with the most mathematically pure languages like Haskell you run into implementation details space leaks which you have to understand to build reliable systems.

There’s certainly something to be said for abstract understanding, but one thing I’ve learned in software and in business is that details matter, often in surprising ways.


> I understand this thought process, but in my opinion it's the wrong way to think about software concepts.

You cannot really tell someone that the way their brain learns is the "wrong way". Different people's brains are wired differently and thus need to learn in different ways.

> Understanding what a bridge is doesn't mean knowing how to build one, and in fact tying your understanding to a certain implementation of a bridge just limits your ideas about what is, in fact, an abstract concept.

Software development is about more than just understanding what something is. It's about understanding how to use it appropriately. For some people, simply being told "you use it this way" is enough. Other people like to understand the mechanics of the concept and can then deduce for themselves the correct way to use it.

I also fall into that category. While I'm not comparing my capabilities to Bright's I do very much need to understand how something is built to have confidence I understand how to use it correctly.

Neither approach is right nor wrong though, it's just differences in the way our brains are wired.

> There's a reason why computer science professors explain concepts at a high or abstract level and don't jump into implementation to help students understand them.

Professors need to appeal to the lowest common denominator so their approach naturally wouldn't be ideally suited for every student. It would be impossible to tailor classes to suit everyone's need perfectly without having highly individualised lessons and our current educational system is designed to function in that way.

> If an idea stays abstract in your mind, it gives you more flexibility with how you apply it and more ways to find use out of it. But it does take a mindset shift, and is an actual learned skill, to be able to see something as purely abstract and accept that how it's made doesn't matter.

The problem here is that abstract concepts might behave subtly differently in different implementations. So you still need to learn platform specifics when writing code on specific platforms, even if everyone took the high level abstract approach. Thus you're not actually gaining any more flexibility using either approach.

Also worth noting that your comment suggests that those of us who like to understand the construction cannot then derive an abstract afterwards. Clearly that's not going to be true. The only difference between your approach and Bright's is the journey you take to understand that abstract.


I like to understand the construction too, it's why I'm in engineering. I'm just saying it shouldn't be necessary in order to understand an idea. For me, it was a crutch I used for years before I grokked how to decouple interfaces from implementations because I naturally understand things better after I build them. But if you use an implementation to understand an idea, you couple implementation to interface in your mind, and so it changes your understanding.


> I'm just saying it shouldn't be necessary in order to understand an idea

Nobody said it is necessary. Some folk just said they find it easier learning this way.

> But if you use an implementation to understand an idea, you couple implementation to interface in your mind, and so it changes your understanding.

That's a risk either way. It's a risk that if you only learn the high level concept you miss the detail needed to use it correctly in a specific language too (OOP can differ quite significantly from one language to another). And it's also a risk that then when you learn that detail you might forget the high level abstract and still end up assuming the detail is universal. If we're getting worried about variables we cannot control then there's also a risk that you might just mishear the lecturer or read the source material incorrectly too. Heck, some days I'm just tired and nothing mentally sinks in regardless of how well it is explained.

There are an infinite ways you could teach something correctly and the student still misunderstand the topic. That's why practical exercises exist. Why course work exists. Why peer reviews exist. etc.

And just because you struggled to grasp abstract concepts one specific way it doesn't mean everyone else will struggle in that same way.


I am not a child psychologist, so take all of this with a grain of salt. I believe children first learn concepts by looking at and playing with concrete things first. "Oh look at this fun thing... Oh whoops I moved it, it looks slightly different, but if I rotate it, it looks like it used to... It doesn't really taste like anything, but it feels hard... Whoa, what's this new thing over here? Oh wait, this is the same size and shape as the thing I played with previously... In fact it behaves just like the first thing did. Oh cool, there's a whole stack of them over here, I bet they work just like the first things did!" This is how one might interpret a baby's first interactions with blocks. Later in life, they might find out about dice and understand some similarities. Later, still in school, the kid learns about cubes in geometry class, and can think back to all the concrete hands on experience he had and see how the various principles of cubes apply in real life.

So, people learn by experiencing concrete things first, and then grouping all those experiences into abstract concepts. Sometimes (ok, often) they'll group them incorrectly: Kid: "This thing doesn't have fur and moves without appendages. It's a snake. Whoa, look at this thing in the water, it moves without appendages either! It must also be a type of snake." Teacher: "That's an eel, not a snake." Kid: "oh. I guess snakes are for land and eels are for water" Teacher: "Water Moccasin is a type of snake that is quite adept in the water." Kid: "oh. They look kinda the same, what's the difference?" Teacher: [performs instruction]

This form of learning by compiling all sorts of concrete things down into a few abstract concepts is so powerful and automatic that we do it ALL THE TIME. It can even work against us, "overtraining" to use an ML term, like with our various biases, stereotypes, typecasting of actors ("this guy can only ever do comedies"). Sometimes folks need a little help in defining/refining abstract concepts, and that's the point that teachers will be most helpful.

So, for me anyway, and I suspect many others, the best way to learn a concept is to get many (as different as possible) concrete examples, maybe a few concrete "looks like the same thing but isn't", and THEN explain the abstract concept and its principles.

Or, to explain the process without words, look at Picasso's first drawing of a dog, and the progressively shinier simpler drawings until he gets to a dog drawn with a single curvy line.


I don't really buy this. It's like saying we should teach about fields before addition of real numbers, or about measure spaces before simply C^n. The most abstract version of a concept is usually much more difficult to grok.


You've never seen tutorials written that way because roughly nobody but you learns programming languages from the bottom up. There is just no demand.

By the way, where can I read a D tutorial from the bottom up?


I just added a -vasm switch to the dmd D compiler so you can learn it from the bottom up!

https://news.ycombinator.com/item?id=30058418

You're welcome!


Walter, watch out. You want to talk about dumbing down? California is considering a new law in which every computer language must have a keyword 'quine' which prints 'quine' . And none of this looking under the hood stuff. That's doing your own research. Trust the computer science! :)


Is there really no demand? Or do those of us who like to learn that way just get used to researching these things ourselves so quietly get on with it. Many of the existing tutorials are at least a good starting point to teach engineers what topics they need to examine in more detail.

Anecdotally, when I've mentored juniors engineers I've had no shortage of people ask me "why" when I've explain concepts at a high level; them preferring I start at the bottom and work my way up. So I quite believe there could be an untapped demand out there.


There’s a difference between understanding something and learning how and why it works the way it does. You can understand how a compilation pipeline works never working with any low-level code and never writing a compiler yourself. You can walk across a bridge and understand it connects point A with point B and don’t understand how a specific bridge has to be constructed. A concrete implementation is just an implementation detail and if you focus too much on it you’ll get tunnel-visioned instead of understanding the concept behind it

EDIT: And I say that as someone who likes both learning and teaching from the ground-up. But there’s no demand for it because that’s not how you efficiently learn the concepts and understand the basics so you can take a deeper dive yourself


> you’ll get tunnel-visioned instead of understanding the concept behind it

You might have gotten tunnel-visioned but it's not a problem I've suffered from when learning this way. And why do you think I cannot understand the concept behind the code after reading the code? If anything, I take the understanding I've grokked from that reference implementation and then compare it against other implementations. Compare usages. And then compare that back to the original docs. But usually I require reading the code before I understand what the docs are describing (maybe this is due my dyslexia?)

Remember when I said everyone's brain is wired differently? Well there's a lot of people today trying to tell me they understand how my brain works better than I understand it. Which is a little patronising tbh.


> You've never seen tutorials written that way because roughly nobody but you learns programming languages from the bottom up.

I am indeed a unique snowflake.


I also have a hard time with learning concepts too if there are handwavey parts of it. I remember by recreating the higher level concepts from lower level ones at times.


To me, the abstraction is an oversimplification of actual, physical, systemic processes. Show me the processes, and it's obvious what problem the abstraction solves. Show me only the abstraction, and you might as well have taught me a secret language you yourself invented to talk to an imaginary friend.


> abstraction is an oversimplification of actual

I think it’s an oversimplification of what abstraction is.


I don't believe most productive programmers learned the quantum physics required for representing and manipulating 1s and 0s before they learned out to program. Abstractions are useful and efficient.

You're more comfortable with a certain level of abstraction that's different from others. I can't endorse others that try to criticize your way of understanding the world, but I'd also prefer if some people who in this thread subscribe to this "bottom up" approach had a bit more humility.


I think part of it comes from believability, or the inability to make a mental model of what is going on under the hood. If something seems magical, you don't really understand what is going on, it can make it hard to work with because you can't predict it's behavior in a bunch of key scenarios. It basically comes down to what people are comfortable with what their axiom set is. It gets really bad when the axiom set is uneven when your teaching it, and some higher abstractions are treated as axiomatic / hand waved, while other higher abstractions are filled in. This is also probably an issue for the experienced, because they have some filled in abstractions that they bring from experience, so their understanding is uneven and the unevenness of their abstraction understanding bugs them.

Like limits in calculus involved infinity or dividing by unspecified number seems non functional or handwavy in itself. Like how the hell does that actually function in a finite world then? Why can't you actually specify the epsilon to be a concrete number, etc? If you hand wave over it, then using calculus just feels like magic spells and ritual, vs. actual understanding. The more that 'ritual' bugs you, the less your able to accept it and becomes a blocker. This can be an issue if you learned math as a finite thing that matches to reality for the most part.

For me to solve the calculus issue, I had to realize that math is basically an RPG game, and doesn't actually need to match reality with it's finite limits or deal with edge cases like phase changes that might pop up once you reach certain large number thresholds. It's a game and it totally, completely does not have to match actual reality. When I digged into this with my math professors, they told me real continuous math starts in a 3rd year analysis class and sorry about the current handwaving, and no, we wont make an alternative math degree path that starts with zero handwave and builds it up from the bottom.


The last time I learned a new programming language (Squirrel), I did so by reading the VM and compiler source code in detail rather than writing code. You get a far more complete picture of the semantics that way! I didn't even read much of the documentation first; it answered far too few of my questions. (Edit:) I want to know things such as: how much overhead do function calls have, what's the in-memory size of various data types, which strings are interned, can I switch coroutines from inside a C function called from Squirrel...


> I want to know things such as: how much overhead do function calls have, what's the in-memory size of various data types, which strings are interned, can I switch coroutines from inside a C function called from Squirrel...

So is that a problem with learning from abstractions, or just simply a problem that this stuff isn't mentioned in the manual?


I do. I recommend it as a way to avoid thinking Haskell is magic, which a lot of people seem to be convinced of. GHC has pretty good desugared printing options.

I'm not sure how to view asm for HotSpot or a JavaScript engine though.


I like my abstractions to be hidden, but I also like to be able to peek under the hood. That's one of the problems of C++ templates, sometimes I want to look at the expanded code.

The GNAT Ada compiler has an option to output a much-simplified desugared code. Not compilable Ada, but very inspectable unrolled, expanded code. Makes for a great teaching tool. 'Aaaaaaah this generic mechanism does that!'

Link https://docs.adacore.com/gnat_ugn-docs/html/gnat_ugn/gnat_ug... look up -gnatG[=nn]... Good stuff.


That's how I learnt C too. Couldn't grok how pointers worked. Took a few months to work with assembly. Returned. Didn't have to read any C tutorial. Everything came naturally


Fortunately, I picked up the K+R book after I was an experienced PDP-11 assembler programmer. I had never heard of C before, and basically just flipped through the pages and instantly got it. I quit all the other languages then, too, and switched to C.


To be fair though that's just an extremely well put together book. It's exceptional.

I happen to have the Second Editions of both Kernighan & Ritchie and of Stroustrup's much more long-winded book about his language sitting near this PC.

Even without looking at the actual material the indices give the game away. If I am wondering about a concept in K&R and can't instantly find it from memory, the index will take me straight there. Contrast Stroustrup where the index may lack an entry for an important topic (presumably it was in great part or entirely machine generated and the machine has no idea what a "topic" is, it's just matching words) or there may be a reference but it's for the wrong page (the perils of not-so-bright automatic index generation) and so the reader must laboriously do the index's job for it.

Now, today that's not such a big deal, I have electronic copies of reference works and so I have search but these aren't books from 2022, Stroustrup wrote his book in 1991 and the 2nd edition of K&R is a little older. This mattered when they were written, and when I first owned them. K&R is a much better book than almost any other on the topic.

The book, I would argue, actually holds up much better in 2022 than the language.


> the perils of not-so-bright

Fortunately, my parents defined me out of that category.


There are apparently whole books written about C pointers. It's definitely a topic where the best (?) way to teach it, in my view, is to sit there and just force a student to watch everything I do while I answer questions since you need a push over the activation energy to be able to work things out yourself.


> There are apparently whole books written about C pointers.

Spend 30 minutes teaching him assembly, and the pointer problem will vanish.


Unfortunately that's not quite how C pointers work - there are also things called pointer provenance, type aliasing, and out of bounds UB.


You’re very old school… I love it.

Edit: I just realized from the sister comment who I was replying to. Old school charm for sure. Now I love it even more.


FWIW, thanks a whole lot!

As an engineer in the forties it is somewhat encouraging to read that you felt the same way even if it was about different topics.

For me, these days it is about frontend generally, Gradle on backend and devops. So much to learn, so little documentation that makes sense for me. (I'm considered unusually useful in all projects I touch it seems but for me it is an uphill struggle each week.)

I always win in the end even if it means picking apart files, debugging huge stacks and a whole lot of reading docs and searching for docs, but why oh why can't even amazingly well funded projects make good documentation..?)


Just try, my pretties, just try to understand how exception handling actually works without staring at a lot of assembly.


Ah, doesn’t it just fly over to the nearest “catch”?

Btw, the worst misunderstandings I’ve seen were not lacking knowledge, they actively believed in some magic that isn’t there if you dig deeper. That’s why I still think that teaching at least basic assembler is necessary for professional programming. It can’t make you a low-level bare metal genius, but it clears many implicit misconceptions about how computers really work.


I recently picked up C after years of python, devops and javascript. I realized it's simply impossilbe for me to understand the tradeoffs made when other languages are designed or just understand my Unix-like operating system and other parts of it without knowing enough C. My next target is of course assembly and the compiler. And if anything I know, I want to stay away from any kind of sugar-syntax and unnecessary abstractions on top of basic computer and programming concepts.


Stack unrolling is a complicated process.

It's tempting to think of them as a kind of return value, but most languages do not represent them this way. (I believe it's a performance optimization.)

Flying to the nearest catch can also be complicated, as it's a block, that involves variable creation, and this possible stack changes. Again it's easier to model as a normal block break and then a jump, but that's not the usual implementation.


I always expect them to use (thread-) global state, not unlike good old errno just more structured. There's always at most one bubbling up so that's how I would do it.


But then you'll have a branch on every failable operation and slow down the happy path. This is not too different from passing the error as a value.

Instead, compilers use long jumps and manually edit the stack. I'm not sure it makes a lot of difference today, but branches were really problematic by the time the OOP languages were popularizing.


They're not bad these days if they're predictable. There's a code size cost, but there's also a cost to emitting all the cleanups and exception safety too.

For instance Swift doesn't have exceptions - it has try/catch but they simply return, though with a special ABI for error passing.


I believe that today a cold branch is just gets predicted as not taken and stays as such because it never jumps (in terms of a predictor’s statistics, not literally).


From the assembly one can learn what compilers do. But it cannot teach how modern CPU actually work. I.e. even with assembly reordering, branch prediction, register renames, cache interaction etc. are either hidden from code or exposed in a rather minimal way.


Right, in particular this is vital for Concurrency. In the 1990s my class about multi-tasking began by explaining that the computer can't really do more than one thing at a time. But in 2022 your computer almost certainly can do lots of things at the same time. And your puny human mind is likely not very well suited to properly understanding the consequences of that.

What's really going on is too much to incorporate into your day-to-day programming, and you'll want to live in the convenient fiction of Sequential Consistency almost all the time. But having some idea what's really going on behind that façade seems to me to be absolutely necessary if you care about performance characteristics.


Then there is no much difference between learning C and assembly. Both are languages for an abstract machine that has less and less relation to how things are done for real.


That is true.

But before I learned programming beyond BASIC, I took a course in solid state physics which went from semiconductors to transistors to nand gates to flip flops to adders.

Which made me very comfortable in understanding how CPUs worked. Not that I could design something with a billion transistors in it, but at the bottom it's still flip flops and adders.


> just fly over to the nearest “catch”?

No? (The stack unwinding is a whole process in and of itself.)


Well, I imagine that it would be possible by expressing its semantics using continuations. Implementing exception handling using call/cc seems like one of them favorite Scheme homeworks. And if you implement it that way, you should then know exactly what it does.


Although it's said that call/cc is a poor abstraction that's too powerful, and it'd be better to have its components instead.

https://okmij.org/ftp/continuations/against-callcc.html


Groovy? Gradle is a build tool for JVM.


Gradle.

Thankfully Groovy is not in the equation except for old Gradle files from before the Kotlin Gradle syntax existed.


A thousand times this! At the very least, I always want to have good mental model of how something probably works or mostly works even if I couldn't reproduce the implementation line-for-line. To me, it can almost be dangerous to have the power to use something without any idea of what's under the hood. You don't know the cost of using it, you don't have a good basis for knowing what the tradeoffs and reasons are for using it over something else, and the concept isn't portable if you need to work in a language or environment that doesn't have it.

If I come across a feature I like in a programming language, I usually find myself trying to figure out how to implement or emulate it in a language I already know (ideally one that won't hide too much from me). Implementing coroutines in C using switch statements, macros, and a little bit of state for example.


Since I've taken cars apart and put them back together, that has been very helpful to me in improving my driving skills. I also know when a problem with the car is just an annoyance and when I have to get it fixed. I can often know what to do to get the beast home again, without having to call a toe truck.


Absolutely! I have similar stories with other abstractions too (virtual methods, protocols, exceptions etc)

I just never quite understood where this "don't worry about the implementation" is coming from, as well as the tendency to explain abstractions in general terms, with analogies that make little sense, etc. The "don't worry about the implementation" did so much harm to humanity by producing bloated, wasteful software.

In fact I think a good abstraction is the one (1) whose implementation can be explained in clear terms, and (2) whose benefits are as clear once you know how it's implemented.


I'm similar. I've narrowed down my learning to two things I need:

- Principles for how things fit together. This is similar to your comment about digging into the assembly. Understanding how something is built is one way of determining the principles.

- Understanding of why something is needed. I still remember back to first learning to program and not understanding pointers. I pushed through in reading the book I was using and eventually they talked about how to use it and it finally clicked.


My dad has had trouble understanding classes for decades, but he had mostly stopped programming during that timeframe as well so it wasn't something he was going to put much time into learning. Now, he's returned to programming more regularly but still is having trouble with classes. I figured the big problem for him is exactly what the problem for you was, the hidden this pointer.

I'd started working on manually writing the same code is both C++ and C, but your approach of using something automated is an even better idea. Showing the implicit this pointer isn't hard to do manually, but polymorphism is a bit more of a pain. But I think the best part about using a tool is that he can change the C++ code and see how it affects the emitted C. Being able to tinker with inputs and see how they affect the output is huge when it comes to learning how something works.


Well, there is a difference between understanding "what it does" and "how it does what it does," and conflating the two is often a mistake. I have seen people take complex code apart (e.g. by doing manual macro-expansion), and it was not just a waste of time, it, in fact, hindered their understanding of the framework as a whole.

When learning Git, I enjoyed reading a tutorial that explained it from the bottom up, but, in the end, having been shown, early on, what is merely the implementation detail, created a cognitive noise that is now hard to get rid of.


I had a similar experience in law school in one of my tax classes. I hit a couple things where I could just not get what the tax code, the IRS regulations, or my textbook were trying to tell me.

I went to the university bookstore and found the textbook section for the university's undergraduate business degree programs, and bought the textbook for an accounting class.

Seeing the coverage of those tax areas from the accounting point of view cleared up what was going on, and then I understood what was going on in the code and regulations.


Learning how double-entry accounting works is both very simple and extremely useful for understanding any finance related topics.


I think like many programmers are "bottom up" (including me), I had a hard time understand virtual methods until I read an example on how they were implemented, then I was able to understand what it was and then the explanation why they were useful.

I remember two lessons about networks at school, the first one was "top to bottom" (layer 7 to layer 1), I understood nothing, then there was another one bottom up, and I finally understood networks..


>Anyhow, I essentially have a hard time learning how languages work without looking at the assembler coming out of the compiler. I learn languages from the bottom up, but I've never seen tutorials written that way.

Funny, that's similar to how I learned assembly! I wrote some small program, and then used lab equipment to monitor the bits flipping in the PC circuits...


> Years later, Java appears. I couldn't figure that out, either. I thought those variables were value types. Finally, I realized that they were reference types

I will resist replying to this.


Don't resist too hard, I don't want you to have an aneurysm!


this has been my approach to study also. some people are fine with what some might call "magic" and they never worry about lower level details. anyway if you want an approach from bottom up you should look at learning a lisp, especially common lisp

http://clhs.lisp.se/Body/f_disass.htm


When learning ReasonML it really helped to understand what Variants were by seeing the outputted JS code (being familiar with JS).


Yeah, I also always have the necessity to look what the compiler does with the code you through at it.


Sounds like you have a certain mental model of computation, and you can't understand other types of semantics. I suggest playing with a term rewriting language. If you can grok that without mapping it to your existing mental model, then other languages can be viewed through that lens much more easily.


> In video game design there is a saying: Show locked doors before you show a key

This is something I've tried putting into words many times.

I'll try to solve a problem and get to know its challenges deeply. Then a tool is introduced that brings it all together. In these cases, I seem to quickly get a full grasp of the operation and essence of the tool.

I wish education was based around this principle. A bit like what Paul Lockhart advocated for in "A Mathematician's Lament" (https://www.maa.org/external_archive/devlin/LockhartsLament....)


That's because the use case for classes appears when the information needed to understand the program exceeds the programmer's working memory. At some point, you need some way to make something into a black box whose innards you do not need to understand when not working inside the black box. Languages which do this badly do not scale well.

This is a hard problem. We have, at least, structs, classes, objects, traits, modules, and namespaces. There still isn't consensus on how best to do that.

At a higher level of grouping, we have more trouble. This is the level of DLLs, microservices, and remote APIs. We barely have language for talking about that. We have no word for that level of structure. I sometimes refer to that as the "big object" level. Attempts to formalize that level have led to nightmares such as CORBA, XML schemas, and JSON schemas, and a long history of mostly forgotten interface languages.

It's interesting to read that the author likes Python's new typing system. It seems kind of half-baked to have unenforced type declarations. Optional type declarations, sure, but unenforced ones?


> We barely have language for talking about that

The paper "The Power of Interoperability: Why Objects Are Inevitable" uses the term "Service Abstraction" - "This terminology emphasizes, following Kay, that objects are not primarily about representing and manipulating data, but are more about providing services in support of higher-level goals" It goes on to give examples of how service abstractions arise in a lot of places including the linux kernel.


Author here. I sure would prefer enforced type checking - but a good type checker that runs automatically in the editor and immediately warns me when I inevitably mess up is the next best thing.


> At a higher level of grouping, we have more trouble. This is the level of DLLs, microservices, and remote APIs. We barely have language for talking about that. We have no word for that level of structure. I sometimes refer to that as the "big object" level. Attempts to formalize that level have led to nightmares such as CORBA, XML schemas, and JSON schemas, and a long history of mostly forgotten interface languages.

Here I would like to think TLA+ would help. But it seems it doesn't either.


Python's static type system was kept out of the runtime object values to keep a slow language from getting slower.

Mypy is also not 1.0 yet.

You can still find bugs in advanced use cases i.e. when you are mixing inheritance, generics, data classes...

The new type system is both flawed and also a godsend for python developers.


I do not think JSON schema and XML schema are such nightmares though. They are not that difficult to write most of the time and are quite useful. Maybe they simply are not the best examples for what you want to express.


For the Python typing system. If you mostly code alone, you have the options to force yourself to enforce it, which would benefit the future you.


A modern IDE (I use PyCharm) also makes use of it for both warnings and autocompletion.


Yes, this is one of my pet peeves as well.

And what I think it's also important: don't ask for the key if the door doesn't need to be opened.

The examples on the text about Python and OOO are on point. Yes, people would reinvent classes if they needed, but the great thing about classes (and type annotations) are that they're optional

Compare with Java where you need a 'static void main' method inside a class just to begin. Why? Not even C++ requires that.

Do help them keep in their lanes but keep simple things simple.


The sheer irony is that "lite" object systems like python and ruby are actually purer and more faithful to the ideals of OOP (as envisioned by Alan Kay and those with him and before him) than Java can ever be, with its ugly and unnecessary seperation between classes and primitives, a completely irrelevant distraction that is primarily a VM optimization detail which only compilers and other bytecode producers or consumers should have known or cared about.

In python, nearly everything is a PyObject, data, code, source code files (which are just modules, which are just objects). It's highly misleading to name the things you get by calling a class objects, it implies this is somehow special, as if dicts and lists and ints, and classes themselves for crying out loud, aren't objects as well.

This is why I cringe so much when people associate OOP with Java bloat, the language is just unimaginably badly designed, this has nothing to do with OOP. You don't even have to be dynamic like Python and Ruby to be a good OOP system either, although that is a particularly low-friction path pioneered by smalltalk and lisp, but languages like Scala and Kotlin prove you can be highly static, highly OOP, and beautiful to read and write without the sheer amount of paperwork Java heaps on you.


It's not really an optimization detail, it's because Java doesn't have user-defined value types. The primitives are there because they're value types copied from C.

(Java is a memory-safe variant of Objective-C with all the interesting features removed.)


> Java where you need a 'static void main' method inside a class just to begin

C# too, but they remove the necessities to use static main for entrypoint, starting from C#9


In school, I would always go to the end of the chapter and look at the difficult exercises, think about them for a while and fail to solve them. This gave me the motivation to read the chapter and do the easier tasks. It puts your mind in "the right state", knowing that you are working on a real problem.

Later in life I've discovered a trick that gets me into this state of mind more easily. For any feature, I ask myself: What problem does it solve?


I was thinking about this recently when learning about Monads and that quote hits the nail on the head. It seems to me one of the main reasons so may blog posts try and fail to explain Monads is they try to explain "What" a monad is, without the motivations for why we need them. Unfortunately the "why" is very tied up the nature of functional style, laziness, etc. The primary use of a monad is it allows you to enforce execution order. If you're coming from an imperative language, execution order is a given so the reason you'd want monads really isn't clear.


But I think that's the exact reason we need to learn simple languages here. So that we know the closed doors first (the abstraction problem), then we will create a key or find a key elsewhere. Otherwise if "Class" is made the part of the language, like Java, you will not appreciate the importance of it to any extent and have a overall bad experience....


You can start with the simple features of a language and use them to introduce the more complex features.


Yep but in this case a language should be bundled with a good tutorial that introduces everything nicely. But if a person wants to just look at the specifications and code examples, the person would be having a lot of pointless keys..


> Then a tool is introduced that brings it all together.

I totally agree there and I think it's due to the fact that you've got a mental model of the problem space and your brain can see the empty hole. Once you have a tool of that shape, your brain knows where it goes and how it solves your problem.


This is very close to the image I have in my head. I visualize it as a puzzle: The more pieces you have, the more naturally more pieces come and the easier it becomes to see what you need and what is out of place.


hei, just read this. That's a really cool piece of writing.

It kinda put me in a mood of going over the basic principles of mathematics again, but in this new light.

Any recommendation of content that follows the advice given in the article?

Thanks!


Very quotable principle. Any attribution?


A succinct and level-headed assessment of the programming language landscape. Agree that beginners need to struggle with unwieldy personal projects before " getting" typing and classes.

"The idea is that you can't understand a solution without understanding the problem it solves," great quote!

"But mostly, software is bounded by its creation process: Programmers have limited time to create, and especially limited time to maintain, code." Software is this weird commodity with no marginal cost but some "complexity" cost.

Also, I remember taking a theory of the mind course and learned how people whose native language didn't have words for things like other people's minds or consciousness couldn't solve tests on those topics IIRC. Humans, in general, needed these words in our vocab to reason about these things. Just as adding words in our language helped us reason about higher-order concepts, these programming words and features allow us to reason about our code better. Furthermore, words in our language grow organically, just like these programming concepts.


> Also, I remember taking a theory of the mind course and learned how people whose native language didn't have words for things like other people's minds or consciousness couldn't solve tests on those topics IIRC.

Given that most animals can reason about other animals as agents with intentions, I'm not sure what people have been unable to answer. This also sounds a bit longer the discredited setting version of the Sappir-Whorf hypothesis (linguistic determinism).


> Humans, in general, needed these words in our vocab to reason about these things. Just as adding words in our language helped us reason about higher-order concepts

Except that words for concepts nobody's ever bothered to consider don't just appear in the language. That can only happen when somebody tries to think about the topic and is forced into making up some new words.


One difficult thing to accept in programming language design is that what mathematically is simple, consistent, and straightforward is not simple, consistent, or straightforward to people.

For example, a simple, consistent, and straightforward expression syntax would be RPN (Reverse Polish Notation). But humans dislike, and much prefer infix with its complicated operator precedences.

The real trick to programming language design is finding those constructs that are simple, consistent, and straightforward for users.


Just a quick word about polish notation:

The only reason why humans seem to prefer infix notation, is because it is what's being taught in school. The first mathematical expressions most people get to see is

    1 + 1
There is no reason why someone who was introduced to

    + 1 1
at an early age would find it less "natural" than infix. I think the only reason why infix is the dominant notation is historical: Prefix notation relies on getting whitespace right as a deliminator, and that is a lot tougher to do in handwriting. Infix solved this problem by (ab)using the operator as a deliminator.


That is not necessarily true. the first tells me we are starting with 1 apple and adding another one. the second tells me that we are starting with Adding? Adding what? We never start with an action without having a object/subject in mind.


You are still starting from your own learned context. Other languages than English have different word order conventions. As a simple extreme example, in classical Latin the main verb generally is placed last in the sentence.


White space is bad for readability anyways. I don't think that would be doable.


> White space is bad for readability anyways.

Question{Which|of|these|sentences|is|easier|to|read;This|one?}

Or this one in the next line?


I meant in the context of programming.


Ancient texts always seem to have no word breaks, they're just a solid wall of letters.


Some humans dislike RPN.

I am yet to find whether they really dislike or just find it unfamiliar.

I am convinced that unfamiliar gets conflated with unintuitive and hard all the time.


I remember back in the 70s in college the great debate was whether the HP-35 calculator with RPN was better or worse than the TI SR-50 with infix notation.

My conclusion was that RPN required less to remember than infix, and with a calculator you had to be very careful to not mess up where you were in punching buttons.

But for a tty, infix wins every time. Like no book publisher ever writes math equations with RPN, unless they are writing about Forth or Lithp or some intermediate code.

I should have known, however, that my remark about RPN not being human would flush out the 3 people for whom it is!


> unfamiliar gets conflated with unintuitive and hard all the time

In a deep sense unfamiliar and unintuitive/hard can be viewed as the same thing (see e.g. the invariance theorem for Kolgomorov Complexity). Hence striving to be "familiar" is still a virtue that it makes sense for a programming language to aspire to (balanced of course against other concerns).


Familiarity is a characteristic of the agent.

Intuitiveness/duficult is of the object.


Objects are not intrinsically difficult or intuitive though. Different people will perceive the same thing to have different levels of difficulty or intuitiveness.

Or as Von Neumann said about mathematics (and I think the same is applicable to CS and programming): "You don't understand things. You just get used to them."


'unfamiliar' is 'unintuitive'. You only have an intuition for things that are similar to things you have encountered. Some unfamiliar or unintuitive things might be simpler than familiar things. But, at first, it is often easier to use familiar things, even if they are less simple.

Regarding RPN, I am not convinced that they are actually simpler to a human. In order to understand an RPN expression I try to keep the stack of operands and intermediate results in my ultra short term memory. This can easily exceeds its typical capacity, whereas when trying to understand an infix expression, I can replace subtrees with an abstraction, i.e. a short name, in my mind. But perhaps I have just not yet seen the light and spend enough time working with RPN expressions.

On the other hand, RPN expressions are certainly far simpler to implement.


I never said PRN is simpler. It is just as hard as infix.

I do not like PNR any better or worse. It takes me about 5 minutes to switch from lisps to others and back. I just put the parens in the wrong place a couple of times and I am done.

Paredit + PNR makes editing slightly more comfy, but that's it. They are the same thing.


I think this is the takeaway as well; I prefer RPN (and I can read and I like k which is considered 'unreadable/readonly' and Lisp which is also apparently 'hard to read') because I am used to writing a lot of forth(likes); I like discovery while programming and it is quite trivial to stuff in a forth into anything, even when no-one did it before. For instance, 10 years ago, to speed up Xamarin (which was compile/run and quite slow at that), I sped up development by adding a Forth into C# so I could rapidly discover and prototype on my device without having to compile. And I have done that for the past 30 years, starting on my 80s homecomputer to avoid having to type hex codes and risk more crashing during discovering what I wanted to make.


I agree with your general principle but I do think infix notation is still better because so many equations are a tree of binary operators. Look at how binary trees are drawn - the nodes are in the middle of their children. It just makes sense for binary operators to have their operands on either side.

Otherwise you end up having to maintain some kind of mental stack which is just a bit mentally taxing.


There is something I am missing here.

How does your mental model change that much from where the operator goes?

Can't you put node to right of a column of children? Like you would do on a piecewise function.

I am dislexic, maybe that's why I do not see your point.


Prefix and postfix notations permit a great distance between the operands and the operators. If done in a disciplined fashion so that they're merely swapped (at least in the majority of cases), then sure, it's not too different. But consider this: (¬ used for unary negation)

  b ¬ b 2 ^ 4 a c * * - √ + 2 a * /
The two expressions being divided are quite large, so there's no easy way to just move the division one token to the right, it goes all the way to the end. Which two things are being divided at a glance, just point them out. I've even selected a more familiar expression for this exercise that should make it easier. But what about an expression people are less familiar with? Which thing are we calculating the square root of? This is what RPN looks like when entered into a calculator, which is handy and quick for computations (I mean, I have an HP-32S II at my desk next to my laptop, I like it). But even the HP-48G with its graphical equation editor didn't force you to think like this, algebraic notation provides locality of information that makes it much more comprehensible without overburdening your working memory.

And once you start adding parentheses or indentation to convey that information, you're on your way back to a tweaked infix notation. The same for prefix notations. If you can see past the parentheses of Lisp/Scheme it's not too hard to see what's being expressed (using the same symbols as above):

  (/ (+ (¬ b)
        (√ (- (^ b 2)
              (* 4 a c))))
     (* 2 a))
This is more comprehensible, but it's not a strict prefix notation anymore, we've used parentheses and indentation in order to convey information about the grouping. I've even borrowed Lisp's use of variable numbers of arguments to simplify *. If asked, "What is being divided by what", you can, rather quickly, identify the divisor and dividend, at least point them out if not express totally what they do. But a straight prefix notation or postfix notation makes that task very difficult.

You could start naming subexpressions, but then you get into one of the two hard problems in computer science (naming things).

And then we get to the other things people want to do with mathematical expressions, which is largely symbolic manipulation. The present infix notation (or a highly demarcated prefix/postfix notation like the Lisp-y one above) is much better for this task than a strict prefix/postfix notation.


Rust has this problem with its module system. It has top-level "items" that can be functions, types, global variables, …or modules. The way modules are defined and imported is consistent with defining and importing of named structs and functions.

But every new user is totally confused about this (I'm not even sure if my explanation above will be understood), because everyone expects modules to be a special case of imperative file loading operations, rather than declarative syntax creating named namespaced items, like everything else in the language.


D's original symbol table system was completely consistent, extensible, and orthogonal. I explained it till I was blue in the face, and exactly zero people understood it. D now has a significantly more complex one.


Rust has a mod declaration tree and a use import graph.


> But humans dislike, and much prefer infix with its complicated operator precedences

Depends on what you're writing; Gerald Jay Sussman claims that edge cases in mathematical notation (and corner-cutting of sorts in what is written) makes physics quite hard to grasp in "The role of programming" [1]; hence the Structure and Interpretation of Classical Mechanics book, which includes many Scheme programs which explicate everything.

[1] https://www.youtube.com/watch?v=arMH5GjBwUQ


> One difficult thing to accept in programming language design is that what mathematically is simple, consistent, and straightforward is not simple, consistent, or straightforward to people.

The problem here is that most 'people' have already learned the non-simple, non-consistend and bent logic of non-mathematical (aka imperative) programming languages. Because (almost) everybody has learned the basics of mathematics before programming, so for _everybody_ a term like `x = x + 1` had been a violation of all three.


> The complexity of Rust's ownership model is not added to an otherwise simple program, it is merely the compiler being extremely pedantic about your code obeying rules it had to obey anyway.

Not exactly. I remember when I learned Rust a few years ago (so maybe things are different today), sometimes the compiler didn't let me do things that should have been perfectly fine things to do. This forced me to write the code in a more complicated way than what should have been necessary.

I think this often happens beacuse the compiler's rules don't always match reality. Like if I have a struct S that contains members X and Y, the compiler doesn't let me have two mutable references to that struct even if one of them only changes X and the other one Y. In reality there is no problem but compiler thinks there is a problem so it forces me to write more complicated code.

I have had this almost exact problem in a program I was writing. The reason I could not just pass X to one function and Y to the other is that one of those functions I had to call was from a library that could only take an S. The solution was that I had to write a macro that extracted Y from S and call that macro every time I wanted to pass Y to a function.


> the compiler doesn't let me have two mutable references to that struct even if one of them only changes X and the other one Y.

The semantics of mutable references and other interface constructs must be independent from actual implementation, to allow for changes in the latter. So Rust is behaving as expected here. If the changes to X and Y are truly independent, you can have a function that returns separate mutable references to both ("splits" the struct). But if your library function takes an S, it might rely on the whole struct and then the changes are no longer truly independent.


> then the changes are no longer truly independent.

But the author of the code is telling you that they are independent. The author may be proved wrong if some future change accidentally triggers dependence but we as humans can have knowledge that the compiler can not have. The compiler sees a possibility of conflict and wants to protect you from yourself. That is a usability issue.


Rust is exploring techniques to allow "the author of the code" to provide finer "separation" semantics without resorting to unsafe - this is what GhostCell, QCell, LCell etc. are all about though their "usability" is still low. There's plenty of ongoing research in this space, so feel free to stay tuned for more!


Fun computer science fact: any type system will disallow some otherwise valid programs and also allow some invalid programs. The trick for type system designers is to find a sweet spot.


I'm curious if this is a provable result or an empirical observation?

(Come to think of it, I'm not even sure what's meant by program validity in this context.)


Programs are proofs and so this follows from Godel's incompleteness theorem. But "invalid program" needs defining.


> The solution was that I had to write a macro that extracted Y from S and call that macro every time I wanted to pass Y to a function.

Why does that require a macro?

``` struct S { x:X, y:Y }

fn do_with_y (y:&mut Y) {} fn do_with_s (y:&mut S) {}

fn main_program (mut s: S) { do_with_s(&mut s); do_with_y(&s.y); /* ... */ } ```

I prefer seeing code like this rather than wrapping it in abstractions and macros/helpers because it's not particularly verbose and very clear where the borrowing happens ; you'd see the same in C++ or C, to be honest.


I made a macro because it was more complicated than just S.Y. It was more like `Something.get_stuff().unwrap().things`


The definition of "simple" and "complex" is very vague here.

If we use Clojure's philosophy of simple, Python is by no means a simple language, even without classes. There are a lot of special syntax like `with` `as` `elif` `for .. else` and concepts such as `nonlocal`, not even mention generators, decorators and lots of fancy stuffs, although it is easy to learn...

I don't agree with the author that static type makes things more complicated... Instead type make things much simpler and easier to reason about.. In the Python's example, the author indeed like to reason about static typed systems...

And about classes, it is just a leaky abstraction legacy we have to deal with now.. "so they would have accidentally re-introduced classes" definitely no.. We can use structs, traits, interfaces or much better ways for abstraction than classes, which was introduced with ideas for inheritance.


Author here. Indeed, when I tested off my idea on a friend, his first response what "But what is complexity? Is J a complex language? What about assembly?". It can get tricky once you get into the weeds. But from the comments here, it looks like most people got what I was trying to express.

W.r.t whether types makes Python more complex - my point was that a type system makes the _language_ more complex, but it makes the _code_ using it easier to reason about. That's my main view of the blog post: Be careful trying to simplify a programming language if it leads to the code itself being more complex.

And sure, I have no particular attachment to OOP "classes" as such, and I don't think they are necessary. In fact I much prefer Julia's approach. What I meant with classes being inevitable is that 1) structs, or something like structs, are inevitable, and 2) methods, in the sense of functions that are semantically tried to particular structs, are inevitable. In Python, these two things are provided by classes, so in Python, there really is no escaping classes.


The big or small language problem have been debated quite a long time, for what I know especially in the scheme community. There are a lot of insightful discussions about R6RS, which you may find interesting.


> If we use Clojure's philosophy of simple, Python is by no means a simple language, even without classes. There are a lot of special syntax like `with` `as` `elif` `for .. else` and concepts such as `nonlocal`, not even mention generators, decorators and lots of fancy stuffs, although it is easy to learn...

Beware of the Turing tarpit where everything is simple but nothing is easy! Python is by no means perfect, but using different syntax when doing different things can make a lot of sense. "Everything is a X" can be a nice mental model, but on the other hand, as we've seen with classes, mixing data structures, structs and namespaces all in one concept is not always the best solution.


Yep, so as I said the definition of things like "simple", "complex" are super vague.

| but using different syntax when doing different things can make a lot of sense.

For C users yes, if we step back a little bit, for English users yes, but for people with other language background it may not hold. Every pieces of added syntax will utilise some human priors that may be specific to someone. But some really minimal systems can be grasped by every background, albeit taking a very long time.


My idea behind "make things that are different look different" doesn't come from language (at least not written language) but from visual media in general. For example, on a industrial machine or in a guide, people will often use lots of trick to distinguish some things from other. It isn't really syntax, but considering that programming is written text, we don't have much else than syntax to differentiate the parts on a quick read.

I think that may be a bias due to how I read code. I tend to skim it quickly at first, and then read the "more interesting parts" on a second pass. On the other hand, if you read code linearly, syntax may not be as needed. I'd like to see a study on eye movement when reading code and language preference, that may help reveal some patterns (or not.).


Static types are definitely make the language more complicated. It will have a bigger specification and more complicated implementation.

You could even argue it makes writing programs more complicated. But they definitely also make writing programs much much easier.

Kind of like how algebra is more complicated than basic arithmetic, but good luck proving Pythagoras's theorem without algebra.


What I meant is that the word "complicated" is vague. Even if you a using a language with dynamic type, it just hides the unavoidable complexity behind. All the problems with types will resurface at the run time. The language by itself allows very complex interactions with types, just that the user not specifying it. For me I would add the behaviour of the language as part of the language complexity, not the written part only.


>> good luck proving Pythagoras's theorem without algebra.

https://avatars.mds.yandex.net/get-zen_doc/48747/pub_5c9e493...


I don't understand. That uses algebra.


The proof is on the left, and doesn't require algebra. You can use another picture for the right part, like this https://microexcel.ru/wp-content/uploads/2020/05/kvadrat-sum...

(it also uses algebraic notation which can be ignored)


Pythagorus's theorem is literally written using algebra. You can't even write it down without algebra.


algebraic notation != algebra

It can totally be written without "algebra"; it's not the notation that contains the idea, it's only the idea that happens to be often expressed in this notation. I could use any other notation, or even plain English, to express the idea, if I didn't mind the extra verbosity.

For Example:

1. The Berlin Papyrus 6619 (from 2000-1786 BC Egypt) uses prose [1]

2. The Ancient Chinese mathematical text Zhuobi Suanjing uses both prose and a pictorial notation [2]

3. The Baudhāyana Shulbasūtra (from 800-500 BC), a set of mathematical instructions for use in the construction of Vedic fire-altars, uses Sanskrit prose describing geometric constructions using rope [3] [4].

[1] https://en.wikipedia.org/wiki/Berlin_Papyrus_6619#/Connectio...

[2] https://commons.wikimedia.org/wiki/File:Chinese_pythagoras.j...

[3] https://en.wikipedia.org/wiki/Baudhayana_sutras#Pythagorean_...

[4] https://en.wikipedia.org/wiki/Shulba_Sutras#Pythagorean_theo...


> algebraic notation != algebra

Ha where did you get that crazy idea?

> The Berlin Papyrus 6619 (from 2000-1786 BC Egypt) uses prose

Just because it's prose doesn't mean it isn't algebra. For example "If you square the lengths of the two shorter sides together and add them up it equals the length of the square of the length of the longer side". That's just algebra with words.

The thing that makes algebra algebra is that you use variables, not constants. "the length of the longer side" is just a wordy variable.


Google Einstein’s proof of the Pythagorean theorem. I’ve won quite a few bets with it. :-)


I did and they were full of algebra. Anyway as I've said elsewhere Pythagorus's theorem is an algebraic formula, so you can't even state it without using algebra.

Converting the equation to prose doesn't mean it isn't an algebra anymore. The key feature of algebra is manipulation of variables.


>Encapsulation, modularization, abstraction - let's not pick nits about the precise meaning of these terms, they are all about the same thing, really: Managing complexity of your own code.

The thing is, code complexity can be managed superbly without the concept of a class.

Golang doesn't have classes. C doesn't have classes. But both have structs and they have functions taking a pointer to, or a struct of type T. (what OO calls a "method" for some reason). So we can have "objects" as in, custom datatypes, and we can have functions to act on them.

And on top of that, we have modularization in libs and packages.

So, why do we need more complicated languages? "Show a locked door before you show a key" indeed.


> What more do we need?

Good languages with good IDE support?

I'm old enough that at some point assembly and C were my favourite languages and PHP was what I was most productive in and Python was also something I enjoy.

These days no way I go back for new projects.

After Java "clicked" for me and the introduction of Java generics shortly afterwards it became my favourite language and later TypeScript and C# had been added to that list.

I'm not to dumb, but I prefer using my smarts to develop programs, to work together or even pair program with the language to mold something, not to babysit languages that cannot even help me with trivial mistakes.


Is not the point, that a language with optional typing can help you out with trivial mistakes, if you give it sufficient information? Java makes (made?) you give all the info all the time, even when you do not need its help. Java holds your hand in an annoying way. TypeScript does not because you can gradually add types to your program, but is based on the shakier foundations of JS.


Yes, I would agree that structs are the solution to the complexity the author describes. But Python doesn't have structs: if you don't learn about classes you will probably eventually organize some data using dictionaries or heterogeneous tuples.

In another language you might introduce structs first (C and Go only have structs, C++ has both classes and structs but they are the same thing) but in Python you don't have that option - you can use data classes or named tuples but those already need most of the ceremony you introduce for dealing with classes.


If it’s useful: Python now has dataclasses, they are very struct-like.


> The thing is, code complexity can be managed superbly without the concept of a class.

It can be managed by modules/namespaces, but without even that (a la C), I really don’t see it managing complexity well. The important point of OOP is the visibility modifiers, not “methods on structs”, that would provide no added value whatsoever.

What OOP allows is - as mentioned - not having to worry about the underlying details at call site. Eg, you have a class with some strict invariant that must be upheld. In case of structs you have to be careful not to manually change anything/only do it through the appropriate function. And the bad thing is that you can make sure that your modification is correct as per the implementation, but what classes allow for is that this enforced invariant will remain so even when the class is modified - while now your call site usage may not fulfill the needed constraints.


> The important point of OOP is the visibility modifiers,

OOP is not required for implementation hiding. It can be done entirely by convention (eg. names starting with an underscore are internal and not to be used).

Go took this a step further and simply enforces a convention at compile time: Any symbol in a package not starting with an uppercase letter, is internal and cannot be accessed by the consumer.


Well yeah, and typing doesn’t need enforcement by compilers, we just have to manually check their usages. Maybe we should go back to Hungarian notation!

I mean, snark aside, I really don’t think that variable/field names should be overloaded with this functionality. But I have to agree that this functionality is indeed not much more than your mentioned convention.


OO implies far far more than simply grouping data in a struct and passing it by reference.


Such as?

The only other things that come to mind when I think about "OOP" are:

* inheritance, which at this point even lots of proponents of OOP have stopped defending

* encapsulation, which usually gets thrown out the window at some point anyway in sizeable codebases

* design patterns, which usually lead to loads of boilerplate code, and often hide implementation behind abstractions that serve little purpose other than to satisfy a design pattern...the best examples are the countless instances of "Dependency Injection" in situations where there is no choice of types to depend on.


How is encapsulation thrown out the window in sizeable codebases?? It is the single most important thing OOP gives, and is used by the majority of all programmers and has solid empirical evidence for its usefulness.


Could I see some examples of this evidence?

Let's consider a really simple object graph:

    A -> B
    C
A holds a reference to B, C has no reference to other objects. A is responsible for Bs state. By the principles of encapsulation, B is part of As state.

What if it turns out later, that C has business with B? It cannot pass a message to A, or B. So, we do this?

    A -> B <- C
Wait, no, we can't do that, because then B would be part of Cs state, and we violate the encapsulation. So, we have 2 options:

    1. C -> A -> B
    2. C <- X -> A -> B
Either we make C the god-object for A, or we introduce an abstract object X which holds references to A and C (but not to B, because, encapsulation). Both these implementations are problematic: in 1) A now becomes part of Cs state despite C having no business with A, and in 2) we introduce another entity in the codebase that serves no purpose other than as a mediator. And of course, A needs to be changed to accomodate passing the message through to B.

And now a new requirement comes along, and suddenly B needs to be able to pass a message back to C without a prior call from C. B has no reference to A, X or C (because then these would become part of its state). So now we need a mechanism for B to mutate its own state being observed by A, which then mutates its own state to relay the message up to X, which then passes a message to C.

And we haven't even started to talk about error handling yet.

Very quickly, such code becomes incredibly complex, so what often happens in the wild, is: People simply do this:

    A -> B <-> C
And at that point, there is no more encapsulation to speak of. B, and by extension B is part of A's state, and B is part of Cs state.


Encapsulation (to lock down an object like this) is used when there's a notion of coherence that makes sense. If B could be standalone, but is used by both A and C, then there's no reason for it to be "owned" by either A or C, only referenced by them (something else controls its lifecycle). Consider an HTTP server embedded in a more complex program. Where should it "live"?

If at the start of the program you decide that the server should be hidden like this:

  main
    ui
      http
    db
And main handles the mediation between db and ui (or they are aware of each other since they're at the same level), but later on you end up with something like this:

  main
    ui
      http
    db
    admin-ui
And the admin-ui has to push all interactions and receive all its interactions via the ui module, then it may make less sense (hypothetical and off the cuff so not a stellar example, I'll confess, but this is the concept). So you move the http portion up a level (or into yet-another-module so that access is still not totally unconstrained):

  main
    ui
    db
    admin-ui
    http
    -- or --
    web-interface
      http
Where `web-interface` or whatever it gets called offers a sufficiently constrained interface to make sense for your application. This movement happens in applications as they evolve over time, an insistence that once-written encapsulation is permanent is foolish. Examine the system, determine the nature of the interaction and relationship between components, and move them as necessary.

Arbitrary encapsulation is incoherent, I've seen plenty of programs that suffer from that. But that doesn't mean that encapsulation itself is an issue (something to be negotiated, but not a problem on its own).


Those are called callbacks, we use those everywhere. If you don't want to use callbacks, you can use something like a publish-subscribe pattern instead so that X doesn't need to be indirectly linked to B through A and can publish directly to X.


The one thing that is extremely hard to fake well without some sort of language support is multiple dispatch, the ability to call a method on an object x of type X and have the calling expression automatically "know" which method to invoke depending on the actual runtime type of the object (any subtype of X or any interfaces implemented by it etc etc etc...).

This is extremely hard to fake in, say, C. I mean it's not really that hard to fake the common use case demonstrated in classrooms, a rough sketch would be something like

enum vehicle_type {CAR, AIRPLANE} ;

Struct Vehicle {

enum vehicle_type t ;

Union {

struct car_t { <car-data-and-behavior>} car ;

struct airplane_t { <airplane-data-and-behavior> } airplane ;

} data ;

//some common data and behavior

} ;

void func (Vehicle *V) {

switch(V->t) {

   CAR :<....> break ;
   AIRPLANE :<....> break ;
} }

But this is a horrible thing to do. First, it's a repetitive pattern, you're basically working like a human compiler, you have a base template of C code that you're translating your ideas into. Rich source of bugs and confusions. Second, it's probably hiding tons and tons of bugs: are you really sure that every function that needs to switch on a Vehicle's type actually does that? correctly? what about the default case for the enum vehicle_type, which should never ever take any other value besides the valid two? how do you handle the "This Should NEVER Happen" case of it actually taking an invalid value? How do you force the handling of this in a consistent way across the code base that always exposes the problem to the calling code in a loud-yet-safe manner ?

There's a saying called Greenspun's tenth law : "Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp". If you ignore the obnoxious lisp-worshiping at the surface, this aphorism is just trying to say the same thing as the article : there are patterns that appear over and over again in programming, so programming languages try to implement them once and for all, users can just say "I want dynamic polymorphism on Vehicle and its subtypes Car and Airplane" and voila, it's there, correctly and efficiently and invisibly implemented the first time with 0 effort on your part.

If you don't want the language to implement those patterns for you, you're still* going to need to implement them yourselves when you need them (and given how common they are, you will* need them sooner or later), only now haphazardly and repetitively and while in the middle of doing a thousand other thing unrelated to the feature, instead of as a clean set of generic tools that the language meticulously specify, implement and test years before you even write your code.

>inheritance, which at this point even lots of proponents of OOP have stopped defending

Not really, there are two things commonly called "inheritance" : implementation inheritance and interface inheritance. Implementation inheritance is inheriting all of the parent, both state and behaviour. It's not the work of the devil, you just need to be really careful with it, it represents a full is-a relationship, literally anything that makes sense on the parent must make sense on the child. Interface inheritance is a more relaxed version of this, it's just inheriting the behaviour of the parent. It represents a can-be-treated-as relationship.

Interface inheritance is widely and wildly used everywhere, everytime you use an interface in Go you're using inheritance. And it's not like implementation inheritance is a crime either, it's an extremely useful pattern that the Linux kernel, for instance, Greenspuns in C with quirky use of pointers. And it's still used in moderation in the languages that support it natively, its ignorant worshippers who advocate for building entire systems as meticulous 18th century taxonomies of types and subtypes just got bored and found some other trend to hype.

>encapsulation, which usually gets thrown out the window at some point anyway in sizeable codebases

I don't understand this, like at all. Encapsulation is having some state private to a class, only code inside the class can view it and (possibly) mutate it. Do you have a different definition than me? How, by the definition above, does Encapsulation not matter in sizable code? it's invented for large code.

>design patterns, which usually lead to loads of boilerplate code

100% Agreement here, OOP design patterns are some of the worst and most ignorant things the software industry ever stumbled upon. It reminds me of pseudoscience : "What's new isn't true, and what's true isn't new". The 'patterns' are either useful, but completely obvious and trivial, like the Singleton or the Proxy. Or they are non-obvious, but entirely ridiculous and convoluted, like the infamous Abstract Factory. This is without even getting into vague atrocities like MVC, which I swear that I can summon 10 conflicting definitions of with a single Google search.

The idea of a "design pattern" itself is completely normal and inevitable, the above C code for example is a design pattern for generically implementing a (poor man's version of) dynamic dispatch mechanism in a language that doesn't support one. A design pattern is any repetitive template for doing something that is not immediately obvious in the language, it's given a name for easy reference and recognition among the language community. It doesn't have anything to do with OOP or "clean code" or whatever nonsense clichés repeated mindlessly and performatively in technical job interviews. OOP hype in the 1990s and early 2000s poisoned the idea with so much falsehoods and ignorance that it's hard to discuss the reasonable version of it anymore. But that's not OOP problem, it's us faulty humans who see a good solution to a problem and run around like children trying to apply it to every other problem.


A language with a short learning curve is like a toolbox that’s nearly empty. You quickly run out of ways it could help you. We should optimize for experts, because that’s where each of us is going to spend most of his career.


Programming languages are not unique to programmers anymore, and not all users will ever become experts.

These days, programming is being thought to more pupils and students than ever, because languages like Python has made it very accessible and easy to learn.

The fact that a non-CS/IT person who has never written a code in their lives, can learn the basics with languages like Python, and make tools which increases productivity in just mere weeks, is amazing. I know this, because I've witnessed it multiple times at work.

That would almost certainly have never been the case, had we only had languages with steep learning curves. Imagine if someone wanted to make a web-scraper in C, and some program that act on the scraped data. In python, that's basically under 20 lines of code...in C? Most likely hundreds.


I fully agree with this. I think we're in the middle of a paradigm shift in regards to programming as more and more people enter the field or use programming as an auxiliary skill.

Like with writing/reading and math there was a time where only experts would do these things. There is still quite a substantial difference between a seasoned literary scholar and the average person, but many have access to this skill now and the fundamentals have become normal.

With programming I assume the same will happen. The fundamentals are simple and there are many ways, technological and cultural, to learn and use this skill.

Here's a list of programming types: Web development, systems programming, game engines, data analysis, application/UI scripting, scientific computing, spreadsheets...

I think we have to acknowledge and welcome this change and the people who will newly access this beautiful craft, while at the same time being mindful about what that means for professional identity and assumptions around that issue.


Exactly, I never how being able to learn a tool in a couple hours was something that made a programming language desirable to use. This is our profession, something we will be doing for a large period of our lives, choose the tool that results in the best output not the tool you can learn in an afternoon!


Is it possible to optimize too much for experts, and sacrifice learning curve too much?


Yes, as the creator of C++ has said [1] before.

Personally, I'd also distinguish between "complicated over time" and "complicated by default." For example, Clojure has a minimal syntax and instead uses a large vocabulary of terse primitive procedures. If you don't know the primitives, it's "complicated." A different example is Rust, which has many features expressed in syntax. If you haven't learned all of the syntax, it's "complicated." Compare these with C++, which began as a small set of extensions to a stronger-typed C, but has gradually grown into a family of sub-languages, each added for a particular purpose, and all interacting with each other (sometimes beautifully, other times horribly -- knowing the difference is "complicated").

[1]: https://blog.codinghorror.com/the-problem-with-c/


I don't think so, if you assume that learning is a linear thing and at the end of it you improve as a dev.

But the problem is that learning isn't one curve, it's a bunch of them and most of them lead to dead ends or even turn backwards after a while.

If you bloat out a language, what happens is that some subset of devs (like, maybe 10%) write really great code and have all the tools they'll ever need to do so. And the other 90% learn a chaotic mix of good and bad practices from each other, without the ability to properly distinguish them, and eventually reach some local maxima amongst themselves of overcomplicated "average" code that's shittier than I imagine it would be if they just kept the language simple.


C++ before 2011 springs to mind. From "Direction for ISO C++"[1]: "C++ is expert-friendly, but it cannot be just expert-friendly without losing important streams of new talent. This was the case before C++11 and C++11 reversed that negative trend."

[1]: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p200...


I do rather wish C++ 11 was a blank slate for that reason. Not really, as that'd kill C++, but there's so much good about C++ that is held back by its backwards compatibility with C and the dark ages of pre-11 C++. Data types and features are intended as best practice, but it's stuffed away in libraries since it can't replace the core of the language.


C++ evolution since Rust came along has been amusing. Finally, the C++ designers take safety seriously, now that there's a competitive threat. Until a decade ago, they didn't. I used to argue with some of the committee members over that.

The trouble is, C++ does safety by using templates to paper over the underlying C memory model. This never quite works. The mold always seeps through the wallpaper. Raw pointers are needed for too many APIs.


Languages don't get popular unless they are very easy to learn, so no language popular today almost surely is biased in favor of learning curve and against complexity. We probably need some big industrywide intervention to create the optimal level of complexity in languages as I doubt even C++ is too complex to be optimal if you consider that you work 40 years in a career. It would work by just having a couple of standard languages so developers don't need to learn so many languages, then they spend a lot of time learning those until they are fluent and then they don't have to learn anything new about programming languages for the rest of their lives, very efficient.


> It would work by just having a couple of standard languages so developers don't need to learn so many languages, then they spend a lot of time learning those until they are fluent and then they don't have to learn anything new about programming languages for the rest of their lives, very efficient.

Programming methodology continues to evolve and languages need to address it to stay relevant.


I certainly doubt any extant language actually does it, and also fundamentally there may just not be so much that's worth building into a general-purpose language. The most extreme examples I can think of are something like Haskell (which takes, what, a weekend to be able to write code in?) or maybe APL (which takes, what, maybe two months?). A language that was truly expert-oriented might take as long to learn as human languages do, or longer - certainly, for something you use your whole career, that much up-front effort would be worthwhile. But are there even enough distinct general-purpose constructs to fill that much time?


Try learning Rust, then you reach this point

https://stackoverflow.com/questions/39558633/higher-rank-lif...

I still don't understand what higher rank lifetime bound means and this affects whether I can write code that actually compiles


When I was writing my comment I thought about Scala, where I was certainly learning useful new techniques 7 years into using the language, but that wasn't really about learning new parts of the language. I think Rust has picked an overly specific way to represent lifetimes, that makes them more of a special case than they should be. But maybe that distinction doesn't actually matter.

(Higher rank lifetime bounds are like higher rank types, it might be easier to understand those first and then understand Rust's weird special case version of them afterwards)


Rust is in this weird place, because it needs to be palatable to C/C++ programmers. C has pointers and a vague notion of their lifetimes, but barely any type system to speak of. That's the lens through which Rust needs to present the problem.


There are certainly languages that test whether this is possible. J comes to mind: https://www.jsoftware.com/#/README

Here's something like the tenth or fifteenth chunk of code I wrote in J. It solves a programming puzzle. I doubt people who have mastered J would call it well-written, but it solves the problem.

  list=:_9]\33 30 10 _6 18 _7 _11 23 _6 16 _19 9 _26 _8 _19 _8 _21 _14 17 12 _14 31 _30 13 _13 19 16 _6 _11 1 17 _12 _4 _7 14 _21 18 _31 34 _22 17 _19 20 24 6 33 _18 17 _15 31 _5 3 27 _3 _18 _20 _18 31 6 4 _2 _12 24 27 14 4 _29 _3 5 _29 8 _12 _15 _7 _23 23 _9 _8 6 8 _12 33 _23 _19 _4 _8 _7 11 _12 31 _20 19 _15 _30 11 32 7 14 _5 _23 18 _32 _2 _31 _7 8 24 16 32 _4 _10 _14 _6 _1 0 23 23 25 0 _23 22 12 28 _27 15 4 _30 _13 _16 _3 _3 _32 _3 27 _31 22 1 26 4 _2 _13 26 17 14 _9 _18 3 _20 _27 _32 _11 27 13 _17 33 _7 19 _32 13 _31 _2 _24 _31 27 _31 _29 15 2 29 _15 33 _18 _23 15 28 0 30 _4 12 _32 _3 34 27 _25 _18 26 1 34 26 _21 _31 _10 _13 _30 _17 _12 _26 31 23 _31 _19 21 _17 _10 2 _23 23 _3 6 0 _3 _32 0 _10 _25 14 _19 9 14 _27 20 15 _5 _27 18 11 _6 24 7 _17 26 20 _31 _25 _25 4 _16 30 33 23 _4 _4 23
  sumOfRows =: ([: +/"1 [: |: ] \* [: |: _1 + 2 \* [: #: [: i. 2 ^ #)"1 list
  rowSigns =: (;/@:,/@:>)"2 {|:(_1;_1 1;1) {~ (_1&<+0&<) sumOfRows
  candidates =:,/((_4]\4#"3 (_1 + 2 \* #:i.2^9) (\* &. |:)" 1 _ list)  \* (>rowSigns))
  theAnswers =: (] #~ [: (0&<@:+/@:(0&<) \* \*/@:(_1&<)) [: |: +/"2) candidates
  theSums =: +/"1 +/"1 theAnswers
  theBestAnswer =: ~. ((theSums = ] theBestSum =: <./ theSums) # theAnswers)
In case anyone wants an explanation: https://gcanyon.wordpress.com/2009/10/30/a-solution-to-einav...


Haskell comes to mind.

In all seriousness, not really. Learning curve doesn't matter for a language that we'll pay your bills for the next twenty years. And once you fully internalize a JVM/.NET-level platform or Scala/C++-level language a lot of that will be reusable in learning others.


Yes, if nothing else because the problems that are solved shifts over time, as do preferred solutions. So if it is too optimized, there is a risk of churn.


If you can’t retain any or the project is short-term, I suppose. Probably nobody could learn a new language just for a hackathon.


Then they are quite bad developers, because that is exactly what a team of ours did in a two day hackacton.

Some learned C++ on the Arduino, some went with TypeScript and Angular, others with JavaScript and AWS Lambdas, together we got a robot arm to pick and drop pieces with a Web dashboard and remote control, talking with each other via the "cloud".

Naturally all of them had several years of coding experience, but in other languages.


It's definitely going to vary according to your familiarity with other languages in the family.

I came into a C# shop late last year from never touching C# before and was writing more or less idiomatic C# very quickly, but I've worked with lots of semi-colon languages so this is mostly familiar territory.

On the other hand if you've never seen a Lisp before, ten years of C++ and VisualBasic won't prepare you to get anything much done in Scheme. Your reflexes are all wrong, and that's going to take some unwinding before you're productive.

Also, people will hate it if you oblige them to use language A they don't know and which is poorly suited to the problem when they know language B that's well suited. Even if you've got a sane business rationale (e.g. bus factor, they're the only person who knows B in your organisation) they're going to spend a lot of time moaning about how terrible it is at this job they could do better.

I do not like C++ but I'm immediately dubious about whether I'd rather try to control a robot arm from Javascript. Is there an option where I just have my foot surgically removed by people who know what they're doing? I guess maybe if the robot manages most of this itself and I'm really only overseeing it the Javascript is less awful, but if there's an actual real time control loop I feel like I'm very much between a rock and a hard place.


It was an hackton, so all of that was throwaway code anyway.


> A language with a short learning curve is like a toolbox that’s nearly empty. You quickly run out of ways it could help you.

Very quotable!

That's why I like it when languages allow you to start writing simple code and gradually make it more complex. Haskell for example is not like that. Python does better here. I think Scala is one of the best languages in that regard.


> Haskell for example is not like that. Python does better here. I think Scala is one of the best languages in that regard.

This seems ripe for getting a lot of passionate anecdotes out of the woodwork.


Ha or not. Looks like I was wrong.


Not enough attention maybe? :)

Also, I like Haskell. It's just that the learning curve is much steeper at the beginning. This also has an advantage in that you will rather find experienced people in a project and don't have to deal with different styles within a project. The drawback is that it's harder to learn while being productive at the same time.


One trend with complicated languages is poor scalability although. They don’t compile fast , tend to have slow iteration speeds and don’t scale with large teams or codebases well. You see this with c++ , rust, haskell, scala and swift.


I'm curious to hear where you've seen Rust not scaling to large teams or codebases?


Generate 1 million lines of stupid repetitive code that varies slightly with a 100 line script and you'll see it happen quickly. Same issues with swift and even kotlin to some level. I don't have specific experience but I've run into it with other people who use both languages. My personal experience is swift.

If 1 million lines sound crazy to you, you just need 200 engineers working on one project for a year or two.


I'd also be interested in a Haskell example.


Generally this problem is because someone went crazy with template haskell or generics


What problem? I'm looking for an example of where poor scalability has been a problem in a Haskell codebase.


Learning curve has two dimensions, a short and steep learning curve can actually achieve a lot.

And that's why we call it a curve instead of "learning time" or "learning content", which are two different dimensions.


How does a short learning curve necessarily correlate to a "nearly empty" toolbox? That seems like a fallacy to me.

Ruby is fairly easy to learn. Does that mean it has a nearly empty toolbox? No it doesn't.

A language with a GC is easier to learn than one without a GC. Does the one with a GC have less in the toolbox? What if it also allows opting out of GC?

ObjC is considered a hard language to learn, esp for people used to C++ and Java. Does this come from it having more tools in the toolbox?

"A language with a short learning curve is like a toolbox that’s nearly empty" is a nice quote, but it also objectively seems to be wrong.


> Ruby is fairly easy to learn

No, its not. Ruby as a language is about as complex as they come.

It is easy to get to a productive state though, but that is not the same thing as having learned the language. Length of learning curve and accessibility are vastly different things.


Getting productive easily is the pretty much the definition of not having a steep learning curve. So I have difficulties making sense of your argument.


yxhuvud referred to length of the learning curve, not slope of the learning curve.

Even if getting started is easy (no steep slope), mastery can take long.


Python is the same. It's easy to write simple code, but once you get deep enough, the skeletons come out of closet.


I would assume that if you have a toolbox with only a small number of tools, you would spend some of your time using that set of tools to make other tools that extend your abilities.

I know that many developers in earlier times, before the internet, when programming languages were more limited and access to library repositories was not possible most experienced programmers built up their own libraries of functions that they would use and add to as required.

These days, we still have these extensions to the languages, but they are shared in repos and are called shared libraries or frameworks.


if you have a toolbox with only a small number of tools, you would spend some of your time using that set of tools to make other tools that extend your abilities

There is a big issue. We are much worse than that. We iterate through workshops (even the ones we built ourselves), and also our products grow out of them — they never leave a workshop they were done in. As a result, when you come to a new place, it itself is unlike anything in your previous workshops, and the product itself is unlike anything you could use your previous toolkits on. Doesn’t really matter how easy it is to bring all your toolboxes with you, unless it’s a new empty place. That’s why agreeing on a rich-from-the-start workshop is important.


I don’t disagree but there’s also needless confusion brought on by syntax like the C++ pure virtual function syntax (=0;) that encourages mental models that don’t map 1:1 with what’s really going on.


At the same time, a language with built-in abstractions that are not fitting the problem at hand will force its users to spend time working around the bad abstractions, which will be messier than not having the abstractions in the first place. More churn can definitely be expected in a complex language.


What do you have in mind? I think the language providing an abstraction doesn’t mean in most cases that it has to be used — if classes are not a good fit for this specific thing, just don’t use them. So that would still boil down to inexperienced developers not using proper abstractions - which can happen (with smaller blast radius) in less expressive languages as well.


The toolbox analogy breaks down when you realize that every language feature interacts with every other language feature. So it's more like a toolbelt that you have to wear. Hopefully, the tools aren't so plentyful and heavy that you spend all your energy dragging it along!


Most software is written and maintained by non-experts. Consequently you should optimize for time-to-onboard. Good tools are intuitive and easy to master; a hammer is basically better than a scanning electron microscope.


Are you talking about the languages themselves, or the languages plus their ecosystems (of libraries, design idioms, etc.)? Because if you're just talking about the PL itself, I'd really disagree


I think Excel shows us that you can do a lot with a beginner friendly toolbox.


Your point is well made, but in the case of Excel, the real question is should you do a lot with it?


Given their example about classes and "don't show the key before showing the lock", it's ironic the writer pans languages aiming to be "simpler" without understanding the kind of complexity they're actually avoiding. Zig, Go, (and Java and C# and other not as sexy and modern examples) are avoiding C++'s complexity - features with complicated interactions between each other that can sink your program without you being aware at all.

My favourite example is how the integer promotion rules make multiplication of unsigned shorts undefined behaviour, even though unsigned integral numbers have well-defined wrap-around arithmetic. Going by [1], we have the following rules:

1. Usual arithmetic conversions: The arguments of the following arithmetic operators undergo implicit conversions for the purpose of obtaining the common real type, which is the type in which the calculation is performed: (...) binary arithmetic: , /, %, +, -; (...) Both operands undergo integer promotions .

2. If int can represent the entire range of values of the original type (or the range of values of the original bit field), the value is converted to type int

3. (added after the above rules existed for the purposes of better optimizations) If the result of an arithmetic operation on two ints over- or underflows, that is Undefined Behaviour.

4. 0xFFFF 0xFFFF = 0xFFFE001 > INT_MAX

5. Therefore, two unsigned short values cannot be safely multiplied and must be explicitly cast to "unsigned int" each time.

[1] https://en.cppreference.com/w/c/language/conversion


The problem with implicit wrap-around of unsigned integer is that it makes your program slower. The underlying hardware with 2s complement does not need to pay the cost of reserving the overflow space or shifting/masking it out.

Further more, its more consistent if signed and unsigned behaving identical than having to look for the type with integer promotion rules of the number.

If you prefer the current C/C++ way of less performance and inconsistency that is fine.


And now I see that HackerNews ate my multiplication symbols. What I meant to type is that

    unsigned short a = 0xFFFF;
    unsigned short b = a;
    unsigned short c = a * b;
Is currently undefined behaviour on platforms where int has more bits than short (like x64 and arm) due to the interaction between the integer promotion rules and undefined integer overflow.


Rust here says OK, we'll define Mul (the * operator) for the same type (and for references to that type) so

  let a: u16 = 0xFFFF;

  let b: u16 = a;

  let c: u16 = a * b;
... is going to overflow, Rust would actually detect that because 0xFFFF is a constant, so, this says "Silently do overflowing arithmetic" and er, no, it doesn't compile. However if you achieved the same thing via a blackbox or I/O Rust doesn't know at compile time this will overflow, in a Debug build it'll panic, in a Release build it does the same thing as:

  let c: u16 = a.wrapping_mul(b);
Because the latter is not an overflow (it's defined to do this), you can write that even in the constant case, it's just 1, the multiplication evaporates and c = 1.

In C++ if you can insist on the evaluation at compile time there is no UB, so you get an compiler error like Rust.


I don't like Rust's approach, but it is better than C's. Rust should either commit to wraparound or make the default int type support arbitrary values.

In C, the problem isn't the silent wraparound, the problem is that when the compiler sees that expression, it will assume that the resulting value is less than INT_MAX, and optimise accordingly. The other insidious problem is that wraparound is defined for other unsigned arithmetic, so a programmer that hasn't had this explained to them, or read the standard very carefully, would quite easily assume that arithmetic on unsigned short values is just as safe as it is for unsigned char, int or long, which is not the case.


I understand why you don't like C's behaviour here.

> Rust should either commit to wraparound or make the default int type support arbitrary values.

Committing to wrapping arithmetic everywhere just loses the ability to flag mistakes. Rust has today Wrapped<u32> and so on for people who know they want wrapped arithmetic. I'd guess there's a bunch of Wrapped<u8> out there, some Wrapped<i8> and maybe some Wrapped<i16> but I doubt any large types see much practical use, because programmers rarely actually want wrapping arithmetic.

The mistakes are real, they are why (thanks to whoever told me about this) C++ UBSAN in LLVM actually flags unsigned overflow even though that's not actually Undefined Behaviour. Because you almost certainly weren't expecting your "file offset" variable to wrap back to zero after adding to it.

For performance reasons your other preference isn't likely in Rust either. Type inference is not going to let you say "I don't care" and have BigNums in the same code where wrapping is most dangerous.

We can and should teach programmers to write checked arithmetic where that's what they meant, and Rust supports that approach much better than say C++. Also the most serious place people get tripped up is Wrangling Untrusted File Formats and you should use WUFFS to do that Safely.


As someone who strongly dislikes complicated languages, it's obvious to me that it's a matter of personal preference; personal aesthetics, if you like. If the choice is between complexity in the language and elsewhere, often I'd rather it be elsewhere. But while that preference is entirely subjective, what isn't subjective is the distribution of that preference among programmers. I think that we can say with as much certainty as we can about these things that the answer is clear: the vast majority of programmers prefer simpler languages. So much so that while language simplicity is by no means a sufficient condition for high popularity, it seems that it's a necessary one.

But I'll go further. My gut feeling, which is by no means even remotely scientific, is that those who prefer simpler languages outnumber those who prefer complicated ones 9:1. Moreover, my feeling is that the ratio is higher at the low end of programmer capability than at the higher end — and it makes sense, as more capable programmers can handle more complexity, so some — still a minority, but a larger one —prefer moving more of it to the language.


I’ve seen many of your comments (sorry, I’m not stalking you, just find both your work on Loom and your fresh take on some CS problems really interesting) regarding languages and would like to know how would your ideal language look like? I know that you are quite fond of Zig’s simplicity, especially the way they replaced many complex features of c++, rust by a simple constexpr feature. I would wager you also think of Java as a simple language, and I remember seeing Clojure as a positive example somewhere as well.

You also seem to suggest that a “high level low level” language is not really possible (the trap c++ and rust falls into) due to low level details always seeping through, making refactors much slower. So my question is: how would the ideal low and high level language look like in your opinion? Regarding the high level languages, do you find java’s Valhalla plans good? I personally really like the semantic-only distinction of basically loss-of-identity, and loss-of-tearing, giving ample grounds for optimizations behind the scenes, and over time (in part thanks to some of your comments), I grown to really like the simplicity a language with simple classes and methods can bring.


> how would the ideal low and high level language look like in your opinion?

I really don't know. I didn't know about the utility of affine types for memory safety in a low-level language before Rust, and I didn't know how much could be done with just comptime before Zig, so I certainly can't say what would be ideal for me. In terms of values, I would like a very simple language that helps write correct code and has fast compilation. None of the low-level languages I know about fully meet those goals, but so far Zig comes closest. Then again, it's easy to look good before you're actually out there, so we'll see.

> Regarding the high level languages, do you find java’s Valhalla plans good?

TBH, I haven't been following Valhalla closely, but knowing the people involved, I have the utmost confidence in what they'll do.


I don’t know. It seems to me the progression is “I like simple languages.” To “I like how expressive I can be. Look at this elegant one line program I wrote.” To “I like simple languages.”

Anecdotally, as I’ve gained experience, I and most people I know have only grown in valuing simplicity. It’s one reason the the average Clojure developer is older.


Funny to see this

> As someone who strongly dislikes complicated languages

while

> Working on OpenJDK at Oracle

as if Java was uncomplicated…


Everything is relative. Java is more complicated than Clojure and Python, but it's less complicated than C#, Scala, and Kotlin, and no more complicated than Swift.


In case anyone hasn't seen this most excellent talk on this topic, I highly recommend: Guy Steele - Growing a Language (OOPSLA '98)

https://www.youtube.com/watch?v=lw6TaiXzHAE


One thing that isn't often mentioned is that complicated programming languages nearly always come with complicated tooling. C++ has cmake, which is self-explanatory, and even cargo is pretty complicated. I'm still not sure what is a crate vs. a package, and the amount of times I've had to look up the bindgen tutorial, and the amount of steps involved, is too much. This isn't even to mention compile times, debug-mode executable speed, etc. This isn't the complexity moving from the source code to the tooling, this is complexity created in the tooling itself required to manage the complexity created by the language.


And they attract programmers which love complicated code. Of course they don’t actually call it complicated and in fact they see it as elegant and state of the art.

People have figured this out about C++, but they’re still in denial about Rust. Notice how in the article the borrow checker is the best kind of feature which emerged to handle an existing problem.

In reality the borrow checker is a hammer which forces all code to look like something the borrow checker can nail down. Rust is similar to C++ and it looks like it will be used outside low-level programming or systems programming and in those cases one can gladly give up e.g. manual memory management, obsessively avoiding copying, having to specify how wide an integer is and so on at least in parts of the code.


As a non-professional and in essence beginner programmer who likes to learn about programming and writes simple to moderately complicated scripts I have to say I am currently in the mindset of not needing classes.

In fact the first part of the article I was like: yeah no need for classes there, a function is way simpler.

Now like I say I write fairly simple scripts, I mostly automate manual processes, but even after reading the article I'm still not sold on classes. I have to admit using dictionaries/associative arrays as the author showed and have often wondered over the difference between classes and objects. But I still don't see the problem.

I known my lack of knowledge and understanding is showing here; I'm no expert and have never written any massive software.

Can someone point me to an example where a class is more suitable than a function working on an object like an associative array? Genuine question from the ingnorant.


It's mostly a matter of scale. In some sense, classes are just "functions working on an object like an associative array", x.y(z) is a fancy way of writing y(x, z) and python even makes that explicit. The point is they're designed to keep those functions in the same place and not leaking all over the code, keep similar utility and data in the same place.

If you are writing small scripts, it really doesn't matter. If have a million lines of code and thousands upon thousands of source files, any structure you can impose on it makes a big difference.

That said, OOP is just one paradigm of imposing structure.


> Can someone point me to an example where a class is more suitable than a function working on an object like an associative array?

I/O or resources.

Any time you're interacting with a stateful system like an API or a database, there's inevitable setup and maintenance to keep access to that resource sound. Additionally, you rarely care how that resource is kept coherent, you just need to access it at parts of your program. Without a class or some system that keeps some bookkeeping data around with methods, every function has to have that bookkeeping state passed in as a handle, and all of a sudden all functions need to know something about how setup works, as opposed to what it does. This is a rare instance of where namespaces + globals relative to those namespaces reduce complexity; even languages like Clojure have DI "objects" for resource access, and lots of languages that have IO monads are really a data structure over some state + functions.

This setup of objects as things you can't interact with directly--aka not data--is the original one in the literature. Once upon a time you needed to interact with a device driver or a mouse or a screen or a hard drive, and logically encapsulating that device was useful. You'd then interact via "messages" dispatched to the object which are effectively what methods are these days (the term "dispatch" is still used in the literature). It also makes mocking out that physical real-world thing easier.

Notably, objects never were meant as a way to represent data. I'm one of those functional programming weirdos that likes most things to be immutable, and I have simple stateless functions in an enclosing namespace. I pass simple structs (or case classes) of data around in those functions and it works well for me. But the I/O or externally interfacing parts of my program are always in specific places in my program, and they always use dependency injection, and it's always obvious what they are. They look like a sandwich where those objects are at the edges of the program (APIs/ingress and database/egress), but the meat of the program is in the inside and is boring pure functions.


> Without a class or some system that keeps some bookkeeping data around with methods, every function has to have that bookkeeping state passed in as a handle, and all of a sudden all functions need to know something about how setup works, as opposed to what it does.

Classes make so much sense for me in this space, that I can hardly see why people would defend not using them.

Additional to your points, having things in classes with clear interfaces makes it possible to easily integrate new underlying functionality without changing the contract. For example you only store things on S3, have a nice class with store methods and so on. You can just add another different storage backend. How would you do that using functions only? If they also need to take care of setup? A nightmare.


>Can someone point me to an example where a class is more suitable than a function working on an object like an associative array? Genuine question from the ingnorant

1) A class is basically a contract known at compile time. If there's a mistake (for example, you're trying to reference a non-existing property), it can be caught before the program is even run, saving time.

2) A class often acts as a namespace/facade, i.e. you encapsulate a piece of logic inside a class, and its users only see a simple-to-use interface, they don't have to know about implementation details (such as what dictionary must have what keys).

3) A class can be made self-validating. For example, there can be a rule "price can't be negative". If we have a proper "Price" object, it can validate this rule itself, whenever it's constructed/updated. Thanks to data encapsulation, we can prevent users from updating the price value directly, only by asking the object to update itself (and it can refuse to do so!) -- making sure it never can be below zero. If you use free-standing functions on associative arrays, you can forget to validate certain rules in some places, leading to data corruption.

3) Classes allow polymorphic behavior. I.e. implementation can be changed dynamically based on the current context.

4) If you follow domain-driven design, it's easier to discuss and model new requirements with "domain experts" (not necessarily programmers) using the concept of "objects", with well established properties and behavior attached to them, rather than some ad hoc associative arrays and functions.

Not all of this is unique to OOP, of course. OOP is just one of the tools.


There probably isn't a single example. But "quantity is a quality in itself". Classes sort of evolve when you need to manage dozens or hundreds of those associative arrays and don't want to keep in your head exactly what keys are in a given array. And when you have a dozen of arrays and a hundred of functions, which functions go with each array.

In itself every element is easy, but together they are hard.


I expected one of the other answers to encapsulate how I feel about this, but they didn't, so here goes:

You have probably noticed in your "fairly simple scripts" that your language has some notion of types. Your variables are one type or another type, in some languages they can change type and in others they mustn't, but likely the language has the notion that, for example, while dividing 9 by 3 is a reasonable thing to do, dividing your array of business departments by the name of a CSV file is not.

Now, these types aren't like the elements (whether classical or chemical) they were chosen by the language authors. There are probably a handful of obvious ones, probably strings like "Clown" and numbers like 19.03 are different types, your dictionary isn't a number, and if your scripts are allowed to access files probably a file "Handle" or equivalent isn't the same kind of thing as a dictionary.

So, in this sense a class is merely a type that you can define in your program.

This has any number of useful properties, for example suppose you deal with a lot of Social Security Numbers, if you had an SSN class it'd be clearer that, while SSNs might well be "numbers" in some sense, you can't go around summing them up, any place that seems to be happening is a bug waiting to happen, as a class there's no reason it should be possible to do arithmetic on SSNs. Your chosen language might provide easy ways to say e.g. "This dictionary is for numbers, don't put anything else in it" and chances are if that's available you can likewise say "Only for SSNs" and now there's no risk you accidentally got the total number of people processed as a Social Security Number.

Does that justify classes as an idea to you?

The specific word "class" is associated with a particular approach to programming called Object Oriented Programming, which is much less popular today than it was at the end of last century, but this general idea of being able to make types for your application is more general.


Example from the article is a nice one.

Now you want to make 100s of different dogs in your program and you want make them to bark at each other.

You want to loop over all the dogs and feed them.

You can make infinite loop that will check all the dogs and if one is hungry call Feed and if one is not then not.

Learn what is the difference between Object and Class. You can create objects and let them live in your program.


That's not very convincing. Why can't a "dog" be a perfectly ordinary data structure that you manipulate the usual way?

  while True:
      for dog in dogs:
          if dog["hungry"] == True:
              dog["lastfed"] = time()
              dog["hungry"] = False
Okay, I hear you say, but the fact that a dog is an associative array is an implementation detail. Supposing I want to abstract that - don't I need objects then? No! Just define some functions:

  while True:
      for dog in dogs:
          if is_hungry(dog) == True:
              update_last_fed(dog)
              satiate_dog(dog)
So you end up with an abstract type (dog) and a bunch of functions that operate on that abstract type. Is there any advantage to "gluing" the functions to the data? Well yes, but it's subtle. Imagine you had two types of animal:

  while True:
      for pet in pets:
          if species_of(pet) == "dog":
              if is_dog_hungry(pet) == True:
                  update_last_fed_dogfood(pet)
                  satiate_dog(pet)
          elif species_of(pet) == "canary":
              if is_canary_hungry(pet)
              ...etc etc ad nauseam

With objects, you can make this elegant with pet.satiate() etc. Note however that this problem goes away with multiple dispatch - there's no reason why the language couldn't be designed so that satiate(pet) called different functions contingent on the type of pet. We already do this for math! Adding two floats is mechanically distinct from adding two ints, but it's a rare language that makes the programmer use different functions.


If you don't have any imagination and you treat dogs as database entries then yes you might do just that. If you are only person writing it and using that code, the same.

You did not address interaction between dogs. Some other idea, how do you pass dogs to a playpen, how would they interact - maybe you want different playpens where you can send your dogs into, some playpens having food, some not having a food, different types of food. What if some other developer will have to write code to heal the dog? If function to dog.Heal() will be in some different module that you never heard of, how do you find it?

You don't want to make each playpen having "if species_of(pet)" code duplication, if pet alone can act on its own will.

Writing code in here seems not that useful here because you would really have to imagine possibilities and see how much code would go into wiring up such a world. Then seeing how having methods in object are helping organize code and in the end prevent whole classes of errors.

Building all of that as an example is just too much work for one off discussion. Just play with imagination of how much complexity you can add to a simple example and soon you will notice.


Well respectfully, you provided the original project spec as an example of where objects were required, and I demonstrated that no, they mostly weren't. I agree that it's unproductive for you to keep adding things to the spec so that I can demonstrate that nope, you still don't need objects to do all that.

All that objects are, are functions glued to data. That's it. Whether or not you choose to glue them together, the behavior of the dog lies in the functions, and the state of the dog lies in the data. Frankly, your examples seem to lead away from objects - how will another developer add the ability to heal dogs, for instance? They can't change the class definition, so they'll have to do something awful like subclass it into HealableDog(), which doesn't interoperate with all the existing dog code at all. Meanwhile, if 'dog' is simply a big ol' struct of data, they can write a function heal(dog) and call it a day.

I highly recommend the first part of this video, which explains the composability problems with objects very well - with dogs and cats even! https://www.youtube.com/watch?v=kc9HwsxE1OY


Link to YouTube was really nice.

I would not agree that one cannot change the class definition as it really depends on the system one is writing. I change class definitions on daily basis in business line applications. If I would be making frameworks then maybe I would not do that so often.


Make Dog implement Healable, duh.


The original author can do that. Another author cannot add that functionality later, as a library. The can only define a new kind of dog.


Subclass Dog into MyDog, which is exactly the same as Dog. From there, use and modify MyDog as though you would be modifying Dog. Any class that references base class Dog isn't modifiable, but any future work you can use MyDog instances to get all the functionality you need (with some ugly casting). There are other solutions as well.


> Why can't a "dog" be a perfectly ordinary data structure that you manipulate the usual way?

In statically typed languages, it's easier to read data out of a struct with fixed types, than a map[string]any and convert it to the expected type on each access.

> So you end up with an abstract type (dog) and a bunch of functions that operate on that abstract type.

C has no multiple dispatch and no classes. So libraries based around opaque objects (GTK and ALSA etc.) prefix each function name with its struct name, which is verbose and gets tedious to read or write after a while. Is multiple dispatch possible in a statically typed languages with compile-time nonvirtual function resolution? I haven't looked into it, it might be possible.


> Is there any advantage to "gluing" the functions to the data?

Depends on the capabilities of the language. If the language supports function overloading and type-checking, then no.

Otherwise the "fun" begins if you add different variations of the same basic data that require subtle changes for working with them.

In the case of Python you'd end up with a manual type-checking pattern and nested if-else statements within the functions. In the OO case you can leverage protocols, composition and (to a lesser extent) inheritance.


I can't reply to everyone so thanks to you for your replies.

My take is that for simple stuff I don't need classes. An example is I'm writing a simple GUI calculator for my brother who's a stone mason. It's a stone calculator that accepts quality, height, width, length measurements in mm and output a tables with volume costs in either metres or feet cubed.

I could model it as a job classes that accepts a stone list class, and customer name, cubed cost, tax, parameters. And methods to get total costs, taxes etc.

The stone list class is a list of stone classes. Each stone class has quality, height, width, length parameters and methods to get cost, get taxes, etc.

But all I do is store the measurements in a list, each list (stone) is stored in another list, and I iterate. Calculations are done on adding a stone, or when you recalculate the whole list. I know it's not big, complicated, or impressive example but for me it was easier to implement it classless.


A file is a class of objects you probably perform operations on with your scripting. You could instead use some kind of function instead to change, say, the entries in the disk index used to map sectors but using a class function like "rename" is just so much more understandable from the human point of view.

You asked for an example where a class is more suitable than a function. I can think of no better example than the abstract concept of files on a modern operating system. In fact it's so useful you will probably have difficulty coming up with ways to perform the same operations without using an object-oriented strategy.


The two main advantages of using classes are:

a class is also a type (if you have a typed language),

a class is also a module, you group together values that makes sense together (the x and the y of a point) with the functions (methods) that interact with it.

When you have a lot of codes flying around, you can add encapsulation (private/public thingy) so the user of a code does not see the implementation which helps to create libraries that can evolve independently from the applications using them.

Also compared to an associative array, a class is more compact in memory (granted JavaScript runtimes see dictionaries as hidden classes).


To be honest, I don't care about classes itself. What I do like are the interfaces - which are usually expressed as collection of methods. Classes provide easy grouping of those methods, so you can not only type-safe implement particular methods, but make sure that whole group of methods implement particular contract.

To be honest, this does not require classes per se - Golang manages to have interfaces without classes, Rust has traits.


Classes are about encapsulation. They contain state and have funtions (methods) to change said state. It's nice to have things encapsulated when programs get very big. For one off scripts though, not really needed.


The crucial crux of this is whether the complexity is worth its weight. Different programmers will disagree, not least of which because this is tightly bound to how you prefer to think about programming. Moreover, what seems complex to one person might seem entirely straightforward to another.

However, at the end of the day my personal worry is that the central thesis of the article is often inverted. A complicated programming language might not tame complexity within itself, rather it may instead breed even more complexity in its community and the code that is written in it. That complexity spirals ever more outward as new features grow that attempt to tame the complexity, but only add to the growing morass.

This is not necessarily true, but I've seen it happen often enough that I'm very wary of how new features in a programming language interact with each other.


Interestingly, the strongest bouts of architecture astronautics in the Java ecosystem occurred when the language was simple: lacked generics, functional features, etc. It was the time of EJBs and other such J2EE stuff.

I'd say that feature cohesion is important: in Java, the new(er) features blended into the language without creating incongruities and footguns. In C++, it was not as smooth, though.

Another, and maybe more important factor, is the prevailing tastes in the community. PHP 8 is more complex than PHP 4, but the community attitude changed, many aspects of the stdlib and key frameworks evolved, and now PHP mostly shook off the stigma of being a cluster of antipatterns.


I would like to bring up some counter-examples:

- 1. Generators that can allocate prevent comptime-overapproximation of potential stack usage (https://github.com/ziglang/zig/issues/157#issuecomment-76395...) or make analysis inside the compiler significant harder.

- 2. There is no reason why static analysis must be part of the compiler, though its still an open research problem how to present lifetimes properly to the user (https://github.com/ziglang/zig/issues/782#issuecomment-10149...)

- 3. Requiring static analysis to be inside the compiler pushes back on external static analysis. While this may be overall good to force upstreaming work and prevents monetarian incentive of introducing bugs as optimizations detectable by tool xyz, it reduces modularization incentive and extensibility for experimentation.

- 4. Trying to encompass everything in static analysis has the same problem as the all-encompassing logic: Soundness can not be proven and the proof for soundness becomes more and more problematic the more rules one adds (slow and hard to reason which rule has been applied). And this does not even account human errors or the fact that a PL does not accurately implement the logic.

- 5. The biggest problem of complicated languages is that they have no process to define what underlying logic (see point 4) the compiler should follow and "what is too complex". Somehow folks also forget about Gödels incompleteness theorem (you cant reason on everything in the compiler).


"Classes would still exist, but as implicit patterns."

Classes are not a design inevitability, but just one way of managing state. Languages that mostly avoid mutable state don't tend to have object systems, for instance.


> Classes are not a design inevitability, but just one way of managing state

What "Classes" are is a tragedy of things.

Classes are closures, with a syntactic form that allows for nested composition. This is a particularly brittle form and these qualities are found in all languages with Classes. Language maintainers, as they currently exist, have failed to concede that this is a bad idea, because "it works good enough for lots of cases". This has hurt the industry by creating a culture that can only support the view that "all code is ugly". Given the choices, the variability in how code evolves randomly to deal with choices made under these conditions, is unsurprising.

Why? Because Classes are a good feature that has been misrepresented as a compositional tool instead of a syntactical shortcut and nobody is dealing with the gravity of that...given the amount of history, industry, effort, et al based around supporting their faux necessity, who could be blamed for not raising a stink? And why is there a "new" keyword in 2022? Python gets this right.

Almost every language ends up heaping additional syntax for behavior that is either compiler added (eg php/rust traits, ruby aspects) a meta-language (eg Java aspects, php attributes, various C++ impls, ES6 javascript classes), or a straight mixin syntax (Python decorator). This alone, is obvious that the support for composition is a major problem that has both been neglected, then mishandled by heaping on the same mistakes. Interfaces, were a particularly laughable solution, which has made the problem worse by tying that limited form of composition with reflection.

> Languages that mostly avoid mutable state don't tend to have object systems, for instance.

They do, in the form of closures. How infrequently composition-by-nesting is used in functional languages is telling (basically the inverse of nonFP languages).


Classes are closures just as much as closures are classes (as Java showed).

What classes are at their most core is bundles of behavior, possibly with encapsulated state, and usually some way to decide at runtime which of a family of classes' behavior to use. Closures are naturally tied to 1 single behavior (though of course you can invent ad-hoc methods of extending that, such as a "methodName" parameter, but no one ever does this in practice).

This is the reason why classes and interfaces actually exist in some form or another in all useful languages today. For example, Go has structs with methods + interfaces, Rust has structs + traits, OCaml has modules + module types (well, it also has classes, but I understand those are less popular), Haskell has data types and type classes, and on and on, JavaScript has objects + dynamic name resolution.

The one idea from classic OOP that has turned out to mostly be a dead-end is implementation inheritance + subtyping. This is basically the only class feature you'll only find in specific OO concepts, and basically all modern guides tell you to avoid it.


> and basically all modern guides tell you to avoid it.

If only... There's still a thriving OO-heavy mindset in much of the Java ecosystem, but less evangelical that it used to be.

In other languages... Inheritance tends to be de-emphasised compared to a decade ago, thankfully. I rarely see advice to avoid it - instead moderation is advised.

I think the main problem is that it is hard to explain to a learner what the pitfalls are with inheritance-heavy OO design except in the most general terms. It's something that can be quite useful when used judiciously, in the right context, but knowing when this is requires experience.


Closures are the superior mechanism though, you can implement classes plus all the crazy decorators / metaclasses / mixins / etc. The reverse is true of course but in the dumbest way possible.

A closure-based implementation of a class is just a function that returns a structure full of closures (an interface if you will) all with the same shared state.

But now because a class is just a function, you can do all kinds of crazy things with it, the "constructor" can even pick and choose which functions to use to implement the interface at runtime, while retaining full typesafety.

Classes are not just unnecessary, they're a bad abstraction.


> Classes are not just unnecessary, they're a bad abstraction.

And yet, there are approximately 0 examples of the "closure-backed classes" example you give, while every vaguely popular language except C has some variant of polymorphic classes, from StandardML to Lisp.

Your closure example is not used in any langauge because it's extremely un-ergonomic. If the langauge doesn't have some kind of concept of "interface" (abstract class, type class, trait, structural typing etc), then you'll either miss polymorphism (return a single specific type), or be too dynamic (return an object that could do anything). It also risks being inefficient if you try to naively add function pointer members to each object to support runtime dispatch.

The closure-based alternative also obscures the relationship between product data types (structs/record types) and classes (structs with methods).

Closure-backed classes are a neat experiment, but just like manual virtual table management in C, they are not a realistic solution to the problems that classes address, if you can possibly avoid them.


Using C, which doesn't even have closures, as an example of why this doesn't work is rather silly. How can you have closure-based classes without closures? The ergonomics are beyond horrific in this case but I don't see how that's relevant.

> And yet, there are approximately 0 examples of the "closure-backed classes" example you give, while every vaguely popular language except C has some variant of polymorphic classes, from StandardML to Lisp.

SML is actually one of the best languages to implement this approach! And as far as I know this is even the recommended way to do it (see the MLton docs). No idea what polymorphic classes you're thinking of when it comes to SML. Nobody uses objects in OCaml either.

> Your closure example is not used in any langauge because it's extremely un-ergonomic.

I'm not arguing ergonomics, just implementation details, you can add all the syntax sugar you want :) Also Erlang Actors are functions with an explicit message loop. So that's even lower level than this.

> If the langauge doesn't have some kind of concept of "interface" (abstract class, type class, trait, structural typing etc), then you'll either miss polymorphism (return a single specific type), or be too dynamic (return an object that could do anything).

I don't follow, neither of those are true. Yes you return a single type, that single type is the interface type (which can have infinite implementations, just like in any language where you return an interface). But you don't need a specific concept of an interface, it's just a record of closures.

Also every language should have a way to say "this struct is the union of these structs", which gives you the subtyping you're typically missing with this approach (i.e. an object that implements N "interfaces" can be cast to all of them).

Without that you need to add an extra layer of indirection with explicit casting functions. Not ergonomic as you say.

> It also risks being inefficient if you try to naively add function pointer members to each object to support runtime dispatch.

If you're not using function pointers then you don't have runtime polymorphism, what do you think a vtable is? It's a struct of function pointers. An object is a pair (voidptr, vtableptr), where the vtable can be created at compile time or at program start. Of course that's not very ergonomic, but I'm not arguing ergonomics.

> The closure-based alternative also obscures the relationship between product data types (structs/record types) and classes (structs with methods).

Classes obscure the relationship between subtyping, interfaces, inheritance, abstract datatypes and modules. A class is much more than a struct with methods. A struct with methods is vtable.

> Closure-backed classes are a neat experiment, but just like manual virtual table management in C, they are not a realistic solution to the problems that classes address, if you can possibly avoid them.

What problems do classes address? Nobody seems to have an answer to this, which is why modern languages like Rust and Go don't have them. You can argue that traits and interfaces are are "OO", but they're not classes. Rust doesn't let me do Iterable.new{...} at runtime, exposing it's nature as a struct of functions, but nothing says it couldn't.


> I'm not arguing ergonomics, just implementation details, you can add all the syntax sugar you want :)

Oh, good. That sugar actually already exists in most OO languages, and usually it's called...

   class
> What problems do classes address?

The problem of getting sweet ergonomic syntax for the concepts of encapsulation, polymorphism, possibly inheritance (with additional sub-divisions mainly on whether multiple- or single-ancestor), interface, abstract class, type class, trait, structural typing, etc, etc. Currently implemented in varying combinations and to varying degrees in different languages.

So make up your mind: Do you have something against syntactical sugar for all those, or don't you?


A class is any construct that can:

- expose multiple function (implementations) as a single unit

- be chosen at runtime as a representation of a contract (polymorphism)

- (optionally) encapsulate some state accessible only to the functions of the class

This construct comes up naturally in a wide variety of programming tasks, mainly related to the large-scale organization of code.

Here are some examples of language constructs that are classes (perhaps by other names):

- `class` in C++, Java, C#, OCaml etc.; polymorphism achieved through interfaces/pure-virtual classes

- `class` in Python, Ruby; polymorphism achieved through dynamic dispatch

- `defmethod`+`defclass` in Common Lisp CLOS; polymorphism achieved through `defgeneric` (a rare example where polymorphism is at the single function level, instead of a bundle of functions)

- `structure`&functors in SML; polymorphism is achieved through `signature`

- `module` in OCaml; polymorphism achieved through the use of `module types`

- `struct` in Go; polymorphism achieved through use of `interface`

- `struct` in Rust; polymorphism achieved through `trait`

- `data` in Haskell; polymorhpism achieved through `class` and `instance`

- `typename` in the C++ template meta-language; polymorphism achieved through dynamic typing (SFINAE) or `concepts`

Now, some of these languages/constructs may have additional properties. Some support subtyping, some support multiple fields etc. Some explicitly implement interfaces, some implement them implicitly. But they all have the initial three properties in common, and recognizably solving the exact same problem in slightly different ways.

The implementation varies quite significantly, but I would be surprised if even one of these languages actually implements this construct using closures.

Now, to address some of the other points:

> If you're not using function pointers then you don't have runtime polymorphism, what do you think a vtable is?

I have often seen proposals to implement runtime dispatch via something like:

  struct Obj {
    foo: fn () -> int
    bar: fn int -> int
    field1: int
  }
This is wildly inefficient compared to the vtable approach, which is closer to

  struct ClassA { //vtable
    foo: fn () -> int
    bar: fn int -> int
  }
  struct Obj {
    vtable: &ClassA
    field1: int
  }
> But you don't need a specific concept of an interface, it's just a record of closures.

If your language doesn't have a concept like Interface to begin with, you can't implement it out of thin air. There are few languages that don't have this. C is perhaps the only mainstream example, and indeed in C (even if you had closures) you couldn't return an object from a closure that is not a specific concrete type; or void* and expect people to cast to something else.

If your language does have interfaces, it will also have a better way to define classes corresponding to those interfaces than random closures returning interface types.

> I'm not arguing ergonomics, just implementation details, you can add all the syntax sugar you want

I've mentioned this before, but I don't know of a single language where the class mechanism is actually implemented in terms of closures. Typically there are other specialized constructs at the compiler level used specifically for this.


Yeah your first approach to runtime dispatch is just dumb (don't listen to those silly people), you want the second in all cases, even with a closure implementation of classes. Objects should be implemented as in the second case, or more concretely, as:

    struct MyObj {
      vtable: MyInterface*
      data: void*
    }
Which is how dynamic traits are implemented in Rust. The C++ approach of a pointer to:

    struct Obj {
      vtable: ...
      field1: T1
      field2: T2
      field3: T3
    }
Pretty much means you need to define all the interfaces you implemente on class declaration, which is not nearly as nice as doing it independently (as with traits, protocols, typeclasses, go interfaces, etc.)

My point is that MyInterface is just a vtable, a class is a function that fills in that vtable, an object is a pair of a filled in vtable and data. These mechanism are restricted to the compiler for no good reason, which is why you end up with "metaclasses" and "mixins" and all that other nonsense. Just expose the mechanism itself and give it some nice syntax.

Classes mix too much stuff together: inheritance, subtyping, access control, data specifications, interface specifications, etc. We've figured out that's not really a good thing which is why Go and Rust rightfully split them up and use separate mechanisms.

I'm saying go even further and don't hide the runtime representation, let me create these things at runtime if I so wish. That gives metaclasses and mixins and what not for free (with a constexpr-like mechanism they can even be done at compile time).

You can still keep traits as the way of defining bundles of methods, they're also useful as generic concepts, that goes into the ergonomics of it.

This is a similar discussion to Zig generics. Zig has no idea what generics are, a "generic" is just a function that given a type produces another type. Works pretty well.

> A class is any construct that can:

That's just your definition. Your points describe an abstract datatype, a concept that predates classes. We could call that a class these days, but that's not reflective of the programming language construct called "class" as used in language like C++, Java, C# and friends.


>> Languages that mostly avoid mutable state don't tend to have object systems, for instance.

> They do, in the form of closures.

You can build an object system from closures, certainly, but I wouldn't say closures are objects by themselves. An closure lacks inbuilt messaging, which is usually considered a requirement for an object.


> You can build an object system from closures, certainly, but I wouldn't say closures are objects by themselves.

This raises 2 issues.

1. Is a closure an object? An "object" is conceptual, not based on a specific algorithm.

Then you mention a capability of messaging being a requirement for the Object concept.

This is where arguments start going off the rails in casual language discussion because people combine the idea of what a language can do (capability), what a language was designed to do (designed-for) with the supporting ecosystem, and what's possible in a turing machine by virtue of what other languages provide (in capability/designed-for).

You can message to other closures via methods, just like the vast majority of languages currently support. This kind of messaging is not async, which seems like a stupid cheat on the idea of a message, but it makes sense once you start running lots of synchronous objects with independent lifecycles, locations, and identity characteristics...then you realize the quick cheat costs you in large distributed systems while being very efficient in smaller systems, imo.


Alan Kay defined object-orientated programming as having messaging, state hiding, and late binding. I think the ship has sailed on the last one, but the first two properties hold true for at least all the object orientated languages I'm aware of.

A closure lacks messaging by default, so it's not an object. You can certainly make an object from a closure, and in an object orientated language a closure may be itself represented by an object; but responding to messages is not an inherent property of a closure.

It's not useful to say that all languages have all features; the line has to be drawn somewhere. Being able to theoretically write an object system for a language doesn't make the language object orientated.


Right, that's true. I think structs are inevitable, and classes is the way to get structs in Python.

But yes, the entire OOP-classes with namspaced methods and "self", that's not inevitable


In a very simplified way, classes are extensible modules.

That has always been in trend in most languages, first one gets modularity, then eventually they outgrown now being able to have generic modules, and eventually they land into something that is for all pratical purposes OOP (there are many ways to skin a cat).


I'm not even sure structs are inevitable. You could write a language around open sets of key/value pairs, for example.


Type classes and ML functors are also a way to do polymorphism and dynamic dispatch on data structures.

So which languages are left without object systems?


Polymorphism and dynamic dispatch don't equate to an object system in and of themselves.

At minimum, I'd say an object system needs some form of messaging and state hiding. That is, an object holds hidden state, and responds to external messages.


Some would say that starting from lambda calculus it would be enough to reach there.

I advise some reading "The Art of the Metaobject Protocol", or similar literature.


Obviously any Turing complete language can implement an object system, but that's clearly different from the language having an object system built in.


We can start discussing semantics then, when a language supports a feature, it is only when it is hardcoded on the compiler, or the language offers enough lego blocks to do it via a library.


It's when it's part of the core language, as any Turing complete language can implement any feature of any other Turing complete language. It's not a useful statement to say all languages support all features.


Rather, any Universal Turing Machine can be programmed to calculate whatever any other Universal Turing Machine can be programmed to calculate. It's not necessarily done with the same features, or use of resources.

In some cases, the only way the features of some UTM A's architecture will be obtained through UTM B, is if UTM B is used to create a simulation of UTM A, and so then that provides a way to execute the UTM A instructions themselves. In that situation, UTM B itself has not acquired the features of UTM A.


There are two problems with making that distinction. First, by that definition, few languages have regular expressions as a feature. Second, in languages with support for syntax rewriting, that would mean the internal implementation, not the user-facing syntax, is what decides what counts as a "real" feature.


A TM is a machine which accepts a tape, together with a specific tape: it performs a very specific calculation.

A UTM can take an instructional input which turns it into a different machine, which can then process tapes designed for a different TM mechanism.

A fixed programming language (and its standard library) is a UTM. If there are regular expressions in the language syntax, or the library, it's a feature of the UTM.

Everything else is instructions: additional libraries, macros, whatever.

We can extend languages by pretending that the additional code we have written (libraries, syntax rewriting) comprise a new, extended language. So that's a new UTM.

There are limits in what syntactic rewriting can provide in a seamless, efficient way. Some features are simply not expressible in the target language of the rewriting, other than by using the syntax to create an inefficient, interpreted language. The use of the features it provides is walled off within the embedded syntax. (Not to deny that this is nevertheless valid and practically useful.)

The UTM theory tells us that we can always adapt a UTM to accept the tapes intended for any TM mechanism. However, it doesn't tell us how costly that is. For some TM mechanisms, the approach may be to translate a given tape up-front into the UTM's own instructions and then run it directly. For some TM mechanisms, the approach may be to simulate the mechanism: the UTM's instructions execute the simulator, which executes the tape. The UTM theory mostly talks about simulation.

What we know is given a Turing Complete language A (i.e. a UTM) we can always write an interpreter for language B. That's the baseline. We then have a new UTM equivalent to B, consisting of language A, plus the interpreter. That setup then accepts programs of language B, and their inputs.


> If there are regular expressions in the language syntax, or the library, it's a feature of the UTM.

By "the library" do you mean the core library of the language?

If a language didn't have regular expressions as part of its core library, does that mean regular expressions would not be a feature of that language?

> Some features are simply not expressible in the target language of the rewriting, other than by using the syntax to create an inefficient, interpreted language.

Why inefficient? Why interpreted? What if the syntax was efficiently compiled?


So I guess Common Lisp, Raket, Scheme, C++, C#, F#,... just lost a couple of features.


If those features are supplied by libraries outside the core language, I don't think I'd count them.

I use Clojure a fair bit, and that has a miniKanren implementation in the form of core.logic. Would I say that makes Clojure a relational programming language? No, I don't think so, because it's not part of the core language.

Similarly, there exists an optional static typing library for Clojure called core.typed - does that make Clojure a statically typed language? Again, I don't believe so.

You can build a library to implement any feature, particularly in Lisps or other languages with support for AST transformations. Those features may even be counter to the core design of the language - you could write a library in Haskell to support dynamically typed, imperative programming, for example.


To be more concise classes != object systems != polymorphism and dynamic dispatch on data structures

The difference between them are important to some people. And that's why they will argue for that.


One thing that is common to many people arguing is that they actually never bothered to read SIGPLAN, or CS type systems.

So they argue with knowledge based on Internet facts instead of actual CS literature.


The natural language is evolved by people saying what they what and the majority became the language. Not defined by some elites or some authority.

Classes binds data and functions together, and object systems supports inheritance. These are some of the things that some people don't really like, but you may not care. People sometimes care about syntax as well, writing "Class" is also part of the experience of using the language.

We have different words in the language for a purpose, that's my opinion.


Modules also bind data and functions together.

Not all object systems use inheritance, and even those that do, we can speak about delegation, interface/trait/protocols/type classes inheritance, regular classes inheritance, or some weird stuff like object patterns from BETA.

That is why we have CS literature, to put the right words in place.


| Modules also bind data and functions together. Yep, but a module is still not a Class. My example was referring to you mixing concepts of Classes and Polymorphism and dynamic dispatch in your original reply.

As you said, there are a lot of object systems that are different, and the difference matters. Having inheritance or not matters. I was just making examples to show that things are different.

We all know that C++ concepts != go/typescript interfaces != Java/c# interfaces != Haskell type classes (c# concepts) != rust/scala traits.

Try to explain that using "CS literature", maybe a lot of language designs don't have a word yet made for that. Yep there are Subtyping, Bounded quantification, Ad hoc polymorphism, Row polymorphism, but still not enough to tell all of them from each other.

But to anyone in the community, people all know that they are all not Classes and what make them special is important. It is quite clear, but why don't you get that?

The concepts CS literature are quite trivial and people can spend a few weekends to learn, or maybe another few for type rule syntax or a few others for basic category theory to understand them in a more intuitive way. But the real problem of making a good language for some applications is still there and much more complex and difficult. And maybe current literature are not enough for that.

The elites will never dictate the words, the people will pick up whatever they want to express their knowledge anyway, as we know of what happened in history.


Type classes are a way to do polymorphism, but... dynamic dispatch? I don't think so; which function is called, is determined statically.

As for ML functors: they're neither polymorphism, nor dynamic dispatch. First: functions for different types can't share names (unless we're talking about shadowing of names with lexical scope), the names are either qualified with the name of the struct they come from or they're not qualified at all. In all cases there is exactly one implementation for every name. And again: no dynamic dispatch. MLton even defunctorizes all code at compile-time. The only polymorphism in SML is the one that's used in some built-ins (like +), but it's not available to users of the language. SML has something called "polymorphic functions", but it has nothing to do with writing multiple implementations of one function under one name and letting the compiler select which one is called; which is what is usually meant by "polymorphism".


There are many ways to skin a cat how those declarations are written.

If I have time later I might post an example, apparently we are only right with code.


> Over and over, every piece of documentation for the Go language markets it as "simple". This is a lie. Or rather, it's a half-truth that conveniently covers up the fact that, when you make something simple, you move complexity elsewhere.

I'm not sure this is strictly true. This is true of essential complexity, but not true of accidental complexity. Some programming language abstractions introduce a lot of boilerplate and accidental complexity, so "simplifying" could just mean bringing the language more in line with a natural expression of a program.


> The problem with the example of Rex the Dog is that classes are not about being able to represent your dog. It's about something completely different. Encapsulation, modularization, abstraction - let's not pick nits about the precise meaning of these terms, they are all about the same thing, really: Managing complexity of your own code.

Author is also missing a big part of explanation here.

Real problem is that you won't have one dog, you will have hundreds of dogs with different attributes and you want to make them bark at each other.

When you have to implement bunch of running dogs - then you hit encapsulation, modularization and abstraction. Those properties are not some kind of given especially in example presented, but for me question "how can I make 100s of dogs" comes naturally.

So in the end author really misses Object vs Class distinction and I can see it in the writing. Author equates Object and Class where in reality Object is an instance of a Class in run-time, so you make "Rex" and "Lassie" dogs which are Objects in your program and can bark at each other. Classes cannot do anything as those are simply templates for creating Objects, you cannot do operations on a Class.

So below sentence is simply wrong:

> You have now defined a Dog class and can do operations with your dog by simply writing


There's a distinct difference between complexity and complication, which I think is what the author is looking to convey.

Complexity is just inherent to the system you are building. The only way to decrease complexity is to do less. Hiding it only moves it somewhere else. There's a whole school of thought around the idea of reduced functionality, and in fact it's baked into the UNIX ethos of "do one thing well".

Complication, on the other hand, is what I would call "unnecessary complexity" that gets added to a system as you bolt on more things to give it more potential, or to support legacy. We see this in a number of programming languages, most famously C++. Adding more capabilities to the language requires you to have more knowledge before you can effectively use it. Adding features to something not designed for it (cough Java) results in some very annoying complications. And THIS is where the call for simpler languages comes from.

The misunderstanding between complexity and complication is what causes for example golang to be such an annoying language. They sought to reduce complication (a laudable goal), but conflated that with complexity and only managed to move it elsewhere (because that's all you can actually do with complexity).

We see similar problems in data formats. JSON has lax standards and limited type support, which forces the complexity onto the user in the form of string serialization and base64 and all sorts of incompatible custom secondary codecs. Is the format itself simple? Sure! The spec fits on one page! Is it simple to use? Absolutely not! It's a minefield and a security nightmare. Data is complex, and any decent data format will have to be just complex enough to support the majority of use cases in a consistent fashion.


I generally agree with you (with the addition of it being often referred to as essential (which is non-reduceable) and accidental complexity=complication in your words).

May I ask you why do you think Java is an example of adding features it was not designed to have? In my opinion, even with the original warts of the language, it is one of the few long-running languages that managed to withstand adding every feature and the kitchen sink, and additions are thoughtful and deliberately slow (late-movers advantage, it let’s other languages make mistakes with a feature first, avoiding bad ideas like async).

Generics was added very cleanly to the language seeing the constraints of the time, lambdas are also more than good in my opinion (with the only wart being around checked exceptions, unfortunately). The new direction with algebraic datatypes (record, sealed classes) and pattern matching is also an example of a feature that was known forever (ML), but the general developer community got to like it only recently.

it is only bad in the context of managed languages that can do better. It is a “necessary evil” in case of rust for example


Generics were a necessary thing, but the type erasure is a huge problem, and unfortunately one that can't be fixed without a completely new bytecode format. Also, annotations, while a necessary thing to bring it up to speed with modern practices (such as nullability contracts), add to the bureaucracy of Java and make it more painful to work in.

Don't get me wrong; I do like Java, and have written hundreds of thousands of LOC in it over my career. But there's only so much it can do without breakage, and only so much that can be papered over and worked around. Even Kotlin suffers from the JVM it rests upon.


Why do you think type erasure is problematic? Haskell and basically every language absolutely gets away with type erasure so it really shouldn’t pose a serious problem, other than for reflection usages which are rare.


It was too long ago so I can't remember why it was problematic, only that it was, and required a bunch of workarounds in some cases to get what you take for granted in other languages.


    def bark(dog_dict):
        print("WOOF!" if dog_dict["weight_kg"] > 25 else "Woof")

    rex = {"name": "Rex", "weight_kg": 35}
This code is actually better, but it would even be better with structs and traits

https://gist.github.com/rust-play/ceb5a292a22e55e27d56a83253...

We separate the concept of a Dog, the concept of something that barks, and the actual implementation in this case

We also don't immediately print it, but just return a slice so you can print it later or pass it to another function

It's a bit longer than the class example, but conceptually cleaner because multiple structs can implement the trait, some of them without a weight

and of course I'm only going to print things in my main, the interface only returns a slice which can be passed to other functions


That was not the point I was trying to make. What I was trying to make it that the need for encapsulation and abstraction does not go away, and that people will re-invent them if they are taken away.

More specifically, I think structs, or something like structs, are inevitable. And methods are also inevitable. Whether it be by OOP-style classes that contain both fields and methods, or in the Julia style where you have structs, and global-scoped functions with struct-specific methods, or in the Rust style you linked to with structs that conforms to a trait, which implements the method is not important for my specific point.

All three ways add complexity to the language, but such kind of complexity is unavoidable, because it will show up in the code anyway. Even if you use C, which does not have an explicit connection between structs and functions (i.e. methods), you will just write functions where you document that such and such argument should be a pointer to such and such type.


> It's a bit longer than the class example, but conceptually cleaner because multiple structs can implement the trait, some of them without a weight

Sounds like premature abstraction. Instead of declaring a class, you have declared an interface.


There are no classes in Rust, and it's better for it. That's my point, some abstractions like classes are NOT necessary.

There were arguments before Rust 1.0 that people will simply try to emulate classes. But nobody actually bothered, because traits provide enough functionality that you don't actually need to have classes to actually write maintainable and understandable programs.

I'm attacking the exact point of the article that you need X. Maybe you actually don't need X, and your programs will be better for it.


> There were arguments before Rust 1.0 that people will simply try to emulate classes. But nobody actually bothered, because traits provide enough functionality that you don't actually need to have classes to actually write maintainable and understandable programs.

It's not only traits. Traits (well, traits objects) are used for dynamic dispatch, traits themselves are used as composable interfaces, and structs + impl blocs are used for basic classes.

My point was that your example, implementing a trait that may or may not use a weight for barking, is not making the program better. It's making it more complex for no good reason. Here's what your program could be: https://play.rust-lang.org/?version=stable&mode=debug&editio...

Adding traits doesn't make the code better. There's only dogs that are barking. You don't even need to have bark() be a method, a plain function will do.


Of course it can be as simple as that, that's beside the point

But almost every time you actually won't ship a program that just prints stuff to the console

That's why I did it that way, because how you actually output won't be println, but probably to a window or a socket, or whatever

The core functionality of passing a slice will be actually useful in those cases, you can include this code in your final program


> There's only dogs that are barking.

   class DogSoundFile
   constructor Create(MP3File, Weight);
   ...
Bit silly, innit? (Are bigger MP3 files heavier...? :-)


Missing the point.

If you read just a little farther, you get that extra apparatus has a purpose that is opaque to beginners, that just adds confusion.


I'm responding to

> And so they would have accidentally re-introduced classes, only this time the existence of classes would have been implicit in the code, and their behaviour ad hoc defined, their invariants spread all over the source files, and with no language-level tooling or introspection to help the programmer. Classes would still exist, but as implicit patterns.

You don't need to do this with classes, my design is better than the class design because it's more flexible


Flexibility usually has a cost. It's not better or worse.


Correct! When the language provides apparatus to do the thing, everybody doing the thing is incentivized do it the same way, making libraries implicitly compatible. Cobbling things together yourself (which you can always do: these languages are Turing-complete) you are unlikely to do it compatibly with what anybody else does.

This might be why Lisp never developed an ecosystem. Everybody can make their own pile of macros to customize the language for their use, with no standard interfaces that can be composed.


So would this argument not apply to the use of Templates in C++? Everyone can design their own--what is to keep these from slopping all over the landscape?


> So would this argument not apply to the use of Templates in C++?

Yes, AFAICS arguably it would.

> Everyone can design their own--what is to keep these from slopping all over the landscape?

Who says there is anything to do that?


I was referring to

> you are unlikely to do it compatibly with what anybody else does


Yes, and I was implying there's nothing to make sure it's compatible.


Example: GNU Emacs comes bundled with 1.5 million lines of Lisp code. How was that possible, given that Emacs Lisp has macros?


You have maybe noticed that Emacs is a single program? (OK, two or three, historically.) Emacs Lisp code that doesn't integrate with the rest of Emacs just doesn't work.


What I have noticed is that GNU Emacs is an application development platform for various programs.

Example: Org mode. 133 k lines of Emacs Lisp code.

https://orgmode.org

Org mode then is also used by other tools.

Somehow these 'Lisp macros' seem to integrate, even though they are written in different times (GNU Emacs started in the mid 80s) by many different authors, for different programs on top of GNU Emacs.

That GNU Emacs is a program itself is not different from, say, CPython. That's also a program itself, used to run other code. GNU Emacs has 'packages' and an online archive for that. Developers publish libraries (which other use) and whole solutions, like Org mode.

That 'Lisp' itself doesn't have an 'eco system' is no surprise, since Lisp is not a single language, but a family of dialects, developed over the last almost 64 years - either similar ones like Autolisp and Emacs Lisp or different ones like Logo and Racket. Thus these dialects tend to have their own eco-systems. Like Java and C each have their own eco-systems.


I really don't care about Lisp.

The point is that if you are designing a language and want a lively ecosystem, you either build in features that establish conventions, or implement them in a Standard Library if the language is powerful enough to express them that way.

Failing to build common ground to enable libraries from different places to interact cleanly dooms your language to obscurity.


Not sure what that has to do with Lisp then, since macros establish conventions, but on a language level.

Example: the Common Lisp Object System was initially developed as new a library, based on two other earlier libraries. This was then standardized and it is the standard way to develop object-oriented software in CL.

The library has three API layers: an object-oriented layer, a functional layer and a syntactic layer. The syntactic layer is implemented with Macros. It provides for the developer a convenient way to specify the elements of an object-oriented program: classes, generic functions, methods, control structures, ... As a developer everyone prefers to use the syntactic level.

There are many libraries and applications using it. It's also extensible, so that other user can add new behaviour to the object-system itself.

That macros enable new language extensions also on a syntactic level is nothing special. To use macros from other developers is not more difficult than using a class graph, or similar.

There is a lot of 'fear' of user-level syntax extensions from people who have never developed with it. Understanding the implementation of macros themselves is an added burden, since many developers are not used to write&read code which is written on a meta-level of incremental code generation. But there are practices which can be learned and applied. Macros themselves can make the code much more readable, since it allows one to support a problem domain also on a syntax level and hides implementation details.


I disagree. It is maybe better for the author or you as a beginner, but not everyone. One needs data to support claims like that.


Again, you miss the point.

Identically the same person, at different times, is confused by what seems like pointless extra apparatus, and, later, embraces it as solving a problem they have come, after experience, to recognize.

"One needs data" is, in this context, nothing better than meaningless bluster.


You can make a general claim like that and sounds true. But I was referring to how your general claim fits to the specific situation.

Conversation is mean to have a deliver and receive end, both side can have problems. "you miss the point" kind of push all the responsibility to the listener. But why you have such a strong confidence about your expression?


I had a similar experience.

As I grew more experienced I also found it peculiar how sometimes classes worked really well, and other times they were a terrible fit. Eventually I realised, as an abstraction they are far closer to high level patterns than rudimentary functions - but the fact that they are a very common pattern built into so many languages obscures that attribute from most people.

> Classes would still exist, but as implicit patterns.

Once you realise this, you also start to notice how often overused and misused classes are, because they seem to be the first thing many people reach for. Classes are not an inherently bad pattern, but that doesn't mean they work well for everything either.

I think teaching them from the perspective of a "need" as the author suggests, would not only make their purpose clearer for beginners, but also prevent them being so misused.



I would take issue with the zig vs go: go tries to make everything simple. Zig tries to make the simple things simple, but very much exposes the systems complexity.

I would say zig is uncomplicated, but can embrace complexity.


What make it difficult for me very often are bad examples. The set of standard examples for introducing concepts of class, object, inheritance etc (animals, geometric figures, cars) makes things harder in some aspects in my opinion. That was at the beginning. Some time ago I was learning design patterns and again - after understanding the concept I usually understood how bad were the examples. I've never read a good explanation of SOLID principles.


Python modules are basically a singleton class. It's somewhat unique among languages (JS has this too).

Makes it so you can "unit test" modules without actually using a Python Class. I see the Python Class as a way to get a "module within a module".

Nice side effect of Python modules as singletons: The "self." is implicit to the scope of the module- This makes them a joy to write.

In most other languages, yeah, you'll want some form of encapsulation so you can write tests.

Type checking can cut down on the amount of tests you need to write, but you're continually paying for it with dev time. This is a decision that needs to be made depending on what the project is, whos working on it. If I'm on a team with a lot of new people who will try shoving data where it shouldn't go, yes, gonna force type checking and take the velocity hit.


"Not having a borrow checker" seems like a noncentral example of the value of simplicity. "Not having a separate language for writing macros" seems pretty good.


> So, with disadvantages and advantages to the complexity of modern programming languages, what should one conclude?

> The only resolution is for us as professionals to use our judgement.

Or the third option: create the language with the simplest implementation of all necessary complexity.

Unfortunately, famously, tragically, perfect languages attempted so far are examples of what not to do. The best ones are wonderfully unique examples of what not to do!

I know because I have created more than several.


I cut my programming teeth on Fortran, C, and Smalltalk, all semi coincidentally in my early formative years of programming (well there was also a pile of Lotus123 macros where I discovered that 640K was not enough (thanks a lot Gates) and invented something that years later I would realize was a swap space).

Fortran seemed easy, C was similar, but my mentor wanted to impress me using strtok() in the middle of already existing strings to cut them into words. And like the author, I had my “uh… what?!?!” moments climbing the Smalltalk curve.

Smalltalk was transformative for me. Unlike the author’s experience, I learned an object oriented philosophy in a pretty thoroughbred environment.

The best way to learn a second spoken language is to immerse yourself in it. Wanna learn Norwegian well? Go live in Norway for two years.

Unfortunately, while the hype around Smalltalk and some of its other peers rose, it forced others to “add/borrow” OOP into their philosophies.

Classes were not the heart of OO in the Smalltalk-verse. The root proto class in many implementations was called Behavior. The whole gig in Smalltalk was binding behavior to data. Figure out where a good cohesive boundary for data that glumps together well and dress it up with behavior. You then excited your objects to, well, uh, behave. Or in the lingua Franca, you sent messages to them and they always responded with something. And that was pretty much the whole schtick. I found it a very useful system for modeling things around for me. More than any language since, I found I anthroporhized these behaviors a lot. Algorithm design was often a sort of “if I assigned different people in a room to represent different pieces of state, what would they say to each other to get the job done.”

This is where my take deviates from the article. What was liberating about this, was that they kept it simple and stuck to this. In Smalltalk you might ask, well, what’s a class? It’s an object. But in most other OO systems, something happens here. Usualky, first class classes don’t exist. Or they do kinda. But often, it’s another area of the language you have to learn to specify behavior taxonomies. It’s like two different languages.

How do you create an instance of a class. In Smalltalk, you sent a message to it. Because that’s what you did. You sent friggin messages! But in just about every other OO system, you now get to learn about constructors. Are they like messages/functions? Kinda. But they’re different. Do you express and invoke them like messages? See “kinda.” Separate sub language count now equals three.

What about math? Gotta do that the way you learned how to do it in school. Precedence rules. But in Smalltalk, you sent a message.

Meta programming? Send more messages. Control flow? Send messages. Looping? Send messages. Functional style collection traversal? Send messages.

This isn’t meant to be a Smalltalk love fest. I moved on 10 years ago. But what frustrates me is that people refer to Algol derived languages as object oriented. Most of them aren’t. They’re object-additive. And functional-additive. They’re 10s of language ideas and syntaxes balled up in one, little bits of syntax competing for contextual disambiguation. They’re complicated for all the same reasons that while Spanglish is appealing to a wide body of English and Spanish speakers, you’d go nuts trying to put down a formal definition of how to speak it.

OO isn’t why languages are complicated.


> Show locked doors before you show a key

This is a great pedagogical insight, expressed in a very nice way. Thank you for this goldnugget :)


I still don't get classes in python.

For now I wrap my data objects as a json and use rabbitMQ to message the json to all the other running .py's

I heard about async and want nothing to do with it thank you.

I do understand the need to maintain a mental model of the code so I gave based everything on Sherlock Holmes novels.


>>"Focus on debugging your application rather than debugging your programming language knowledge."

>Why would I want that? The entire point of modern language complexity is to reduce the amount of debugging needed for your application, because its complexity is properly managed by the language.

The difference is that applications are always particular and specific to a problem. Moving mechanisms from the application towards the language, which unless domain-specific, is a general purpose tool, does not guarantee reduction in complexity. If a language restricts me from writing a program, the language says "what you want to do is so bad, we must eliminate it for the entire userbase of the language!". This is a very tall order. Very few restrictions are so universal in their application that they apply any program a user would want to write.

The author posits the hypothetical that Zig's lack of manual memory management creates more runtime crashes, but it is just that, a hypothetical. It depends on the application and the use case, and the individual developers. And it comes for example at the cost of performance in a situation where the developer wants to manipulate memory manually and is willing to take the risk of introducing errors.

Strict language features restrict the ability of developers to deal with problems as they see fit. It's a form of top-down design that assumes that general solutions are effective at addressing particular problems. One can argue for example that syntax-heavy languages are much more clear and uniform but they are also restrictive. Lisps give developers directly much more power, but may come at the price of providing fewer guarantees.

Personally I'm not a fan of overbearing languages. Features and guarantees the languages provide have trade-offs. In some situations they may not be worth it and at the end of the day I have problems that a language designer cannot have anticipated because they're relevant to me, and I want the freedom to address them as I see fit. If a tool is inappropriately heavy-handed it turns from being useful into being restrictive, and it makes my job harder, not easier.

A practical example for me is program correctness. I much prefer an extensible tool like clojure.spec, which lets me add guarantees as needed, over a very harsh type system built into the language. If a type system eliminates valid programs that I deem appropriate, it adds complexity for me.

And a last point, I focus on restrictions in terms of 'complications' rather than features because one can always omit using a feature of a language, but has to abide by its restrictions.


I agree with almost everything about this article. I am also a known Go zealot. What gives?

Well, I think the author underestimates just how much needs differ across different programs. Go excels with services because they are often written with a shared-nothing mindset for horizontal scalability. If you’re using S3, Redis, PostgreSQL and RabbitMQ for all of your persistence and coordination, it leaves only small bits for your service to be concerned about w.r.t. local data races, and they are mostly encapsulated into libraries, sometimes provided by the language, that can be aided with static checking.

There is something in video games called minmaxing, which is where you maximize one attribute at the expense of all of the others. For me C++, and now Rust, feel like they are guilty of this with their weight on zero-cost abstractions. I am not against efficient code. It is super awesome to generate efficient JSON serialization code at compile time. In aggregate, even a few hundred cycles per request could be a big difference.

But as many have already stated, the secret to zero cost abstractions is that it’s zero runtime cost. Just as reducing programming language complexity doesn’t remove it, but only moves it around, removing runtime cost doesn’t remove it either. And the problem here isn’t that compile time isn’t a better time than runtime; it’s the hidden costs that suck. Sure, maybe the compile times in Rust will improve, and maybe it’s not even that big of a deal. But every little feature the language has puts some additional stress on the ecosystem. The language server has to bear this load, for example.

Everyone knows the downsides in Go’s approaches to code generation, but I actually think it’s kind of intriguing. go:generate is imperfect in almost every way, and yet it’s oddly practical. It doesn’t guarantee code is not stale, but clever codegen design can cause syntax errors when regenerating code is needed. But maybe most interestingly, gopls doesn’t need to care. The go compiler doesn’t need to care. The parser doesn’t need to care. As far as they are all concerned, the generated code is just more code. To me, this is clearly an approach that doesn’t suffer from minmaxing. Maybe it suffers from a dogmatic fetishized idea of simplicity, but I don’t think so: I think it was literally just meant to be the bare minimum to try to solve the most important parts of the problem, and at that it succeeds.

I am not saying that Rust isn’t awesome. In fact, what I am saying is that I think a lot of smart people have fallen into the trap of thinking that Rust is the one true programming language simply because it’s such an alluring story. But, I think there’s more room. Formally verified C, Go, Rust and Zig will probably all co-exist for the foreseeable future, along with many other permutations of programming languages. Nobody is going to write a web browser in Go, or at least not anything to the scale of Servo or Chromium. But, if you wanted a language to write some servers in, well, hell, you could certainly do a lot worse than Go.

I don’t want to go too into it, but I also think there is some overly optimistic viewpoints here. It is true that reducing language complexity will often move it elsewhere, but that does not mean that language features do not cause complexity that doesn’t exist otherwise. For example, the Rust borrow checker. Yes, I love it. It is not perfect. It allows only a subset of valid, correct code, and it’s excessively hard to write, for example, a correct doubly linked list in Rust. Does that mean it’s useless? No. But, in general, it’s not very hard to write a correct linked list. (It’s easier to write a subtly flawed one, but that’s neither here nor there. :) Thankfully, most people do not need to write a linked list implementation, but I do still think this stands as an interesting illustration nonetheless.

And I am not suggesting that it’s a superior option at all, but if you are in a simple language like Go, it’s still possible to eat some cake and have it, too. gvisor’s checklock program is a pretty good example. Not perfect in almost any regard, but again, it represents a different set of tradeoffs.

Rust frontloads an awful lot of complexity for cases that are not necessarily common for everyone. And that complexity, both in the toolchain and the resulting code, is a cost that gets paid repeatedly forever. It needs to have a good payoff. Systems programming and game developers have a lot to gain. Everyone else? I think it’s up for debate.

May the best languages prosper. Ideally cooperatively.


> There is something in video games called minmaxing, which is where you maximize one attribute at the expense of all of the others. For me C++, and now Rust, feel like they are guilty of this with their weight on zero-cost abstractions.

> I think it was literally just meant to be the bare minimum to try to solve the most important parts of the problem, and at that it succeeds.

In other words, it's the cost effectiveness of a language feature. You want features to provide as much value for as little cost as possible. This is what Go's going for; to paraphrase Russ Cox - "If a feature is not clearly above the bar, it's below it." [1].

> Just as reducing programming language complexity doesn’t remove it, but only moves it around, removing runtime cost doesn’t remove it either. And the problem here isn’t that compile time isn’t a better time than runtime; it’s the hidden costs that suck. Sure, maybe the compile times in Rust will improve, and maybe it’s not even that big of a deal. But every little feature the language has puts some additional stress on the ecosystem. The language server has to bear this load, for example.

The worst hidden cost that's often missed is the impact on the programmer.

Take Python's controversial walrus operator [2]. If you look at the reasoning and the examples, the change seems reasonable. But consider that once the feature gets implemented, every single Python programmer will have to learn what it does in order to be able to read others' code. Does this feature provide enough to justify this weight? Undoubtedly not. Of course, it got implemented anyway, because the impact of making the code slightly more readable in some cases is clearly visible, but the cost incurred on every single programmer's mind is not.

This is how languages reach untenable levels of complexity. By the time you realize your language is getting complex, you're 20 features in and it's too late to turn back. "A frog dropped in a pot of boiling water will jump out of the pot to save his own life. If the frog is put into cool water and slowly brought to a boil, he will remain there until he is cooked through."

[1]: https://github.com/golang/go/discussions/47203#discussioncom...

[2]: https://www.python.org/dev/peps/pep-0572/#abstract


Modern C# is way too complicated, and I really dislike that. On the other hand, as I've tested out the new features, I have to admit that most of them are good.

Part of what makes this work is that wherever possible, they made the features compiler-only, which means I can use them without having to update my application to use a newer runtime or newer libraries. This means I don't have to worry that updating my code will break compatibility for existing users.

The other key thing for me has been realizing that a significant portion of the new features enable me to replace old verbose/bad code with new concise code that has desirable properties - it's become easier to reduce duplication and easier to write efficient code that avoids allocations and better exploits the type system.

Some simple examples:

the addition of 'in' parameters which behave like 'ref' but are read-only, which enables the compiler to automatically pass values by-ref when appropriate - this means you can replace existing by-value arguments with by-ref arguments and not break any existing code. It's really great.

they added 'ref' locals and 'ref' return values, which means you can do a clean and efficient mutable version of 'list[i]' like in C++ but without any of the semantic issues (storing the ref is explicit and the compiler prevents you from accidentally introducing memory safety issues.)

generics were expanded to allow you to safely manipulate pointers, which means that high-performance code no longer has to use gross tricks and can now be type-safe. in modern releases you can now finally do arithmetic with generic types as well (though sadly this requires an updated runtime).

The addition of tuple types bothered me a lot too until I realized that the tuples were silently updated to value types, which means writing obvious tuple-based code is actually highly efficient and I don't need to hand-write record types.

the async/await features have some major downsides, but on the other hand the compiler and library design teams built in a bunch of really wise escape hatches to let you work around issues. The whole state machine can be customized to swap out all of the internals, you can define your own types with seamless 'await' support, and the compiler will (in release mode) aggressively turn the state machines into structs so there aren't any allocations. It's really nice and transitioning my code to it has been a huge improvement.

linq is notorious for bad performance, but they made it possible to provide your own implementation of all the query operators so it was possible for me to define my own methods and make my linq queries not allocate at runtime. really nice (though it comes with its own tradeoffs).


> they made it possible to provide your own implementation of all the query operators so it was possible for me to define my own methods and make my linq queries not allocate at runtime. really nice (though it comes with its own tradeoffs).

I would like to know more about this, would be really useful. So you have any recommended docs to read?


I don't have a bookmarked page about this, but essentially if you implement your own version of the extension methods from the Linq namespace, the compiler will prefer those over the IEnumerable<T> extension methods. If you want to debug this a great way to do it is to examine the generated binary in a decompiler and you should see which methods the compiler decided to use. The async/await support also works this way - it will just look for methods with an appropriate signature and try to use them if possible: https://devblogs.microsoft.com/pfxteam/await-anything/

For both of the above examples, it's sufficient to write extension methods that target a specific type, so you don't have to change the actual types you want to query or await - just provide extension methods. This means you can do it for types you didn't write, as well.

For more advanced scenarios, you can write a query provider that consumes expression trees, and the compiler will generate an expression tree for your library to consume instead of calling a bunch of methods: https://docs.microsoft.com/en-us/archive/blogs/mattwar/linq-...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: