Hacker News new | past | comments | ask | show | jobs | submit login
OCaml is Pretty Great (2019) (chewxy.com)
128 points by lelf on May 1, 2020 | hide | past | favorite | 138 comments



I am happy that functions/subroutines were invented and in common use by programmers in the 1970’s, otherwise we would have had arguments that function calls with their stack based semantics are confusing, and explicitly writing out what you want your code to do instead of hiding it behind a function call increases code clarity.

There would have been a group of people who criticized Go’s lack of functions, but other people would have pointed out that functions are not strictly necessary and besides they will be added in version N+1 once we are sure about the best way to do functions.


If you learned the difference between reentrant functions and non-reentrant subroutines the way I did you would never equate them again.

Back in the early 80's when I was 14 or so I wanted to replace a slow bubble sort in a BASIC program of mine with something faster. I had a book that had a quicksort implementation in Pascal and I thought it would be straight-forward to translate it into BASIC.

Of course the naive translation didn't work. Luckily the chapter before the quicksort in the book was all about recursion so I did figure out what was wrong. I also had enough experience with assembly language to be familiar with the concept of passing parameters on a stack. I just hadn't really internalized why someone would do that. In the end, I had to use an array to create a parameter stack to work around the lack of reentrant functions, but I did get it working, and it was a lot faster than the old bubble sort.


I don't understand what it has to do with reentrancy?


A (non-tail) recursive function actually needs a reentrant mechanism to implement the semantics of calling itself; this mechanism is a stack.

Without recursive functions, we have a bounded call depth and thus don't actually need a growable stack; a fixed-size array suffices.

We take for granted that function calls are always implemented with a stack, because that's how processors work today, but it could have been done differently. We could imagine an alternative timeline with much restricted functions and no implicit call stack.

And indeed, tail recursive functions are usually implemented with tail call elimination, in which the caller reuses the stack frame of the callee; function call becomes a simple goto, which transforms recursion into loop iteration. If we didn't have an implicit stack we could still have tail recursive functions (and a bounded number of non-tail calls), and manage other recursion cases with explicit stacks.

See for example pratt parsing https://matklad.github.io/2020/04/13/simple-but-powerful-pra... which uses the implicit call stack, and contrast with Dijkstra's shunting yard algorithm https://matklad.github.io/2020/04/15/from-pratt-to-dijkstra.... which is the same thing but with an explicit stack. If we didn't have a call stack, we could always program in Dijkstra's style and be none the wiser.


My guess is that:

1) the quicksort algorithm which was used was a recursive version, and relied on local variables to do its job.

2) in the discussed BASIC implementation, local variables are translated to global variables (probably with some form of prefixing based on the function using it to avoid name clashes). This means that functions in that language are never fully reentrant, and that somehow an explicit variable stack has to be implemented to recover the ability to recurse without overwriting variable prematurely.

Note: I initially had a doubt wrt reentrancy, as I had knowledge of this concept in the context of multithreading (I think it actually came up originally in that context), and indeed having concurrent uses of the same function relying on some global variable is problematic, but here there isn't any kind of concurrency. However, global variables can be a problem in the situation outlined above. Another possible issue, although not present here, is when a function F uses a global variable, and accept a function pointer. If that pointer is pointing to a function making a call to F somehow, then lack of reentrancy could also be a problem.

To refresh my memory, I must admit that I had a quick glance at stackoverflow (https://stackoverflow.com/questions/3052393/reentrancy-and-r...).


Subroutines themselves weren't the innovation. Everyone was already calling out to subroutines in their assembly code. The innovation was structured programming.


Stack-based subroutine calls associated with structured programming came quite after people started to use subroutines in the assembly or higher level programming. Fortran, for example, gained those in eighties. Prior that everything was statically allocated and recursive calls were not supported.


I'm not sure that "stack-based subroutine calls" are associated with structured programming. An activation stack is required to ensure reentrancy e.g. if a subroutine might be called from an ISR or a cooperatively-executing task, in addition to suppprting recursion.


Could you give us an example of a pre-structured programming, high level language that supported the things that you talk about. PDPs had instruction support for cooperatively executing tasks but the return address was stored in a register, and was therefore limited to a depth of one.

From what I know stack based subroutines really became common around Dijsktra's famous paper which really ought to have been called 'Use the Stack, Stupid" rather than GOTO considered harmful. I am more than willing to be corrected with citations to the contrary.


And at the time, a lot of programmers did whine and moan about structured programming.

The famous joke should be “there are two hard problems with computer programmers: whining, moaning, and off by one errors.”


MJD speculated on much the same thing: https://blog.plover.com/prog/design-patterns.html .


Have people actually looked at the code? You can’t treat the benchmarks game code as a simple black box. Most implementations look like C regardless of the language, eg. https://benchmarksgame-team.pages.debian.net/benchmarksgame/... or https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

There is very little to infer from code which is so far from being idiomatic, aside from the fact that you might as well use C if you ever find yourself in the benchmark situation.


Some other examples of what you are talking about: In the regular expression benchmarks, some entries implement their own idiomatic regex parsers or link to the PCRE2 library, rather than use the regex library that comes with the language implementation: https://benchmarksgame-team.pages.debian.net/benchmarksgame/..., https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

This is, arguably, totally fine, because these are still valid programs that run (and run quickly). BUT, it makes the benchmark programs poor choices to compare the verbosity of languages. So statements like "For a language famed for its terseness, Haskell it turns out, isn’t as terse as expected" can't be supported when comparing benchmark programs that were written to maximize speed, rather than written to minimize developer time.

Fortunately, the Benchmark Game does publish all of its programs, including the ones that don't "win" the speed race, and it's possible to find nice, concise, idiomatic Haskell programs in there.


> BUT, it makes the benchmark programs poor choices to compare the verbosity of languages.

Because?

Because the way some link to an external library is more verbose than the way they link to an included library?


Yes, that is what I mean. By the way, on the benchmark game website, is it still possible to sort benchmark results not by speed, but by gzip'd source code size?


On the task pages, click the column header:

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...


> You can’t treat the benchmarks game code as a simple black box.

Well you can and obviously people do, but the advice given is "Always look at the source code."

> Most implementations look like…

So you counted !

How many of the 33 Haskell programs "look like C" ?

https://benchmarksgame-team.pages.debian.net/benchmarksgame/...

Or did you see a couple that "look like C" and cherry-pick ? :-)


The actual fastest Haskell programs are here: https://benchmarksgame-team.pages.debian.net/benchmarksgame/... These are the ones which are used for benchmark results.

The only fairly idiomatic ones (no pointers, no malloc, no unsafe) I could find are pidigits, binary-trees, and k-nucleotide. That's 3 out of 10.


So you did not mean "Most implementations look like C …" you meant the fastest implementations look like C ?

You might find this interesting — "One can, with sufficient effort, essentially write C code in Haskell using various unsafe primitives. We would argue that this is not true to the spirit and goals of Haskell, and we have attempted in this paper to remain within the space of reasonably idiomatic Haskell. However, we have made abundant use of strictness annotations, explicit strictness, and unboxed vectors. We have, more controversially perhaps, used unsafe array subscripting in places. Are our choices reasonable?"

pdf "Measuring the Haskell Gap"

http://www.leafpetersen.com/leaf/publications/ifl2013/haskel...


I think you're misunderstanding the point of my comment. Keep in mind the point of the original article: "For a language famed for its terseness, Haskell it turns out, isn’t as terse as expected - it’s average size of source code is larger than the average Go source code size."

I'm responding to this. Languages with many escape hatches like Haskell or Swift will result in less idiomatic code (and potentially longer programs) when optimized for the benchmarks games. Languages with less escape hatches like Go and Ocaml less so.

I don't have anything against any of these languages. Feel free to write C-like code in Haskell if you want.


If "idiomatic code" comes with some downside that "escape hatches" overcome, then I see that could result in "escape hatches" being used.


Its a tough thing to study but I'm not sure the definition of verbosity is particularly good.

>size of the clean (i.e. no comments, normalized white space) GZip’d program source code, in bytes.

Some languages idiomatically use comments and white space to aid readability whereas others use more verbose function and variable naming. Gzip will also encode patterns that could be very verbose (switch statements, unrolled loops, functions with same prefixes...) into very few bytes


My team's been using Go for the past five years to build an open source test automation framework. None of us knew Go before starting on the project.

Five years on, we have no regrets. The language semantics does not get in our way, it "feels" easier to solve problems. Personally, with OOP languages I spend a lot of time designing vs solving. Design is subjective. Go, takes that burden away.


I am doing the same with Rust. Fell in love with the language and it feels great :)

Yeah you have to battle the borrow checker sometimes but it's there to help you avoid shooting yourself in the foot.


Interesting but I find myself confused. There is not really much discussion about what kinds of implications these metrics might have in practice and what the limitations of this analysis are. Obviously the goals of programming languages vary, and outside of esoteric languages few really strive to be very terse. It seems at least a bit odd to me , then, to make value judgements based on this.

Hate to be a wet towel, since it is interesting. But does it actually mean anything?


When it comes to picking criteria to compare languages with, performance and succinctness have the advantage of definitely having a large impact on developer productivity/happiness (same thing tbh) while being (relatively) easily measurable.

Having recently jumped ship to a language/runtime that gives up to several orders of magnitude of performance speedup while maintaining similar verbosity, I appreciate the many opportunities this opens. I can write "lazy" code that pretty much always runs (way) faster than the stuff I used to carefully optimize. Perhaps the last metric I'd be interested in is the a number of bugs in the average codebase, and while I've seen language comparisons for that, I'm not sure whether to believe that they're accurate.


This chart shows the balance between the verbosity of a program written in a language, and the runtime performance of it. The ideal programming language would sit at the lower left quadrant.

This is a very odd way to rank programming languages. A language which said nothing and did nothing would be in the lower left too.

I don't mind things being verbose if they are easy to read - compactness is not a good measure of clarity. Code is read far more often than it is written, so this is important, and performance is part of a set of tradeoffs against resource usage etc, it's not a static thing for all situations.


>A language which said nothing and did nothing would be in the lower left too.

Huh? This is like if I were to argue that I'm one of the top runners in the world since I can't finish a marathon and thus my finish time is 0. A language that is unable to solve the problems would not be listed at all or have an infinite value on the performance axis.

>compactness is not a good measure of clarity

Solely taking compactness into account is definitely not a good idea, since it just leads to code golfy situations. However, I completely disagree that verbosity isn't a major detractor to readability, since it introduces a lot of noise that detracts from what actually happens. People can only keep track of so many things at once, so the simplicity of a single part often enough comes at the cost at making it harder to understand the whole. It's not a coincidence that 'higher-level' languages are generally less verbose than lower-level ones.


A language that is unable to solve the problems would not be listed at all

I was pointing out the absurdity of using these two criteria in isolation to judge languages, turns out there are other important things like finishing correctly, number of bugs, and clarity, not compactness. I don't think the ideal language can be judged on two criteria (or that there is an ideal language for everyone).

However, I completely disagree that verbosity isn't a major detractor to readability, since it introduces a lot of noise that detracts from what actually happens

Sure I agree extreme verbosity can be terrible too. So also can extreme terseness, unless it closely matches the problem domain, because otherwise now you have two problem domains you have to be familiar with in order to get work done - the actual problem domain, and the jargon invented by the programmer/language in order to solve it. This depends partly on the person reading and their preference for abstraction.


I agree but it can go too far too. Last week I had the misfortune to be writing some Applescript. Never will I write a more readable program.

I never want to write in Applescript again though!


Applescript is a really interesting attempt to make something easy to write for people used to english (not sure about easy to read). I've written a fair amount too but dislike it intensely as well - I'm not sure it's easy to read or to write, because the grammar is not well defined and the space of possible keywords and uses is expanded all the time by application developers who define which terms you can use.

A very interesting experiment though.


Let's try an experiment and replace the putdowney first part of the title with its upliftey second part.

The experimental bit is this: since titles basically dominate discussion completely, this change will probably convert a dyspeptic Go thread (boring) into a vigorous OCaml one (hopefully less boring). Of course that assumes that my posting this comment doesn't pull a Heisenberg on the discussion.

Edit: I forgot to mark this off topic so of course the meta aspect took over. (This was the top subthread, probably overnight. I'm going to downweight it now.)


You put me through a real emotional roller coastal :-)

"Oh cool, I always love some OCaml appreciation!"

Click thru.

"Oh no! More Go bitching?!"

But then ended happily, as actually article was interesting.


Every effort to improve quality of the discussion is valuable. Thank you


I don't know if it helped much though - Like others have mentioned this is a quasi Go article doing a Schrödinger on OCaml


I probably should have read the article before trying a late night experiment. Also, I should have marked my comment off topic.


Thank you so much. Please do this more often!


Either of them alone kind of misrepresents the article. Why not both, e.g. "Go is a Pretty Average Language (But OCaml is Pretty Great)"


Thanks. The last thing we need is another discussion about the problems with a programming language. Those are boring but they're also pointless.


dang, while I appreciate trying to foster a good discussion, I like the HN 'no editorializing' rule and if we break it here, anyone can justify breaking it for their own posts, no? And also, it didn't really convert this thread into OCaml discussion anyway. Most people ended up talking about Go/Rust/etc.


I like the 'no editorializing' rule too! But keep in mind the full rule: "Please use the original title, unless it is misleading or linkbait; don't editorialize." [1]

"Go is a Pretty Average Language" is linkbait, so following the rule means changing the title. That's the first thing to understand, and the next is just as important: when changing the title, what should one change it to? The answer is to look for representative language in the article itself [2]. Avoid making up new language if at all possible—that way the content gets to speak for itself, plus you avoid the errors that tend to creep in when the submitter writes a title themselves.

So the thing to do is to comb the article looking for a phrase that can serve as a better—i.e. more accurate and neutral—title. Subtitles are a great place to look, as are the URL, the HTML doc title, photo captions, the opening paragraph, and if necessary the bowels of the text. There have been cases where I've read a long article closely, only to fish out a perfect phrase that summarizes the entire article exactly, in the middle of the 26th paragraph. You can nearly always find something.

In this case, I used the subtitle, which wasn't exactly editorializing. But it wasn't clear that it really summarized the article either, but I wanted to try it as an experiment. Otherwise I wouldn't have posted a long explanation about what I was doing. Certainly a one-off experiment, just to see what would happen, does not mean that "anyone can justify breaking it for their own posts". If someone has an interesting idea for an experiment, I'm not against it (though they should probably let us know at hn@ycombinator.com so we don't misunderstand and kill it). But it has zero implications for promotional licensing.

[1] https://news.ycombinator.com/newsguidelines.html

[2] https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...


I agree with your finer points, but

> In this case, I used the subtitle, which wasn't exactly editorializing.

Technically it is, in the sense that it's the editor's job in a publication to select exactly which citation best summarizes the content of the whole piece.

So I suggest to perhaps change the "don't editorialize" part to rule to better reflect the nuance of the subject.


Thanks for explaining. I agree the article title can seem like clickbait, but to cut out part of it also seems a bit ... off. I would probably add a little context to clarify, e.g. '[On a chart of verbosity vs performance] Go is a Pretty Average Language (But OCaml is Pretty Great)'


This is called 'misrepresentation'.


The article's own subtitle is misrepresenting itself? You'll need to make a better case if you want people to buy that.

Btw, the change I made was because of HN's rule. The original title was linkbaity, so the rule called for it to be changed. More explanation here: https://news.ycombinator.com/item?id=23045233.

It's true that the subtitle was also somewhat baity, but less so—and in any case, the change was an experiment, as I explained. If I hadn't been deviating from regular practice, there would have been no experiment, no explanation, no complaints, and therefore no responses to complaints, and this comment wouldn't exi


Furthermore, by completely changing the title, you're grossly violating your own guidelines:

"Otherwise please use the original title, unless it is misleading or linkbait; don't editorialize."


This is a mis-characterization of this particular title change. He changed it to the subtitle. The part

>unless it is misleading or linkbait

Is what applies here. The resulting title is likely to lead to a better discussion, which is part of what HN is about.


HN is a benevolent dictatorship, which is a big part of why it remains such a great place.


I'd look at various tech sub-reddits (like Rust) and ask why other communities can do it with more transparency and better volunteerism.


I'd be happy to learn from communities that get things better than we do, but one critical bit for such discussions is that size is always a dominant factor. At each order of magnitude, problems get qualitatively different. I'd guess that the Rust subreddit is much smaller than HN (though who knows? Rust is popular). If so, the lessons are not likely to translate automatically. Similarly, lessons from HN don't translate to much larger outfits than HN.


I wouldn't necessarily call the community/website "great" given that the discussion would devolve into a flame war just because a different half of the title was used.


Which is of course their prerogative. I have notified the author of this article about dang's deliberate misrepresentation as a courtesy.


The reason why Ocaml and Haskell are relatively verbose is that in order to achieve maximum performance one must write code in a decidedly unidiomatic style.


This is true, but also not necessarily an objection. It gives you a choice of decent performance by default, or very good performance when you really need it.


Reminds me of some blogposts abut JavaScript and Rust, I've read.

The author was arguing that you can achieve maximum performance with Rust while still writing it in an idiomatic way, which isn't possible with JS.


That might be true, but the converse is that even idiomatic Rust is extremely verbose and painful to the eyes.

Ideally, you'd be able to write mostly OCaml (higher level, closures, polymorphism, GC) and Rust (safe manual memory management) only when needed.


Which will be possible when algebraic types get integrated into OCaml, part of the work has already been integrated due to the ongoing multi-core changes.


It's already possible, there's https://github.com/zshipko/ocaml-rs


You mean algebraic effects? Can you link to more about how the specifically enable better memory management? I always though that would require linear types (at least Rust-like memory management - I see how effects would enable memory-pool / stack-like memory management).



I can't find any details on memory management (except regarding multicore GC, which is not user-facing changes AFAIK).


I think coming from JavaScript, the wins in writing Rust are much bigger than coming from OCaml.

So, as a JavaScript developer, I currently favor Rust instead of OCaml.


Haskell's higher gzipped code size might be due to assignment providing semantic compression for other languages -- that gzip isn't applying to Haskell

I.e. gzip is very unlikely to model assignment in it's compression algorithm, so it will compress other elements -- and those elements probably correlate with the ones Haskell already optimizes for

So gzip metric makes the other languages look better than a pure loc would


Does this imply that anti-functionalism is better?

I.e. gzip and the human brain can probably compress repetition and local effects very easily -- and this is what purity gives. So with purity I can only represent the obvious (i.e. easily computable), and gzip suggests this is the part of the computation that least needs expressing -- since it is already the most clear

But allowing anything to affect anything else in any way; as a maxima; allows me to represent the least-obvious algorithms most succinctly -- and least-obvious algorithms are the most important ones to represent succinctly!

I feel like there is some truth here, but it is not the whole story. What am I missing?


I suppose I assumed less-obvious algorithims always have semantically meaningful differences to more-obvious algorithims

But we can hope that this is rarely the case

More-obvious always wins if there are never any semantic differences

Good programming is finding the most-obvious algorithim while keeping the same semantics -- and even sometimes changing the semantics to make them more obvious

In that vein, a side question I often ask myself is, which should the algorithm represent clearly: 1) The problem 2) The solution 3) The transition between those two

All 3 have validiy, and result in very different programming paradigm

How can I choose one above the other??


I'd like an answer to that if anybody would care to ;)

1) The problem. Maps well to macros 2) The solution. Maps well to prolog 3) The transition. Maps well to imperative

What basis can I use for choosing between those goals??

-- The given mappings are examples


Shouldn't two programs with the same function but written in different languages zip to (just about) equal size, for a sufficiently powerful version of zip?


Such a version of Zip can't exist. If it would, you could decide whether a program calculates a certain thing. According to Rice's theorem something like that can't exist without solving the Halting Problem

https://en.wikipedia.org/wiki/Rice%27s_theorem


jpeg compression works better on photos, gif compression works better for horizontal lines of repeating color

One of the easiest things to compress is repetition. After that local effects are going to be easiest

Remote effects -- i.e. assignment, is going to be np complete -- i.e. if the best compression for x depends on everything else, then zip is going to have to look at everything else for every single bit it compresses!!

So I don't know the details of zip's compression, but I'm sure it doesn't compress side-effects well


Pretty great until you come across a project with thousands of lines of code and 0 type signatures. At least Haskell warns you about top-level declarations without type signatures. This is the error of the developer, true, but I think compilers should steer people away from doing such things.


For OCaml, generate annot files by adding the -annot flag. Then in emacs you can do C-c C-t over any expression and it will tell you the type.

Edit: But I do think you have a point. In our OCaml programs we enforce interface files (*.mli) for all modules (the rule is enforced by 'make check'). So at least between modules there is always documentation and an explicitly typed interface.


In ocaml, install merlin or the upcoming ocaml-lsp, you mean :) It's excellent tooling and provides completion, jump to def, etc. Annot files are very limited in comparison.


Are you talking about OCaml. It's widely accepted community practice to have type signatures in separate interface files. Those make it pretty easy to navigate the code.


My main issue was when making code modifications. If I wanted to add an argument to a function I would have to find and fix tens of calls to that function in order to convince the compiler to infer the type that I wanted so I could use its error messages to find and fix the rest of the calls. In hindsight, I should've written annotations by hand for functions I had to modify.


Indeed, if you have an interface file which contains the function's type, you'll only need to change that one point to get all consuming modules to see the correct type.


Aren't they always displayed in a virtual line above?


In what IDE? I used vim and had to setup merlin/ocaml-lsp to deal with that project.


I used VSCode.


I wish this post and comments had more ocaml :(


> You may have noticed I left one of my favourite languages, Python, out. This is because I had truncated the chart at the given maximum X and Ys. Such a plot for Python would show no red line.

Is this because Python is very slow? If so, it would be interesting to see a graph for PyPy.


Maybe a logarithmic performance axis would make the plots clearer.


The original title of this blog post is "Go is a Pretty Average Language" and the sub title is "But OCaml is Pretty Great".

Why has the title of this HN post been changed to something which isn't either?



Probably to try and prevent this from turning into a flamewar. A lot of people just read the title and don't read the article and then comment based on what they think the article is about.


Solely because dang didn't like the title. Here are several things to consider:

1. HN's own guidelines state: "Otherwise please use the original title, unless it is misleading or linkbait; don't editorialize."

2. dang edited the title of a submission from another user. Almost certainly without asking.

3. dang edited the title that reflecte title of an article verbatim, to something completely different that conveyed completely different information. Almost certainly without asking the author of the article.

It's HN's prerogative to do whatever they want on their site. However, considering the after-the-fact manipulation of a HN user's post for no good reason, and the misrepresentation of an article that someone else wrote, I do not trust the HN mods at all. Since they're already changing user's contributions based on a whim, it's reasonable to assume that this extends to users' comments, the vote scores, and anything else visible on the site.

I'm going to get permabanned for this, and I don't care at all. I now have zero trust in the mods of this site, and the information it presents.


You're describing something that has been standard practice on HN for over a decade, and is well understood by this community—so much so that entire websites have been dedicated to tracking title changes on HN. If you want to know more about HN, it's easy to learn by using the search box that's at the bottom of every page, as well as the links (guidelines, FAQ, etc.) at the bottom of every page. Also we're happy to answer questions at hn@ycombinator.com, and HN users are happy to explain things to new users.

It's usually a good idea to take a little while to learn about the conventions of a community you've just joined before jumping into high drama. On the other hand, most new accounts that show up to complain about HN and its moderation are not really new users at all, but concern trolls—sturgeon of the second freshness: https://www.google.com/search?q=%22second+freshness%22&oq=%2....


What a silly metric. Is there any point other then feeding language-war trolls?


Go and Rust are harder to pick up.

With weird operators and syntax.

Maybe I am getting old, losing my edge or python has spoilt me.

Even C# is elegant compared to those two.

I tried to learn both of them, but they just didnt excite me as much as C# or Python.


Rust has a similar "problem" as C++: it wants to be high level (for productivity) and close to the metal (for performance). Such languages are needed, but they are not going to be the easiest to learn. It is not their fault.


There has to be a simpler solution than Rust though. Rust seems to have started with some clear objectives but evolved into a nightmare.

We should be putting the effort into the compilers and keeping the language free of cruft and any unsafe concerns. Much like Delphi did for example.


For most, average userspace programs garbage collectors work very well. (Of course there are exceptional cases as I'm sure people will be quick to point out.) However if you deny garbage collection for all programs you have to replace it with something very complex to reason about, like RAII for C++ or borrow checking in Rust. Or you end up reimplementing GC inefficiently (C++ reference counting). Also certain algorithms -- persistent data structures with shared subgraphs and therefore cheap updates -- become very difficult to write because there's no local ownership of nodes, only something with a global program view (a GC) knows when an object is unreferenced and can be freed.


It's funny that one of the key selling points of Rust is "fearless concurrency", but IMO if you want "real" concurrency - i.e. not just "split work between threads" which is what Rust can do easily, but different threads doing the same work concurrently (e.g. accessing the same data structures, sending messages, etc.) you need lock-free data structures, and for those you need GC (or something different, e.g. epoch-based reclamation - but from what I've read (no practical experience though), it's just as tricky to implement and performs worse than good GCs).

I think the future is combined GC (for concurrency & unpredictable lifetime patterns) / reference counting (for resources - files, sockets, large primitive arrays) / safe manual (e.g. memory pools). I just need to figure out all the details :D


The crossbeam crate implements epoch-based GC for concurrency in Rust. It's widely used and seems to work fairly well.


This. The Rust community seem to be pushing Rust as a more-or-less general purpose programming language. But giving up garbage collection never makes sense unless you have to. The applications where you really can't use GC are a tiny niche relative to the ones where you can.


>> The applications where you really can't use GC are a tiny niche relative to the ones where you can.

And this is one space Rust is targeting explicitly. Why would people replace Go with Rust for performance reasons otherwise?

https://blog.discord.com/why-discord-is-switching-from-go-to...


Especially, if you have do limited scope GC like Erlang, and let the programmers pick when GC is happening. Sort of like malloc/free but without the fear of introducing bugs like double free.


I'm not really seeing the whole "nightmare" part of Rust. If the borrow checker ever becomes an issue (and it really shouldn't be one in most cases, if you're familiar with the semantics) you can easily opt into increased flexibility. It just takes a little more boilerplate than in other languages.


Sometimes when I look at the strange constructs of Rust, and all the hoops that you must jump through to do certain things, I end up yearning for C++. Or even plain old C. And I tell myself, that I’ll be more careful with memory handling.


I have the opposite: when I see C or C++, I'm terrified of touching it for fear of messing up the memory handling. Rust allows me relax.


Exactly, C seems saner in my view, and in its own way, elegant too.


There is. It's called Dlang or D.


And the core team went down the C++ path in many cases.


What weird operators and syntax do you have in Go? I'm asking because I feel the exact opposite. Someone who already knows a bit of a language with C-like syntax can pick it up very quickly.


Flipped variable declaration. Duck typeing, And the idomatic use of := instead of explicit declaration. Also the use of _ and multiple return values (which i really like). Are all a little tricky to begin with.

That use of := is my only very small hang up.


Whether variable declaration is 'flipped' or not depends on which languages you're used to. Duck typing has nothing to do with syntax or operators. ':=' is an explicit declaration (with simple type inference). The use of '_' to indicate an unused binding is common to many languages. Multiple return values in Go have about the simplest and most intuitive syntax imaginable. You separate the values with a comma both when returning them and when binding the returned values.


For Go:

1. Exports from packages are to be defined with a Capital letter. To me, that is very hard to read. An explicit export statement is way better, if verbose.

2. Type declarations after variable (x int, y int), to me personally, is hard to read. Now I know a lot of people are fine with it, so this is a personal thing for me.

3. Multiple return values syntax is terrible.

  func split(sum int) (x, y int) {

   x = sum * 4 / 9

   y = sum - x

   return

  }
In this function, the sum is the input and x & y are outputs. The syntax seems weird to be. This also is a personal gripe.

4. Variable declarations can be var i, j int = 1,2 AND k:= 3 with implicit typing. To me this is a bad design, with various programmers choosing different methods to declare and initialize. Leads to a lot of cognitive overhead to read code, especially on large open source projects. Why not make it explicit to declare types? Sure, it's a bit of more work for the programmer, but in the long run, that bit of more initial work will save a lot of downstream confusion and frustrations.

5. Constants cannot be declared using := syntax. I am sure there is a valid reason for it, but assignment in the language for now is = , := and those can and cannot be used in various places. That is confusing.

6. For loop has no parenthesis for the loop initialization but needs {} for the loop blocks. This is really stupid. Personally, for(i:=0,i<100;i++) {} is a concise syntax and looks pleasing and readable.

7. To complicate it further the initialization and post statements are optional

   // This is a valid for loop in Go

   sum := 1

   for ; sum<1000; {

    sum = sum + 1

   }
Isn't that confusing in itself. Sure, we don't have to write a loop that way, but some junior developer is going to write it anyway, and it muddles up the code for sure.

8. From the official Go Docs "At that point you can drop the semicolons: C's while is spelled for in Go"

   // This is a ALSO valid for loop in Go

   sum := 1

   for sum<1000 {

    sum = sum + 1

   }
yay.

9. And then there is this gem in the "if" statement.

   if v := math.Pow(x, n); v < lim {

     return v

   }
Yes, you can write an optional initial statement before the if condition. All this for what? So you save one extra line in the code? Sorry, the beards got too long and too tangled up for whoever came up with this idea. 10. Also, the initial statement you saw just after the if statement, the variable "v" is available in the scope of else/if and else statement blocks. But not outside the if/else scope.

Yay, another way to confuse code readers.

10. What is up with the defer keyword to defer function calls until the "surrounding function returns".

There are many more, but I think I have made it clear why, personally, Go is off putting.

edit : Changed format a bit.


Uppercase exports are great once you get used to them. You can know whether an identifier is exported instantly without having to look up its declaration.

Agree that the multiple declaration/assignment syntaxes are unfortunate. Another example of this is that := can't be used for top-level declarations. Once you learn the rules you stop having to think about it, but if the rules were more consistent, they'd be easier to learn.

The other big issue with Go's variables is shadowing. Again, one of those things that you pick up fairly quickly, but will bite newbies a few times.

If-declarations are nice because they restrict the scope of a variable. I like knowing that I can see instantly that a particular variable is only used in three lines.

Defer is a great way to clean up resources without the complexity of destructors and exceptions. Defer is also ubiquitous in code that needs to wrap accessor functions in a mutex. It's been adopted by a few other languages, which shows that it has merit. My only gripe with defer is that it only works on function scope; it'd be nice to have a defer scoped to a for loop.

The other stuff is your personal preference, which I disagree with, but can't argue against :)

I'll add another personal gripe of mine, though: typecasting is indistinguishable from a function call, e.g. Foo(bar) might be calling Foo on bar, or casting bar to type Foo. Those should really have different syntax.


Don't know Go but I'm going to give it a shot:

1. Casing is a personal preference, so that's fair, but not a valid criticism.

2. Type decls after the variable are much easier to parse, and allow for (easier) type inference.

3. Yeah? References exist... C/C#/C++ have them too.

4. This ones a fair criticism.

5. That's because constants have a completely different syntax from variable assignment.

6. The parenthesis are syntactically unnecessary and add nothing, the curly braces are required for an unambiguous parser (barring significant whitespace).

7. This is true in almost every language with for loops. Try it in the JavaScript console in your browser.

8. Yeah? So?

9. The scoping is a feature, not a bug. E.G.

   if v := getResult(); v == Success {
     Print("Success!")
   } else {
     Print("Failed with error code ", v)
   }
   // v doesn't pollute namespace out here
10. That one I'm not educated enough to make a reply to.


Try prefixing code with two spaces instead of a ">"

  sum := 1
  for sum < 1000 {
    sum = sum + 1
  }


Have you tried C++?

Rust is an alternative to C++ or C, not to Go, C# or Python.

Sure, you can write any program on any language, but if you try to write bare metal C code from C# or Python, your frustration will be quite large as well.

Use the right tool for the job.


Just a minor correction, Rust is trying to be replacement of C++. It will be unlikely to replace C, at least for next 2-3 decades. C will serve as the low level language and Rust will be a higher level language.

Today Rust needs LLVM (written in C++), to generate binaries and it may be 2-3 decades before Rust will be used to write anything like LLVM with supporting platforms.

Today Rust library to do anything useful rely on unsafe integration with underlying C code.

Rust is still on fringes and quite a complicated language compared to C. I still doubt it will be a replacement for either, may be it will go-exist with C/C++ on fringes and may be in next 30 years if it survives like Python might have significant code and can be useful in place of C++.


No language will replace C on its turf, FOSS UNIX clones and embedded POSIX RTOS.

Everywhere else C has been being replaced a good part of the last two decades, even all major C compilers are written in C++.

As for Rust, yes a co-existence is more likely.


> Everywhere else C has been being replaced a good part of the last two decades, even all major C compilers are written in C++.

You keep saying this, but I'd recommend you to look at a few random "C++" files in https://github.com/gcc-mirror/ for example.

For many of these projects, "moving to C++" often means to change the file extension. Or not even that, sometimes it's just changing the build setting to "Compile as C++".

There's no reason for holy wars. Fact is that regular C code is still ruling the world. Maybe not in the financial industry. Maybe it's embedded in a few C++ syntactic constructs. Maybe some people even found a way to use RAII in a way that doesn't suck in a large project. But the vast majority of the stuff that does something is still dangerous, evil, ugly C.

And now excuse me while I go watch another talk named "C++ Features to avoid" or "Taming build times in a large C++ codebase" or "Why we went back to plain C"


Yeah, the old USENET argument of comp.lang.c, kind of missed that.

Here are my reading recommendations,

https://gcc.gnu.org/wiki/gcc-in-cxx

"This page is meant to eventually help document the ongoing effort in the "gcc-in-cxx" branch to make gcc compile in C++ mode, i.e. as C++ source code. So, the goal of this branch is to facilitate switching GCC's implementation language to C++."

https://devblogs.microsoft.com/cppblog/the-great-c-runtime-c...

> "We have converted most of the CRT sources to compile as C++, enabling us to replace many ugly C idioms with simpler and more advanced C++ constructs."

https://llvm.org/docs/Proposals/LLVMLibC.html

"Provide C symbols as specified by the standards, but take advantage and use C++ language facilities for the core implementation."

You are excused, write a blog post afterwards.


There is a fallacy in taking enterprises that might be done "eventually" as being already completed and ultimately proven.

What you quote is not disagreeing with what I was saying at all.

Quick look at the GCC link you posted:

    Project Status (last updated 06/2009)
        Phase 1 of gcc-in-cxx now complete (06/2009)
        gcc-in-cxx completes bootstrap as of 03/2009
Seriously.

> "Provide C symbols as specified by the standards, but take advantage and use C++ language facilities for the core implementation."

Sure, now let's look at the code

https://github.com/llvm/llvm-project/blob/master/libc/src/

Oh, it's not even a completed project. Here is stdio

https://github.com/llvm/llvm-project/blob/master/libc/src/st...

stdio/ for example has just 3, mostly empty files: file.h, fwrite.cpp, fwrite.h. Last commit was 8 days ago. It's not a project to be taken as serious evidence for any argument.

You need to stop selling things that might never exist, or have long been dead in the water.

Apart from that, I've looked at 10 files and the only use of C++ I've seen was some namespacing (that just makes the code more verbose IMO; YMMV).

It's essentially C, the thing you're leading a religious war against.

In this day and age where it's common knowledge that we're all living in opinion reinforcement filter bubbles, it should be possible to expect from people to at least quickly skim the citations they provide.


Ah so we are back into the comp.lang.c days. Reply will come in due time.


Just back in times where we try at least not to back up claims with citations that can't withstand a 1-minute check.


Going over that LLVM-libc website again, this is absolutely hilarious. It reads like a bucket list of grandiose features, done by a Junior developer with lots of motivation, but with no idea that these add costs and complications, and no idea what's really important.

The Git repo matches that impression. There are neatly arranged directories containing CMakeLists and fuzzing stuff and READMEs and what not. Is there a good reason why strcpy() needs to be fuzzed? Some directories contain a lot of header files with 20 lines of comment junk and namespace junk, surrounding a single line that contains a function signature.

And looking at the code, it seems not even 1% of the actual functionality is implemented. I will be extremely surprised if this project gets ever completed.


Sometimes I do make the failure of doing a hasty reply, should know better.

Trying to reduce the amount of C code in the world since 1992, sometimes I win, sometimes I loose, in what concerns me, the outcome looks pretty much positive to me outside embedded development and FOSS UNIX clones.

As for the whole C compilers written in C++ thread, a proper reply will follow up rest assured, even if takes a couple of weeks like last time.


> For many of these projects, "moving to C++" often means to change the file extension. Or not even that, sometimes it's just changing the build setting to "Compile as C++".

It's a bit more than that in the case of GCC. Not a lot more, but GCC does use templated containers. I'm not sure it uses any other C++ features, a kind of "C with templates" might have been sufficient for its purposes.

But then again, what would you expect? GCC is huge. What would a huge refactoring buy the developers?


All I've been saying is that they don't REALLY seem to have "moved to C++". FWIW


Moving to C++ doesn't imply having classes everywhere, anything described in ISO C++, requiring compilation with a C++ compiler counts as moving into C++.


But if projects are basically C compiled as C++ that's absolutely meaningless and if you take that as an indication for anything then you're just fighting a pointless religious war.


On the contrary, that has been the argument from the C side since the comp.lang.c days trying to prove a point that C is still relevant without understanding that was C++'s Trojan horse.

Also just because it looks like C the semantics aren't the same, e.g. implicit conversions from void* aren't allowed in C++ compilers, ?: has a different precedence and a couple of other differences, this still in C89, let alone with everything that came afterwards.


These are pretty minor things that don't matter semantically or with regards to safety. For example I never knew that there was a difference in the precedence of the ?: ternary operator (which I don't use anyway), but I'd wager that it's more the byproduct of different language formalization methods, and not an intended change. There surely can't be a significant difference because it would hinder the Trojan effect.

void* not implicitly casting to their assigned value in C++ is an annoyance to me at best. I don't see any advantage at all there.

Things that matter are for example that C++ has vectors or other container types with the possibility to enable bounds checking with a compile flag. Well, you'd think they matter, because it is a security improvement at first sight, and I certainly thought so. I do now think that it makes programs significantly harder to maintain, to the point where superficial advantages cancel out or turn negative.

If that effect wasn't real at least to some extent, I wonder why do I see SO MANY function signatures in C++ projects where functions are passed in classic pointer + length style.

Furthermore tools like valgrind can be helpful in a way that gives some of the benefits of language-level methods for bounds-checking.


Why you see them? Because developers should know better, but there is a community that keeps writing C style code in C++, unfortunately, as their teachers are stuck in the 90's.

"CppCon 2015: Kate Gregory “Stop Teaching C"

https://www.youtube.com/watch?v=YnWhqhNdYyk

But what to expect when schools keep using Turbo C++ as teaching tool, https://galdin.dev/blog/why-you-shouldnt-be-using-turbo-c/


Oh my, another day, another random unrelated link fest. I do happen to have watched that talk, and I'm not sure what point you want to make by linking to it. That talk is someone else's opinion how to get students to write some C++ that compiles - students with no prior understanding of C++ or of programming. Students that do not aspire to be great programmers or even to understand a little bit of what's happening here. That will not be able to help themselves when there's a 200-line error spewn from templated containers for example. That don't even want to understand (or have trouble understanding) the concept pointers.

It's NOT a talk suggesting that C is bad or that the use of it is bad (despite the title).

It's a talk that suggests that C++ needn't be hard. If we were about actually understanding the language and making significant use of its features, Bjarne Stroustroup himself begs to disagree with the idea that C++ is a language that an individual can fully understand.

It's a talk suggesting how to approach teaching when the goal is to get quick (and surely limited) results instead of understanding.

But that's totally uninteresting to me. If it's interesting to you, we should end doing more comments, and I should try not to be annoyed when I read your evangelism and your ton of comments where you often hop around and quickly change topic when someone disagrees, and where you instead throw more links to pages that may have a title that suits your agenda, but that are often low on content or dead-in-the-water stuff, or that are completely unrelated to the discussion.

I'm tired of playing wack-a-mole with you.


> Things that matter are for example that C++ has vectors or other container types with the possibility to enable bounds checking with a compile flag.

As I mentioned above, templated containers is the one significant C++ feature that GCC uses. So maybe you now agree that it's not just C compiled as C++ after all?


Well I was never going into a "technically" argument, I thought that was pretty clear from the start. I had already knewn that those containers are used there in some places (although I didn't find any when I quickly skimmed the codebase 2 days ago). So let's just say "yes technically I agree" and maybe I won't seem as a totally boneheaded evangelist.

My actual intention and the reason for my comments was to put things into relation. Compared to what you want to argue about here, pjmlp has a totally different idea of how things are or should be, and which I think is rose tinted and skewed.

Using a templated container once maybe every few dozen or hundred lines is not a justification for unfounded and unbalanced hating, and it is no indication for taking this codebase as an argument that C is so bad that they couldn't deal with it anymore, so they had to switch to C++ which solved all their alleged problems. That's just not the way it is.

Not taking from that technical point of agreement, and as an aside because I can't resist, I personally prefer to just use stay with C completely/"technically", and I believe I get better results by improving the architecture of my projects to work around C's "shortcoming" of not having templated containers. One can even make a point that just templated dynamic vectors are easily available in C since C does have polymorphism for arrays, supported by its pointer syntax, and using valgrind and the occasional assert() you can get lots of the same benefits in terms of security. The way I go about this achieves the same level of convenience, minus RAII (which I found to be problematic on larger scale. Templated containers and maps are nice though for short throw away code and temporary variables that never leave the function scope.)

I also actually make heavy use of a C99 feature that is still not in C++, which is designated initializers to make static arrays (often readonly data) to map from enums to additional data. I also found that I've wasted a lot of my time trying to build arbitrary "nice" things in C++ and always finding myself in a place some time later where I've totally locked myself in, so that's why I prefer just not to use a C++ compiler at all, even if there are some little nice-to-haves that you can get with it.

So there you go, my rant that is totally out of place here, why I don't think C++ is a better choice even if used very lightly.


Rust can absolutely replace C in everything except small embedded platforms where only a C compiler is available. With the unsafe subset, it can do anything that C can do. It's actually harder to seamlessly replace C++, because C++ has its own semantics that are not very easily replicated in Rust.


Hopefully something like zig will replace C. Much less complex than Rust, but with niceties like modules and arrays.


> Rust is trying to be replacement of C++.

Rust is not trying to replace any language.

> it may be 2-3 decades before Rust will be used to write anything like LLVM with supporting platforms.

Depending on what you mean by "like LLVM," this already exists partially in Cranelift. We'll see if it takes three decades or not.


> Rust is trying to be replacement of C++.

Citation needed.

Rust lacks so many fundamental C++ "features" (variadics, thin pointers, HKTs, const generics, specialization, overloading, inheritance, good FFI with C++, C++ exceptions, ...), that trying to sell it as a C++ replacement is quite hard.

OTOH, Rust is pretty much at feature parity with C, having excellent interoperability with it, and only lacking somewhat in inline assembly, allocas/VLAs, etc.

You can take any C project, and implement a new feature or refactor some part of it in Rust without issues.

You can't do the same for C++ code easily: integrating with C++ templates is hard (e.g. try to use a C++ Boost library, or the C++ standard library from Rust), integrating with C++ error handling is hard (try to write Rust code that can error using C++ exceptions, or that can be unwinded with one), calling C++ ABIs is hard from Rust, interfacing C++ types with Rust is also hard (inheritance, virtual functions, etc.).

Any successful C++ projects that uses Rust (e.g. Firefox), uses Rust as a better C to implement a C library, that can be called from C++. That's not "being a better C++" in any way of the word.


>> Rust will be a higher-level language.

It does lots of low-level things that are closer to C though.


I have used C++. I understand about using the right tool for the job. But Rust and Go are not good tools solely due to their syntax.


Similar opinion. The only two things I consider worthy at this point are C and Python. Everything else seems to have lost the balance of compromise, stability, low API and language volatility, the ability to retain it and it's API in mind effectively and tool flexibility.

On the Go front I have to say I like the language but not the rigidity which quite frankly gets in the way a lot of the time and turns minor experiments and refactoring into a time consuming nightmare.

On the C# front, I've been writing C# since before the compile toolchain was even RTM and I dislike every moment now. It has become a bloated mess of a language and the supporting frameworks are impossible to keep track of and up to date.

Edit: I prefer SBCL over everything really but that's another universe :)


> On the Go front I have to say I like the language but not the rigidity which quite frankly gets in the way a lot of the time and turns minor experiments and refactoring into a time consuming nightmare.

Can you give an example of this? Only thing I can think of is renaming struct fields when making them public/private.


Have you looked at Nim?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: