Chromium project finds that 70% of security defects are memory safety problems

Barrin92 · on May 24, 2020

Reminded me of an article I read last year (https://www.zdnet.com/article/microsoft-70-percent-of-all-se...)

Interestingly enough Microsoft arrived at the same number. I think it really stresses how hard it is to reason about memory manually. I'm still surprised how much stuff is being written in non-safe languages even if people could get away with a managed language.

One thing that I just recently noticed when scrolling through some github repos is how much new software in the linux ecosystem is still written in C despite it being probably avoidable, like flatpak.

keithwinstein · on May 24, 2020

You know, I totally buy that 70% of the vulnerabilities in complex C++ code relate to memory safety, especially in something like Chromium, which is incredibly complex and includes a lot of third-party code that wasn't always designed at first to be robust to untrusted input, and also fast-moving with a ton of code added or churning every week. But I'm not sure I buy that memory-safe languages will consequently be as beneficial for safety (especially in other kinds of software) as that fact would suggest.

For one, not all software is like Chromium. If you look at something like OpenSSH, the vast majority of their security holes have nothing to do with memory safety and are just logic bugs (often caused by features that somebody added that aren't core to the basic SSH experience, e.g. code that interfaces with X11 or something) or protocol weaknesses. (http://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=openssh)

The other effect is that in practice, memory-safe languages can come with security baggage of their own. If you look at the zillions of security holes in something like Rails or Wordpress or Django, a fair portion of them relate to an attacker's ability to invoke sophisticated-but-unintended behaviors that are more likely to be hiding in these managed languages (and their support libraries) than in something like C++. E.g. CVE-2013-0156 or CVE-2013-0277 or apparently any current Python or Ruby program that, even today, calls yaml.load on untrusted input. That kind of "security hole from unwanted latent functionality" is less likely to exist in C++. (I realize this is a contrarian view not shared by the vast majority of PL/security experts, but the ones I hang out with seem to interpret "memory-safe language" to mean "expert-written Haskell or, if you want to slum it, Rust" and are not thinking "random Ruby/Python/JavaScript/enterprise Java".)

Not to mention the countless high-profile security holes that have nothing to do with memory safety, e.g. Shellshock, "goto fail", Lucky Thirteen, BEAST, CRIME, POODLE, FREAK, Logjam, etc. Or bugs that are very relevant to Chromium but aren't really Chromium's fault and probably don't appear on their own list, e.g. Meltdown/Spectre etc.

simias · on May 24, 2020

I don't understand your argument. You're basically saying "it's possible to write safe code in unsafe languages and vice-versa". Obviously that's true, but I'm not really sure what that tells us.

I mean, supposing that Chromium was rewritten in a safer language, have we any reason to believe that these memory issues would be replaced by a similar number of non-memory-safety issues?

>That kind of "security hole from unwanted latent functionality" is less likely to exist in C++.

Why? That seems like a non-sequitur to me. For me the real difference between the projects you list is that Chromium and OpenSSH are not, as you put it "random Ruby/Python/JavaScript/enterprise Java", they are heavily audited and have a lot of resources allocated into preventing the very issues you mention. Comparing OpenSSH to Wordpress and attributing their respective security track record mainly to the languages they use is fairly absurd IMO. If you're clueless enough not to properly sanitize untrusted input can I really believe that you'd manage to write safe C code?

This thread is oddly reminiscent of discussions around gun control. Because something is not 100% effective doesn't mean that it's not valuable. Although I guess we do need these unsafe languages in case we ever need to overthrow a tyrannical government... no wait, I'm getting confused.

steveklabnik · on May 24, 2020

> Although I guess we do need these unsafe languages in case we ever need to overthrow a tyrannical government... no wait, I'm getting confused.

I have heard people say "we rely on exploits to be able to exercise software freedom on locked-down platforms" and "if games were written in Rust, a lot of the glitches speedrunners use would go away and we'd lose our hobby" so... you're not always off.

ptsneves · on May 24, 2020

Wow. To equate logic errors with being clueless? You do not need to be clueless to not sanitize input. You just need to forget about it, or incorrectly assume the input is safe, or, basically have a bad day. I make bad assumptions and stupid logic mistakes all the time. I guess everybody does because I see logic bugs everywhere.

Also you assert it is absurd to compare WordPress and openssh In the context of this argument why? They are both widely used software with very high stakes on their reliability and security. They show very well the contrast that even though one is memory managed and the other is not, that does not stop both having serious bugs. Actually it shows memory management is a don't care variable. On the other hand if chromium finds their project has much more memory management issues than other issues, then yes: memory management for the functionality of chromium seems to be a relevant facto upon which improvements can be done.

jdm2212 · on May 24, 2020

I think the previous poster was referring to SQL injection vulnerabilities from not sanitizing DB query parameters. And that's something you only f* up if you're clueless (like I was back when I used to do that).

simias · on May 25, 2020

>Wow. To equate logic errors with being clueless?

I meant that input sanitization is generally an easier problem to solve than writing safe C. If you can't do the former reliably, I'm certain you wouldn't be able to do the latter.

Input sanitization should generally be handled very close to the interface, once you went through this step you should expect to only deal with safe data. Memory safety covers the entire application, if some non-critical debugging function 10 level deep in the call stack messes up when it computes the size of a message buffer you can have a remote code execution vulnerability.

>Also you assert it is absurd to compare WordPress and openssh In the context of this argument why? They are both widely used software with very high stakes on their reliability and security.

Wordpress is a complicated ecosystem with multiple plugins that each can introduce security problems. Its attack surface is also incredibly large since it's generally public facing and anybody can interact with a lot of the features by default. That coupled with the fact that it's extremely popular make it a very good target for attackers. It's also a relatively fast moving target since the web changes fairly fast and new features have to be implemented regularly.

OpenSSH does mostly one thing and does it well. Its surface of attack is a lot smaller and it's vastly easier to audit and test. Its development is also overseen by the OpenBSD developers, who are famously uncompromising regarding safety. The feature set is very stable and changes very slowly.

>Actually it shows memory management is a don't care variable.

The absence of evidence is not the evidence of absence. How much development effort went into making sure that these issues didn't exist? How many thousands of hours auditing the code and making sure it did what people wanted? In a safer language this time could've been spent auditing other parts of the codebase or implementing other features.

I'd also like to point out that one of the biggest OpenSSH vulnerabilities ever found (in the Debian version of OpenSSH where the maintainer heavily reduced the generated key entropy by mistake) was indirectly caused by a memory issue since the reason for the patch was a false-positive returned by Valgrind regarding use of uninitialized memory.

asveikau · on May 24, 2020

I think it's a fairly good point you're replying to and so don't shoot the messenger.

There are tons of comments, including here on HN on a regular basis, which one could use to conclude that WordPress should be more secure than anything written in C no matter the quality. And yet, it isn't true in that instance. So it's fine to advocate rust or whatever, but let's not make that other, more dangerous conclusion.

saagarjha · on May 24, 2020

Yes, it is important to note that switching to a memory safe language does not magically solve all your bugs. My own personal anecdote: when I found multiple bypasses in the sandboxing mechanism on macOS, it was not because the relevant code was written in C++, but because the dynamic linker is a really complex system; the things I found clearly flew under the engineer’s radar when they were designing it and they didn’t think beforehand about how those components interacted together. Still, being able to reduce the attack surface area to solely be logic errors is much better than having to deal with memory safety thrown in as well. (Similarly, languages with stricter type systems magically get rid of issues like “you gave me a string and I wanted an int”, but the problems of “you forgot to check the password in this one case” still remain.)

nindalf · on May 24, 2020

I think reduction in the number of errors is a worthwhile goal. Switching from C++ to a language without memory safety issues while being just as fast would reduce the number of things the developer and code reviewer have to worry about. They could spend more time searching for bugs in the logic, potentially eliminating further issues.

In practice what I’ve found is that people prefer to deal in absolutes. A large reduction in a category of bugs isn’t enough, it needs to be eliminated altogether, they say. If it’s still possible in a contrived example, what’s the point of investing in switching?

whatshisface · on May 24, 2020

It's a miracle that people who think like that use computers at all - after all, computers only reduce the time it takes to perform a clerical task or a calculation, they do not eliminate it.

TeMPOraL · on May 24, 2020

No, it's a reflection of a very important heuristic used in programming (that's arguably much more fundamental on a philosophical level, but correct words to categorize it escape me): the zero-one-many principle. If there's more than one something, you have to treat it as if there may be an unbounded number of something. You can only get it out of your head if you can put reliable bounds on it; the best if you can prove there's less than two of something.

rictic · on May 24, 2020

So if you could make your build 70% faster, philosophically it wouldn't matter, because you would still have a build step?

jfoutz · on May 24, 2020

No, you sort of walked past their point.

A thing can happen zero times (never). An example might be 1+1 =0. You could get really unlucky with cosmic rays or something, but really adding two registers, both containing 1, that result isn't going to be zero unless there's some sort of hardware failure.

A thing can happen once. an example might be, you can delete a file once. There are ways to get unlucky, of corse, but once it's unlinked it's gone.

if it happens more than once, you really should probably think about an unbounded number of times. now days, that sorta means 2^64 times. There will be bugs when something overflows int64, but I hope you get the gist.

The parent comment is talking about invariants you can use in an algorithm.

I think you might be worried about python vs C or something along those lines. Really, it should all work with a pencil and paper. Which is obviously going to be slower than pushing around electrons. But if you can find those invariants, 0-1-many, you can make a better algorithm. If you're stuck with a pencil it'll still make that faster.

barrkel · on May 24, 2020

It's a category error to apply the intuition to a risk calculation, though.

TeMPOraL · on May 24, 2020

It's not a category error if you're doing an initial guess. Not to mention, particularly in information security, most of "risk calculation" is intuition and guesses, maaaybe sometimes plugged into a probabilistic framework with something resembling rigor.

barrkel · on May 24, 2020

I'd say binary considerations are incompatible with risk calculations because no person and no procedure is perfect nor perfectly followed, nothing is completely bug free, etc. Some small part of a calculation might appear binary, but other terms usually dominate.

Risk calculations are far broader than infosec and don't deserve the dismissiveness you seem to be casting towards them. Risk calculations are the core of business. Almost every decision a business makes is a risk calculation; every action has an opportunity cost if it isn't intrinsically risky, and actions with certainty are very rare.

(For the avoidance of doubt, I believe that use of memory-unsafe languages should be avoided if reasonably possible, but there are still plenty of reasonable reasons to use C, C++ etc. instead.)

TeMPOraL · on May 24, 2020

I'm not trying to dismiss risk calculations. I appreciate them and challenges involved, having worked on tools supporting risk calculations in corporate space.

I feel this thread is getting out of hand. I initially replied to say why, in general, the kind of thinking that makes you unsatisfied with reductions but not eliminations of concerns, is common among programmers - because it's a sound heuristic. Reducing is good, but eliminating is better.

smitty1e · on May 24, 2020

I think you're after Mathematical Induction.

https://en.m.wikipedia.org/wiki/Mathematical_induction

_0w8t · on May 24, 2020

Also memory-safe languages with GC allow for simpler API and design. That in turn gives more time to think on errors in logic.

Too · on May 24, 2020

In many ways yes but not always. After writing lots of C++ and then going back to managed world I often find cases where the opposite is true. Questions like "will this be mutated?" or "will the library shallow copy, deep copy, or hold a reference to this object?" are very hard to answer in most GC languages.

deckard1 · on May 24, 2020

You're exceedingly lucky if the developers you work with these days are aware of the differences between shallow and deep copy, or how object references work.

This is where I make the case for having knowledge of asm/C/C++. The JavaScript developers that started on JavaScript or, maybe, Ruby have no clue what the hell is going on in the language they are working in. I see this all the time. From senior and lead developers, too.

They have no mental model for how pointers work. Which means they have no concept of what an object is, on a memory-organization level. They aren't always aware of when mutation will occur, and they are entirely too reliant on operations that are inefficient. In Lisp, this would be the developer than uses CONS everywhere. There is no awareness of the allocations behind the mechanisms. There is premature optimization and then there is not driving into the damn pothole in the first place! One of these is F1 racing and the other is basic driving skills.

I believe this may also explain why so many developers are so bad at using git. Git is entirely based on pointers. There is an elegance and simplicity to git that is lost on many people.

fetbaffe · on May 24, 2020

I’m not sure that is the underlying problem, the lack of knowledge about pointers, I think it is more about that the job programmer today is one of the most common jobs today, that alone lowers the general skill level.

In the old days people usually chose to become a programmer for a reason, today it is like any other job.

JamesBarney · on May 24, 2020

Most devs learn what the need to do their jobs. And you can be a very effective Ruby developer without needing to know much of this information.

Just like you can be a very effective C or C++ developer without knowing anything about css (which in my opinion is far more difficult to understand than when and how stack vs heap allocations are made)

Scramblejams · on May 24, 2020

I've never found programming so relaxing as when working in languages where almost everything is immutable. I think we're living beneath our privileges here.

_0w8t · on May 24, 2020

I can agree with that when comparing with plain C code. But reference-counting in C++ and Objective-C leads to exactly the same problem as with managed code. What saves C is that reference-counting there leads to a lot of boilerplate and bugs. To avoid that one needs very clear ownership model.

Gibbon1 · on May 24, 2020

I saw an essay by someone discussing going back to C and eschewing leet pointer and decentralized memory allocation for a system of centralized memory allocation and passing handles[1] instead of pointers. Which appeared to make most of the memory safety problems go away.

[1] handles could be unboxed to expose a pointer. But unboxing had machinery to detect and trap on bad handles

_0w8t · on May 24, 2020

That reminds me about early Fortran where everything was statically allocated including local variables and function return address. To simulate data structures one used arrays of various dimensions. Effectively array indexes where handles. This was not memory safe as there were no bounds checks. But add those when the compiler cannot eliminate them, and one gets fast and memory safe language.

Edit: that static allocation was essentially a centralized store. Everything repeats itself.

fetbaffe · on May 24, 2020

You probably mean this essay

https://floooh.github.io/2018/06/17/handles-vs-pointers.html

Gibbon1 · on May 24, 2020

That's the essay. I think it's interesting because he's mimicking the way I've been writing firmware. I've found most of the issues with memory safety disappear when you use static and centralized resource allocation, avoid monkeying with pointer arithmetic, and avoid making temporary copies of pointers 'for later'.

Too · on May 24, 2020

Sure, you can shared_ptr everything and pretend your are writing C#, which as you say circles back into the same problems of spaghetti-ownership. The difference with c++ or rust is that you have options: plain &-reference for non persistent pointers, const &-references for read only non persistent pointers, copy for true copies, move/unique_ptr for transfer of ownership and shared_ptr for shared ownership. All this information is readily accessible to the caller through the function signature.

wott · on May 24, 2020

> I can agree with that when comparing with plain C code. But reference-counting in C++ and Objective-C leads to exactly the same problem as with managed code. What saves C is that reference-counting there leads to a lot of boilerplate and bugs.

You've already been greyed out, but I agree 100% with what you said.

That's my main beef with this kind of bi-weekly discussion where people always put C and C++ in the same bag.

I rarely, if ever, make memory management errors in C. The model can't be more simple. If you don't explicitly allocate, you don't have to care about it (in the general case, it is on the stack and will disappear when exiting a function, anyway you also don't have to care about those implementation details, just do as if it disappeared with scope). If you allocate something (on the heap), then you free it later. Malloc(), free(), end of story. Okay, don't keep pointers to areas you will free or realloc, of course. Any function which returns pointers tomemory chunks are always to be treated as malloc'ed, those objects can never be on the stack and never have a fancy automatic management of any sort.

Now almost each piece of code I wrote in Objective-C contained memory errors/leaks. Be cause 1. I didn't grasp the more complicated, less deterministic (so to speak) models and the jargon well 2. you (well, I) never know right away which type of management was used for the object that some function returned, you have to check the doc and pray it is clearly written 2.b you have to keep that information in mind until the moment when you'd like to release the said object in order to release what should be released and not release what is/will be/has been automatically released now/later/sooner/whoknows.

And I don't even talk about the abomination that is Objective-C++, where I drowned in muddy waters.

kzrdude · on May 24, 2020

It's so easy to create thread safety related bugs in C# because it's easy to use parallelism and there are no checks for threading issues.

fetbaffe · on May 24, 2020

Correct, that is true for any language where you deal with raw threads, managed or unmanaged. That tells us that we should avoid using raw threads and instead use some other API to achieve parallelism.

nicoburns · on May 24, 2020

It's not true in Rust. In Rust, objects cannot be concurrently accessed from multiple threads unless they are explicitly marked as thread-safe (with the Sync trait), or wrapped in a a Mutex, which ensures that the safety variants are upheld.

pjmlp · on May 25, 2020

Yes for multi-threaded code, although not quite for multi-processes code, which is what one should be writing when the goal is security.

Then you are back into the same synchronization issues as in every other language.

vardump · on May 25, 2020

Yeah, it solves only the 99.9% case.

pjmlp · on May 25, 2020

So Firefox will return to be a single executable after Rust migration is fully done? After all that is not needed it seems.

fetbaffe · on May 24, 2020

Rust does not compile if you don't use the techniques you described?

ansible · on May 24, 2020

> Rust does not compile if you don't use the techniques you described?

Yes, exactly.

https://kkimdev.github.io/posts/2019/04/22/Rust-Compile-Time...

Of course, you can bypass this by using 'unsafe' code, where you are explicitly telling the compiler that you have manually verified Rust's safety guarantees.

This is why a significant portion of Rust's community gets agitated with (possibly unnecessary) use of 'unsafe' in popular libraries.

fetbaffe · on May 24, 2020

Then I stand corrected and applaud the creators of Rust for haven taken these measures for writing thread safe code, because it is supereasy to get threading wrong, even for the most experienced developer.

littlestymaar · on May 24, 2020

Interestingly enough, this was the raison d'être of the Rust project at the beginning, while zero-cost abstraction and memory-safety without GC (and the borrow-checker) weren't part if the initial goal. When Rust was first released as a research project it had green threads and a GC (interestingly close to Go actually), but it was already designed with data-race freedom in mind.

thePunisher · on May 24, 2020

Both Java and C# are designed for memory-safety but not thread-safety; Rust is.

Considering many of the software problems we face have some connection to thread-safety, Rust is the way to go.

pjmlp · on May 27, 2020

I would consider that the way to go is multi-processes instead, and secure every critical module on their own OS sandbox.

Threading seemed a good idea that went wrong.

ReactiveJelly · on May 24, 2020

And in the last quadrant, you can use non-raw threads like the Task or Parallel APIs and still have threading issues in C#.

I'm trying to learn Android and some libraries give zero indication of what thread your callbacks will be called on.

Maybe there's a convention somewhere that says "On Android, you can't assume anything about callbacks, so always assume you're on some anonymous worker thread and lock everything or dispatch to a thread you own"

But I have not come across it yet.

pjmlp · on May 25, 2020

Yes that is the convention.

You cannot assume anything about process or thread lifetimes on Android, as the simple act of rotating the screen will restart your application, and it can be killed at any moment and restarted later, due to memory pressure, or because the use has switched applications.

So whatever was the state of your application can be completely different when the callback is supposed to be invoked later on.

erichocean · on May 24, 2020

They're simpler only if you could care less about performance. C and C++ are primarily used in situations where you do care about performance.

_ph_ · on May 24, 2020

Memory safety and performance are not a contradiction, they are widely independant. Yes, some strategies like GC induce an overhead. But in the age of multiprocessor machines, this overhead gets very much reduced by parallel collection. And practice has shown, in programs with a lot of dynamic memory allocation, garbage collected languages often perform even better. There are different approaches, like memory safety by the compiler as Rust does it. And with "performance" in mind, you shouldn't forget, that most large C/C++ programs tend to use libraries to provide some amount of safety, but those libraries are of course adding some overhead too. Finally, the less time you spend on questions of memory management, the more time you have to write an actually fast program.

dirtydroog · on May 24, 2020

Sorry, this is hand wavy nonsense. One counter example, we switched from Cassandra (JVM) to Scylla (C++). It was a win in terms of both query latency and infrastructure costs as we required fewer machines to handle the same load.

As for having more time to write a fast program... that's funny. If you want a fast program on something JVM based you're pretty much going to be spending the majority of your time writing things in a way where the GC plays as little role as possible.

_ph_ · on May 24, 2020

Sorry, this is hand wavy nonsense. One counter example, we switched from Cassandra (JVM) to Scylla (C++). It was a win in terms of both query latency and infrastructure costs as we required fewer machines to handle the same load.

Sorry, this is not hand wavy nonsense. And what you are providing is called annectodal evidence.

Also, your universe seems to consist only of the JVM as a memory-safe alternative to C++. Yes, there are a lot of bloated, badly performing programs implemented in Java. However this isn't a given. Yes, some design decisions for Java introduce the risk of bloat, but you can avoid them at much less effort (and risk) than memory corruptions and new features like the value classes are reducing the bloat quite a bit. But still, the JVM is extremely high performance, so for speed in surprisingly many cases, it often beats C++. Virtual method calls are just better optimizable at run time, the Java JIT creates excellent code and Java has some of the best garbage collectors, so at really dynamic memory loads, it beats any manual management by a wide marging.

And of course, there is a whole world beyond Java as alternatives. Rust has been explicitly designed to excel at tasks C++ traditionally shines for, while giving your full safety.

dirtydroog · on May 24, 2020

There is an open-source project by LMAX (a forex trading company) called Disruptor[1] that squeezes as much as possible out of the JVM. It's awesome. I ported it to C++ years ago when I wanted to learn about low-latency techniques. However, if you look at the code they need to actually break out of the JVM's safety net to get the performance they need[2]. I couldn't help but ask myself why they didn't just use C++, and when asked one of the devs did admit that their own C++ ports had an approx 10% performance increase (although this was ~7 years ago maybe)

Rust is certainly interesting and it's on my radar. I wonder though, when it comes to having it in use in anger if its guarantees turn out to be over sold, just like the JVM's safety claims were. Time will tell.

Edit: I tend to focus on comapring against the JVM because pretty much any framework you use on The Cloud is JVM based. I'm of the opinion that there are cost savings to be had if these were ported to more appropriate languages, hence the Cassandra vs Scylla comparison. The money saved was 'noticeable'.

[1] https://github.com/LMAX-Exchange/disruptor [2] http://mishadoff.com/blog/java-magic-part-4-sun-dot-misc-dot...

pjmlp · on May 25, 2020

Which is why such projects use the JVM, they save money in developer salaries, developer pool, bug fixes due to security exploits, available set of tooling and libraries, while caring to hand optimize a tiny set of libraries for specialized use cases.

Java 15 just accepted the JEP for native memory management, yet another stepping stone for having value types support.

If the cadence continues, Java will eventually have all the features that it should have had in 1996, had Sun properly taken into consideration languages like Modula-3 and Eiffel.

Which you can get today in a language like Swift, C#, Nim or D, productivity of GC, type safe, while having the language features to do C++ like resource management.

murderfs · on May 24, 2020

There are notable examples where garbage collection has both better throughput and latency than manual memory collection: persistent data structures come to mind, because in the absence of garbage collection, you have giant awful cascades of reference count manipulation any time a structure would get freed.

You'll never actually see anyone tout the benefits of using GC for this, though, because the performance characteristics of persistent data structures are so horrendous compared to mutable ones, no one actually uses them in C++.

pkolaczk · on May 24, 2020

You can use GC to implement the persistent structure in C++/Rust just fine. But then you pay the GC cost for only that structure, not for all the other things.

andrepd · on May 24, 2020

Well yes, if you are comparing GC to refcounting then yes, gc wins.

pkolaczk · on May 24, 2020

I wouldn't be so sure. There is no easy answer. It typically wins in trivial benchmarks, where the total heap size is small. However once you have a big heap of other long lived stuff, there is a significant indirect cost caused by scanning the heap, memory barriers and evicting good data from caches. This cost is negligible only if your total heap size is much larger than the live memory set. Also modern manual memory allocators are not as slow as they used to be a few decades ago and they actually allocate in low tens of nanoseconds.

_ph_ · on May 24, 2020

If your problem demands many heap allocations, GC wins also vs. manual memory management. Heap allocation in a generational GC is as fast as stack allocation, you don't get fragmented heaps.

pkolaczk · on May 24, 2020

It is nowhere near stack allocation, please stop this nonsense. There was a paper claiming that, but the requirement was to set heap size 7x bigger than needed.

Stack is also very hot in cache. Memory that GC is handing allocations from is not.

I recently ported some of Cassandra code from Java to stack-only Rust and I got ~25x performance improvement, most from avoiding heap allocations and GC.

dirtydroog · on May 24, 2020

That depends on the allocator used. The default libc one works but isn't great performance-wise. It is possible to intercept calls to malloc/free by using LD_PRELOAD on Linux. That will allow you to use allocators such as jemalloc or tcmalloc instead.

Of course, repeatedly allocating and freeing is poor for performance. Cache/pre-allocate when you can. This goes for managed languages too.

thePunisher · on May 24, 2020

It's usually a trade-off since C++ may be more time consuming and therefore most costly to write. It's also a lot harder to get bug-free because of the manual memory management.

Both Java and C# may be somewhat slower, but the maintainability and freedom from memory management issues more than makes up for this.

Any engineer worth his salt will take this into account.

citrin_ru · on May 24, 2020

> Yes, some strategies like GC induce an overhead. But in the age of multiprocessor machines, this overhead gets very much reduced by parallel collection.

If you unlucky and given GC is not well optimized for given workload, then memory usage overhead can be huge. Just a few days ago had such problem with Go (HeapIdle grew until OOM).

Go GC is less mature than Java GC, but Java GC is not free too - sometimes you need to spend a lot of time time to tune GC settings or to optimize code to avoid GC problems.

In my case case it would be faster to use malloc/free than to spend time fighting with GC (looking for a workarounds).

On average GC saves development time and allows to avoid memory management bugs, but in some cases overhead is big and developers have to spend more time, not less.

_ph_ · on May 24, 2020

I have not claimed that a GC is always faster. Indeed, having a GC enables some to write a totally bad performing program. However this doesn't contradict that fact, that in many cases a GC does not only not mean a slower program, but sometimes a faster program. Usually, you don't have to "fight" the GC. In situations, where manual management is vastly better, you do use object pools and preallocated arrays even in a language with a GC.

thePunisher · on May 24, 2020

There are quite a few areas in software where determinism is a must, such as games, real-time and systems software.

pjmlp · on May 25, 2020

Yes, for those there are real time GCs as well.

https://www.ptc.com/en/products/developer-tools/perc

https://www.aicas.com/cms/en/JamaicaVM

One of the Java vendors acquired by PTC, Aonix, used to sell real time Java implementations for military deployments, including weapons controls and targeting systems.

You don't want a couple of ms pause when playing with such "toys".

thePunisher · on May 28, 2020

Sounds more like snake-oil to me.

fetbaffe · on May 24, 2020

Both C and C++ have zero built in knowledge about parallelism, so in a modern world where most things are parallel and asynchronous I don’t agree at all with that statement.

adev_ · on May 24, 2020

> Both C and C++ have zero built in knowledge about parallelism, so in a modern world where most things are parallel and asynchronous I don’t agree at all with that statement.

And ?

C (even Fortran) had threads and was used to create high performance program with high degree of parallelism before any "concurrent" modern language was even born.

You can not agree if you want. But fact are there, 98% of programs running on the biggest "parallel" machines nowadays (supercomputers) are C, C++ or Fortran.

You don't need to be "designed" concurrent to be efficient at it. The same way you do not need to be designed "Cloud-native (bullshit)" to run on a virtual machine.

fetbaffe · on May 24, 2020

I think that it sets a different mentality if it is part of tools you use, you solve problems differently. It is of course possible to write highly concurrent code in traditional sequential languages and your examples proves that, but supercomputers are a special case with budgets for that. I'm talking about the general case, and I think the post I replied to also assumed that.

We have to give programmers, with different backgrounds & training, tools to write high performant code in their everyday job. Many of the tools we use today are not designed for that. We are stuck in a mental model 50 years old that is no longer true.

Here are some interesting stackoverflow answers. You are of course free to dismiss these answers as anecdotal.

https://stackoverflow.com/a/14637833

https://stackoverflow.com/a/2799779

sigotirandolas · on May 24, 2020

Both C and C++ have had a concurrent memory model and threads in the standard library for a decade now. And POSIX threads which are pretty much the same thing have been there for a quarter century on *NIX.

fetbaffe · on May 24, 2020

Sure, there are libraries for concurrency, but there are no language constructs like in other languages.

cogman10 · on May 24, 2020

What do you mean by language constructs? As in, a `synchronized` keyword? Or are you thinking along the lines of `async` `await`?

If you mean the `synchronized` keyword, then correct, they don't have that. Most languages do not have that concept. C++ does have mutexes and has had them since C++11 (nearly a decade). C++ also has as part of the language spec the concept of threads, again, there since C++11.

If you meant co-routines, then C++ just added them with C++20.

Or do you mean something different like green threads (ala go)?

C++ certainly has concurrency constructs and has been expanding them since C++11.

fetbaffe · on May 24, 2020

Like Gos goroutines or Erlangs processes. And more of message passing. C++ has added many of these things later on in the standard library, bolted on afterwards.

erichocean · on May 24, 2020

If you think that using libraries to do things in C++ means they are "bolted on", then you don't understand C++.

The entire design of C++ is to enable efficient libraries for these kinds of things to be built. And it does, and has, and will continue to for decades more.

SomeoneFromCA · on May 24, 2020

You forgot about OpenMP.

SomeoneFromCA · on May 24, 2020

The real point of using C and (less so, but still) C++ is not to have the ultimate performance, but to have ultimate control over the program execution. Garbage collection and real-time code are not compatible, because GC introduces random, unexpected stalls in the program flow; even simple reference counting can be harmful, if the last reference get eliminated in the wrong time, and creates a cascade of deallocations.

pjmlp · on May 25, 2020

Then using C and C++ is a mistake, because there isn't an ultimate control about how memory allocator interacts with the OS.

So much that in the 90's there were vendors that could make a business out of selling specialized memory allocators.

SomeoneFromCA · on May 25, 2020

Really? What about an OS kernel code? How does "it's allcator" interacts "with the OS"? And what these allocators were written in?

pjmlp · on May 25, 2020

Usually in a mix of Assembly and whatever is used in OS systems implementation language, not necessarily C.

https://inf.ethz.ch/personal/wirth/ProjectOberon/Sources/Ker...

The malloc() exposed in a ISO C standard library implementation isn't necessarily related to whatever means the OS does memory allocation.

pjmlp · on May 25, 2020

Nowadays, back in the 80's and early 90's no sane developer would used them on 8 and 16 bit home platforms if they cared about performance.

What C and C++ have going for them is 40 years of investment in optimizing compilers, to detriment of other languages, and abuse of UB in optimizers.

Now with shared compiler backends becoming a mainstream thing across most OSes, that advantage is getting thinner.

saagarjha · on May 24, 2020

They do well in those contexts. They’re used in other places too, but generally with poorer justification and results.

zetafunction · on May 24, 2020

My personal experience from reading code that uses Chromium's C++ garbage collector is that that's often not true. While there might no longer be use-after-free errors, it's also no longer possible to make assertions like "object X must outlive Y" because object Y could be referenced arbitrarily and kept alive longer than expected. To get around that, objects might have an explicit shutdown step. But shutting down a large graph of objects is often fraught with peril, especially when that work can span multiple processes.

johnr2 · on May 24, 2020

> Switching from C++ to a language without memory safety issues while being just as fast

Does that actually exist? (I mean in the general case, not just a few carefully chosen benchmarks).

devit · on May 24, 2020

Rust is that language.

Some things, namely array indexing and RefCell borrowing, have unavoidable extra runtime overhead in the default safe usage, but it's unlikely that it is significant (often array bounds checks are either essential and thus needed in C++ as well or optimized out by LLVM in Rust, and RefCell usage is generally rare), and you can use unsafe unchecked operations as well.

_ph_ · on May 24, 2020

Yes, they do exist. The safety of Rust comes from the compiler (which takes its time), not from run-time checks. For certain style of code, the JVM is actually faster than C++. You also have to consider, that a lot of things you do in a C++ program to make it safer add up performance costs.

andrepd · on May 24, 2020

>For certain style of code, the JVM is actually faster than C++.

I find this very hard to believe.

_ph_ · on May 24, 2020

It depends on the nature of the code. Static code is very fast in C++, but dynamic code is less so. Virtual method dispatch in C++ is comparatively slow. Hotspot can eliminate much of the cost by using runtime type information. Using runtime information generally is a way for Hotspot to perform optimizations, a static compiler cannot do, because they would be a bad tradeoff. Heap allocation in the JVM is as fast as stack allocation, this gives another boost vs. heap allocation in C++. And even GCing the youngest generation is basically cost free, as long as you don't have many surviving objects.

I don't claim that Java is always faster than C++, that would be silly and there are plenty of Java programs in the wild which proves that it isn't. But there are quite some tasks at which Java is indeed faster.

sigotirandolas · on May 24, 2020

In the general case, virtual dispatch has an overhead in both C++ and Java (which btw is only <5 machine instructions), but it's true that Java can sometimes eliminate it with runtime information, which C++ can not use.

However, in my experience, using virtual dispatch is a relatively rare ocurrence in C++ (compared to the vast majority of method calls). On the other hand, on Java, most of _everything else_ is indirect and has overhead: All objects are allocated on the heap, primitives (int) are often objects (Integer) where they ought not to be, all objects have 16 bytes of overhead, etc.

But the JVM will convert those heap allocations to stack allocations! And it will realize those Integers are used as int and remove the overhead! And it will realize you're not using the information on the header of every object!

Perhaps in synthetic benchmarks, but in real programs, where there are an almost infinite amount of code paths, dumb 'data transfer' objects are common and things need to be modularized, the JVM is forced to assume the worst case can happen (even if you as an human can prove that it won't happen) and inhibit those optimizations. And now you have indirect accesses everywhere, memory overhead (=cache trashing) everywhere, the runtime can't vectorize that tight due to Integers, etc.

In fact, I can't think of any domain where there is heavy competition and where high performance is a determining factor where Java has won to C/C++. In browsers, it certainly has not.

cogman10 · on May 24, 2020

> (which btw is only <5 machine instructions)

The number of instructions is much less important than what they are doing. In the case of virtual dispatch, it's doing a memory lookup. If that memory is in cache it could be relatively inexpensive but not guaranteed. However, if you have to hit main memory then things are much slower.

> But the JVM will convert those heap allocations to stack allocations!

I was surprised recently to learn this is not the case. (at least, not with hotspot) The JVM will try and "scalarize" things (pull the fields out of the object which may push them onto the stack) but it won't actually allocate a full object on the stack (OpenJ9 will, but I don't see people using that very often).

It is also somewhat bad at doing the Integer to int conversion. That is mainly because the Integer::valueOf method will break things (EA has a hard time realizing this is a non-escaped value). Simple code like

    Integer a = 1;
    a++;

can screw up the current analysis and end up in heap allocations.

There is current work to try and make these things better, but it hasn't landed yet (AFAIK).

> In fact, I can't think of any domain where there is heavy competition and where high performance is a determining factor where Java has won to C/C++. In browsers, it certainly has not.

I think the realm where the JVM can potentially beat C++ is, funnily, work that requires a lot of memory. The thing that the JVM memory model has going for it is that heap allocations are relatively cheap compared to heap allocations in C++. If you have a ton of tiny short lived object allocations then the JVM will do a great job at managing them for you.

account42 · on May 29, 2020

> I think the realm where the JVM can potentially beat C++ is, funnily, work that requires a lot of memory. The thing that the JVM memory model has going for it is that heap allocations are relatively cheap compared to heap allocations in C++. If you have a ton of tiny short lived object allocations then the JVM will do a great job at managing them for you.

Except C++ also has many other options besides malloc() each of these individual objects.

vardump · on May 24, 2020

Allocation heavy C++ can be pretty slow. It's also true unoptimizable (pure) virtual methods performing little work have quite a bit overhead that JVM can avoid.

Of course you normally try to avoid writing C++ like that.

pkolaczk · on May 24, 2020

Allocation heavy Java is also very slow. But it is much easier to avoid heap allocations (and virtual calls!) in C++/Rust than in Java.

pjmlp · on May 27, 2020

For the time being, I still hope that Valhalla will eventually make it.

fluffything · on May 25, 2020

The claim is probably that idiomatic code in one case is faster than idiomatic code in the other, not that you can't write C++ code that isn't equally fast.

For example, if you use `std::vector` or `std::unique_ptr` you would be deallocating memory when those go out of scope. The JVM might actually never do that, e.g., if the program terminates before sufficient memory pressure arises.

Writing Rust code that leaks a `std::vector` is trivial, but doing the same in C++ actually requires some skill.

vardump · on May 25, 2020

Writing C++ code that leaks a std::vector is absolutely trivial, happens all the time and requires no skill.

  auto* v = new std::vector<char>();

fluffything · on May 25, 2020

I disagree in that it requires no skill: it requires the user to avoid doing the obvious thing:

    vector<int> foo {...};

and heap allocate a vector on the heap with `new` (without using a smart pointer), and then avoiding your linters warnings about this (e.g. clang-tidy).

If this is common in your place of work, I truly pity you.

Also, trading a call to `free` for a second call to malloc, and a second pointer indirection to the vector elements, isn't a very effective way of improving performance to beat the JVM. The whole idea behind leaking memory is doing less operations, not more :D

In Rust, leaking "the right way" is trivial (mem::forget is safe), but in C++, leaking the stack allocated vector probably requires putting it behind an union or aligned_storage or similar.

nyanpasu64 · on May 24, 2020

> the problems of “you forgot to check the password in this one case” still remain.

Maybe type systems could, in some cases, prevent this type of problem. You could construct a system where performing a password authentication returns an object of type Authorized, and performing operations requires passing in objects of type Authorized. I don't know how often this is useful in practice though.

saagarjha · on May 24, 2020

If your type system can handle the case where I can divert code flow along an unsandboxed path by replacing functions from the system libraries even before you can achieve code execution (and long before you certainly though I had code execution), I would like to hear it. But expressed in a less brusque way: yes, encoding state in your type system is generally great, and there is a lot of work being done in languages where you can formally “prove” guarantees about your code. The issue is that your proofs only hold true if the system they run on is correct, you chose exactly the right thing to check, you didn’t just leave out part of the program because you couldn’t fit it into the type system…it’s a nontrivial problem.

Dylan16807 · on May 24, 2020

Are you replying to the right comment? It just talks about type systems so you don't forget to check something. No mention of proofs either.

saagarjha · on May 24, 2020

Yes, I replied to the one I meant to. The reason we use type systems is that they let us prove things about our code: simple things like “this is a string, not an int”, more complicated things like “this variable cannot be null”, and to the advanced “I have written up a proof in a theorem-proving language that this program will not deviate from the published specifications in these ways”. My point is that the thing that people really want to prove, which is obviously “does this program work and do so correctly”, is fairly impossible to do so in practice due to some of the problems I mentioned. You can only approximate at this along certain domains.

fauigerzigerk · on May 24, 2020

I agree. Some lower level library could be rigged to return an Authorized<UserId> when that user is in fact not authorized.

So we would have to make sure that type safety is ultimately backed by a hardware based encryption module, but that's just not where we are today.

lmm · on May 24, 2020

I do this all the time in Scala. Define a marker for "required security level", slap it in a free monad or add it to an existing effect stack, then it just naturally gets propagated right the way up, and you can then put the checking where it makes sense (e.g. right at the outermost request handling level).

damnyou · on May 24, 2020

A language with a decent type system can easily take care of issues like forgetting to check the password along one code path, or at least significantly reduce the effort involved.

saagarjha · on May 24, 2020

This is possible, but I certainly would not describe it as “easy”.

smt88 · on May 24, 2020

It's very easy in many typed languages.

You have a type for UnauthenticatedUser and one for AuthenicatedUser. You have an authenticate() method to convert from one to the other.

erichocean · on May 24, 2020

Now do users with various level of RBAC, and show me how the type system solves that "very easily".

I'll wait.

damnyou · on May 24, 2020

It doesn't necessarily solve this completely, but it can be used to significantly reduce the amount of code you have to manually audit. For example, you could use tokens to represent permissions, and arrange your code so that only a small portion of your code can create them. Then you can rely on the presence of that token indicating that a check has been performed in the rest of your code.

The type system is a tool to reduce the space of invalid inputs.

smt88 · on May 24, 2020

This is also easy. There are different ways to do it.

Here's an example: create a type, such as "RoleAuthorization". Make it a required parameter for access-controlled methods, either through the constructor of the parent object or explicitly in the method.

In order to create a RoleAuthorization, you must first pass the user's Role object in an authorization method.

If a method requires authorization, the compiler will complain if you don't first check the current user's role through the authorization method.

> I'll wait.

If you're sincerely interested in discussing something, antagonizing someone this way (and implying that you are infallible) really doesn't help.

If you just wanted to make what you believe is a statement of fact, just make it.

If you don't want to discuss the issue at all and would instead prefer to be unchallenged on this topic, why write about it in a comments section?

machiaweliczny · on May 24, 2020

Wouldn't union just work?

authenticate() : role1 | role2

viraptor · on May 24, 2020

That assumes you know all the roles ahead of time. In many systems they will consist of a large number of variant both in how they authenticate and in what they're allowed to do. It may even depend on environment ideas which cannot be represented in types. (like: user X can issue commands Y, Z in times when they're on call, where "on-call" is a custom operator's rule, not something known earlier)

fetbaffe · on May 24, 2020

But you do know every permission in advance, thus static. How a user gets a permission can be dynamic.

ralls_ebfe · on May 24, 2020

This, or a union of access rights in a case where roles may be defined in a database.

pvg · on May 24, 2020

Not to mention the countless high-profile security holes that have nothing to do with memory safety

I'm not sure this is a sensible way to compare - Heartbleed is a memory safety bug and is a bigger bug than the rest of the ones you've listed combined. The various terrifying nameless iOS exploits that have had actual real-world, documented usage are not clever crypto protocol bugs with catchy names, either.

_0w8t · on May 24, 2020

OpenSSH focuses on security and correctness, not performance. That allows to use simple idiomatic C. Still, as with Chromium, it does use process isolation to mitigate memory-safety bugs.

fetbaffe · on May 24, 2020

Meltdown & Spectre is actually consequence of C, Intel adapted their CPUs to emulate the architecture that C implements, so you as developer can still believe that you are working close to the metal. However modern CPUs are very complex so the abstraction was leaky, thus bugs.

This is why we have to abandon C, not only is the language unsafe, it has also driven hardware manufacturing to a bad state with a reinforcing feedback loop, the more C we write the more hardware manufacturers want to make you believe that we still are working on PDP-11.

amqpp · on May 24, 2020

Not only that, C is solely responsible for the October revolution and WWII. C was Pol Pot's favorite language and is reportedly used by the Taliban.

fetbaffe · on May 24, 2020

Big if true :-)

But here is a great article

https://queue.acm.org/detail.cfm?id=3212479

SomeoneFromCA · on May 24, 2020

A very odd article. Everything that said applies to any other compiled programming language, not only C. Even if Pascal or Ada had become more popular than C/C++, it would still have lead to the same architectural decisions we see in x86. In fact the biggest offender - SMT is actually the opposite of instruction-level parallelism the authors are blaming the x86 for.

fluffything · on May 25, 2020

Chromium, Mozilla, and Microsoft have posted studies that attribute ~70% of their security vulnerabilities to memory safety.

I'm not sure which point you are trying to make, but yes, 100%-70% = 30%, i.e., there are many security vulnerabilities in Firefox, Chrome and Windows that are not attributed to memory safety by these projects, and preventing memory unsafety wouldn't remove all of them (at most "only" 70% of them).

Nobody is claiming here that fixing memory unsafety in software fixes hardware bugs, nor anything about OpenSSH or other projects, and your anecdotes do not show how many security vulnerabilities are caused in those projects due to memory unsafety.

fetbaffe · on May 24, 2020

If Wordpress had been written in C there would probably be even more security vulnerabilities.

asjw · on May 24, 2020

WordPress Is written in PHP which is written in C

sseth · on May 24, 2020

The "Rule of 2" link in the article actually says this : "A recent study by Matt Miller from Microsoft Security states that “~70% of the vulnerabilities addressed through a security update each year continue to be memory safety issues”.

kungtotte · on May 24, 2020

C on Unix-like systems isn't so surprising. The K&R book devotes a large section to the Unix system interface for example, the Linux kernel is one of the biggest C projects in the world, and even when using a higher level language to write your programs you're ultimately calling into libc or wrapping other C libraries or syscalls.

It's a case of "When in Rome, do as the Romans do".

ackbar03 · on May 24, 2020

I get the feeling it's because it's impossible to get away without using c for low level high performance stuff

saagarjha · on May 24, 2020

As Rust shows, it’s not really impossible from a technical standpoint in many areas. The issues are more to do with convention and legacy and familiarity, for the most part. (Technically, C does have a couple of advantages: it’s stable, familiar, and extremely well supported in most places, and has basically had the world revolve around it for a couple decades at least. But there are a number of domains where these are surmountable.)

NewJazz · on May 24, 2020

C is also much better at dynamically linking, and results in much smaller binaries. Still not good justification for flatpak. Or crun (the reimplementation of runc in C) for that matter. The Red Hat crew has a problem with their reliance on C, Python, and Go. They need something like Rust (or even Vala, which does ARC from what I can tell) in their arsenal.

boccko · on May 24, 2020

It's rather easy to dynamically link something that's not polymorphic.

eptcyka · on May 24, 2020

What exactly is the issue with Rust and dynamic linking? Is it that it can't dynamically load Rust interfaces? There's nothing wrong with linking plain old symbols from shared objects in Rust.

adev_ · on May 24, 2020

Rust has no stable ABI contrary to C and also support much higher concept like polymorphism or high order functions which are much harder to represent in a simple ABI.

Real dynamic linking on that (dlopen-style) can be a real nightmare, specially between compiler versions.

You can also add that Rust do not have a runtime, which is in many cases an advantage, but also tend to make libraries bigger.

devit · on May 24, 2020

Rust can create dynamic libraries that import and export symbols using the C ABI, so it is not worse than C in such a comparisons, since it can do exactly the same thing as C.

NewJazz · on May 24, 2020

Are there any examples of libraries that are implemented in this way?

the_why_of_y · on May 24, 2020

https://news.ycombinator.com/item?id=23290506

https://coaxion.net/blog/2017/09/exporting-a-gobject-c-api-f...

https://hsivonen.fi/modern-cpp-in-rust/

http://hansihe.com/2017/02/05/rustler-safe-erlang-elixir-nif...

Groxx · on May 24, 2020

Similar to C++ though, e.g. explosions of template expansions, you don't have to use those higher-level and harder-to-keep-concise features. Other fancy languages have stable ABIs, it's not impossible, nor does it require that it's complex to use.

Though the average one is probably harder, yes. I'd wager that it's mostly lack of care or motivation though, made a bit worse by common language features.

adev_ · on May 24, 2020

> Similar to C++ though, e.g. explosions of template expansions, you don't have to use those higher-level and harder-to-keep-concise features.

C++ has a stable ABI. Fragile, but stable. Or doing things like updating libQt on your Linux distribution without recompiling half of the world would be close to impossible :)

amaranth · on May 24, 2020

This is a relatively new feature and there have been one or two ABI breaks since they thought it was done that required recompiling half the world. I think it's been solid for at least most of the last decade though so... stable enough? I'm not sure if there is an actual spec somewhere though, certainly not a proper one from a standards body. It's just "whatever GCC or MSVC do on that OS/CPU combo" and those two happen to have stopped changing the ABI every compiling release at some point. They're both at least based on the relatively well specified Itanium C++ ABI though so that helps.

Groxx · on May 25, 2020

C++ is also older than Rust by, oh... a few years.

I'm not aware of any plan to not stabilize Rust's ABI, it just hasn't happened yet. It's completely fair to label that a deal-breaker for using Rust, but trying to draw a hypothetical box around it with the label "its ABI will be difficult to create and/or use" seems a bit unwarranted.

erlend_sh · on May 24, 2020

Recently there’s been a renewed discussion about how to make a stable, modular ABI for Rust: https://internals.rust-lang.org/t/a-stable-modular-abi-for-r...

lmm · on May 24, 2020

It's not, people are just incapable of making realistic assessments of what level of performance they require.

If you're using C or C++ "for performance" and not using a profiler in day-to-day development, you don't need to be using C or C++ and should be using a memory-safe language instead.

danielscrubs · on May 24, 2020

"Wirth's law is an adage on computer performance which states that software is getting slower more rapidly than hardware is becoming faster.

The adage is named after Niklaus Wirth, who discussed it in his 1995 article "A Plea for Lean Software".[1][2]"

We could do better languages. But it just takes insane resources to compete with the others ecosystem. IDE's, the Pandas, the format on save.

I've been a professional C++-developer in the past and one of the great things is the tooling and the just sheer knowledge of my past C++ colleagues. They know what happens in the kernel, they know the performance optimisation tricks, they know a lot, because the knowledge they gather just have a longer lifespan.

Ask a JS-developer and they will tell you all the web frameworks that came before React and their quirks...

lmm · on May 24, 2020

Software is getting slower due to poor algorithm choice, not due to the one-off factor of maybe 2 that you pay by switching to a compiled but memory-safe language. If C++ was a sensible choice in 1995, 18 months later you would have got the same performance out of a memory-safe language; we're now 10 cycles of Moore's law down the road, our hardware is 1024x faster and our languages are certainly not 1024x slower (unless you're using a scripting language). It's more about stuff that's accidentally quadratic, and that kind of error is if anything easier to make in C++ than in a more concise language where it's easier to see what you're doing.

C++ developers may be smart people because you have to be smart to do anything in C++ - imagine if the same brainpower that's being spent tracking memory usage and pointer/reference distinctions could be put into your actual application logic instead.

danielscrubs · on May 27, 2020

I'd argue that the focus on performance is what drove the developers to C++ in the first place. So they will know both the algorithms and the low-level details.

If performance is on your mind constantly, why wouldn't you choose the one with the least restrictions on what you can achieve?

It's not like those people would find joy in being locked into the JVM instruction-set.

atq2119 · on May 24, 2020

I agree with you that algorithm choice is what's most relevant.

However: I'd argue that accidentally quadratic algorithms are easier to hide in a concise language. Writing out a quadratic loop explicitly takes space, and that space alone might make people pay more attention than some subtle implicit language construct. Either way, the most common source of unintended quadratic (or higher) behavior is helper functions and library calls.

The other thing to keep in mind when it comes to algorithms is that cache behavior and therefore memory layout matters a lot for performance on modern hardware. Managed languages really stand in the way of optimizing memory layout, which can be a systematic performance disadvantage compared to C++.

I do hope we get some more innovation in the design space occupied by Rust, where you get fairly explicit control over memory layout, but still have statically checked memory safety guarantees.

lmm · on May 24, 2020

> I'd argue that accidentally quadratic algorithms are easier to hide in a concise language. Writing out a quadratic loop explicitly takes space, and that space alone might make people pay more attention than some subtle implicit language construct. Either way, the most common source of unintended quadratic (or higher) behavior is helper functions and library calls.

I disagree. When every loop is full of cruft around setting up the iterators, it's easy to drift past what's actually happening. In a language where looping over a list takes a single syntactic token, it's a lot more obvious when you've nested several such loops.

> The other thing to keep in mind when it comes to algorithms is that cache behavior and therefore memory layout matters a lot for performance on modern hardware. Managed languages really stand in the way of optimizing memory layout, which can be a systematic performance disadvantage compared to C++.

C++ doesn't really make cache behaviour clear either though. I agree that we need better tooling for handling those aspects of high-performance code, but they actually need to come from somewhere lower-level than C++.

atq2119 · on May 24, 2020

Nested loops are obvious in most languages, including C++ -- unless you happen to work with people who don't indent their code properly, but then you have bigger problems than the choice of language.

The real problems tend to come from where the quadratic behaviour doesn't come from nested loops, but from library calls. The canonical example of this is building up a string with successive string concatenation in C.

As for cache behaviour, C++ allows you to control memory layout, which is really what's required there, while most managed languages don't give you that control at all.

lmm · on May 25, 2020

> Nested loops are obvious in most languages, including C++ -- unless you happen to work with people who don't indent their code properly, but then you have bigger problems than the choice of language.

We live in a fallen world. In a large enterprise codebase there will almost certainly be parts that aren't indented correctly. And even if everything is indented perfectly, the sheer amount of stuff in a C++ codebase makes everything far, far less obvious.

dirtydroog · on May 24, 2020

Drivel. You can't have it both ways. It's easy to see what you're doing in C++ because you have to do it! It's the whole point of the language, and apparently the source of bugs.

Correct me if I'm wrong, but I doubt you've ever written a program in modern C++?

lmm · on May 24, 2020

"Modern" C++ is the No True Scotsman of programming languages, so you define it clearly and then I'll tell you whether I've written any. But I've written C++, including professionally. I expect to write some at work tomorrow, in fact.

It's not easy to see what algorithms you're using in a C++ codebase, because most of the lines of code are taken up micromanaging details that are broadly irrelevant. Yes, C++ makes it easy to tell whether you're using 8 bytes or 16 in this one datastructure. But you drown in a sea of those details and lose track of whether you're creating 10 or 10,000 instances of it.

dirtydroog · on May 24, 2020

I'd define modern as using RAII extensively and using C++11 at least.

As for algorithms, I honestly don't know what you mean. They're all documented online with respective big-O running times[1]. If you're talking about making unintended copies of things, then yes, C++ does expect you to know what's going on... it's the whole point of the language. If that's too much for you then don't use it, but that doesn't make it a bad language (I'm not denying it has some hair-pulling moments) Use std::move() when appropriate.

[1] https://en.cppreference.com/w/cpp/algorithm/sort (see 'Complexity' section.

lmm · on May 25, 2020

> I'd define modern as using RAII extensively and using C++11 at least.

Then yes, I work on modern C++ codebases.

> As for algorithms, I honestly don't know what you mean. They're all documented online with respective big-O running times[1].

Nontrivial programming tends to involve implementing, or even inventing, some algorithms yourself.

fetbaffe · on May 24, 2020

Modern C++ means usage of reference counting (smart pointers), in some cases reference counting can be slower than a garbage collector.

bluGill · on May 24, 2020

In modern C++ the reference count is always 1 or 0 for 80% of the code and so there is no need to actually maintain the count. That other 15% needs a count which I agree is slower than gc done right. The final 5% has cycles and cannot be done by reference count.

dirtydroog · on May 24, 2020

Not really. Only shared_ptr uses ref counting, unique_ptr doesn't, and looking at our code base (highly networking orientated) we only use shared_ptr once. You could, in theory, use shared_ptr everywhere, but then you're not using the language properly and may as well resort to Java or similar.

teknopaul · on May 24, 2020

Where I work architects agree with you.

Yet fairly modest requirements like 4000 concurrenct connections or a 16gb cache can be hard to achieve.

We almost exclusivly write for the JVM. I see systems with 20 - 30 - 40 jvms with huge CPU and RAM demands, to meet modest perf requirements.

Sometimes I do wonder if we shouldn't write a little more carefully written C and a lot less memory safe Java.

lmm · on May 24, 2020

Handling large numbers of connections is more easily done in Java, in my experience, since Java's async support is relatively straightforward.

I would lay money that the language isn't your real bottleneck. Switching languages might save you a factor of 2, using better algorithms or datastructures can save you a factor of 1000 or more. How much profiling do you do?

dirtydroog · on May 24, 2020

Look at the networking sourc code for Redis. It meets all your requirements and isn't that big.

adev_ · on May 24, 2020

> If you're using C or C++ "for performance" and not using a profiler in day-to-day development, you don't need to be using C or C++ and should be using a memory-safe language instead.

That's in my mind wrong for at least two reasons:

- If you are making building blocks (libraries) for other languages, you have to use C or C++ (maybe Rust soon). They are currently the only languages that can be bridge to the rest of the world (Python, Ruby, JS) without loosing your mind.

The main reason is that they are without GC, meaning deterministic destruction of object, meaning easy to interface with language with GC.

- When you aim for high performance you don't profile day 1, that's a complete waste of time. You will never transform every of your function in a critical kernel. You profile when you reach a performance bottleneck and you need it.

lmm · on May 24, 2020

> If you are making building blocks (libraries) for other languages, you have to use C or C++ (maybe Rust soon).

Sure (though there are often other ways to achieve the thing you actually want to do). But in that case you're not doing it "for performance".

> When you aim for high performance you don't profile day 1, that's a complete waste of time. You will never transform every of your function in a critical kernel. You profile when you reach a performance bottleneck and you need it.

In that case your performance requirements are not extreme enough to justify using C/C++ for your whole application. Write it in a safer language, when you hit performance issues profile and optimise, and maybe drop into C/C++ for those few "critical kernels" in the unlikely event that it turns out you actually need to.

adev_ · on May 24, 2020

> In that case your performance requirements are not extreme enough to justify using C/C++ for your whole application.

Again : no. That is in my experience both wrong and over-idealist.

For many applications, the overhead in term of dev time of writing bindings for every of your compute kernels + the pleasure of debugging the problems associated with them and heterogeneous build chains is generally several order of magnitude higher in term of man-hour-cost than just doing your program entirely in C/C++/rust.

There is an other aspects generally ignored:

- Theory say that often 80% of the compute time is consumed in 20% of the code. That's often wrong, and many HPC simulators do not have any kernel taking more than 3-7% of total run time. Consequently, everything might one day need to be optimised.

- Many performance critical algorithm are state of art and evolve. Meaning your innocent little function in "memory managed" language might become tomorrow a new bottleneck. And you do not want to have to rewrite that all the time.

A lot of new devs are wrongly scared of manual memory management where it became a non-problem with RAII in C++[1-2]x or the borrow checker in Rust.

And generally the ones that are scared are the one that do not use it.

The mental overhead with memory in C++ does not come so much with the lifetime of object, it comes mainly with HOW to use efficiently YOUR memory: aligned object in memory, cache effect, indirection, cost of polymorphism, allocation, etc, etc.

All these aspects, you do not think about them in memory managed language because you can not: You have no control over it.

And that's also why they bite you in the face in term of performance, generally much more than the 2x you quoted before. Just a remember, a cache miss and it's ~200 cycle you loose

lmm · on May 24, 2020

> - Theory say that often 80% of the compute time is consumed in 20% of the code. That's often wrong, and many HPC simulators do not have any kernel taking more than 3-7% of total run time. Consequently, everything might one day need to be optimised.

> - Many performance critical algorithm are state of art and evolve. Meaning your innocent little function in "memory managed" language might become tomorrow a new bottleneck. And you do not want to have to rewrite that all the time.

You can't have it both ways. If it's really common for everything to become a performance bottleneck, it's worth profiling from the start so that you avoid having major pitfalls anywhere. If it's rare and exceptional, FFI for those cases is fine.

> The mental overhead with memory in C++ does not come so much with the lifetime of object, it comes mainly with HOW to use efficiently YOUR memory: aligned object in memory, cache effect, indirection, cost of polymorphism, allocation, etc, etc.

> All these aspects, you do not think about them in memory managed language because you can not: You have no control over it.

> And that's also why they bite you in the face in term of performance, generally much more than the 2x you quoted before. Just a remember, a cache miss and it's ~200 cycle you loose

On the contrary. Plenty of people do those kind of things in, say, Java. They require knowing about compiler internals and using unsupported hints, or even bypassing parts of the compiler. But so does controlling these things in in C++.

adev_ · on May 24, 2020

> If it's really common for everything to become a performance bottleneck, it's worth profiling from the start so that you avoid having major pitfalls anywhere. If it's rare and exceptional, FFI for those cases is fine.

You can have it both way, code evolve. It is pretty common in performance critical code that a minor, almost never call function in one scenario, become a performance critical bottleneck in an other. If you already played with large scaled simulation software, this happened almost every week depending of your inputs on what you are interested to simulate.

> On the contrary. Plenty of people do those kind of things in, say, Java. They require knowing about compiler internals and using unsupported hints, or even bypassing parts of the compiler

No it's not again. I have been developing in C++ for 15 years, including in the HPC world and I (close to) never had to touch a compiler internal. The language gives you what you need for performances, you do not need to play with that.

At the opposite JIT compilers like V8 or Java are monster of complexity very sensible to side effect [1] and controlling things like "Does this data fit in my L2 cache ?" in them is close to impossible because even things like "Where are my data and what are there size ?" is an hard question.

Once again, there is theory and there is practice.

Theory is what you say. Practice is that 98% of performance critical software in HPC, game industry, physics and High Frequency Trading is in C++/C (maybe Rust soon). And this is why.

[1]: https://mathiasbynens.be/notes/prototypes

lmm · on May 25, 2020

> You can have it both way, code evolve. It is pretty common in performance critical code that a minor, almost never call function in one scenario, become a performance critical bottleneck in an other. If you already played with large scaled simulation software, this happened almost every week depending of your inputs on what you are interested to simulate.

In which case you're in the "worth profiling from day 1" world. It's much easier to work on the performance of code when you're already working on it and have it in your head - particularly in a verbose language like C++ where it takes a relatively long time to comprehend existing code - so if there's a decent chance that the performance of this code is going to be important in the future, profiling as you write saves you time overall.

> No it's not again. I have been developing in C++ for 15 years, including in the HPC world and I (close to) never had to touch a compiler internal. The language gives you what you need for performances, you do not need to play with that.

I said be aware of, not touch. If you weren't doing things like memory alignment pragmas then I guess your performance requirements were never so stringent. Fact is that a Java program that's fitting its data into L2 or avoiding cache line aliasing will blow a C++ program that isn't out of the water.

> Theory is what you say. Practice is that 98% of performance critical software in HPC, game industry, physics and High Frequency Trading is in C++/C (maybe Rust soon). And this is why.

HPC/physics follow questionable development practices in a lot of areas, and the games industry follows questionable everything practices. HFT uses a lot of Java and even higher level languages. C++ survives because people are rewarded for being seen to put a lot of effort into performance, and are not rewarded for avoiding bugs.

adev_ · on May 25, 2020

> C++ survives because people are rewarded for being seen to put a lot of effort into performance, and are not rewarded for avoiding bugs.

That's pure bullshit.

C++ survives because, even in 2020, it does the job.

Most people criticizing C++ are still stuck in there mind with C++98 and its quirks.

C++ evolved and modern C++ is at least as productive as Java or C# when used correctly.

That's why he is still actively uses and continue to grow.

This message just translate at best, your feelings (as a Java/scala developers ?). And you allow yourself to insult both the HPC Industry and the game industry without even providing metrics based on your "feelings".

lmm · on May 26, 2020

> C++ evolved and modern C++ is at least as productive as Java or C# when used correctly.

You're claiming this in a thread about how a flagship project from one of the biggest names in the industry found that 70% of their security bugs were things that wouldn't have happened in Java or C#. Your statement may be true, but only for a kind of "correctly" that doesn't actually exist in practice.

> That's why he is still actively uses and continue to grow.

Where are you getting those stats?

> This message just translate at best, your feelings (as a Java/scala developers ?). And you allow yourself to insult both the HPC Industry and the game industry without even providing metrics based on your "feelings".

Would you defend either of those industries as a haven of good coding practices? Do you believe that they have fewer bugs, make better use of up-to-date tools, make more data-driven decisions, than other parts of the industry? I'm repeating a reputation rather than a specific metric, sure, but does anyone actually dispute that reputation?

adev_ · on May 26, 2020

> You're claiming this in a thread about how a flagship project from one of the biggest names in the industry found that 70% of their security bugs were things that wouldn't have happened in Java or C#.

Which is a project which is born in 2008, and still ship codes from the 90's. That also include one of the most optimized (meaning complex) piece of code world wide: V8.

You have nothing that come even close to the complexity, usability and popularity of Chrome in both Java and C# world. Ironically, even Microsoft uses a C++-core in its software, including MS Office and Edge. Maybe you should reflect on that.

> Where are you getting those stats?

https://tjpalmer.github.io/languish/#y=stars&names=java%2Cc%...

You're welcome.

>Would you defend either of those industries as a haven of good coding practices?

Every industry has domain driven standards in term of coding practice. They all have their reasons based on deadlines, usages, iteration cycle, developer backgrounds, safety.

Pretending that one culture is superior to the other is both pretentious and let appear a bad misunderstanding of the world we are in.

Now this is my last comment on this thread. I do not think you are open to any discussion.

lmm · on May 26, 2020

> Which is a project which is born in 2008, and still ship codes from the 90's.

Back in 2008 C++ advocates were saying the same thing: all those errors are only in old codebases, modern C++ doesn't have those problems. At what point should we stop believing it?

> You have nothing that come even close to the complexity, usability and popularity of Chrome in both Java and C# world.

Nonsense. There are dozens of more complex, more usable, and more popular systems written in Java and C#.

> Ironically, even Microsoft uses a C++-core in its software, including MS Office and Edge.

In the older projects that they're most conservative about, yes. Large companies change slowly. Doesn't mean what they're doing today is wise.

> Every industry has domain driven standards in term of coding practice. They all have their reasons based on deadlines, usages, iteration cycle, developer backgrounds, safety.

Which is to say that good development practices will be a lesser or greater priority level in different industries.

joshuamorton · on May 24, 2020

There's performance critical software in hft written in Java and erlang too. People like to think C++ has the monopoly, but it doesn't.

And game programming isn't normally "performance critical" so much as OS-less and embedded. Sometimes there's real-time constraints, but very few parts of a AAA game are the inner rendering loop that has real time requirements. Mort of it if boring stuff that could be (and often is!) written in python or more often Lua.

SomeoneFromCA · on May 24, 2020

Erlang has terrible performance for non-concurrent tasks. It starts to shine only in very concurrent app.

throwaway17_17 · on May 24, 2020

I will agree in part, if you aren’t using C or C++ in performance critical software you probably could use another language. I think your profiler-every-day requirement is, maybe, a tiny bit too restrictive, but not by much.

There is certainly a need for the industry (and hobbyists) to take stock of both what is necessary and what is just desired. I just don’t want to see the bar for using a language that doesn’t handle all aspects of memory usage and performance to be restricted to soft real-time applications and infrastructure projects. Digging into low level programming can be fun and rewarding.

My problem is essentially that even if I am not absolutely concerned with maximum performance across memory usage, binary size, CPU efficiency, bandwidth, etc. I still truly enjoy the options for control (or the illusion of it) that is provided by C. I enjoy memory layout design, allocator design, cache-efficiency considerations and the like; while at the same time, I don’t have a huge love of malloc and free or having to track down segfaults. I think ‘performance-by-default’ is a viable language design goal and want more of it in my tools.

I keep posting comments mentioning my pet language project, mostly in hopes that when I see it on my threads display I can continue to shame myself into getting it released on time, and this is one of those. These kinds of concerns motivated me to design a language for personal use that gives me what I want, but doesn’t require (but can) allow me to deal with other things. I enjoy using Coq, Haskell, Idris, Lisp, ATS, and Clean. But those language deprive me of things I really do enjoy. So I’m going for a low level language (in the Perlis sense) that has an experimental type theory. This will certainly allow for a memory safe subset and is the kind of thing I want to see more of from others.

msla · on May 24, 2020

The problem with memory management is a problem with lifetime management, which Rust reasons about in terms of ownership management, which it attempts to reason about statically, with help from the programmer. GC attempts to do the same thing, dynamically, with less help from the programmer.

Both of those methods still allow leaks, which is why Rust encourages RAII. [1] Are there other structured lifetimes we can get compilers to enforce for us, like they enforce certain invariants about flow control using control structures?

[1] https://users.rust-lang.org/t/memory-leaks-in-rust/18187/2

http://huonw.github.io/blog/2016/04/memory-leaks-are-memory-...

https://doc.rust-lang.org/nomicon/leaking.html

dependenttypes · on May 24, 2020

> I'm still surprised how much stuff is being written in non-safe languages even if people could get away with a managed language.

The distinction between unsafe and managed languages does not make any sense. You can for example have a safe C implementation. No need to move to another language.

doomjunky · on May 24, 2020

But in managed languages the safety comes by default and in unmanaged languages the safety comes by exception.

dependenttypes · on May 24, 2020

In both ways safety comes by the implementation. It just so happens that so called "safe" languages tend to have a single safe implementation while "unsafe" ones tend to have multiple unsafe and one or two safe ones.

Does the amount of unsafe implementations matter?

mister_hn · on May 24, 2020

Just imagine that SQLite is written in C and the major of bugs are logical/optimization ones, not related to memory management.

And when talking about C++, since 2011 there are smart pointers to help developers managing memory kind of automatically.

I believe most of the problems are due the people typing on those keyboards such bad code, especially when it comes to "smart" code.

In 2020 we have plenty of tools supporting the developer job (memory sanitizers, static analysis tools, linting, profilers, etc.): The big problem is failing to / ignoring to use such tools.

_ph_ · on May 24, 2020

Ah, the "bad code" fallacy. If we hadn't had enough experiences already, the article we are disucssing here clearly shows, how difficult it is, to get C/C++ code reasonably error free. Yes, there exist a free applications, which are pretty good in that respect, but if your goal is to be error-free, the development time balloons. And you are often limited to very simple memory allocation schemas. If you think that "smart pointers" should be used, why not just use a memory-safe language?

winrid · on May 24, 2020

Well, smart pointers will still result in a more predictable program. Modern GCs are amazing. But it's hard to be super predictable with them.

thePunisher · on May 24, 2020

I would argue that most of the memory errors actually pertain to C programs, not C++ ones (providing they use proper encapsulation and libraries such as STL).

C is syntactically C++ being a subset of C++ and I've seen many programs which claim to be C++ but are actually programmed using C methodologies.

saagarjha · on May 24, 2020

Chrome is a heavy user of such tools and has many talented engineers working on it.

aylp · on May 24, 2020

Citation needed. I never see the same level of dedication in big Google or Facebook projects as in well run smaller projects like SQLite.

You are directly rewarded for "new" features (i.e. plagiarizing or rewriting old ones and selling them as new).

So I don't think the engineers are particularly talented, and the incentives are not the same as for SQLite.

ReactiveJelly · on May 24, 2020

If using Rust results in fewer bugs for less money, shouldn't we still do it?

The reason SQLite can test so thoroughly is that their requirements never change. I have tried to browse the web with SQLite but so far no luck.

zetafunction · on May 24, 2020

Actually, sqlite has its share of memory safety issues. https://bugs.chromium.org/p/chromium/issues/list?q=Type%3DBu... (and to be clear, this is just a coarse search, and the bugs there aren't necessarily all bugs in sqlite, but there are definitely some)

boccko · on May 24, 2020

Do you think issue at google is not knowing about/not using the proper tools?

aylp · on May 24, 2020

No, the issue is that the incentives are wrong (features, features, features for promotion) and indeed the engineers (on average) are not that talented.

mister_hn · on May 24, 2020

Features (user tracking and fancy options) over stability.

Why do we see the version of Chrome reaching 100 at a sustained pace, instead of seeing something like Chrome 8.X or 9.Y?

The pressure is all about introducing new features (and thus, more bugs!) while fixing the existing bugs slowly.

And we should stop thinking that all the brilliant minds are working at FAANG, because it isn't true.

riquito · on May 24, 2020

Matches closely the number of memory related security issues Firefox had in their CSS parser before the rewrite in Rust [1]

> Over the course of its lifetime, there have been 69 security bugs in Firefox’s style component. If we’d had a time machine and could have written this component in Rust from the start, 51 (73.9%) of these bugs would not have been possible

Also interesting on the topic of memory safety

https://hacks.mozilla.org/2019/01/fearless-security-memory-s...

[1] https://hacks.mozilla.org/2019/02/rewriting-a-browser-compon...

asjw · on May 24, 2020

In this context I believe that the rewrite benefitted Firefox more than Rust

At Firefox they already knew the problem they were trying to solve using Rust, they already wrote that software, already discovered many of the overlooked complications involved in writing a modern browser, so, in conclusion, even a rewrite in plain C would have solved many of the bugs.

The simple operation of rewriting the same software with previous knowledge of how it works usually leads to simpler code (at the cost of developers time)

Firefox is also the entity that invented Rust, so it's in their best interest to publicize it as "the final weapon" against bug, but "if we had used Rust from the beginning these bugs would not have been possible" is just wishful thinking.

Rust itself could not be there without the browser war and the pressure that contemporary web puts on software that runs it

And it'been C/C++ that has driven us there

steveklabnik · on May 24, 2020

> At Firefox they already knew the problem they were trying to solve using Rust, they already wrote that software, already discovered many of the overlooked complications involved in writing a modern browser, so, in conclusion, even a rewrite in plain C would have solved many of the bugs.

The Rust re-write was the third attempt; the first two were in C++ and failed.

chinesempire · on May 24, 2020

> The Rust re-write was the third attempt; the first two were in C++ and failed.

AFAIK most of Firefox is still written in C++, JavaScript and C [1]

The JavaScript interpreter, which is particularly important to users perceived speed, is entirely written in C++

[1] https://4e6.github.io/firefox-lang-stats/

steveklabnik · on May 24, 2020

I am talking specifically about the style engine, which is the topic of this sub-thread.

chinesempire · on May 24, 2020

the posts linked in the beginning of the thread talk about 51 memory safety bugs over the course of many years, from 2002 to 2018.

Stylo has been the default CSS parser starting from the beginning of 2018.

It's good that Rust could have avoided them, but is it a fair comparison?

I think that when at Firefox they started to think about a new architecture to better enable parallelism they began improving considerably, Rust is only a part of that.

steveklabnik · on May 24, 2020

If you watch the talk I linked below, you’ll see more details. They were, and tried with C++, and failed. Rust was key to the success.

chinesempire · on May 25, 2020

> If you watch the talk I linked below

I've watched it several times over the past 2 years

And I've read the posts about writing and HTML engine in Rust when they first came out in 2014

https://limpet.net/mbrubeck/2014/08/08/toy-layout-engine-1.h...

and ported them to Elixir and still use them in my programming lessons

> Rust was key to the success.

For them

It's important to specify that Rust, built by Firefox, lead to a Firefox success.

Just like Dart, created by Google, is the language of choice for Flutter, also created by Google.

I know you've been working at Mozilla to work on Rust and I believe Rust is very good, but I also think Mozilla could have used other languages, there were a few that could led them to success, but they understood this are times where the "means of production" aren't the machines but engineers tools, and creating a programming language is the best way to control part of that world.

leshow · on May 25, 2020

> there were a few that could led them to success

which ones? If they tried 2 or 3 times with C++ and the Rust one succeeded, what other information would you need to have to convince you that Rust was the differentiating factor? It seems like you just don't want to admit that Rust was the key to their success in the project, even when you have someone who was there telling you that it was.

We aren't going to get research study levels of replication on large projects like this, so I don't know what standard you're looking for here.

chinesempire · on May 27, 2020

> what other information would you need to have to convince you that Rust was the differentiating factor?

The fact that Chrome is doing just fine without it?

> t seems like you just don't want to admit that Rust was the key to their success in the projec

It seems like you are trying a classic ad personam, I agree that Rust was one of the changing factor, I also wrote it, but just for Firefox, not in general.

Which is the original point of this sub-thread.

> We aren't going to get research study levels of replication on large projects like this

I don't think Firefox is the only large project out there. nor the largest.

steveklabnik · on May 25, 2020

> I know you've been working at Mozilla to work on Rust

I haven't worked there in a year and a half.

chinesempire · on May 25, 2020

That's a very short time span in human years.

Anyway I wasn't implying anything bad, just that you worked for years at Mozilla on Rust and it's like asking Anders Hejlsberg if C# enabled Microsoft to do things that have failed before with C++ or if TypeScript is better than vanilla JavScript.

amlxy · on May 24, 2020

Nonsense, the first rewrite was famously as success (because anyone claimed it could not be done).

Have you contributed any code?

steveklabnik · on May 24, 2020

Here is is straight from the devs: https://youtu.be/Y6SSTRr2mFU?t=360

phyzome · on May 24, 2020

Mozilla developed Rust specifically for this kind of rewrite. That was the entire point of it. (As relayed to me by one of the designers in 2011 or so.)

They didn't think just a rewrite in C would be enough, they didn't think any other existing language would be sufficient, and then they went off to design Rust. So the statement "Firefox is also the entity that invented Rust" kind of misses the point.

ReactiveJelly · on May 24, 2020

Some people seem to think Mozilla gets royalties every time you invoke rustc.

And that if I just took off my rose-tinted glasses, I'd realize my Rust code is buggy, unsafe, slow, and hard to maintain, and the only reason I'm using Rust is because of hype.

It was pretty hyped when I started using it, in 2015.

chinesempire · on May 24, 2020

> And that if I just took off my rose-tinted glasses, I'd realize my Rust code is buggy, unsafe, slow, and hard to maintain

I didn't get the impression.

I understood that Firefox talks about their success in rewriting in Rust because it's their language, they control it and are the major sponsor and user.

I don't think Google or MS or any other company heavily involved in crafting programming languages for their own purposes will ever go that route for some of their core software, because they can't control the language and if they tried they would get the blame for trying.

steveklabnik · on May 24, 2020

Mozilla does not control Rust.

Google and Microsoft are both already using Rust for real products.