The Rule of 2

tyingq · on Aug 10, 2019

"If you can be sure that the input comes from a trustworthy source"

Perl's "taint"[1] capability is pretty interesting in this space. Do other languages have something similar?

[1] https://perldoc.perl.org/perlsec.html#Taint-mode

"You may not use data derived from outside your program to affect something else outside your program--at least, not by accident. All command line arguments, environment variables, locale information (see perllocale), results of certain system calls (readdir(), readlink(), the variable of shmread(), the messages returned by msgrcv(), the password, gcos and shell fields returned by the getpwxxx() calls), and all file input are marked as "tainted"."

Unrelated rant: Sometime recently mobile chrome omits any part of a url after a # when you copy/share the url. Grrr.

realharo · on Aug 10, 2019

I'm not aware of a similar feature built into other languages, but most of it could be easily achieved with almost any type system.

Just have two separate types, e.g. UnsafeString and regular String, and some kind of `convert` function that takes a validation function as an argument. You'd get compile-time checking that way.

People don't tend to use such things in practice though, and you would also have to ban a portion of most language's standard libraries to enforce it in practice (because they already return regular strings for inputs).

aidos · on Aug 10, 2019

This can be a useful pattern - though I’ve never seen it used for this usecase. A similar one is templating languages (like Jinja) where you need to wrap strings if you want to send HTML to your templates with them being escaped on render.

We use something similar where we have a BadNumber class in our code (python). Any operation with another number will also create a BadNumber. It allows us to make sure that these tainted numbers are always obvious.

heavenlyblue · on Aug 11, 2019

You could use an import hook in python, then create a whitelist of APIs that will be mapped to UnsafeString and then will receive SafeString as an argument.

13of40 · on Aug 10, 2019

How about implementing it as an access modifier like "trusted" and enforce that only values from other trusted members can be assigned to a trusted member?

staticassertion · on Aug 10, 2019

I think this is just refinement types. You wouldn't so much ban the stdlib as need to wrap it.

chubot · on Aug 10, 2019

Wow that's interesting. I've never heard of this in Perl or Ruby, and I always thought of taint analysis as static rather than dynamic.

Though I don't have experience with it, maybe one reason it isn't used is because of false positives?

For efficiency reasons, Perl takes a conservative view of whether data is tainted. If an expression contains tainted data, any subexpression may be considered tainted, even if the value of the subexpression is not itself affected by the tainted data.

Has anyone used this? Is the runtime overhead always there, or only when you turn the taint mode on? It seems like it would have to occupy some extra space in the string objects all the time? (Although I guess if it's literally a single bit, it can come for "free" because of padding)

-----

FWIW here are some references on static taint analysis:

https://cacm.acm.org/magazines/2019/8/238344-scaling-static-...

cites https://www.usenix.org/legacy/event/sec06/tech/full_papers/x...

tyingq · on Aug 10, 2019

"Has anyone used this?"

Perhaps not much on purpose, but it kicks in automatically if the Perl script is setuid. So you'll find questions about it where people are struggling with it.

UncleMeat · on Aug 11, 2019

Dynamic taint analysis is a really common technique in academic work, but largely has unacceptable performance costs for interesting applications. Typical costs range from 10% to 100% overhead or more. The other problem is that the entire system needs to track it. If you just own part of a system, instrumenting to add dynamic taint tracking can be really difficult.

jandrese · on Aug 10, 2019

IIRC all CGI scripts were run with the taint checker on some systems.

SquareWheel · on Aug 10, 2019

>Unrelated rant: Sometime recently mobile chrome omits any part of a url after a # when you copy/share the url. Grrr.

It's probably sharing the canonical link. That's the "correct" behaviour as defined by browsers. Definitely not ideal in your example, though.

steveklabnik · on Aug 10, 2019

It’s in Ruby as well, inherited from Perl. It doesn’t get much use though.

jordoh · on Aug 10, 2019

It's used in Rails to reduce the likelihood of un-sanitized user input in SQL fragments [1]. I think it would see a lot more use if additional input sources were marked as tainted [2].

[1] https://api.rubyonrails.org/classes/ActiveRecord/Base.html#c...

[2] http://www.jkfill.com/2012/03/10/preventing-mass-assignment-...

hau · on Aug 10, 2019

Nim has something similar in the works, barely implemented.

https://nim-lang.org/docs/manual_experimental.html#taint-mod...

skybrian · on Aug 10, 2019

This can be done with a special type representing "trusted" data. For example, Go has template.HTML representing data that's safe to render without escaping. Everything else gets escaped.

est31 · on Aug 10, 2019

> The Rule Of 2 is: Pick no more than 2 of

> * untrustworthy inputs;

> * unsafe implementation language; and

> * high privilege.

> Security engineers in general, very much including Chrome Security Team, would like to advance the state of engineering to where memory safety issues are much more rare. Then, we could focus more attention on the application-semantic vulnerabilities. That would be a big improvement.

> Unsafe implementation languages are languages that lack memory safety, including at least C, C++, and assembly language. Memory-safe languages include Go, Rust, Python, Java, JavaScript, Kotlin, and Swift

Very nice. At the end of the process, Google might adopt Rust in Chromium. As much as I use and love Firefox, it's only realist to say that Chrome has higher chances of being around in 10 years.

I wonder why the list doesn't include their wuffs language.

tyingq · on Aug 10, 2019

"Memory-safe languages include Go, Rust, Python, Java, JavaScript, Kotlin, and Swift"

An interesting list. They left out C# and PHP if it's supposed to be the most popular languages.

pjmlp · on Aug 10, 2019

Selective view based on in house languages.

jchw · on Aug 10, 2019

I work at Google and as far as I know Rust is not really “in house” at Google, at least not any more than C#. Both languages exist in some form, Google does have some plugins for Unity and I believe Fuchsia has some Rust code. It is indeed unclear what criteria was used to select languages, though it’s really not very relevant to the primary point anyway.

(Legal line noise: my opinions are not those of my employer.)

sitkack · on Aug 10, 2019

Rust has been pulled into the Android tree.

jchw · on Aug 10, 2019

That's genuinely exciting to hear. Thanks for letting me know.

lisper · on Aug 10, 2019

And the original memory-safe language: Lisp.

amoitnga · on Aug 10, 2019

Is Ruby memory-safe?

tyingq · on Aug 10, 2019

No (direct) pointers or malloc, and garbage collected, so yeah.

jillesvangurp · on Aug 11, 2019

Lots of native extensions though. So indirectly plenty of possibility for abuse. Some implementations are better at this than others.

rectang · on Aug 10, 2019

No-thank-you to the gratuitous Firefox FUD.

est31 · on Aug 10, 2019

https://gs.statcounter.com/browser-market-share/desktop/worl...

https://data.firefox.com/dashboard/user-activity

If you check where Firefox was 10 years ago to where it's now you can see the trend. It still continues. In the last year, Firefox lost more than 10% of its market share. A component of this is probably Firefox not being able to capture growth of the entire market, but the trend also holds for the absolute number of users: 890 million YAUs in Jul 2018 vs 809 YAUs in Jul 2019. In the long term view, Firefox is dying.

eikenberry · on Aug 10, 2019

Neither of those charts go back far enough because if they did you'd see this has all happened before, and even worse at one point. When IE was taking over the world FF fell to <5% of the market, yet it survived. It's not dead until its dead and with Chrome killing off ad-blockers I bet we'll see some reversal of the current trends when that ships.

dtech · on Aug 10, 2019

When IE had taken over the world (~2002) Firefox did not exist yet, unless you include Netscape or Mozilla Suite

jillesvangurp · on Aug 11, 2019

By the same reasoning, Internet Explorer should have killed off Firefox ages ago. Except that never happened and instead it is IE that died. Firefox has plenty of supporters and a rich development community; it won't go away any time soon.

There are enough of them to keep Mozilla going indefinitely and they are doing some truly amazing stuff like using Rust to get massive performance boosts. Mozilla and Firefox have set the agenda technically for close to two decades. Everybody does tabs now. I remember when that was a Mozilla only thing. Extensions were a Mozilla only thing for a long time now. Even Safari has extensions now. The new focus on security and privacy started at Mozilla and is now being copied by others (Brave, Edge, Safari) while Google is moving to kill ad blockers and continues to sell users out to their advertisers.

rectang · on Aug 11, 2019

So what? Is it your mission to threadjack every story and turn it into a conversation about Firefox market share?

When you observe someone using Firefox do you interject, "You know that Firefox is dying, right?"

When you participate in meetings, do you open with "I'd like to state for the record that Firefox is doomed, doomed I say"?

At the coffee shop in the morning, do you order "Grande Mocha, hold the Firefox because all hope is lost"?

ufmace · on Aug 10, 2019

Well it isn't meant to be an exhaustive list of languages that are memory-safe. I could complain that Ruby isn't in there, but it isn't very popular in the Google world right now. I think we all get the idea though.

oconnor663 · on Aug 10, 2019

> But if you transform the image into a format that doesn‘t have PNG’s complexity (in a low-privilege process, of course), the malicious nature of the PNG ‘should’ be eliminated and then safe for parsing at a higher privilege level. Even if the attacker manages to compromise the low-privilege process with a malicious PNG, the high-privilege process will only parse the compromised process' output with a simple, plausibly-safe parser.

It's interesting to get a sense of how deeply unrealistic they think it is, to write a safe parser for a typical data format in an unsafe language.

pjmlp · on Aug 10, 2019

Because as of 2019 the same errors as in early 1980's keep being repeated, regardless how many tools have been developed to tame C and it's derivatives.

It is so unrealistic that Android is following up Solaris footsteps.

Google has announced that ARM memory tagging extensions will be required in future Android versions.

MrMorden · on Aug 10, 2019

When is Google going to require monthly security patch delivery?

pjmlp · on Aug 10, 2019

Soon it seems.

According to an interview they gave to Are in 2017 about changing Play Store contract.

And the new Project announced at IO to require support for GSI images.

mxcrossb · on Aug 10, 2019

The PSP was hacked by pirates when a bug was discovered in its image app (I think for tiff files). It makes me wonder if security should be baked in to a data format.

zetafunction · on Aug 10, 2019

https://bugs.chromium.org/p/chromium/issues/list?q=Type%3DBu...

It's not clear that all these bugs can be turned into an attack, but that sure is a lot of bugs.

johnday · on Aug 10, 2019

Two actually seems like a lot here. Why would you angle for two and not one? It seems like the latter two (unsafe implementation language and high privilege) are both within the purview of developers. Is it just a case of resource management?

skybrian · on Aug 10, 2019

It's practical advice for Chrome developers wanting to get a patch accepted. Deciding to rewrite the high-privilege parts of Chrome in Rust (say) is too big a project to be in scope.

zetafunction · on Aug 10, 2019

1. The only option for avoiding an unsafe implementation language is currently Java, which is limited to Android.

2. Avoiding high privilege often means going out of process, which is challenging on resource-constrained devices.

muterad_murilax · on Aug 10, 2019

"Always two there are; no more, no less. A master and an apprentice."

kazinator · on Aug 10, 2019

How about rule of 3:

- untrustworthy inputs

- privilege

- unsafe language

- big, ball-of-mud codebase

We can do the first three, if the thing is small and simple.

Pretty much every OS kernel out there in wide deployment has all four of the above, though.

abalaji · on Aug 10, 2019

Does anyone have a working link to the diagram at the top? It seems to be a Google Drawing that requires permission to view/

emilfihlman · on Aug 10, 2019

Image is broken.

jandrese · on Aug 10, 2019

It's hosted off of a private google drive.

Vogtinator · on Aug 10, 2019

"unsafe implementation language" is a pretty moot point.

Looking at all the trivial exploits against web applications which are basically never written in memory-unsafe languages (Ruby, Python, PHP, ...) shows that it doesn't really matter much. While having the same implementation in a memory unsafe language would be slightly less safe, it's very unlikely that a heap corruption could be exploited remotely.

pjmlp · on Aug 10, 2019

Morris Worm.

About 70% of CVE reported exploits are due to memory corruption.

Living with the remaining 30% would already be a huge security improvement.

Vogtinator · on Aug 10, 2019

The Morris Worm happened back when security was not really a big concern.

Memory corruption is much harder (and in most cases realistically not at all) to exploit beyond a DoS, and that's what you would get with "safe" languages such as Rust or Python as well.

Heartbleed, Shellshock, Dirty COW etc. would all happen exactly the same way in different programming languages.

Yes, there is clearly a benefit in using something which makes it much harder introducing memory safety issues, but it's not nearly as big as many here on HN think.

missblit · on Aug 10, 2019

Didn't Heartbleed rely on reading uninitialized data? I'd assume memory safety includes preventing code from reading uninitialized data.

staticassertion · on Aug 10, 2019

Yes, and correct.

staticassertion · on Aug 10, 2019

Most recent (and the first public) Chrome ITW attack used memory corruption attack. One to exploit the renderer, another to exploit the underlying kernel to escape the sandbox.

I believe the same is true (roughly) with the Coinbase attack that went after Firefox.

In short, memory safety is not only responsible for the majority of reported vulns, but also the exploited ones, at least in the case of browsers.

> Heartbleed, Shellshock, Dirty COW etc. would all happen exactly the same way in different programming languages.

Heartbleed is impossible in a memory safe language, at the least. Same with cloudbleed for that matter.

DirtyCow and shellshock, sure.

It's a bit of a moot point though - human energy is finite, consider if we could spend energy on problems like DirtyCow and shellshock instead of memory safety issues that simply don't exist in many languages.

pjmlp · on Aug 10, 2019

Yet we keep having those yearly 70%, go figure.

staticassertion · on Aug 10, 2019

> Security engineers in general, very much including Chrome Security Team, would like to advance the state of engineering to where memory safety issues are much more rare. Then, we could focus more attention on the application-semantic vulnerabilities. That would be a big improvement.