Hacker News new | past | comments | ask | show | jobs | submit login

"The image files tricked the iPhone into giving access to its entire memory, bypassing security and allowing the installation of spyware that would steal a user's messages."

Seems like there is a lot behind this summary of the hack.

How does an image file trick the iPhone into all that?





Wow, so basically:

1. iMessage has a feature to send and receive GIFs

2. These GIFs are copied to a specific path early in the message processing pipeline (even before the message is displayed)

3. But the copy code doesn't just copy the GIF. It uses the CoreGraphics APIs _renders_ the image to a new GIF file at the destination path.

4. The code uses the ImageIO lib to guess the image format, ignoring the .gif file extension. So you can trick this code to accept a non-GIF file.

5. You can use the above to invoke one of over 20 image codecs that were not intended to be invoked in this code, including the CoreGraphics PDF parser.

6. CoreGraphics PDF parser has a very specific vulnerability in its JBIG2 image codec.

7. JBIG2 takes an image of text, identifies repeating glyphs and uses that fact for better compression. To avoid confusing slightly differing glyphs in things like images of poor quality prints (think e and é, or 3 and 8), it has a way of applying a diff over each instance of an identified repeating glyph.

8. This logic has an integer overflow bug: the 'number of symbols' variable is a 32-bit integer, which can be overflowed using a carefully crafted file. Now the attacker can can set the buffer for symbols to a much smaller value.

9. Making a long story short, this allows overwriting heap memory, setting arbitrary values in the objects used in the JBIG2 logic.

10. The JBIG2 logic uses AND, OR, XOR and XNOR operations when iterating through these objects (to apply the 'diff' on glyphs). The attacker can craft a file that strings together these logic operations so that it basically forms a software logic circuit.

11. So this exploit basically emulates a computer architecture inside an image codec, which can be used to operate on arbitrary memory!

Is that right? If so, this is mind-blowing.


If a hack can be called beautiful, this fits the bill. How do people come up with these?


Being trained in Israeli intelligence corps, moving to civilian life, retaining your spook skills and being funded by Saudi billionaire prince who hates human right activism and criticism.


Or maybe by running step 1 and step 2, and then working for the iOS team, as you can see if you mine a little bit LinkedIn.


You actually only have to check 2/5 of those boxes.


The biggest issue here is that this image parsing was done by such a high-privileged process. What happened to all the sandboxes and stuff?


From the original article [0], last line: "In a future post (currently being finished), we'll take a look at exactly how they escape the IMTranscoderAgent sandbox."

[0]: https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-i...


Is the CoreGraphics ImageIO stuff privileged?


I think the most critical part in the flow is the integer overflow bug, and it is totally avoidable. I am a software engine at Microsoft. Half of my time was spent on security and compliance. We have the right tool, right policy to avoid such things happen. However, I'm not saying Microsoft software is free of integer overflow bugs. I don't intend to advertise Microsoft C/C++ development tools here, but they are the ones I know most.

Let's go to the technical part: If you are asked to implement the binary algorithm with your favorite programming language, how do you verify your code? Unit-tests. How many test cases you will need? More than 10. Binary search implementations are easy to suffer integer overflow bugs(remember the one in JDK?), as long as you have enough tests, your don't need to worry too much. But how much is enough? People can't implement binary search correctly in decades is not because we don't know the algorithm enough or we don't have excellent software engineers, it is because w don't know how to test our code thoroughly. Any non-trivial C/C++ function may need tens of thousands test cases. Simply you can't write them by hand.

You need the right tools: fuzzing and static analysis.

At Microsoft, every file parser should go through fuzzing, which basically is you generate some random input, then you run your tests with the random inputs. Not very fantastic. But there is another kind of fuzzing: symbolic execution, which tries to find all the possible execution paths of your code. If you run symbolic execution with your binary search code, you can get 100% test coverage. And it is guaranteed bug-free. It is like a math proof. Please note the advantage is based on human just had surprising great advancement on SAT solvers in the last 20 years. And often you need to make some compromises between your business goal and security. Most functions can't reach 100% test coverage. You need to simplify them. See https://github.com/klee/klee to get a quickstart. Though C/C++ is often considered unsafe, they have the best fuzzer.

Then it is about SAL annotation and static analyzer. In C, whenever you pass a pointer of an array to another function, you should also pass its length with it. And in the callee function you should check the length. If you forgot it, your static code analyzer will give you a warning. In such a sense, if you didn't allocate enough memory, it will only result an error code being returned instead of undefined behavior.

The last thing: Use safeint wrapping your malloc function. https://docs.microsoft.com/en-us/cpp/safeint/safeint-library...

When we move off the binary search toy example to a real code base, clearly you can see how much extra effort is needed to make the code safe. Please pardon me, most OSS libraries don't have the resource. Many famous OSS projects are "Mom-and-pop" shops. They don't have any compliance rule. They invest very little on fuzzing. So the big companies really should help them. Now you see an integer overflow bug was found in Apple's image render, but was the code written by Apple? Not necessarily. Now we all see the importance of the Open Source movement. It's time to think how to harden their security. For example, even I want to spend my free time on adding SAL annotations to an OSS project I love, would the maintainers accept it?


Why aren’t you using higher-level memory safe languages for that? In C#, the runtime checks for integer overflow can be enabled with a single compiler switch. The switch is not set by default for some reason, but easy enough to enable manually, a single line in *.csproj file.

If you think GC performance is not good enough, see that proof of concept: https://github.com/Const-me/Vrmac/tree/master/VrmacVideo/Con... That C# code implements parser for Mpeg4 format. That format is way more complicated than GIF or even PDF, yet that code runs fine even on very slow computers (Raspberry Pi 4). There’s another similar one in that project for MKV format.


I'd prefer to catch such errors at compile-time. More static the language is, more optimization/analysis can be made. Sometimes the problem can be simplified when your CPU is 64-bit capable but you limit array sizes to 2GB, then you can use 64-bit math to calculate memory sizes to avoid integer overflow. Java and Google protobuf are two such examples. Sometimes the 2GB limit is acceptable, sometimes it is not. You know protobuf even tries to limit string size to tens of MB for safety? The simplification can not be accepted as a general solution.

Back to your Raspberry Pi 4 example: The CPU is 64-bit, but most users only use 32-bit OS with it. Today most Linux installations are 64-bit. I believe Google doesn't care much on protobuf's security on 32-bit systems. So does the other OSS software. So if you take it seriously, it works but it is not safe(when we are talking integer overflow).


> I'd prefer to catch such errors at compile-time.

I don't believe it's possible. These integers often coming from user's input, disk, or network. Compiler can't validate these simply because it doesn't have the data.

Even when possible, it's insanely complicated, and computationally expensive, to catch in compile-time, yet very simple in runtime.

Runtime performance overhead is very small because branch prediction is quite efficient on modern CPUs, these branches are almost never taken, JIT compiler knows about that, and emits code which will be predicted correctly even when uncached.

> if you take it seriously, it works but it is not safe

Noy sure I follow. Let's pretend I am taking it reasonably seriously, despite old and unpaid hobby project.

Why it's not safe? The Mpeg4 and MKV parsers are written in C#, and compiled with that <CheckForOverflowUnderflow> option set to True.


Holy shit. My relatives have asked me in the past "could this [image|video|other supposedly innocuous file format] be a virus or hack my phone?". I've always told them not to worry. Can't do that anymore.


https://en.wikipedia.org/wiki/Windows_Metafile_vulnerability Long story short: Windows library routines for handling an obscure, obsolete image format had a parser flaw. Simply rendering an appropriately crafted image via the standard Windows APIs -- whether in a web browser, file explorer, file preview, word processor, anywhere -- resulted in kernel-level arbitrary code execution.

Now, we've gotten a bit smarter about this sort of thing since. Both at a high level architecturally (don't process image files in the kernel) and at a lower level (use a language that takes measures to constrain its buffers). But the basic scenario hasn't been entirely eliminated. There could be a parser bug somewhere in your web browser for example that allows a properly crafted input to hijack the browser process.


> There could be a parser bug somewhere in your web browser for example that allows a properly crafted input to hijack the browser process.

Bit of a caveat: Chromium and Firefox are probably some of the most hardened software programs in the world (for other browsers, all bets are off).

Chromium distributes its logic over multiple processes per tab, so that even if you eg find a zero-day in V8, you still can't use it to get arbitrary file access without a sandbox escape. Last I checked, Firefox was getting there. Also, Firefox compiles some parsing and image processing libraries to WebAssembly for another layer of sandboxing (and to protect against ROP exploits), and increasingly uses Rust for sensitive tasks.

That's not to say they're safe, but I don't think they're the biggest source of exploits.


> Bit of a caveat: Chromium and Firefox are probably some of the most hardened software programs in the world (for other browsers, all bets are off).

There is a certain irony in the idea that people should rely on a Google product to avoid having their privacy compromised.


I guess with Google, at least you know and give consent to the privacy "invasion" when you use their products.


> I guess with Google, at least you know and give consent to the privacy "invasion" when you use their products.

Google tracks you and adds things to your profile when you explicitly choose incognito mode to avoid the privacy invasion. This doesn't seem like informed consent to me.


Which is why the only safe way to operate is assume anything that is susceptible to outside data is already compromised - and so run them in sandboxes.


You should read the writeup. This was run in a sandbox. Sandboxes are not silver bullets and too can have bugs.


The tech is the easy part: iOS/Android have the best security teams in the world, and an unlimited budget, and sandboxing is an old, proven tech.

I guess that the politics here are the real barrier.


Not only politics, the reason why some languages and OSes rule, is that real progress only happens one generation at a time, to put it in a more friendly wording.


"Science progresses, one funeral at a time."

https://en.wikipedia.org/wiki/Planck%27s_principle

As a general principle in life, it's quite frightening considering ever increasing life spans.


That's just a mitigation for tens of millions of lines of code written in C / C++.


That's not a solution. You're just piping the outside data into your sandbox; it can have bugs too.


This is why I run a 1-task only Windows VM inside a Linux VM on a Mac. Ain’t nobody ripping through x3 0-days for my chats.


If you're a targeted journalist, they'll go through more than three to get you. Full chains are fairly long these days.


lmao bro does all that in front of his IoT Samsung toaster, that has a speaker...speakers are microphones.


Macwinux


There's no such thing as a perfect solution, only solutions that improve a bad situation.


I'm not so much saying it's a bad idea as that what my parent comment described was a logical contradiction. It isn't possible to run "anything that is susceptible to outside data" in sandboxes, because that makes the sandbox susceptible to outside data. If you're genuinely assuming that anything susceptible is already compromised, then the sandbox is accomplishing literally nothing.


I always remember a quote from a sci fi I read about the "multi planet Internet" there.

It was layer upon layer upon layer of protocols and software.

Because it wasn't possible to remove old layers (because some satellites or wormholes or whatever would stop working.)

So, it was super easy to hack...and sending spam. Well you will get killed for that though.


Would be interested in the name of he book ?


It sounds like Vernor Vinge’s A Fire Upon The Deep (& sequels), well worth finding.


It depends on which sandbox you are using. In Qubes OS on desktop, you rely on hardware virtualization, which is virtually unbreakable.


I thought Spectre and Meltdown also allowed host data leakage from a compromised guest?


Yes, microcode vulnerabilities is a problem indeed. Hopefully Qubes Air (next version 5.0) will compartmentalize even that by using separate devices as qubes: https://www.qubes-os.org/news/2018/01/22/qubes-air/.


I wish OSs would just can support for legacy stuff nobody uses and make it an explicit install for the 1% of people who need it.


Each feature has a different 1% who use it


These exploits are only really an issue for your grandparents and whoever if some large-scale mass hack is happening[^2]. As long as they stay up-to-date, anyone not targeted by nation state actors and not holding millions in cryptocurrency[0] likely has nothing to worry about, as these exploits are better used hacking journalists trying to expose corruption or political opponents running against the incumbent[1].

0: https://news.ycombinator.com/item?id=30322715

1: https://www.seattletimes.com/business/rights-group-verifies-...

^2: For instance, the Coinbase Super bowl ad that was only a bouncing QR code would have been a very interesting way to start WW3 if it were Russia hacking millions upon millions of americans' phones, exfiltrating any potentially sensitive information (company emails, etc) in an instant, and/or destroying the device via some exploit chain that destroys the OS and requires a full firmware factory reset to recover.


I think it’s very possible to be in the grey zone. For example, I have worked as an activist, and a campaign I led was widely lauded by the mainstream media as being the key factor in a powerful government minister losing. Now, this is a western liberal democracy that probably doesn’t need to buy tools from NSO Group. But still, am I a fair target? Or what about a friend who is a journalist whose articles have recently attracted negative attention from the government of a fairly corrupt, despotic nation. Are they are target? I’m not talking about Snowden or Assange or executives at trillion dollar companies. In my circle I know a bunch of folk who are basically just normal people with jobs, but for whom their job means that someone working for some government somewhere might like to read their texts. How wide is the net? How can we protect ourselves?


You're really cavalier about whether widespread hacks happen. See any of the text message attacks from the past decade.


Vulnerabilities on iOS are getting really scarce. People spend truckloads of money finding them and you need to pay twice that for the permission to burn said exploit.

That stuff isn't burned on mass hacks on random phone users, it's way too valuable.

BUT. There is a small sliver of time between the exploit being used on a high value target and Apple patching the hole. That's the spot where Joe Schmoe should be cautious.


You are probably right, but this attack only became visible because it had a bug. How many others are invisible currently? Well that's what I'm asking myself :)


A lot, but they're still only used for high-value targets. They're way too valuable to waste on some random person who happens to click a link.


That's a bad argument for defense.

If it can be used on one random person then it can be used on the hundreds of millions of random persons who use iPhone and Android.

And getting even 1% of those massive user bases to click on a link and steal their money or private information, would be incredibly lucrative even for the short period until the patch rolls out, especially for the wealthier iOS userbase as a target.

In my EU country, I'm still getting regular spam SMS with links to what I presume is some older Android malware that wrecked havock last year. So, if attackers are still at it, months after a patch was rolled out, it means they must be still getting returns on their "investment".


Except we don't live in the past decade anymore. Even though people are still sometimes reluctant to updates ("it only made my device slow!"), We made significant progress on patch distribution.

In the past a bug in the SMS stack could be mass exploited and still not getting fixed anytime soon. Not anymore. These bugs cost $10k~$100k now and once you mass-exploit it, they are gone.


> Except we don't live in the past decade anymore.

You do know that is a terible attitude for a real-world security posture meant to protect non-theoretical people's property and information against actual exploits?

> In the past a bug in the SMS stack could be mass exploited and still not getting fixed anytime soon. Not anymore.

While you may wish for patches to always take care of exploits before any phones are compromised, that's not much more than wishful thinking. You assume that all 0day exploits are both known and fixed immediately. That is 100% false.


once you mass-exploit it, they are gone

That is only true of exploits that have obvious and visible impacts, right? If an attacker found an exploit and used it to put a rootkit on millions of phones, but did nothing with that rootkit and it had no outward markers, would anyone know?


Yes, probably the backdoors that security companies implement on the phones to exfiltrate and sell data would reveal that.


I wonder whether even as many as half of android phones are less than, say, six months behind on security updates. They're often quite slow in releasing for any given model, and that's while the phone even gets updates.


One of the earlier iPhone jailbreaks was a tiff image... complicated decompression/rendering algorithms leave room for implementation errors, which can be taken advantage of.

https://en.wikipedia.org/wiki/JailbreakMe#JailbreakMe_1.0_(i...


At some point it was also used as a way to get a custom firmware onto a PSP.

Then modders somehow managed to update the batteries' firmware (cf "Pandora battery") and use that. Sony couldn’t patch it, and it was basically game over for them until they released a new generation of hardware, with motherboards immune to the trick.

Fun times.


There's been buffer overflows/RCE exploits in all sorts of software that can parse images since, well, forever. I remember more than 20 years ago seeing a notice about the embedded Internet Explorer rendering engine in Microsoft Outlook Express having an RCE zero day which could be exploited by simply loading an image in the body of an email.

Rich multimedia parsing display systems in messaging apps are a very tempting attack surface for entities such as NSO.


Why a messenger app needs a picture viewer?


Because people send each other pictures?


> Why a messenger app needs a picture viewer?

A picture is worth a thousand words.


There are two different types of attacks.

One is fly-by attacks by random viruses and ransomware. For those cases, I would not worry about pictures.

Other is when you are targeted by regimes with essentially unlimited budget. In that case yes, the picture can be a spyware.


> Other is when you are targeted by regimes with essentially unlimited budget. In that case yes, the picture can be a spyware.

If this was the case, exploits would never be published or abused, and jailbreaks wouldn't exist because this logic says that those who find exploits will either disclose them "responsibly" or sell them to a nation-state.

If the idea of non-state hackers doesn't bother you, recognize that organized crime is a billion dollar industry and fraud rings would love root access on tons of normal people's devices, including your own.


That's terrible advice that is among some of the worst advice that could be given. There are many other types of attacks that are not viral, are not ransomware and do not originate from state actors.


How does one know which category they are in?


Think about who would want to spy on you, what they'd want to know, and how much they'd be willing to spend to know it.

If the most they could get out of you was a few thousand bucks from your bank account and maybe your email password, you're probably in the first category.

On the other hand, if you have access to highly confidential information (think classified government info or you're literally working on the next iPhone) or are the type of person who makes enemies of spoiled rich oligarchs in despotic nations then you're probably in the second.


The problem is, everyone is in the second category over a long enough time frame. Hong Kongers probably thought the same, but suddenly there were crackdowns, and state actors probably would have loved to have unrestricted access to peoples phones to see if citizens were exercising their “free speech” correctly.

Think about Ukraine today, the Russian government would probably love to have a way to compromise millions of Ukrainian citizens’ phones.

These people all use iPhones.


> These people all use iPhones.

A quick research tells me that pretty much all stats show Android use to be around 80% in Ukraine. Or did you mean Hong Kong? For the latter I see a 50/50 divide.

Just curious about that sentence. I don't think the stats take anything away from your general argument.


Who might be my enemy in the future? Well, maybe anyone who thinks I have something worth to them. Let's say a social media account with a double letter username. Or anything I don't think has any worth now but can be turned into a handsome buck tomorrow. People have been doxed and SWATed over less.


I don’t know. If I was going to bust a move on, say, Taiwan, it might be handy to have root access to as many computing devices as possible so that I could wreak havoc on my enemy’s communication and banking systems.


Who will target you if you are working on next iPhone?


everyone


China.

Xiaomi, Huawei, Oppo, Honor... there are quite a few Chinese phone brands that would benefit from knowing what Apple are working on.


You never really know.

But nobody is going to burn zero days on mass surveillance. It’s just for specifically targeted people.


Are you or someone you associate with interesting?

Negotiate big contracts? Work in aerospace or defense? Have access to inside information about a public company? Have access or are a high level political official?


If you are asking, you are probably in the first category, along with myself and the vast majority of people.



If you are a feminist activist in saudi arabia, I guess you know the deal



Exploits via image libraries have been a perennial threat. A lot of jailbreaks on various devices and consoles over the last 20 years owe their existence to such exploits.


In the late 1990s there were a ton of hoaxes about image files supposedly being viruses. Most famously:

https://en.wikipedia.org/wiki/Goodtimes_virus

I remember telling lots of people at the time that this was impossible, because images weren't executable code, and viruses spread through running programs, not through viewing images.

Unfortunately, this elegant, straightforward distinction didn't hold up over time. :-(

https://en.wikipedia.org/wiki/Weird_machine


> Unfortunately, this elegant, straightforward distinction didn't hold up over time. :-(

I think it was more that it was never true, rather than not holding up in time. ;)

The earliest I can find is a vulnerability in Netscape 3.0 (1996), not found until four years later:

https://www.openwall.com/articles/JPEG-COM-Marker-Vulnerabil...


You just need a buffer overflow in a file format parser.

Thus the distinction has never existed. There has never been such thing as a “safe” format.


>because images weren't executable code

I believe that is what the creators of this virus must be relying on. All I hope is that creating this image virus doesn't become common knowledge (cause that we will fundamentally reshape how we interact on social media).


> All I hope is that creating this image virus doesn't become common knowledge

All I hope is that devs start replacing parsers with ones written in a safe language.


The problem with cpu's is they dont know what instructions are supposed to run in order. Pipeline cache goes a little way towards getting the instructions in order, but ultimately a cpu does not know what instructions it has to run in order for a group of instructions to not be malicious. Think of a cpu like an old human telephone exchange where the operator is plugging in different cables to different sockets and hopefully you get the idea.

I'm amazed at the tech giants with all their funding and they still cant build secure operating systems or have the resources to reduce attack vectors within their own OS'es.


We've gone from 'every OS and device is easily exploitable' to mass market devices/OS pairings where drive-by exploits cost a million dollars.


You should look up Rice's theorem, because what you are suggesting is intractable and has nothing to do with the design of CPUs.


I wouldnt consider Rice's theorem to be relevant for what I was thinking. Sure all programs have common repeatable elements, like open a file, read/write, close file, so you wouldnt have an instruction or few out of the blue suddenly being run, in effect out of context, but thats whats happening here, the normal instructions that would be required to do a task, suddenly start using instructions that are not required in most cases before resorting back to the rest of the instructions for the original task.

Its abit like saying, would you expect some instructions for virtualisation to run if you load a jpg to display on screen? I wouldnt expect instructions for virtualisation functionality to be running in this example.

Or would I expect some instructions for encryption to run if I were to load a sound file to play over speakers? No I wouldnt expect that to happen, but thats the sort of thing thats happening here, some instructions not normally associated with a task are occurring, so how do you detect and alert and maybe halt those instructions?

There isnt anything in the CPU AFAIK that would pick this up, it would need the OS to act as a co-party to perhaps halt this, and I dont know if the OS or even AV software goes to this extent? At best, you'd have something like a dmesg feed or the Intel Processor Trace (https://news.ycombinator.com/item?id=30110088) to get the output of instructions being called (possibly independent of the OS), but like I say I dont know of any OS or AV product that goes to this level of monitoring.

Thats where I am coming from.


"Cook's egg" is a recommended reading. Summary here (spoilers ahead): https://icdt.osu.edu/cuckoos-egg


That summary sparks interest indeed, thanks for recommending! Just ordered a copy.


Was "Cuckoo's Egg" autocorrected to "Cook's egg"?


Yes, indeed. I don't see an edit option.


Can't edit after two hours, but it's present before that cutoff.


Image parsers are complicated and often exposed to untrusted data, they’ve always been a big vector of exploits.


they are safe, unless they're being targeted by state actors.


As of windows 7, the parsing of some font files was handled inside the Windows Kernel.


hi




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: