Hacker News new | past | comments | ask | show | jobs | submit login
CVE-2019-1347: When a mouse over a file is enough to crash your system (tetrane.com)
142 points by danso on Nov 16, 2019 | hide | past | favorite | 58 comments



One more bit of proof that if you want proper isolation you should not bloat your kernel. All this stuff belongs in userland. At least there the damage will be limited.


The crash is in parsing an executable file. Linux also performs this parsing in the kernel. Moving it to userland would be a microkernel design.


> The crash is in parsing an executable file.

In a context that doesn't involve executing it. On a Linux system, this functionality would probably be implemented with libelf or libbfd, neither of which depends on the kernel.


Why does it matter if I'm gonna execute it? The parsing doesn't need to be done in kernel space in either case.

Then again, the kernel already has such a gigantic attack surface that it doesn't matter if the ELF parser is part of it.


A UNIX-style process spawning interface, like exec() or posix_spawn(), requires the kernel to have some awareness of the executable format so that it can create a process from an executable image. It doesn't have to support every feature of the format, though.

A model in which this happens in userspace is conceivable, but it'd be pretty weird. Besides, it introduces a chicken-and-egg problem -- if you need a process to launch a process, you need some sort of special case for init.


Your bootstrapping process and init can have their own, hardcoded format.


Yeah I though the same thing. init is special


IIRC, the binfmt elf code in the Linux kernel basically only does enough elf parsing to map everything into memory and kick off the linker. Then the linker does the rest, in userland. It matters because the binfmt code can be very hardened and minimal, whereas the userspace can go and do whatever it wants.

Of course the ELF parsing doesn't literally have to be done by the kernel, but there's no obvious good place to put it otherwise, considering it's a critical bit. If security is the real concern, then why wouldn't you just drop privileges while parsing ELF instead? I don't see the benefit to moving this code outside of the kernel. Maybe somehow moving it to a different protection ring would be helpful; but same for the rest of the binfmts.

I believe the Windows loader that loads PE files, is in fact in the kernel, along with a lot of other stuff. At the end of the day, when the Windows NT kernel was architected, speed was really important and security practices were not matured. I mean, the kernel handles or at once handled a lot of stuff, especially related to the GUI and even network RPCs and network services.


> Maybe somehow moving it to a different protection ring would be helpful; but same for the rest of the binfmts

And goes for a lot of other kernel code.


> IIRC, the binfmt elf code in the Linux kernel basically only does enough elf parsing to map everything into memory and kick off the linker.

Even for statically linked executables?


For a statically linked executable, the minimal parsing is enough to kick off the executable directly. After all, the linker is "just" a statically linked executable...


One could use eBPF to do the parsing safely in the kernel ;-)


But here's a question - supposing that this functionality was moved into a hypothetical library "pe.dll"/"pe.exe", how would the kernel load this library or executable? Wouldn't one run into a catch-22 situation?


I think they're saying that userland should handle whatever the mouseover is trying to do (another comment says it displays author and version info embedded in the exe). Things that actually require kernel privileges would remain in the kernel.


The kernel would probably statically link it.


…wouldn’t that have the effect of bringing this code back into the kernel’s address space?


The code itself would be in the kernel, but the "parsing while mousing over" invocation would be on userspace code.

That is, the parse logic could be a library linked by multiple application.


wouldn't you load/call it during boot or when the mouse driver loads? If you have a mouse, you are almost certainly going to generate an on-mouse-over event at some point. The advantage is that it isn't running in ring 0.


> Moving it to userland would be a microkernel design.

Yes, please.


Its crazy that the arguments against microkernels were basically 100% its slower.. Which these days is completely batshit insane because we are running software in VM's eating 10%+ system performance, written in languages like Java/Javascript/etc which eat another 50% before the cowboy coder comes along and imports 2GB of libraries which effectively trash the caches/tlb/etc at every opportunity due to the difficulty of creating caches with massive associativity. Which is how we get chat apps written in electron eating up a couple 4+Ghz cores and multiple GB of memory just to display what was basically possible in with mIRC and 64k of ram 20 years ago.

NT tends to be conceptually about the right trade-off (micro-kernel design, but no context switching) but then they go do stupid crap like this.


Some computers have that software.

Others don't, for example tons of embedded devices.

If you have a small, fast kernel it could be used on a variety of devices, from Raspberry Pi to Mac Pro.


mIRC could not do inline gifs. You had to draw everything in ascii. Is that not a reason enough to spend several $k in a device to handle this?


LOL, just post a link to the rich content, that is how its mostly done with hexchat these days. Of course you could just DCC the content to someone too.

But it serves to discourage anything that isn't actual conversation, which might not be a bad plan in the end.


Having been on chat from slack to discord to reddit chat to telegram... I focus back on IRC for its simplicity and lack of video and image-spam

In rare instances pictures add to the conversation. But 99 times out of 100, that pic is some reaction gif (spongebob, wat, etc). And I'd argue that these reaction images are completely worthless.

In a way, it's also the comparison between fb or reddit vs HN or lobste.rs .


Parsing and displaying a gif is not such a complicated matter that it comes anywhere close to excusing the resource gap between the two.


> Its crazy that the arguments against microkernels were basically 100% its slower.. Which these days is completely batshit insane because we are running software in VM's eating 10%+ system performance, ...

The difference is I can avoid most of that crap when I have to write performance sensitive code, I cannot avoid a slow system without dropping down to the hardware.


Slow is very relative here, the performance penalties are on the order of 5 to 10% or so in absolute throughput terms, latency is usually much better in a microkernel based system because it can use the cores more effectively and even quite a bit of the message passing overhead can be caught as long as you stay on the same physical machine by using the paging mechanism.


Linux uses libelf to parse the executable, it isn't in the kernel. It's a sort of weird process but binfmt will load libelf, which is then responsible for loading the ELF file.


Nope: The Linux kernel has to parse and support all of the ELF stuff the INTERP of the executable needs.

fs/binfmt_elf.c is over 2k lines.



Dissenting opinion: if this wasn't in the kernel it's unlikely to have been fixed as quickly --- explorer.exe crashes are unfortunately still common even without third party extensions. If they were causing BSODs they would certainly be given more attention and fixed more quickly.


Poor prioritisation doesn't excuse poor systems design.


Well, that would mean no more kernel bugs and that would reduce the impact of these issues substantially. If that would translate into the bug getting fixed that's a supplier issue, but no longer a major one.


What would the behavior be if this happened in user space?


File browser / user session would crash.


Is that less bad? Wouldn’t that nuke the experience too?


When explorer (the shell and file browser, don't know if those are fully separate under Win 10) crashes, it is automatically restarted and would not close any other running programs and would not log you out.


...and that's probably the reason why they don't seem to get fixed. I've had clean installs of Win10 and 8.1 crash it often, just by browsing through a lot of files.


Thats a regular occurrence for me and people I know. My boss inserted a USB drive and Explorer crashed. It restarted itself, but crashed again. Until the drive was unplugged, at which time Windows made a bunch of error noises and popped up a "Catastrophic failure" dialog. It then worked OK after the USB drive was reinserted. The laptop cost ~3500€ and was installed and set up by the IT department with extra care because it's for the boss. It didn't matter, Windows 10 doesn't pick it's victims.


I assume the audience of a post like this is other security researchers?

That's a radical piece of software, though. I've never seen anything like that before.

The text and technical details supporting the screen caps is very dense and hard to approach as a complete outsider. I would love to have a glossary with all the terms and initialisms/acronyms in use. PTE (apparently means page table entry), for example, is something I've never heard of before.


The format wasn’t particularly great even to someone who’s interested in the field and familiar with dynamic taint analysis (but not an expert :) ). The issue was that it focused a bit too heavily on showing what their product could do, so it had a lot of videos instead of a more tractable “our taint analysis tool shows that the lower four bits of this byte come from byte three of the PE header after it’s passed through x, y, and z”. (Page table entries are used to implement virtual memory, though expanding the acronym at least once would have been helpful. That being said, the code mentions page tables multiple times.)


I'm in the security field and do work similar to this professionally. This article it poorly written and i found the videos shown absolutely useless. If you want to see how technical articles should read, take a look at j00ru's blog (research that found this). That is how you write about this stuff.


Thank goodness!

I looked at the article and it made me mad because it seemed to be a bunch of nonsense.

I mean the second video paragraph is: "A closer look at the memcpy arguments shows that this address is built as 0xfffff8035b2a0000 + 0xe7ff, so we taint the value 0xe7ff to find where it comes from." But there's absolutely zero explanation of what we're looking at.

So it made me mad because it made me feel stupid for not understanding it, when in reality it's just poorly written. Poor communication is theft.


> I assume the audience of a post like this is other security researchers?

Or "low level/back end" developers; people knowledgeable with operating systems, assembly, and hardware.


This issue apparently affected Windows 8.1, Windows 10 and their corresponding Server editions (32-bit and 64-bit) and has been patched since October 8th.


The real question is: why the fuck the kernel parses the PE on just a mouse over... it like Windows is asking for vulns with even more impact :/


It displays an overlay when you hover the mouse over a PE executable with information like the author/company/version/date created.

The fact that it does this in the kernel is foolish, but there's functionality directly tied to this, not an indirect chain of things that results in parsing this for little reason.


Nobody is contesting that there’s a chain of events that leads to this, but doing this in the kernel is not “foolish” - it’s batshit insane, amateurish and frankly laughable.


Typically these things happen because people want to reuse code. There's some code that parses executables, there's a bit of shell that wants to display information from an executable, and someone joins the two together without understanding the ramifications.


I would argue that the stronger reason is that Windows typically doesn’t enforce, allow, or contains any distinction between kernel code and user-space code. This happens because there is a gigantic gravity well of pre-existing code that predates and/or precludes real separation, and the costs of concretely losing that support was nearly always judged higher than the abstract benefit of adding separation.

One old example: I have an old friend who used to do sysadmin work in the Windows 2k days, and he would often get gigs by speeding up services by 5-15% within a day. His go-to trick was to log into the headless win2k server and disable the screensaver process, then wait.


Windows NT (and as a result win2k, xp, 7, 8, and 10) has always had strict user and kernel space separation. Modern Windows also has separate 'integrity levels' within user space - for example, Chrome and Firefox run most of their functionality at 'Low' integrity to prevent compromised browser code from being used to exploit other processes. I'm not sure what the basis is for the claims you're making about the kernel. The idea that distinguishing between kernel and user code is Not Allowed is very confusing to me, how would you do that without running both user and kernel code at the same privilege level in a shared address space?

The CVE here literally has nothing to do with privilege separation or being unable to distinguish between user mode and kernel mode. It's this simple:

* Someone in user-space wanted to parse an executable to get information on it

* A syscall is exposed to parse an executable

* They used the syscall to parse the executable, and that code path had a bug in it which triggered a BSOD (since the failure was in kernel mode).

In the end, this is just "a syscall had a bug in it that caused a kernel crash, and a user-mode application issued the syscall on mouse-over".


What is foolish about it? The system already has to know how to map executables for execution, which is strictly harder, and that code better not crash.


If we trusted code to be reliable solely on the basis of how bad it would be if the code broke, we'd probably be inundated with CVEs.

I'm not sure what I'd suggest instead, though - this specific case would be nice if we had kept the parser in userspace, but that just turns "hovering triggers kernel-mode crash" into "attempting to execute causes kernel-mode crash", presumably.

[Rant about wanting more trusted codepaths having static analysis and/or managed languages goes here.]


Where should the OS's executable parser/loader be if not in the kernel? Are you proposing some sort of tiered bootstrapping scheme, where there's a baby loader that's used at boot to load an executable containing the Real Executable Loader, and that lives in a long-running single process with root permissions responsible for loading any and every dll or exe file ever touched on the entire system? I assume you'd issue requests to it via RPC or something, and it'd need to run on multiple threads so that loading executables wouldn't bottleneck. To get back info on the .exe file like the name of the author, you'd probably need to fire off an async request to the Executable Loader Service to ask it to load foo.exe, then fire another request to ask it to RPC the author name back to you, then after all that copying and RPC has happened you gotta send another call back over to free the loaded executable handle. Oh and I guess the executable loader broker service also needs to verify that the process asking for an executable to be loaded has permissions to access the file, since the I/O is being done out of process now, can't forget that. And it needs to make sure that it doesn't let handles leak if the client process crashes in the middle of manipulating the .exe information, otherwise the executable loader broker service could run out of memory and take down the whole system. Seems good.

Then you ask the broker to load this corrupt .exe file, and it crashes, and now your system (which hasn't bluescreened! hooray, we win!) can't run executables anymore. Under those circumstances, I would just take the OS down and generate a dump that someone can open up in a kernel debugger instead of letting it stumble along half-broken in a way that makes the problem hard to diagnose.


Perhaps I am missing something here, but aren’t we are talking about the ability for a file browser to read a little bit of meta data out of a file?

I don’t understand why reading the metadata of a file (even of an executable file) should need heightened privileges. What am I missing?


most oses parse things via 3rd party code even without mouse over. MacOS has finder plguins which is how 3rd party apps can have thumbnails in the finder.


> MacOS has finder plguins which is how 3rd party apps can have thumbnails in the finder.

QuickLook is implemented via a horrific process where multiple plug-ins share the same address space :(


The images in the article will stay a blurry mess without Javascript and will not load. Please don't do this. Use progressive images.

EDIT: Apparently they are videos, but you can still add thumbnail images to native HTML5 videos without loading 40 script files (that uMatrix reported).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: