CTF Writeup: Abusing select() to factor RSA

ridiculous_fish · on Nov 12, 2023

Apple's libc has a pretty wild feature (guarded by _DARWIN_UNLIMITED_SELECT, on by default) which allows fds above FD_SETSIZE. It works by checking the address of the fd_set: if it's within the current thread's stack, then the call will fail, under the assumption that it's a stack-allocated fd_set.

But if the address is NOT within the current thread's stack, select() assumes you know what you're doing and will allow the call, trusting you have allocated sufficient memory for the high fds in the fd_set.

No opinion if this is a good decision or not, I just think it's interesting!

jjnoakes · on Nov 11, 2023

select() ought to just fail with EINVAL if nfds is too large. There's probably an argument for the macros as well to fail by default if the file descriptor given is too large.

moyix · on Nov 11, 2023

As one safety measure, compiling with -DFORTIFY_SOURCE > 0 will enable checks on the FD_SET etc macros:

https://github.com/bminor/glibc/commit/a0f33f996f7986dbf3763...

The kernel side interface probably won't change because apparently legitimate apps have been allocating fd_sets on the heap to monitor fds > 1024 and they don't want to break those:

https://sourceware.org/bugzilla/show_bug.cgi?id=10352#c7

The underlying problem was also discussed in an article by Lennart Poettering (of systemd fame) and posted to HN back in 2021:

https://news.ycombinator.com/item?id=27215690

https://0pointer.net/blog/file-descriptor-limits.html

warpspin · on Nov 11, 2023

> The kernel side interface probably won't change because apparently legitimate apps have been allocating fd_sets on the heap to monitor fds > 1024 and they don't want to break those

Rightly so. Before libev, AIO or whatever where a thing, I used to run network servers 10 or 15 years or so ago with a redefined __FD_SETSIZE set to 16384 without any problems on Linux (plus appropriate proc and ulimit settings). The whole stack properly supported it, even if not officially supported.

The real problem nowadays is, people can easily receive a fd >= 1024 as you do not control them, and then put them into fd sets only supporting values up to 1023 and then you have a security problem. Plus of course, the later APIs also simply scale better beyond 16k connections.

syncsynchalt · on Nov 11, 2023

I'm guessing you didn't/couldn't use poll(2) because of the performance hit (both user and kernel side) of parsing/checking the less compact data structure?

I both miss and don't miss the days before epoll(2) et al.

warpspin · on Nov 12, 2023

Originally, we did not use it because it did not exist. That service existed already in a time before poll(2) was a thing on Linux and when we still had 256 fds only. The increase to 1024 and then to basically unlimited came just in time for us back then, and we were glad a simple recompile with a #define was all we needed to scale it.

If my memory serves me right, we did try poll(2) though a while after it became available (and we already ran the 16k selects) but it was simply less performant.

Later on, when java.nio came around and Java finally could "compete" for network services therefore, we switched from C to Java completely for that service.

dleslie · on Nov 12, 2023

Hmm, I wonder if there are any Linux distros that are built with this flag enabled?

sweetjuly · on Nov 12, 2023

Fortify source has been enabled in many distros for package builds, including popular ones like Ubuntu [1], for quite a while.

[1] https://wiki.ubuntu.com/ToolChain/CompilerFlags#A-D_FORTIFY_...

cjbprime · on Nov 11, 2023

CTF writeups are so fun. Here's another I enjoyed, written by a teammate: https://zackorndorff.com/2022/08/06/blazin-etudes-hack-a-sat...

EdSchouten · on Nov 12, 2023

I really think that this is a quality of implementation issue.

Even though most implementations do so, there is no requirement to implement fd_set as a bitmap. It could also be an array of integers. Though this still won’t allow you to select() against an infinite number of file descriptors, it at least allows file descriptor numbers to span the full range of int.

Furthermore, there’s also no requirement that FD_*() corrupt your memory. I get it that these macros can’t return errors back to the caller, but they can always set some kind of flag in the fd_set to indicate that insertion was unsuccessful. select() could check that flag and bail out if set.

Sesse__ · on Nov 12, 2023

Windows implements fd_set as an array. https://learn.microsoft.com/en-us/windows/win32/api/winsock/...

gpderetta · on Nov 12, 2023

They could simply abort.

kelsey9876543 · on Nov 11, 2023

Excellent writeup. Thank you.

chc4 · on Nov 11, 2023

I love well designed POSIX APIs, such as "this will silently corrupt memory if you use an FD above FD_SETSIZE, which you have no control over and have no sane way of remapping if it does happen".

deathanatos · on Nov 11, 2023

…right?

The API design is patently insane, but why can't there be a simple

  if(nfds > FD_SETSIZE) {
    errno = EINVAL;
    return -1;
  }

… or something to prevent "the API is garbage" from escalating all the way into "and now your memory is corrupt and the hackers are in"…?

jcalvinowens · on Nov 11, 2023

That's not really what the problem is. The actual code is fine.

The issue is that the definition of `fd_set` has a constant size [1]. If you allocate the memory yourself, the select() system call will work with as many file descriptors as you care to pass to it. You can see that both glibc [2] and the kernel [3] support arbitrarily large arrays (well, in the kernel case you'll run into other limitations... but no memory corruption).

[1] https://github.com/bminor/glibc/blob/master/misc/sys/select....

[2] https://github.com/bminor/glibc/blob/master/sysdeps/unix/sys...

[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

deathanatos · on Nov 13, 2023

> The issue is that the definition of `fd_set` has a constant size

I'm well aware.

> If you allocate the memory yourself, the select() system call will work with as many file descriptors as you care to pass to it.

I'm aware of this, as well.

That the kernel interface is more flexible is fine. But the glibc wrapper could sanity-check its arguments.

If you're saying the glibc wrapper also is flexible (assuming one allocs their own fd_sets and … somehow that doesn't run afoul of strict aliasing…?) … and thus, to allow that backbreaking exercise of "you can still use it for all its other bugs" cannot sanity check its arguments (because nfds > the FD set size is permitted) … well … good grief.

jcalvinowens · on Nov 13, 2023

> I'm well aware

> I'm aware of this, as well.

I'm really curious.. when you typed those sentences, what possible purpose did you imagine them serving?

> If you're saying the glibc wrapper also is flexible

Yes, it obviously is. There's a long and storied history around this, which you're clearly ignorant of: programmers have been abusing this interface in exactly the way I'm describing for decades. Suddenly breaking a bunch of old code in the name of making an antiquated interface nobody uses anymore "secure" is just bad policy IMHO.

tedunangst · on Nov 11, 2023

Because 1024 files ought to be enough for anyone? People wanted to write software to handle more files than that.

95014_refugee · on Nov 11, 2023

Because the code was written before unit tests were a thing, and nobody is willing to take the risk / do the work to fix it, especially when "it's been shipping for years and nobody has ever complained".

raldi · on Nov 11, 2023

It’s been shipping for decades and people have been complaining the whole time.

klempner · on Nov 12, 2023

The funny thing is that select's APIs are compatible with a non-broken implementation, such as something that heap allocates if above a constant size. My recollection is that some implementations (winsock?) even do this.

Of course, that's never going to actually happen on implementations people care about between ABI breaking on the one hand and the existence of poll/epoll on the other.

(My biggest concern in practice is random shitty libraries using select behind the scenes and then silently corrupting memory in processes that have more than a few file descriptors.)

jcalvinowens · on Nov 11, 2023

> which you have no control over

It's not quite that bad: UNIX has always guaranteed open() will return the lowest unused file descriptor. So in practice, it just limits you to 1024 total open files in the process, which in all fairness probably seemed like an absurdly large number at the time it was designed.

klempner · on Nov 12, 2023

And of course that guarantee has its own problem, namely that (especially in a multithreaded process) a use-after-close error is vastly more likely to cause corruption via a write to a newly opened file in the old file descriptor.

And in all fairness, nobody was thinking of multithreading when these APIs were designed. We're lucky enough that errno mostly works as a thread local rather than a global.

toast0 · on Nov 12, 2023

Use after close is, of course, tons of fun. But the guaranteed order also means open and accept require a per-process lock.

saagarjha · on Nov 11, 2023

Memory corruption might die out as we secure our software stacks but I’m glad to see that weebs are eternal.