Hacker News new | past | comments | ask | show | jobs | submit login
CTF Writeup: Abusing select() to factor RSA (threadreaderapp.com)
135 points by moyix on Nov 11, 2023 | hide | past | favorite | 26 comments



Apple's libc has a pretty wild feature (guarded by _DARWIN_UNLIMITED_SELECT, on by default) which allows fds above FD_SETSIZE. It works by checking the address of the fd_set: if it's within the current thread's stack, then the call will fail, under the assumption that it's a stack-allocated fd_set.

But if the address is NOT within the current thread's stack, select() assumes you know what you're doing and will allow the call, trusting you have allocated sufficient memory for the high fds in the fd_set.

No opinion if this is a good decision or not, I just think it's interesting!


select() ought to just fail with EINVAL if nfds is too large. There's probably an argument for the macros as well to fail by default if the file descriptor given is too large.


As one safety measure, compiling with -DFORTIFY_SOURCE > 0 will enable checks on the FD_SET etc macros:

https://github.com/bminor/glibc/commit/a0f33f996f7986dbf3763...

The kernel side interface probably won't change because apparently legitimate apps have been allocating fd_sets on the heap to monitor fds > 1024 and they don't want to break those:

https://sourceware.org/bugzilla/show_bug.cgi?id=10352#c7

The underlying problem was also discussed in an article by Lennart Poettering (of systemd fame) and posted to HN back in 2021:

https://news.ycombinator.com/item?id=27215690

https://0pointer.net/blog/file-descriptor-limits.html


> The kernel side interface probably won't change because apparently legitimate apps have been allocating fd_sets on the heap to monitor fds > 1024 and they don't want to break those

Rightly so. Before libev, AIO or whatever where a thing, I used to run network servers 10 or 15 years or so ago with a redefined __FD_SETSIZE set to 16384 without any problems on Linux (plus appropriate proc and ulimit settings). The whole stack properly supported it, even if not officially supported.

The real problem nowadays is, people can easily receive a fd >= 1024 as you do not control them, and then put them into fd sets only supporting values up to 1023 and then you have a security problem. Plus of course, the later APIs also simply scale better beyond 16k connections.


I'm guessing you didn't/couldn't use poll(2) because of the performance hit (both user and kernel side) of parsing/checking the less compact data structure?

I both miss and don't miss the days before epoll(2) et al.


Originally, we did not use it because it did not exist. That service existed already in a time before poll(2) was a thing on Linux and when we still had 256 fds only. The increase to 1024 and then to basically unlimited came just in time for us back then, and we were glad a simple recompile with a #define was all we needed to scale it.

If my memory serves me right, we did try poll(2) though a while after it became available (and we already ran the 16k selects) but it was simply less performant.

Later on, when java.nio came around and Java finally could "compete" for network services therefore, we switched from C to Java completely for that service.


Hmm, I wonder if there are any Linux distros that are built with this flag enabled?


Fortify source has been enabled in many distros for package builds, including popular ones like Ubuntu [1], for quite a while.

[1] https://wiki.ubuntu.com/ToolChain/CompilerFlags#A-D_FORTIFY_...


CTF writeups are so fun. Here's another I enjoyed, written by a teammate: https://zackorndorff.com/2022/08/06/blazin-etudes-hack-a-sat...


I really think that this is a quality of implementation issue.

Even though most implementations do so, there is no requirement to implement fd_set as a bitmap. It could also be an array of integers. Though this still won’t allow you to select() against an infinite number of file descriptors, it at least allows file descriptor numbers to span the full range of int.

Furthermore, there’s also no requirement that FD_*() corrupt your memory. I get it that these macros can’t return errors back to the caller, but they can always set some kind of flag in the fd_set to indicate that insertion was unsuccessful. select() could check that flag and bail out if set.



They could simply abort.


Excellent writeup. Thank you.


I love well designed POSIX APIs, such as "this will silently corrupt memory if you use an FD above FD_SETSIZE, which you have no control over and have no sane way of remapping if it does happen".


…right?

The API design is patently insane, but why can't there be a simple

  if(nfds > FD_SETSIZE) {
    errno = EINVAL;
    return -1;
  }
… or something to prevent "the API is garbage" from escalating all the way into "and now your memory is corrupt and the hackers are in"…?


That's not really what the problem is. The actual code is fine.

The issue is that the definition of `fd_set` has a constant size [1]. If you allocate the memory yourself, the select() system call will work with as many file descriptors as you care to pass to it. You can see that both glibc [2] and the kernel [3] support arbitrarily large arrays (well, in the kernel case you'll run into other limitations... but no memory corruption).

[1] https://github.com/bminor/glibc/blob/master/misc/sys/select....

[2] https://github.com/bminor/glibc/blob/master/sysdeps/unix/sys...

[3] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...


> The issue is that the definition of `fd_set` has a constant size

I'm well aware.

> If you allocate the memory yourself, the select() system call will work with as many file descriptors as you care to pass to it.

I'm aware of this, as well.

That the kernel interface is more flexible is fine. But the glibc wrapper could sanity-check its arguments.

If you're saying the glibc wrapper also is flexible (assuming one allocs their own fd_sets and … somehow that doesn't run afoul of strict aliasing…?) … and thus, to allow that backbreaking exercise of "you can still use it for all its other bugs" cannot sanity check its arguments (because nfds > the FD set size is permitted) … well … good grief.


> I'm well aware

> I'm aware of this, as well.

I'm really curious.. when you typed those sentences, what possible purpose did you imagine them serving?

> If you're saying the glibc wrapper also is flexible

Yes, it obviously is. There's a long and storied history around this, which you're clearly ignorant of: programmers have been abusing this interface in exactly the way I'm describing for decades. Suddenly breaking a bunch of old code in the name of making an antiquated interface nobody uses anymore "secure" is just bad policy IMHO.


Because 1024 files ought to be enough for anyone? People wanted to write software to handle more files than that.


Because the code was written before unit tests were a thing, and nobody is willing to take the risk / do the work to fix it, especially when "it's been shipping for years and nobody has ever complained".


It’s been shipping for decades and people have been complaining the whole time.


The funny thing is that select's APIs are compatible with a non-broken implementation, such as something that heap allocates if above a constant size. My recollection is that some implementations (winsock?) even do this.

Of course, that's never going to actually happen on implementations people care about between ABI breaking on the one hand and the existence of poll/epoll on the other.

(My biggest concern in practice is random shitty libraries using select behind the scenes and then silently corrupting memory in processes that have more than a few file descriptors.)


> which you have no control over

It's not quite that bad: UNIX has always guaranteed open() will return the lowest unused file descriptor. So in practice, it just limits you to 1024 total open files in the process, which in all fairness probably seemed like an absurdly large number at the time it was designed.


And of course that guarantee has its own problem, namely that (especially in a multithreaded process) a use-after-close error is vastly more likely to cause corruption via a write to a newly opened file in the old file descriptor.

And in all fairness, nobody was thinking of multithreading when these APIs were designed. We're lucky enough that errno mostly works as a thread local rather than a global.


Use after close is, of course, tons of fun. But the guaranteed order also means open and accept require a per-process lock.


Memory corruption might die out as we secure our software stacks but I’m glad to see that weebs are eternal.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: