Windows X86-64 System Call Table

quotemstr · on Sept 1, 2019

It's important to note that unlike Linux, NT system call numbers are not stable. That's a very good thing --- it effectively forces all system calls to go through ntdll, which can coordinate closely with userspace in interesting ways. It's as if NT's "vdso" covered all interaction with the kernel.

For example, NT has no native 32-bit ABI like Linux does. Instead, userspace translates 32-bit to 64-bit system calls, handling the switch from long and back to long mode transparently. This elegant approach is possible because applications never make system calls directly.

pkaye · on Sept 2, 2019

Then why is there a "Program Files" and "Program Files (x86) folder. I believe the second one is for 32-bit ABI apps for some special reasons?

harrygeez · on Sept 2, 2019

IIRC it's purely aesthetic so users can easily tell them apart or just some sort of convention

WorldMaker · on Sept 3, 2019

Raymond Chen's big answer video on the subject: https://www.youtube.com/watch?v=qRb6otsHG5c

rileymat2 · on Sept 4, 2019

Now explain the aesthetic of syswow64.

rocqua · on Sept 1, 2019

How does that work with regards to privilege separation. It sounds like bypassing nt.dll might be an exploit vector that is easily forgotten.

anaisbetts · on Sept 2, 2019

All of the verification is done on the kernel side of the boundary, all entrance points to the kernel via syscalls are marked using static code analysis so that the kernel never trusts a pointer provided from user-mode, so even if you sysenter yourself, you can't do anything Special

gruez · on Sept 1, 2019

Permission checks are still done in the kernel.

>It sounds like bypassing nt.dll might be an exploit vector that is easily forgotten.

MSFT might be bad, but not that bad.

dataflow · on Sept 1, 2019

Just a note, it's not nt.dll, it's ntdll.dll.

crehn · on Sept 1, 2019

Heh, interesting. Any specific reason for the stuttering? Or just arbitrary inconsistent historical decision?

windozeee · on Sept 1, 2019

I'll give you a better one. On a 64 bit Windows:

- the 64 bit binaries are stored in C:\Windows\System32

- the 32 bit binaries are stored in C:\Windows\SysWOW64

dataflow · on Sept 2, 2019

Haha, I think WOW64 means "Windows [32] on Windows 64" which kind of makes sense if you squint at it just right...

cjarrett · on Sept 3, 2019

That's a bingo.

quotemstr · on Sept 1, 2019

I'm guessing it's because module names are sometimes used without the extension and "ntdll" looks better in such a list than "nt".

DHowett · on Sept 2, 2019

I think it’s because the kernel’s module name is “nt”. It’ll show up like that in stack traces on occasion.

dataflow · on Sept 1, 2019

Not sure! My guess is the name would've just felt too short otherwise.

dataflow · on Sept 1, 2019

Note that 64-bit kernel-mode stuff does still have to implement a fair number of 32-bit stuff, e.g. for DeviceIoControl (see e.g. IoIs32bitProcess()).

dataflow · on Sept 1, 2019

Note that this page doesn't show the full set of syscalls, just a subset of them (seems to be the ones ntdll calls). The Win32k stuff, for one, don't seem to be there (edit: they're on this page instead: https://j00ru.vexillium.org/syscalls/win32k/64/), let alone more obscure ones that I expect exist but that I don't know about (like WSL stuff).

monocasa · on Sept 1, 2019

The WSL stuff doesn't have system calls AFAIK. All the work is either kernel mode indirection to the kernel driver, and IPC (ALPC I think on WSL1?).

dataflow · on Sept 1, 2019

Well for starters WSL implements Linux system calls, so it definitely has its own system calls!

But even beyond that, I was thinking of one thing they added for performance reasons (see 3rd bullet): https://github.com/Microsoft/WSL/issues/873#issuecomment-425...

However I'm not entirely sure if it's a concrete "syscall" or if it's implemented as a new option through an existing one. (It's not clear to me how they could reasonably fit this into anything that already exists, but it might.) Though I would be mildly surprised if they didn't implement at least one syscall specifically for WSL...

monocasa · on Sept 1, 2019

> Well for starters WSL implements Linux system calls, so it definitely has its own system calls!

Sure, but not in the NT system call table.

> But even beyond that, I was thinking of one thing they added for performance reasons (see 3rd bullet): https://github.com/Microsoft/WSL/issues/873#issuecomment-425....

> However I'm not entirely sure if it's a concrete "syscall" or if it's implemented as a new option through an existing one. (It's not clear to me how they could reasonably fit this into anything that already exists, but it might.) Though I would be mildly surprised if they didn't implement at least one syscall specifically for WSL...

It's neither, it's an API inside kernel space. They didn't expose it through the NT syscall table, but instead how the linux transtalion layer calls directly into NT kernel functions.

dataflow · on Sept 1, 2019

Ah I see, okay thanks.

zaszrespawned · on Sept 2, 2019

https://devblogs.microsoft.com/commandline/announcing-wsl-2/

WSL is now a VM. So its really hypervisor that runs the linux syscalls on its virtual processor

dataflow · on Sept 2, 2019

I'm aware of WSL2 but we've been talking about WSL1.

ninkendo · on Sept 1, 2019

How do they deal with the syscall numbers being different from release to release, while still maintaining binary compatibility? From what I can tell, (in Linux at least) the syscall numbers are the thing that needs to stay constant for the kernel to not break user space.

dataflow · on Sept 1, 2019

> How do they deal with the syscall numbers being different from release to release, while still maintaining binary compatibility

By not calling syscalls based on their numbers on Windows. That might sound crazy, but actually to me coming from Windows, calling them by ordinals is what sounds crazy. :-) You're supposed to call exported functions in shared libraries (ntdll.dll etc.) that call the syscalls for you.

asveikau · on Sept 1, 2019

Ntdll being the de facto syscall abi is the right answer. But ntdll is also technically subject to change, much of it not publicly documented by Microsoft. So typically a third party application calls into things like kernel32 or advapi (though Win7 and later split those into different DLLs internally) which wraps ntdll which does the actual ordinal calls.

But generally speaking, yeah, the stable interfaces are the C ones, exposed through system DLLs. Linux is unique in developing libc and kernel separately and the division between the two as a public abi boundary.

dataflow · on Sept 1, 2019

I mean, you could say anything is subject to change. The "documented" <-> "stable" mapping isn't a bijection. CreateFileTransacted was documented for kernel32 but that didn't stop it from being claimed as deprecated. And CallNtPowerInformation was documented and yet some of it simply doesn't work anymore.

Also, Microsoft didn't necessarily document everything, but they did provide .lib files (and often headers) even for "undocumented" APIs in these libraries. Breaking these APIs would break programs they previously provided SDKs for.

The reality is, so many third-party applications depend directly on ntdll APIs (even Chrome) and some of them literally cannot work with kernel32 stuff (boot-time partition managers, for one), so as far as I'm concerned, it's about as much set in stone as any library could be.

asveikau · on Sept 1, 2019

Yeah. Direct dependence on ntdll occurs in the wild and it would break stuff to change it. It can also break MS's own code and potentially create massive engineering hurdles for them that are completely unnecessary. But they have always warned people in documentation that they don't consider it a stable interface to the same extent as Win32.

In my experience a lot of the NT APIs are nicer, better thought out, more direct.

KyleJ61782 · on Sept 2, 2019

Using the NTDLL calls breaks stuff because NTDLL doesn't understand anything about the overlaid subsystems. NT was originally designed to support multiple subsystems in addition to Win32, like the Interix (POSIX compliant) subsystem and the OS/2 subsystem I believe. I think (I could be wrong on this) that this was possible by switching the system call table depending upon which process was currently active (doable by switching the system call vector).

Avery3R · on Sept 2, 2019

IIRC the syscall table doesn't get swapped out. Each subsystem translates its calls to NT API calls. For example, user32.dll and kernel32.dll are a part of the win32 subsystem and eventually end up calling NT APIs in ntdll.dll. It's possible for a process to have no subsystem, these are called native NT process and the only dll loaded by default into their address space is ntdll, csrss.exe is an example of this.

dataflow · on Sept 2, 2019

The layer underneath that they build subsystems on is always the system call interface (ntdll), not the syscall numbers themselves.

jart · on Sept 2, 2019

It's a political thing. Microsoft isn't going to break themselves. It's more a question of, if you go against all the warnings and use NTDLL in the non-recommended way (i.e. having it be a hard dependency w/o fallback) Microsoft can and will fuck you if they don't like what you're doing. Just as recently as a few months ago, I saw a startup in Sunnyvale called Crossmeta basically throw in the towel, because they built their whole product around this one NTDLL fork() hack, which had been around for nearly decades. Then MS rolled out a security update, and poof, it's gone.

dataflow · on Sept 2, 2019

Wow, cool story. The part I find surprising is -- there was a fork() hack on NTDLL that actually worked until recently? I know tons of people had tried to get it to work (including the Cygwin folks and also including me) and faced failure after failure!

KyleJ61782 · on Sept 2, 2019

Technically the Windows native layer supports fork()-like abilities, but the problem is that the Win32 subsystem and dependent layers (GUI, etc.) don't have fork() copy-on-write capability. So even if you were able to fork a process, you'd still have to deal with duplicating all of the stuff built on top. In fact, the old POSIX compatible subsystem layer utilized this functionality to actually provide a compliant fork().

Of course this all ignores the fact that Win32 processes are heavier duty than Linux processes (though I don't know if that's due to Win32 subsystem overhead or not). Look at any benchmark and you see an order of magnitude difference in process creation times. You're much better off creating threads instead.

Kipters · on Sept 2, 2019

WSL1 also uses that functionality

dataflow · on Sept 2, 2019

Huh? WSL1 doesn't use NTDLL, it implements its own stuff.

jart · on Sept 2, 2019

Let's consider looking at this in terms of data structures, and ignoring the APIs. Your CPU has this:

    register char (*(*(*(*ram)[512])[512])[512])[4096] asm(cr3)

Your CPU resolves every pointer memory access through those tables under the hood. It's an extremely powerful data structure. It can let you allocate linear memory like a central banker prints money. But if you're a Windows user, then only Microsoft is authorized to access it, and they don't want you having the central banker privileges that are needed in order to implement fork(). That's the way the cookie crumbles.

dataflow · on Sept 2, 2019

It seems you replied to the wrong comment?

Kipters · on Sept 2, 2019

I wasn't specifically talking about any ntdll call, but in one of their "deep dive" videos on Channel 9 they state WSL1 supports fork() because the NT kernel natively supports it, it just isn't exposed on the normal API surface

jart · on Sept 2, 2019

Pretty much all CPUs with MMUs since MULTICS inherently support fork(). So for NT it's a question of prohibiting the behavior. Microsoft's research department even writes papers about how they disagree with fork(). I was truly impressed by the work they did implementing Linux ABI in the NT executive for WSL 1.0. Big change in policy. It's going to be sad to watch it go, but might be for the best. A really well-written ghetto for Linux code is still a ghetto.

asveikau · on Sept 3, 2019

This is a weird statement. The MMU doesn't natively do a fork(). The kernel needs to implement the copy-on-write-fault. I.e. all pages marked read only and the write fault handler needs to realize a given address is COW and perform a lazy copy.

dataflow · on Sept 2, 2019

I think you got confused. The "native layer" referred to NTDLL in the comment you replied to. I think(?) that's what the earlier POSIX subsystem was built on. That's why I said WSL doesn't use that.

eps · on Sept 2, 2019

Looked up Crossmeta just now and it appears that there “threw in the towel” just on their fork() emulation, not on the whole project that does quite a bit more.

http://crossmeta.org

rocqua · on Sept 1, 2019

To be fair, that is how most of linux tends to work. With glib wrappers around syscalls. It's just that the numbers are considered part of the stable interface. Hence people can write their own wrappers.

It's part of linux saying 'we don't break userspace' and the syscall numbers being how userspace talks to kernelspace.

asveikau · on Sept 1, 2019

Nitpick:

glibc is the GNU C library and wraps syscalls, among other things.

glib is a bunch of cross platform interfaces and data structures that grew out of Gtk+.

dataflow · on Sept 1, 2019

Given glibc isn't part of the OS, and that hence you can statically link to it, it can't be part of the stable interface. glibc isn't a thing as far as the OS is concerned.

It seems to me this also means Linux is more limited in what it can do in breaking syscalls too. On Windows, if you load any shared library (including ntdll) and an exported function isn't there, the loader can produce an error for you telling you that a function you need isn't there. On Linux... you're calling the syscall directly, so there's nothing that can stand between you and the syscalls to perform a check or anything. Overall it seems to me like coding against direct syscall numbers is a disadvantage on almost every front.

caf · on Sept 2, 2019

So where this affects things is when you want to take advantage of a newly-added syscall, or a newly-added feature of an existing syscall.

In the second case, both systems behave the same - if your program runs on an older kernel, the function entry point exists (either the syscall or ntdll symbol) but when you call the function to request the new feature, it returns a runtime error.

In the first case, Linux behaves identically to the first case - you get a runtime error (ENOSYS) - whereas Windows will produce an error at load time.

Typically what is done is that whatever is wrapping the syscall falls back to an alternate implementation for older kernels, if that's possible. If it's not, the program can either continue without that feature if it wants, or fail with a message that a newer kernel is required.

I don't feel that there's really a significant difference in the two approaches - wherever you draw the backwards-compatibility line you're going to have to do the same sort of work to maintain it.

account42 · on Sept 2, 2019

> Given glibc isn't part of the OS, and that hence you can statically link to it

Unless you want to use any other library on the OS, for example 3D graphics drivers which tend to have hardware-specific parts in userspace.

> Overall it seems to me like coding against direct syscall numbers is a disadvantage on almost every front.

Which is why noone does it unless they have a very good reason.

dataflow · on Sept 2, 2019

> Unless you want to use any other library on the OS, for example 3D graphics drivers which tend to have hardware-specific parts in userspace.

Not sure what you mean regarding this and glibc.

> Which is why noone does it unless they have a very good reason.

I think you changed the meaning of my sentence to be able to make this reply. I was referring to what Linux does re: syscalls vs. library exports, not hard-coding syscall numbers in the source code.

masklinn · on Sept 2, 2019

> By not calling syscalls based on their numbers on Windows. That might sound crazy

Not really. Only on Linux are direct syscalls an official ABI. On most unices (e.g. solaris, BSDs, OSX) you must dynamically link and go through the platform's libc.

earenndil · on Sept 2, 2019

I don't know about solaris, and I know macos had kernel changes which broke go, but freebsd has a stable kernel interface and unstable userspace. (So does linux, for that matter--ever try running a program linked to a newer version of glibc?)

account42 · on Sept 2, 2019

> So does linux, for that matter--ever try running a program linked to a newer version of glibc?

That doesn't mean the interface isn't stable or at least not less stable than the kernel interface or literally any iterface for something that is still being developed.

It would be nice if glibc would provide an easy way to target the ABI of an older version (which is compatible with all newer versions) but you can do that the manual way by compiling and linking against that old version.

ninkendo · on Sept 1, 2019

So that means ntdll.dll and family can never be statically linked? I guess that makes sense if the only responsibility of the dll is to expose syscalls, but does ntdll expose any other functionality? Functionality that may want to be overridden by a different implementation, etc?

In the Linux world, since syscalls are exposed by libc, other non-C runtimes (for example, Go) end up needing to make system calls on their own without the libc wrappers. Is such a system just not a thing in Windows?

masklinn · on Sept 2, 2019

> In the Linux world, since syscalls are exposed by libc, other non-C runtimes (for example, Go) end up needing to make system calls on their own without the libc wrappers.

They didn't need to, they wanted to, and it can break at any moment on any platform other than Linux. In fact a few versions back they finally back-pedalled and started going through libSystem on OSX.

> Is such a system just not a thing in Windows?

It's not a thing anywhere other than Linux. Go's developers were told time and again that they were supposed to dynamically link to and go through the platform's libc on OSX and BSDs.

They finally relented on OSX after Go broke multiple times during the Sierra beta, but IIRC that's not the case yet for other BSDs (I was thinking it was also the reason why Go 1.11 is required on OpenBSD 6.4, it is but not entirely: OpenBSD requires that stack memory be mapped with MAP_STACK or syscalls will terminate the calling process, so that's an artefact of Go doing userland stacks / threads rather than doing raw syscalls).

jcranmer · on Sept 2, 2019

> In the Linux world, since syscalls are exposed by libc, other non-C runtimes (for example, Go) end up needing to make system calls on their own without the libc wrappers. Is such a system just not a thing in Windows?

Linux is pretty much the only operating system that considers the actual system call ABI to be stable. The BSDs and OS X all want you to use the platform libc instead to access system calls, while Windows uses kernel32.dll et al. as its stable interface.

aaron_m04 · on Sept 2, 2019

> Linux is pretty much the only operating system that considers the actual system call ABI to be stable.

What about MS-DOS? :P

dataflow · on Sept 1, 2019

> In the Linux world, since syscalls are exposed by libc, other non-C runtimes (for example, Go) end up needing to make system calls on their own without the libc wrappers. Is such a system just not a thing in Windows?

Yeah, it's not a thing. You just import functions from ntdll the same way you import from anything else. The OS program loader reads your program's import table and loads them for you.

> does ntdll expose any other functionality? Functionality that may want to be overridden by a different implementation, etc?

It exposes lots of other stuff (DLL loading, memory heap, etc.) that you may want to do differently, but you can hook those at runtime if you really want to. It's not as convenient as just linking another library, but on the other hand, it makes it easier to replace them at runtime instead of at compile-time -- say, if you want a shared library to override a syscall.

quotemstr · on Sept 2, 2019

Go doesn't need to bypass libc on Linux. There's no technical reason to do that. Go doesn't bypass libc on other operating systems. Go going straight to the kernel is just a fashion statement, a symbol repudiation of C, not something that makes actual life better for users.

cesarb · on Sept 2, 2019

> There's no technical reason to do that.

AFAIK, there is: Go likes using tiny stacks, while libc expects normal-sized stacks. Switching to a reasonably-sized stack and back has a cost, which the Go developers want to avoid.

brianpgordon · on Sept 2, 2019

Hm, I thought there was some issue with glibc not wanting to provide wrappers for certain Linux syscalls. What if the Go runtime wanted to use one of those syscalls for some reason? It seems like then it'd be forced to go straight to the kernel.

quotemstr · on Sept 2, 2019

Yes, it'd have to bypass libc for those system calls, just like any C program would. How is that an excuse for bypassing libc for things like read(2)?

No, making system call interposition harder isn't a security feature. A user who can LD_PRELOAD you can already do arbitrary things to your program.

jart · on Sept 2, 2019

Sometimes when folks talk to the kernel, they want to know they're talking to the actual kernel. glibc by design lets random stuff on the system MiTM symbols like read (even security critical ones like getrandom) and the LGPL effectively forbids many projects from using static linking as an escape hatch. The Chrome guys had to write a whole system call library from scratch because of it. On Windows it tends to get much more intense, where dynamic symbol interposition is brought to its logical conclusion, and you've layers upon layers of antivirus hooks and spyware sitting between you and the system. There, pretty much the only thing you can do is just SSL the heck out of everything.

dooglius · on Sept 2, 2019

> glibc by design lets random stuff on the system MiTM symbols like read

Can you elaborate on this? I'm aware of things like LD_PRELOAD, but if an adversary controls the environment, he could just change PATH to point to a rooted version of chrome anyway. That also has to do with the capabilities of the system linker, not glibc.

>the LGPL effectively forbids many projects from using static linking as an escape hatch

The LGPL explicitly allows static linking, that's the main difference versus the GPL.

jart · on Sept 2, 2019

Are you in control of your network connection? Packets aren't that much different from calls between DSOs. It's just a big onion ring. If you make a conscious decision to trust Chrome with your data, then does Chrome have a moral obligation to ensure that choice, in reality, ends up being You<->Chrome, rather than You<->SysAdmin<->Comcast<->NSA<->Hacker<->Chrome? Or maybe they just want to protect IP holders. Or maybe they just don't want folks filing bugs about performance when the root cause turns out to be some poorly written system library. At the end of the day, it's all about minimizing unknowns.

LGPL allows dynamic linking. Only way it'll allow static is if your releases are accompanied by tools for decompiling and recompiling your binaries with the LGPL bits interchanged. But that actually might not be allowed either, since GCC 4.3+ headers and runtimes (e.g. libstdc++) kind of prohibit you from changing binaries on your own, after they've been compiled.

cesarb · on Sept 2, 2019

> Are you in control of your network connection? Packets aren't that much different from calls between DSOs.

The difference is that calls between dynamic libraries are on the same side of the "airtight hatchway" (as explained by Raymond Chen at https://devblogs.microsoft.com/oldnewthing/20060508-22/?p=31...), while there's a security boundary at the network connection.

jart · on Sept 2, 2019

Imagine the modern PC as being just another substrate of this great battlefield where big corporations duke it out with each other and the individual owners are becoming less and less relevant to the security story. It's unpleasant. It seems to get raunchier each year. I've even seen household names retool things like graphics drivers to inject data harvesting code into command-line programs at runtime. Then the big companies who don't play dirty like that, usually react by demanding greater authoritarianism in tools, until the rest of us devs find ourselves smothered with -fpic aslr retpoline code-signed bubblewrapping, which isn't so much keeping us safe, but rather keeping big company A safe from big company B. Everything that the guy you linked said is correct. But locks aren't that relevant anymore, once the wolves have been invited in to live.

quotemstr · on Sept 2, 2019

You've used a lot of rhetoric, but you haven't described a threat model. Both this comment and your previous one, the GP, use language of moral duty when talking about a specific software security measure. That's a category error.

What specific threat is bypassing libc supposed to protect against? I don't think you have an argument here. As the poster to whom you're replying mentioned, anyone who can do symbol interposition already has full control over your program.

jart · on Sept 3, 2019

Perhaps you could share specific details on the use cases that dynamic symbol interposition is intended to serve? Because the only one I think I've seen so far is setbuf(1) and I can imagine more than a few alternatives that could have been considered.

quotemstr · on Sept 3, 2019

Enumerating specific interposition user cases has nothing to do with the point I made, which is about threat modeling. You still have not articulated any specific threat model; you've instead tried to distract the conversation and move it in a different direction.

jart · on Sept 3, 2019

You're welcome to complain about how I talk, but this isn't about me. This is about serving the user and making sure tech behaves the way they expect it to. We need to raise awareness of infrastructural weaknesses that folks may need to consider, in order to stay safe.

I consider symbolic interposition to be one of those things. Linux users might not feel comfortable about the fact that glibc currently makes it so easy to intercept system calls to Linux Kernel's RNG, that your upstream dependencies might actually compromise your key generator unintentionally.

Don't you think that's worth discussing? We could also talk about the concerns surrounding Layered Service Providers, which is another great example of userspace libraries misrepresenting APIs that are generally believed to be talking to the operating system.

quotemstr · on Sept 4, 2019

It's not "how [you] talk". It's that you're not making an actual point. There is no "safety" threat here, nor is there an "infrastructural weakness[]". Fretting about "safety" without identifying a threat model is vacuous. What is, specifically, your threat model? What security guarantees can a program make only through bypassing libc? Why aren't the vast majority of programs --- the ones that call libc --- vulnerable to this ineffable threat?

Users get to control how programs execute. They can interpose symbols. They can disassemble programs. They can run programs under a debugger. They can modify the kernel. They can run programs in a VM. A program can't detect this intermediation; nor does it have any right to do so. A program has no business breaking random OS visibility and control features. We generally call the ones that try malware.

Bypassing libc when making system calls doesn't give a program any insurance against environmental changes. It just inconveniences users while providing no "safety" guarantees. If you want full control over a program's execution environment, ship an appliance.

a1369209993 · on Sept 2, 2019

> But that actually might not be allowed either, since GCC 4.3+ headers and runtimes (e.g. libstdc++) kind of prohibit you from changing binaries on your own, after they've been compiled.

Can you elaborate on this?

jart · on Sept 2, 2019

Read the GCC Runtime Exception v3.

duskwuff · on Sept 2, 2019

glibc has a fallback in the form of the syscall() function, which can be used to invoke any syscall without the need for a wrapper, e.g.

    syscall(SYS_getcpu, &cpu, &node, NULL);

yjftsjthsd-h · on Sept 2, 2019

How is that different from just calling it yourself?

duskwuff · on Sept 2, 2019

It means you don't have to deal with the kernel's calling conventions, which are architecture-specific and require inline assembly.

saagarjha · on Sept 2, 2019

Go actually stopped making syscalls directly on macOS, since the interface is subject to change: https://golang.org/doc/go1.12

slovenlyrobot · on Sept 1, 2019

It's exactly the same deal on OS X. There is no stable system call interface that is not the C library, and it's one of the main reasons they don't allow static linking there either

kccqzy · on Sept 2, 2019

Windows, like most other operating systems, doesn't have a stable user-kernel boundary. Instead the stable interface exists at the C level in libraries used by every process. This is no different from the situation in macOS (libSystem.dylib) or the various BSDs. Linux is the outlier because Linux is just a kernel; without an integrated C library, it by necessity has to keep the syscall numbers constant. Most other operating systems are free to change syscall numbers from release to release.

detaro · on Sept 1, 2019

Application code does not use those syscalls directly, but functions from system libraries wrapping them, which have a stable interface.

alxlaz · on Sept 1, 2019

Over here in Linux world we're all super thankful that the kernel community is big on interface stability and compatibility (and not just backwards), because pretty much nobody else is.

Lots of folks don't realize how important that is. If the Linux kernel space were to break as often as the Linux desktop, Linux' market share on the server and in the embedded space would likely barely exceed that of TempleOS.

dooglius · on Sept 1, 2019

The difference really doesn't have to do with backward compatibility (which Windows does to arguably an even greater extent than Linux) but with where the stable interface boundary is. Windows puts the boundary in ntdll, Linux puts it at syscalls.

gmueckl · on Sept 2, 2019

Well, the stable kernel interface matters on Linux because the user space doesn't give a damn about backwards compatibility. You might be fine if you only rely on the system libc. But every other library that you don't ship with your binary is essentially out to break your stuff. These libraries may even differ between distributions even though they have the same version numbers just because they were built using different build configurations. This is why things like the Steam Runtime exist.

account42 · on Sept 2, 2019

> But every other library that you don't ship with your binary is essentially out to break your stuff.

There are many more lower level libraries that provide binary backwards compatibility: libasound, libX11, libGL, ...

Sure you're not going to be able to assume that /usr/lib/libpng.so.1 will provide the functionality that you want but that isn't any different from png1.dll some program installed in system32.

gmueckl · on Sept 3, 2019

It's funny that you mention libpng. That is one of those libraries that is developed so conservatively that I'd actually trust it to provide the features a binary that I ship relies on. I rarely (if ever) needed the more obscure things like 16 bit per channels or ICC profiles.

dooglius · on Sept 2, 2019

That has nothing to do with kernel space vs user space and everything to do with different projects having different attitudes. The comparison is to the case where Linux instead ships a special "linuxso.so" that all processes and libraries need to (dynamically) link against in order to get a stable interface.

gmueckl · on Sept 2, 2019

That attitude that is so prevalent in Linux user space is what this thread was getting at. This is why the syscall interface being maintained as a stable interface is so important on Linux compared to other operaring systems.

quotemstr · on Sept 2, 2019

Backward compatibly is important. It's a weird Linuxism to assert that it's the kernel that has to provide the compatibility interface. There's no reason the ring zero transition has to be the stable part.

The real reason Linux's ABI boundary is the kernel is just that the kernel and glibc are sperate and frequently-antagonistic projects. Linux ships the org chart.

caf · on Sept 2, 2019

Even if they were the same project it wouldn't matter, because the promise is that your completely statically-linked binary will continue to work without recompilation on tomorrow's kernel if it works on today's.

gizmo686 · on Sept 2, 2019

Otherwise stated as "Linux suppirts completly statically linked binaries". Apart from asthetics, I don't see the limitation of not supporting static linking to lib c.

quotemstr · on Sept 2, 2019

Huh? What you should care about is whether your program can run on future systems. Whether you call your program static or not is irrelevant.

caf · on Sept 2, 2019

It's relevant because if your program is statically linked then it doesn't matter whether or not future changes at the kernel boundary are made backwards-compatible by compensating changes in libc (as is possible if you release them together), because your statically-linked binary can't pick up those libc changes.

masklinn · on Sept 2, 2019

[flagged]

caf · on Sept 2, 2019

What? That makes no sense. The question here was why static linking is relevant to the point about whether or not having libc release together with the kernel means you could move the compatibility line to the libc interface, and I was answering that question.

Forbidding static linking libc is fine if you do that from the beginning. Linux didn't, so - whether or not you approve - there's statically linked binaries out there that must be supported. It's not about calling users stupid, or a discussion about what users should or shouldn't do - it's a discussion about what the kernel can do, today, given the backwards compatibility constraint and the reality on the ground. (Personally I am completely in favour of dynamic linking, but that's not the point).

masklinn · on Sept 2, 2019

> What? That makes no sense. The question here was why static linking is relevant

No it is not.

1. ninkendo asked how syscalls worked

2. detaro replies that NT software uses ntdll and never do raw syscalls

3. alxlaz non-sequiturs that "the kernel community is big on interface stability and compatibility"

4. quotemstr points out that backwards compatibility does not in any way, shape or form require a stable kernel ABI, that's just a weird linuxism (because it ships as a kernel alone rather than a system)

5. you apparently decide to bring the entire thing on an irrelevant tangent completely missing the point

caf · on Sept 2, 2019

I wasn't responding to ninkendo, detaro or alxlaz. I was responding precisely to quotemstr's:

The real reason Linux's ABI boundary is the kernel is just that the kernel and glibc are sperate [projects]

...by pointing out the very real issue that even if they were the same project today, they'd still have to maintain compatibility at the system call layer, because that ship has sailed.

quotemstr · on Sept 2, 2019

To be more precise, then, the Linux kernel is the ABI boundary because during Linux's fluid and formative period, the kernel and libc were separate projects. I agree that it's mostly epoxied in place now, modulo my lkml libsyscall proposal from a while ago.

fortran77 · on Sept 1, 2019

But Windows has excellent back-compatibility!

dblohm7 · on Sept 2, 2019

The C stubs for NT system calls are not included as part of libc. They are implemented in ntdll.dll, which is implicitly mapped into every process (except for the new picoprocesses as used by WSL).

moyix · on Sept 2, 2019

One odd bit of trivia: the Windows system call mechanism supports up to four different tables. The first two are what you'll usually see - the first is the basic kernel services, and the second is for Win32k, the graphical subsystem. The final two slots are up for grabs and historically could be used by drivers to implement custom system calls.

The system call number determines what table is used. Calls in the 0x0-0xFFF range are handled by the first table, 0x1000-0x1FFF by the second, and so on.

As far as I know, the ability to add another system call table was only ever used by the IIS web server kernel-mode component (spud.sys).

crusader1099 · on Sept 2, 2019

I don't doubt your interesting bit of trivia, but I couldn't find anything online about. Do you have a source for this? I would certainly love to read more about it.

moyix · on Sept 3, 2019

There's a bit more information in the indispensable Windows Internals book (quoting from the 4th edition, which is a little old now):

> As you’ll see in Chapter 6, each thread has a pointer to its system service table. Windows has two built-in system service tables, but up to four are supported. The system service dispatcher determines which table contains the requested service by interpreting a 2-bit field in the 32-bit system service number as a table index. The low 12 bits of the system service number serve as the index into the table specified by the table index.

[...]

> A primary default array table, KeServiceDescriptorTable, defines the core executive system services implemented in Ntosrknl.exe. The other table array, KeServiceDescriptorTableShadow, includes the Windows USER and GDI services implemented in the kernel-mode part of the Windows subsystem, Win32k.sys. The first time a Windows thread calls a Windows USER or GDI service, the address of the thread’s system service table is changed to point to a table that includes the Windows USER and GDI services. The KeAddSystemServiceTable function allows Win32k.sys and other device drivers to add system service tables. If you install Internet Information Services (IIS) on Windows 2000, its support driver (Spud.sys) upon loading defines an additional service table, leaving only one left for definition by third parties. With the exception of the Win32k.sys service table, a service table added with KeAddSystemServiceTable is copied into both the KeServiceDescriptorTable array and the KeServiceDescriptorTableShadow array. Windows supports the addition of only two system service tables beyond the core and Win32 tables.

JPLeRouzic · on Sept 2, 2019

Perhaps here [0] ?

Starting IIS 4.0, Microsoft has added a kernel mode support driver (SPUD.SYS). This driver also calls KeAddSystemServiceTable function to add its own system services. This fills an entry in third array element of KeServiceDescriptorTableShadow. Hence, its services will start from 0x3000.

[0] https://community.osr.com/discussion/20626/system-service-di...

pedrocx486 · on Sept 1, 2019

Sorry if this is an dumb question I could easily find on Google (been a webdev for most of my career), buy I'm curious why some syscalls exist on Vista SP1 for example, but don't exist on the versions before and after it.

Edit: I meant SP0, not SP1, sorry. It was this SysCall: NtListTransactions

skissane · on Sept 2, 2019

It isn't a dumb question. I believe the reason is the following:

Leading up to any Windows release, Microsoft has a whole bunch of teams working on different features. Some of those features make it into the release, other features don't make it and get cut. What sometimes happens, is that a feature makes it in, but then problems are discovered in testing, and it gets pulled out again at the last minute. When a feature is removed late in the game like that, it is desired to take the lowest risk removal mechanism as possible–one way of doing that is to actually leave the feature in the code, but leave it undocumented, and hope nobody discovers it and uses it (an approach avoided nowadays due to the risk of exploitable security bugs in the buggy API, but in the past it was more common). Another way is to leave its APIs in-place, and either stub out their implementations (to always return an error), or even hide the actual implementation behind a #define that is turned off in the shipping copy. That way, you avoid making any changes to DLL export tables, the system call table, etc., which you worry (even just out of an abundance of caution) might have some downstream negative effect, but also don't have to worry about anyone discovering the broken feature and trying to use it. Then, in the next release, you have a choice – sometimes they will fix the issues with the feature and put it back in, other times priorities have changed and the remnants of the feature (such as API stubs) get removed altogether. That is why sometimes Windows DLLs export undocumented APIs that don't do anything.

(I've never worked for Microsoft, so this is not based on any internal info, just an inference from observing how Microsoft and other vendors do things.)

dataflow · on Sept 1, 2019

Could you give an example? I don't see anything that's on Vista SP1 but not on SP2.

pedrocx486 · on Sept 1, 2019

Sorry, in my case I saw this on SP0 not SP1: NtListTransactions

PeCaN · on Sept 1, 2019

that's probably part of TxF (aka transactional ntfs) which got deprecated basically as soon as it was released (though i think it's still used internally for e.g. system restore) so it's likely that got moved around somehow or the TxF API got reimplemented in userspace

dataflow · on Sept 1, 2019

I don't believe their deprecation claims on that. It's too deeply ingrained into the OS for them to ever remove it, and too useful and difficult to replace. Really, it just doesn't have enough users, is all. And there's more to transactions than just the file system (TxF) so I'm not even sure it's related to this either.

Also you can't implement TxF in userspace. It has to detect conflicts with other applications and roll back in the case of an unsuccessful commit (power loss etc.) before the file system is used again. Any userspace implementation would leave stuff in a corrupted state until it's re-run.

PeCaN · on Sept 1, 2019

> I don't believe their deprecation claims on that.

I don't either, although the API is kind of overkill for most use cases so I'm not too surprised they discourage people from using it.

> And there's more to transactions than just the file system (TxF) so I'm not even sure it's related to this either.

True, I just assumed it was related to the txfs_list_transactions ioctl.

> Also you can't implement TxF in userspace. It has to detect conflicts with other applications and […]

I think the Kernel Transaction Manager already takes care of that. I think??? TxF could be implemented as a userspace library on top of KTM, but I'm not particularly familiar with either facility. Though if it was possible perhaps they would've done it that way in the first place, since TxF uses KTM regardless.

I wonder what did happen to it between SP0 and SP1.

dataflow · on Sept 1, 2019

I don't know what it does (seems it was never documented? possibly so it could be removed), but for example, I do see NtEnumerateTransactionObject, which may have overlapped with its functionality? Or maybe NtQuerySystemInformation or another function does similarly.

rolph · on Sept 1, 2019

i would suspect a brand new idea [thus not appearing earlier] lost traction or relevence and didnt make the cut to migrate into the next revision

jolmg · on Sept 2, 2019

Interesting that on a linux machine

  man syscalls \
  | grep -Po '\w+(?=\(2\))' \
  | sort \
  | uniq \
  | wc -l

gives the same number of 481 as the number of rows in that table. Though, the grep includes man-pages that don't correspond with a syscall, like intro(2). It's just a curious coincidence.

october_sky · on Sept 1, 2019

Could someone provide examples of what to type in the field, with perhaps the relevance? (I'm not a Windows user, but this looks cool enough to test)

kuroguro · on Sept 2, 2019

Has anyone ever done a mapping of which regular API functions map to which syscalls?

sbierwagen · on Sept 1, 2019

Huh. Why is this so slow? According to Chrome, it took 704 milliseconds from clicking the "show" button on "Windows XP" to displaying the results. No network calls at all. A whole 525ms of script evaluation. Modern computer, thousand row table, wouldn't expect it to take that long.

dataflow · on Sept 1, 2019

Not sure, but innerHTML seems to be taking a long time, in a call that loops over the results of getElementsByName. Given that getElementsByName doesn't return a real array, is it possible internally it's doing bookkeeping to make sure the iteration is still over the same set of nodes? Or alternatively maybe modifying innerHTML is just slow.

nitwit005 · on Sept 1, 2019

Not an answer exactly, but I did some work with large HTML tables for an analytics project, and I did learn browsers struggle with larger tables.

After doing a number of optimizations, our 2000 row table took 1.2 seconds for chrome to insert the elements (purely the appendChild call). That was down from around 2.6 before trimming down the HTML.

lucb1e · on Sept 1, 2019

Works fine for me on an average laptop with latest stable Firefox.

dataflow · on Sept 1, 2019

Try "Show All".

wvenable · on Sept 1, 2019

Seems instantaneous on Firefox for me.

dataflow · on Sept 1, 2019

Really?! It isn't instantaneous for me, even on Firefox 62 (Windows). Takes some half a second or so.

wvenable · on Sept 1, 2019

I'm on Firefox 68.0.2 on Windows 10.

I can click "Show All" and "Hide All" over and over and there is hardly any delay. Maybe you should upgrade to the latest and see if it makes a difference.

dataflow · on Sept 1, 2019

Interesting, thanks for mentioning this! It seems to be something with my profile, because a fresh profile is also definitely faster (maybe 200-300ms just eyeballing). The latest unbranded build I got [1] also seems about the same if not faster (maybe around 200ms). Definitely not "instantaneous", but I'll definitely track this down.

UPDATE: At least 3 of the culprits seem to be MutationObservers from my own extensions (as in, from extensions that I have written for myself) that have {childList: true, subtree: true}! I guess I will have to optimize them. :-)

[1] https://queue.taskcluster.net/v1/task/Uht_zkkPThu384OAHNPwdA...

scomp · on Sept 1, 2019

200ms on Firefox clicking show all.

prolurker · on Sept 1, 2019

Tables layout is generally complex and slow, only hiding the text with CSS feels very fast.

Once layout is involved even without looping to set innerHTML it becomes a lot slower.

bediger4000 · on Sept 2, 2019

Windows NT has been around since what, 1993 or 94?

Why is this even news? Doesn't Microsoft document this stuff? How does anybody use a system that doesn't have this documented?

PS http://stratigery.com/nt.sekrits.html

shanemhansen · on Sept 2, 2019

Because developers aren't supposed to to use them. The documented interface is ntdll or whatever.

derpherpsson · on Sept 2, 2019

I also ask myself this question. How can anyone use an OS where the creators deliberately hide some of the documentation?

I would love to read up on windows internals, but there are no real resources out there. A few books that at first glance appears to cover it, but then just deals with the application layer and up to point-and-click.

This is why I don't use windows anymore. I don't ever again want a job where I have to debug why old program P stopped working after an update. I prefer to be able to learn stuff.

kyberias · on Sept 2, 2019

Oh come on!

https://www.amazon.co.uk/Windows-Internals-Part-architecture...

Certainly all systems have undocumented stuff that are left undocumented because the users are not supposed to use it. If they would, they couldn't change it.

derpherpsson · on Sept 2, 2019

A book... That cost money.

Compare that with the man pages of some decent BSD. Or even Linux..

But yes. I have contemplated buying one of those windows internals. I will probably buy that book just because of your comment. It's not that expensive.

bediger4000 · on Sept 2, 2019

System calls seems like the very definition of what should be documented for users. Particularly if you want someone to write compilers for your operating system.

yyyk · on Sept 2, 2019

The stable program-kernel interface is not the kernel syscalls, but library API. That's exactly the same as every other non-Linux OS out there including other Unixes and other Free OSs.

bediger4000 · on Sept 2, 2019

Pretty sure that isn't true of FreeBSD, NetBSD or OpenBSD. It most certainly wasn't true of any Unix.

They all have files that end up as /usr/include/sys/syscall.h and define well-known numerical constants for each (documented!) system call.

https://github.com/openbsd/src/blob/master/sys/sys/syscall.h

https://github.com/lattera/freebsd/blob/master/sys/sys/sysca...

http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/sys/syscall.h?re...

Oh, look! They even share the exact same numerical values for the old school system calls. #1 is exit(), #2 is fork(), so on and so forth.

Gosh, on my linux laptop, here in /usr/include/asm/unistd_64.h is pretty much the same numbers, but objectively hidden a little bit more than in the direct Unix descendants. I interpret this as "less documented than in Unixes".

Wow, even Minix source code has the same numbers: https://minixnitc.github.io/posix.html

I wonder why that is? Maybe it's because the system call numbers are in fact documented, part of the POSIX API, and descend from Bell Labs Unix source code. Here's V7's list of system call numbers as proof:

https://minnie.tuhs.org/cgi-bin/utree.pl?file=V7/usr/include...

yyyk · on Sept 3, 2019

Documenting syscalls does not mean a guarantee of stability. I can't recall any other Unixes except Linux which has committed to maintaining syscalls, which means this interface cannot be relied on.

e.g.

OpenBSD deletes "relic from the past" syscall:

https://cvsweb.openbsd.org/src/sys/sys/syscall.h?rev=1.199&c...

Solaris documenting deleted syscalls (note however the libc interface is guaruaneed):

https://docs.oracle.com/cd/E23824_01/html/E22973/gkzlf.html

MacOS X is so unstable that even go developers had to go through libSystem:

https://golang.org/doc/go1.12

Volt · on Sept 4, 2019

> MacOS X is so unstable that even go developers had to go through libSystem

Just to be clear to anyone else, the Mac OS X syscall interface is guaranteed to be unstable by Apple. Going through libSystem is the only way.

kyberias · on Sept 3, 2019

How exactly does the Windows design here prevent writing compilers? The API is stable, it just isn't on the syscall level.

Please educate yourself about the Windows architecture.