Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You can repeat "C is a minimal abstraction over assembly" as many times as you want, it doesn't make it true.


I’m just old enough to remember when C and Pascal were called high-level languages.

“C is actually assembly” was quite a plot twist from there.


You can thank Rob Pike for that, about a decade ago he wrote a Google Plus post arguing that one of the things C brought to the table was that it essentially was portable assembly compared to most other high level language projects at the time. I guess the tiny sliver of nuance that the "essentially" added was quickly lost.


Technically that's true because of the "C virtual machine", but pragmatically, C is still the "lowest-level high-level programming language" at least among the popular programming languages (arguably only Forth is lower level, but Forth isn't exactly mainstream).

(and I'd argue that C is closer to assembly than assembly is to what's actually happening inside the CPU, e.g. assembly itself is a high-level abstraction layer that's still pretty close to C - which isn't all that surprising because both probably developed as a symbiosis over time - especially when you look at all the non-standard language extensions in various C compilers)


You're thinking of the C abstract machine, not a virtual machine. Abstract art is when this is a painted blue circle but it's about the feeling of sadness when losing somebody close to you - virtual art is when somebody persuades you a crappy GIF of a monkey is worth a million dollars.

And no, it's just not usefully true to model things this way. The C abstract machine is pretty weird even compared to a PDP-11, and your modern computer is nothing like a PDP-11.

C was intended to be efficiently implementable, so that's nice, but it has numerous defects in practice in this regard, because it pre-dates a lot of discoveries about how to implement programming languages.

The machine doesn't have types. At all. They're just not a thing. C has types. They're not very good types, and they're poorly implemented, but they are definitely types. Several other languages from that era don't bother, C does because it's a "high level language" and you'll do better embracing that understanding than trying to pretend it's assembler.


Assembler has types : bytes, words, floats, addresses, even strings and "functions". They are easily worked around by design though, in similar ways in assembler and C.


C is an abstraction over assembly. It is also minimal compared to prolog.


So is JOVIAL, FORTRAN 66, PL/S, BLISS...


May I suggest you give arguments why it is not (and why we should care about those arguments from a practical standpoint), and what language should better earn the title?


Others have already made a few in response, but for reference, this ACM queue article [1] is a good place to start. It has been widely discussed both on HN and elsewhere, and a simple search of the title can bring up several counter-arguments etc. if you're interested and want to know more.

> what language should better earn the title?

None.

1: https://queue.acm.org/detail.cfm?id=3212479


This article states that "C is not a Low-Level Language", not that it is not a thin abstraction over assembly. The arguments in the article could as well be used to make a point that "Assembly is not a Low-Level Language".


Could you specify the main ways it isn't? I have limited experience in both but I haven't seen much that suggests otherwise


Many features of C do not directly correspond with most modern assembly languages. You cannot predict the exact assembly it will generate without knowing a lot of details about your compiler and platform, and even then it's often iffy. It seems like a bit of a leap to call something "a minimal abstraction" if you can't even correctly describe how an operation in the abstraction corresponds to operations in the lower level.


That's mainly the result of optimizer passes, C itself doesn't have much to do with it.

Assembly languages actually haven't changed all that much since the 70's, but compilers have improved a lot. The output of early C compilers did indeed match the source code quite closely (and not just on the PDP-11), but even today that's true if you disable optimizations (and even with optimizations is usually pretty straightforward to map the C source to the assembly listing - if you're somewhat aware what optimizer passes in modern compilers are doing).

Of course CPU ISAs are already human-friendly abstractions over what's actually happening down in the hardware.


C has everything to do with optimizer passes, the language definition is what allows them to happen! The fact that you get about what you'd expect on -O0 is merely incidental, the specification does not afford you this.


It's true and false at the same time. The operations C gives you map 1-to-1 with assembly. Given some C code you can quite accurately predict which loads/stores will be elided by the compiler and what the resulting assembly will be.

I can't name another language for which this is true.

I get that you're hinting at the insane level of undefined behaviour enforcement by compilers, but I don't think it matters all that much once you understand how optimizing compilers work.


> The operations C gives you map 1-to-1

It does not. Implicit casting, inlining, volatile, args passed via registers vs stack etc can significantly change what you expect to be generated.


And conversely you don’t have access to a few things assembly can do: arbitrary stack access, PC register, flags. Some operations like bit rotation, zero bit counting, or fancier addressing modes have informal code patterns with a hope they’ll optimize right, but nothing guaranteed in the standard.


Pretty much a matter of optimization isn't it? Try disabling them. But I take it that this was never the point anyway. I think the point is that the representation of language objects and the runtime are rather straightforward compared to many other languages.


Only one of those (inlining) is an optimization. Two are language features (implicit casts and volatile) and the other is a calling convention (passing arguments on registers vs. stack).


Calling conventions aren't (mainly) a C feature either though, but are defined by the ABI specified for a specific OS/CPU combination (and all languages which want to talk to system APIs need to implement that ABI, not just C).


> It's true and false at the same time. The operations C gives you map 1-to-1 with assembly. Given some C code you can quite accurately predict which loads/stores will be elided by the compiler and what the resulting assembly will be.

I mean, can you though? After all you won't even have the same output depending on compiler and flags. And of course not all architectures have the same capabilities, so the same code can compile to a various number of instructions depending on the target architecture.

Not to mention things like bitfield access that can result in non-atomic load/stores for a simple `foo->bar = 1;`

I'm not sure in what sense you could say that C operations map 1-to-1 with assembly any more than Rust, C++ or basically any compiled language.


> The operations C gives you map 1-to-1 with assembly

    int test(int x, int y) {
        return x % y;
    }

    test:                                   // @test
        sdiv    w8, w0, w1
        msub    w0, w8, w1, w0
        ret
Surprisingly, assembly doesn't have the remainder instruction, but instead it has a multiply-then-subtract instruction which is not corresponding 1-to-1 to anything in C.


x86 doesn't have remainder. Other instruction sets do. But yeah I agree with your point. Some targets might not even have multiply.


x86 has remainder: IDIV calculates both the quotient and the remainder at the same time and places them in different registers just like many other ISAs before it did. In fact, DIV instruction of PDP-11 worked that way too:

    Description:    The 32-bit two's complement integer in R and Rv1 is divided
                    by the source operand. The quotient is left in R; the remain-
                    der in Rv1. Division will be performed so that the remainder
                    is of the same sign as the dividend. R must be even.

    Example:        CLR RO
                    MOV #20001,R1
                    DIV #2, RO

                       Before          After
                    (RO) = 000000   (RO) = 010000   Quotient
                    (R1) = 020001   (R1) = 000001   Remainder
PDP-11 also had ADC and SBC (add/subtract with carry) instructions which weren't (and still aren't) exposed in C either.


> x86 doesn't have remainder

The code in the comment you replied to is 64-bit ARM assembly.


Oops of course.


In C, unlike pretty much every other language, you can't even know how big an integer is.


Of course you can, but this is implementation defined. So you need to create code that does different things based on the underlying implementation. Every reasonably large C code base checks the current implementation to define what type of integers, pointers, etc. they're dealing with.

This is by design, because C was created do deal with disparate processors and OSs. When it was crated it would be difficult and unwise to assume that an integer has size of 2 bytes, for example, since each machine would define its own preferred length.


> When it was crated it would be difficult and unwise to assume that an integer has size of 2 bytes

It was still a bad design. Of course this is in hindsight, but history has taught is it would have been much better if the default was for specific sized integers, with an option to use `uint_fast32_t` or whatever.


Then we would be stuck with 16 bits integers. We are lucky to be in a point in time where integer types have been stable for a while, but it wasn't always so.


We wouldn't. People would update their code eventually or opt in to variable sized integers.


History says otherwise: C was so successful it is still widely used.


Just because something is popular doesn't mean it's good. Tobacco smoking is also popular.


Some of the best software we have to date was written in C. If you like it or not, it is a magnificent tool.


It would still be good if it was written in a different language. Probably even better because the developers would have more time to improve the software instead of reinventing wheels, writing boilerplate code and chasing down segfaults.


> It would still be good if it was written in a different language.

But it wasn't, C made all these software tools possible.


No it didn't. C happened to be the most popular language for a long time but that doesn't mean people couldn't have written software without it, or that if it hasn't been so popular a better language wouldn't have arisen.

English is very popular but you wouldn't say "English is what made all those books possible" would you?


Yes it did and continues making it possible. Several of the largest codebases in the world are written in C.


To reiterate, C didn't make it possible to write those codebases just because they happen to be written in C.


Yes, it did. Just check how they created UNIX.


Sure you can, it's called sizeof(int). Now read the other comment to understand why.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: