The author translates 1 line of C mess to 7 lines of Zig mess which are then manually simplified through transforms. Why not just apply transforms to the original C code? Or is this only for people who know Zig syntax but not C?
> Or is this only for people who know Zig syntax but not C?
It's a blog post hosted on zig.news (so intended for an audience that knows Zig), I think we can safely assume that it's not meant to be an universal guide for understanding C code, but rather the recount of a (somewhat minor) experience lived by the author. It's even stated upfront:
> @rep_stosq_void on Twitter posted this strange sample of C code, and I wanted to show my process of understanding this contrived C code.
"I was presented with this code, and this is what I did to understand it"
Author here, I'm just showing how I did it. It took a few minutes to do it for myself - figured it would be worth the 15 minutes or so it'd take to write it down. Just sharing my process, there's nothing too serious about this post.
I've happened to have written a lot of C, but some areas (like this one) I'm not as confident in. This way of casting, and working with types is very poor syntax in my opinion (I mean below in this thread you have people arguing about the spiral rule and such, it's obviously a common confusion).
Zig's way of expressing types and pointers is far superior, these transformations I made were done quickly and I didn't feel like there was any ambiguity or confusion in anything. Just a series of simple reductions, there are no "tricks" or easy mistakes to make in the code. It feels like a trivial proof.
Obviously I am biased, this was posted to zig.news. I thought it was a neat showcase of translate-c, and how zig does some of these things nicer. I'm not telling everyone that they should do what I did, but this works for me and I'm happy to share.
You do understand that zig started out by removing things from C and then "adding rich constexprs" and that's it? It's arguably simpler than C, C is actually two languages (C and C preprocessor), three if you add make, five if you add cmake and autoconf.
In practice you can't be productive in a large project in C without make, increasingly cmake (at the very least for reading someone else's c project). And it's part of the zig agenda to unify build tools, too. We are now in the 2nd decade of the 21st century, it is not strange for programming languages to be opinionated about build tools, ruby has rake, rust has cargo, js has, well, js is a mess, don't copy js, elixir has mix, etc (and those are just the languages I know).
That doesn’t make those tools part of the C standard, though. The standard does not mandate the use of any particular build tool for C projects beyond a conformant C compiler. There is nothing stopping me from using meson, ninja, bazel, a Frankenstein shell script, etc. to build a C project. Hell, there is nothing stopping me from using make to build non-C projects - it was designed as a generic dependency resolution tool.
> There is nothing stopping me from using meson, ninja, bazel, a Frankenstein shell script, etc. to build a C project
There is if you intend to use libraries as those libraries may well have chosen a build system other than the one you have chosen meaning you'd have to rewrite their build scripts. This is far from the experience of using libraries in languages with unified build systems which is typically as simple as adding a line with the version of the library you want to a manifest file.
You somehow managed to criticize Zig for being something that C isn't, lacking macros, preprocessor and whatnot, and yet praise Go and Python for being opinionated about their features, which don't have preprocessors or macros.
Having few features that are small and knowable is pretty much the idea behind Zig though. It's supposed to be more of a C replacement than another C++/Rust.
If you know C you can just read through the Zig language spec and see how stuff correlates and you just learned zig.
Removing a whole secondary language for macros actually just makes the language simpler
Honestly, I’m not sure about Nim. I’d probably reach for Go before Nim because I hate layers in-between things and too many language paradigms.
Main thing is that all these hip new languages will be dead in a couple years because they don’t do anything all that innovative. Making “another C” or “another C target” versus Go which focuses on channels, slices, and go routines. These other languages that “make a better X” simply will be sidelined in production deployments because no matter the language there will always be warts.
Really folks keep reinventing the wart in the name of something “better”.
Maybe it's still not the language for you or for your projects, but for anyone else stumbling across this I don't think it's a fair characterization. Would you mind elaborating on where you think I'm off-base in the following?
> head-strong nerds inventing things
Let people have their fun.
> preprocessor allows you to add some logic before compilation
Zig's comptime feature does address that need. For most use cases where comptime and the preprocessor would both suffice, comptime has tons of strong advantages (type checked, errors traced back to the right lines, ability to write arbitrary turing-complete code without unholy syntactic black magic, no double evaluation, no need to wrap everything in extra parentheses, ....). It explicitly does not address use cases where you would actually want a textual preprocessor (like making the language look like Fortran), the argument being that the burden on reading unfamiliar code would be too high. The ease with which you can write arbitrary comptime code also enables you to do things like embed lookup tables in the binary, which in C you would ordinarily do by copy pasting from another tool (or writing constants by hand) or adding yet another preprocessor to the mix.
Comptime isn't a clear win over preprocessing in all cases, but having written a lot of C and a little Zig, if I had to choose one for a hypothetical new project I'd be tempted to use Zig just for access to that one feature.
> Syntax
Zig does feel a little clunky to write to me right now, but I think that's mainly do to how often I'm using @someBuiltin() in Zig when I would be using an operator or an additional language feature in something like Python.
That said, the syntax itself is incredibly simple. You can check that yourself for Zig [0], Python [1], and Go [2]. Zig has far fewer syntactic quirks and abilities than either of the other two. You have functions, operators, code blocks, types, reserved characters/keywords for builtin stuff, a few kinds of literals, and a little bit of syntactic sugar for working with structs.
> Don't reinvent C++...for instance, I even consider C better than Zig just because the features are small and knowable
To each their own. FWIW, Zig is explicitly and actively avoiding becoming a kitchen sink language like Rust or C++. They've added a few new features, and I personally find it easier to keep track of those than of all the different kinds of undefined behavior I might stumble across in C, especially since most of those (defer, errdefer, labeled break, ...) behave exactly as your intuition suggests they would.
The only particularly messy part in the C code there is the (int*(*)[]) type cast.
My intuition (because I don't usually have to deal with this kind of nonsense) is "cast to a pointer to an array of int pointers". cdecl confirms that: https://cdecl.org/?q=int*%28*x%29%5B%5D
So we cast a (pointer to pointer to int) to (pointer to array of int pointers) [we can ignore the detour through void*] and then immediately dereference through all three layers. Which gives us back the only int in the program.
Any time the spiral rule comes up, I like to point out that it's wrong. It is instructive in a way because one learns more about C declaration syntax, but it is even more instructive to recognize why it is wrong.
The spiral rule works only if there is no pointer to pointer or array of array in the type. But take this for example:
The type of xxx is a [1-element] array of [2-element] array of [3-element] array of pointer to pointer to ints. I drew a spiral that passes through each specifier in the correct order.
Notice that to make the spiral correct it has to skip the pointer specifiers in the first three loops. This is marked by ¦. This is not mentioned in the original spiral rules and one could be forgiven to parse the expression as xxx -> [1] -> pointer -> [2] -> etc. following a spiral that doesn't skip the pointers.
The spiral rule can be modified to process all array specifiers before all pointer specifiers, but then you'd have to specify that the order to do so is right and then left. At that point it's just the Right-Left Rule.
You're right of course. Every language has warts, and declaration, especially involving pointers, is C's. But once you internalize that declaration mirrors usage, together with the spiral rule, it will all immediately become clear. There is some method to the madness.
That's a fair point. In C, you always start at the identifier. In case there is none, type declarations can contain parentheses*, and just like in math, parens resolve first, so it's from innner to outer. So in this case one starts with the `(*)`.
* the tricky part is that `()` are also used to denote functions. So yeah, it's not always readable. `(*)()` would be a pointer to a function returning int (the default type) and taking an unspecified amount of arguments.
Do the parens in `int * (*) []` do anything? When I saw them I immediately assumed that function pointers would be involved, but it doesn't seem to be the case and now I'm confused.
EDIT: Uh, apparently you can add parens to casts but not declarations.
Yes they do something. They are used to override precedence, just as you would in a math expression. Array indexing has higher precedence than pointer dereferencing, and declarations follow the same precedence that you have in expressions.
int **p[123]; // p is array(123) of pointer to pointer to int
int *(*p)[123]; // p is a pointer to array(123) of pointer to int
Sometimes people find casts confusing because there is no identifier inside. But you can easily read it if you know where the identifier would be in an equivalent declaration.
I think this is just about where the identifier has to go in a declaration. Otherwise the spiral rule [or rather right-left-rule, as pointed out elsethred] doesn't start at the right place (to over-simplify it).
You can in fact have parentheses in declarations, but the identifier must be on the inside, not just to the right of everything: https://godbolt.org/z/vKzcYMdvK
I've always thought that the "obfuscated C competition" is proof that it is quite hard to obfuscate C. There'd be no challenge in most other languages. (I haven't got enough knowledge of Zig to know if it is as easy to understand as C in general).
I feel lately every IOCCC submission just has an obligatory "replace some part of the code with arbitrary identifiers which the preprocessor will search and replace back". Honestly a cheap gimmick, but it leads to a bigger "whut" factor when first seeing the code.
Yes, the IOCCC is always so interesting. Because the language is so spartan, there's often only one or two ways to do something, because to solve problems on a higher level you have to build the scaffolding yourself. Whereas in the usual dynamic scripting language you can reuse variables as different types, and conjure up a really convoluted approach in just a few statements.
The spiral rule is incorrect in the general case as other comments have already pointed out (the correct way is to read in precedence order) but I agree that it was trivial to decipher. I guess the opposite could be true for an experienced Zig programmer.
The cast says "treat this int* as an array of int*, then the first dereference says "give me the first element of that array", which gets you back the original int*.
See, this would be a better, shorter, and more insightful explanation than that an article full of mechanical transformations performed by hand.
"We start with a pointer-to-int called "a", take pointer to it and name it "p" (it's a pointer-to-pointer-to-int, although we store it in a pointer-to-whatever variable), then cast it to some weird type, dereference it thrice and store 1 into the resulting target. Two questions remain: a) what is that weird type? b) we have two levels of indirection but three dereferences, how does it work? The answer to the first question is that weird type is "pointer-to-array-of-pointers-to-int", and it helps us to answer the second question: dereferencing a value of that type is a no-op in arithmetical sense (but has a type-casting effect)."
Unless I'm mistaken, this is almost, but not quite correct. (Firstly, what's being cast is `p` which is a `void-pointer`.) More importantly, the latter cast says to treat p as a `pointer to an array of int-pointers`, which means that the first dereference actually gives you the array. Expressions with an array type in most contexts (including this one) decay into pointers to their first elements – thus the second dereference gets you the first element of the array, an int-pointer, and the third one gets you the integer being pointed to.
Also, because this array type is incomplete (it has no size), I don't think you could use it in any context where it didn't "decay" into a pointer to its first element (you can test what your compiler says if you try to measure the array's size with the sizeof operator when only using one dereference).
Am I the only one who thinks that Zig's syntax looks incredibly ugly? I've never seen a snippet that didn't make my eyes glaze over from the syntax. What am I not getting here?
This particular example is machine-generated code created by translate-c, it's meant to be semantically equivalent to the C code and even uses language features that you're normally not even supposed to use (c pointers).
That said I think it's fine if you don't like the syntax. I think that some complaints are honestly too superficial to be legitimate (like complaining about builtins being prefixed with @), but at the same time Zig is often times prioritizing explicitness over "good looking".
I personally consider Swift a very good looking language, but then I look at all the new features that got added since I used it last, remember that I value simplicity over aesthetics, and go back to Zig.
Ooh that looks a bit like Rust having quickly scrolling by. I have been keeping a eye on Zig but I'm waiting till the package manager stuff has been finalized and implemented.
> Am I the only one who thinks that Zig's syntax looks incredibly ugly?
It's very subjective what is beautiful or ugly, of course. It'd be more interesting if you can offer specific critique rather than just calling it ugly.
You're probably not the only one, I for one would call any non-lisp "ugly", but again, highly subjective, as many others find some C-like code beautiful but other C-like code ugly.
What? Are you arguing that because a phobia (that not everyone had) exists, beauty is not subjective but absolute? I'd love to see how you measure beauty if so, including the "beauty of code".
Objective / subjective does not mean that things cannot be experienced relatively. Someone colourblind isn't able to distinguish all colours, yet it does not mean that colours as a frequency of electromagnetic radiation are subjective, only their experience is.
It is common to make stuff you dont generally want in your code (but still need to be able to do because the language is sufficiently powerful) look ugly.
Yeah. I think I'm so used to C by now that I just can't handle anything that doesn't look like C. It's like my brain just ignores text when it can't recognize the C code patterns.
OOP might have helped popularize usage of the dot notation, but namespaces are a different thing.
Zig has no inheritance but everything is namespaced, including declaring functions inside struct definitions so that you can use them as if they were methods.
Yeah, me neither. Names must belong to some namespace, it really bothers me when code starts binding common nouns in a global context. C lacks namespaces so I use prefixes instead. At least this solution doesn't screw up the ABI like in C++.
I hate global variables so much it's one of the reasons I got rid of libc. Freestanding C turned out to be a superior language just because it lacks all the libc cruft.
I ultimately dropped Ruby because of global state. It's such a wonderful language but it has one fatal flaw: lack of proper modules. The require method just executes Ruby source files, modifying the global state of the interpreter. It ceased to be a beautiful language once I realized this. Python's modules are superior, and the Javascript approach is the best one: just a normal function that returns a normal object containing exported data and functions. Javascript modules are c.ompletely reified.
Yeah, the syntax doesn't really excite to me too much. Which is a shame, because I would like to see a modernized "better C" that isn't more verbose than C.
Out if curiosity I've checked what c2rust.com thinks about it:
**(*(p as *mut [*mut libc::c_int; 0])).as_mut_ptr() = 1 as libc::c_int;
which is still needlessly complicated, and not even quite accurate due to giving the array a 0 size (the as_mut_ptr() converts the array back to a C pointer).
>and not even quite accurate due to giving the array a 0 size (the as_mut_ptr() converts the array back to a C pointer).
It doesn't seem inaccurate to me, more like the best choice at hand. If the C array has a known length, the Rust code has it too. Only if the C code has an array of unknown length does the Rust code use a 0-length array. Furthermore, if the C code indexes the array of unknown length, the Rust code uses .as_mut_ptr().offset(...) instead of directly indexing the array. So the fact that it represents C arrays of unknown length with Rust arrays of 0 length does not cause any problem, because the generated code is consistent.
6.5.3.2 Address and indirection operators
Constraints
1 The operand of the unary & operator shall be either a
function designator, the result of a [] or unary
\* operator, or an lvalue that designates an object that
is not a bit-field and is not declared with the
register storage-class specifier.
So as the parameter is an lvalue it is guaranteed to work with the & operator.