More

enriquto · 2025-04-07T07:00:29 1744009229

if it's your software you can simply sell them a proprietary license separately

umanwizard · 2025-04-07T07:06:54 1744009614

That’s not what I’m talking about. I’m talking about software that people are obligated to distribute under a GPL-compatible license. For example, some random company’s private fork of the Linux kernel.

pacifika · 2025-04-07T07:16:16 1744010176

Interesting thought experiment and at the same time a terrible real world experiment.

enriquto · 2025-04-05T23:56:50 1743897410

> The second and third books leave something to be desired

Also got this feeling on the first read... but now I remember them very fondly! I like to think that this trilogy happens in the same universe as Dune, being a prequel to the events of Dune. The homage to the Dune universe by the author is obvious (the names of the books, the notion of "other memories", etc). But many notions fit together, with some effort in your imagination. The second book of the trilogy provides a mechanism to explain the other memories in the form of nodal biology. The octopi ftl technique is reminiscent of the guild navigators. The third book hints subtly at a reason why the butlerian jihad could have happened.

enriquto · 2025-03-22T13:43:08 1742650988

As somebody who is afraid of types (and also, who hates types, because we all hate what we fear), may my point of view serve as balance: you don't need a type system if everything is of the same type. Programming in a type-less style is an exhilarating and liberating experience:

assembler : everything is a word

C : everything is an array of bytes

fortran/APL/matlab/octave : everything is a multi-dimensional array of floats

lua : everything is a table

tcl : everything is a string

unix : everything is a file

In some of these languages there are other types, OK, but it helps to treat these objects as awkward deviations from the appropriate thing, and to feel a bit guilty when you use them (e.g., strings in fortran).

zeta0134 · 2025-03-22T14:08:36 1742652516

I feel the need to issue a correction: while I'm programming in assembly, I very well have types. This word over here (screen position) represents a positive number only, but this one over here (character acceleration) can be negative. When adding one to the other, I need to check the arithmetic flags like so to implement a speed cap...

The types certainly exist. They're in my mind and, increasingly through naming conventions, embedded within some of the comments of my assembler code. But nothing is there to check me. Nothing can catch if I have made an error, and accessed a pointer to a data structure which contains a different type than I thought it did. Without a type system, that error is silent. It may even appear to work! Until 6 months later, when I rearrange my code and the types are arranged differently in memory, and only THEN does it crash.

nottorp · 2025-03-22T14:13:00 1742652780

> increasingly through naming conventions

The original goal of hungarian notation :) But Petzold mistakenly used 'type' in the paper and we ended up with llpcmstrzVariableName instead of int mmWith vs int pixelWidth, which was what they were doing in Office and frankly makes a lot of sense.

MrMcCall · 2025-03-22T14:05:23 1742652323

But once you get down to the unit data values inside any of those aggregates, you're still dealing with either characters, ints, floats, strings, arrays, and they each have their own individual access patterns and, more importantly, modification functions.

You can't add a number to a string, only to another number.

If you are dealing with a float, you better be careful how you check it for equality.

If it's pure binary, what kind of byte is it? Ascii, unicode code point, unsigned byte, signed multi-byte int, ... whatever.

There's no escaping the details, friend.

And your saying "everything is a word" for assembler is just plain wrong.

SAI_Peregrinus · 2025-03-22T19:04:15 1742670255

> You can't add a number to a string, only to another number.

Works in C, as long as the integer keeps the resulting pointer within the bounds of the allocation. See a trivial example[1].

[1] https://godbolt.org/z/YWKK1P7zj

MrMcCall · 2025-03-22T20:02:25 1742673745

Ok, sure. But I doubt that's a good practice. In fact, I can't possibly imagine it not being a horrible idea.

So, I ask: what size and signedness of int? 1, 2, 4, 8? What if the string is of length 3, 2, 1, 0?

Why bother with all those corner cases. Everything has a memory layout and appropriate semantics of representation and modification. Pushing those definitions is a recipe for problems.

I like to keep it simple, keeping the semantics simple in how I code specific kinds of transforms.

The less kinds of techniques you use, the less kinds of patterns you have to develop, test, and ensure consistent application of across a codebase.

Especially down in C land, which is effectively assembler.

Gone are the days of Carmac having to save bytes in Doom, unless you're doing embedded work, in which case that's all the more reason to be very careful how you handle those bytes.

SAI_Peregrinus · 2025-03-23T16:37:27 1742747847

> But I doubt that's a good practice

That's entirely how "string" indexing works in C. Strings in C are just pointers to `char` with some valid allocation size. As long as the integer used for the pointer offset results in a pointer into the allocation after the addition, it's valid to dereference the result. Remember, `array[index]` is syntactic sugar for `(array + index)` in C. Lots of the C stdlib string functions use this, e.g. `char strchr(const char* str, int character)` has a naive implementation as a simple loop comparing `char`s[1]. Glibc does it one `unsigned long int` at a time, as an optimization, with some extra work at the start going one `char` at a time to ensure alignment.

> So, I ask: what size and signedness of int? 1, 2, 4, 8?

Doesn't matter, as long as the result of the addition points to within the pointer's allocation. Otherwise you get UB as usual.

> What if the string is of length 3, 2, 1, 0?

Doesn't matter, as long as the result of the addition points to within the pointer's allocation. Otherwise you get UB as usual. For a 0-length string (pointer to '\0\'), the only valid value to add is 0.

> The less kinds of techniques you use, the less kinds of patterns you have to develop, test, and ensure consistent application of across a codebase.

100% agreed. The less C you use for string handling the better. C strings are fundamentally fragile.

[1] https://godbolt.org/z/4T835cdKb

MrMcCall · 2025-03-24T01:31:41 1742779901

> Doesn't matter

Any time the microprocessor accesses memory for use as an int, it's a specific kind of int, meaning size and signedness, and the flags are adjusted properly as per the operation performed.

> Strings in C are just pointers to `char` with

I'm gonna end this here. I taught myself C programming by reading K&R in the late 80s, and then proceeded to do so professionally for YEARS and YEARS.

There are people that know, and there are people that act like they know. You ever read the first two chapters of Windows Internals? You ever write C code that could make Windows system calls from the same program that could be 32- or 64-bit with a simple compiler flag?

I have.

> C strings are fundamentally fragile.

Not if you know what you're doing. You're almost certainly using a C program to type this response in an operating system largely written in C. You get any segfaults lately? I don't EVER on either my Ubuntu or Debian systems.

Thanks for playing.

SAI_Peregrinus · 2025-03-25T16:48:21 1742921301

> > Doesn't matter

> Any time the microprocessor accesses memory for use as an int, it's a specific kind of int, meaning size and signedness, and the flags are adjusted properly as per the operation performed.

Sure. But the C standard specifies how addition of a pointer to an integer works in section 6.5.7, particularly paragraph 9. The specifics of what flags get set & the width of integer used are up to the implementation & the programmer, but

> For addition, either both operands shall have arithmetic type, or one operand shall be a pointer to a complete object type and the other shall have integer type. (Incrementing is equivalent to adding 1.)

should be a pretty clear statement that pointer + integer is valid!

> > Strings in C are just pointers to `char` with

> I'm gonna end this here. I taught myself C programming by reading K&R in the late 80s, and then proceeded to do so professionally for YEARS and YEARS.

> There are people that know, and there are people that act like they know. You ever read the first two chapters of Windows Internals? You ever write C code that could make Windows system calls from the same program that could be 32- or 64-bit with a simple compiler flag?

> I have.

I'm an embedded C developer. I've been writing C for decades, but not for windows. But I do write code that can work on both 8-bit and 32-bit systems with just a compiler flag. Strings are arrays of character type with a null terminator, and array-to-pointer decay works as usual with them.

>> C strings are fundamentally fragile.

> Not if you know what you're doing. You're almost certainly using a C program to type this response in an operating system largely written in C. You get any segfaults lately? I don't EVER on either my Ubuntu or Debian systems.

> Thanks for playing.

C strings are arrays of character type with a null terminator. That is fundamentally fragile, since it includes no information about the encoding or length of the string, and thus allows invalid states to be represented. That doesn't mean you will get segfaults, only that it's possible for someone to screw up & interpret your UTF-8 data as ASCII or write a `\0` in the middle of a string or other such mistake, and you'll get no protection from the type system.

enriquto · 2025-03-22T14:15:53 1742652953

> And your saying "everything is a word" for assembler is just plain wrong.

I also love x87 registers, but they are becoming rarer these days.

MrMcCall · 2025-03-22T14:50:19 1742655019

Your profile says "just another C hacker".

I learned C from reading K&R in the late 80s.

Every C compiler I've worked with could output the code as assembler, so C is really a thin layer of abstraction that wraps assembler. Having programmed in pure assembler before, I understand the benefits of C's abstractions, which began with its minimal, but helpful, type system.

Should I not be taking you seriously?

We are not just talking with each other but sharing our expertise with those who may be reading.

Sometimes I forget that other people can just be unpleasant on purpose. I find no other explanation for your response.

enriquto · 2025-03-22T15:01:55 1742655715

Sorry man, I was just trying to spew some harmless banter.

MrMcCall · 2025-03-22T19:53:40 1742673220

Apology accepted. No worries.

It's just that my post rate is severely limited, even to reply to people replying to me.

I'm on some kind of naughty list in "the algorithm", which is something beyond dang's control, by what I gather from his reproachments to me.

That's why I've got to minimize my number of replies.

And 'harmless banter' doesn't communicate in pure text, friend.

Peace be with you.

erikerikson · 2025-03-22T14:31:26 1742653886

> You can't add a number to a string

Haven't written JavaScript?

MrMcCall · 2025-03-22T14:54:08 1742655248

I don't care if it "can" be done, the results will be garbage unless you are very, very careful.

And Javascript is garbage no matter how many people use it successfully, as I have done professionally.

SAI_Peregrinus · 2025-03-22T19:03:08 1742670188

Or C. It just turns into pointer math. Godbolt example here[1], just make sure the `int` is an offset within the bounds of the char* and it's well-defined.

[1] https://godbolt.org/z/YWKK1P7zj

ks2048 · 2025-03-22T18:57:18 1742669838

If you design a large program in C where all your variables are "char*", I suppose "exhilarating" could be one word used to describe it.

The article's perspective would be that structs are useful, so use them liberally. And nearly all good, large C programs do, as far as I can tell.

Of course there are tradeoffs and you can take it too far. The article mentions that as well.

bluGill · 2025-03-22T14:49:57 1742654997

I've programmed in typeless languages and they are great for small programs - less than 10,000 lines of code and 5 developers (these numbers are somewhat arbitrary, but close enough from discussion). As you get over that number you start to run into issues because fundamentally your word/array of byte/multi-dimensional array of floats/ ... has deeper meaning and when you get it wrong the code might parse and give a result but the result is wrong.

Types give me a language enforced way to track what the data really means. For small programs I don't care, but types are one of the most powerful tricks needed for thousands of developers to work on a program with millions of lines of code.

wruza · 2025-03-22T15:13:37 1742656417

I experience it often in teams less than 2 developers and projects less than 2000 lines of code (that's still 50 pages, btw). It boils down to being able to load everything into your mind, and that heavily depends on a type of a project, data/code models, ide, etc, and also various factors unrelated to coding.

A human mind is a cache -- if you overload it, something will fly out and you won't even notice. Anyone who claims that types have no use probably doesn't experience overloads. If it works for them, good, but it doesn't generalize.

RodgerTheGreat · 2025-03-22T15:14:21 1742656461

Sure! But many, many useful programs will never need to grow to millions of LoC or thousands of developers.

darioush · 2025-03-22T16:03:25 1742659405

I'm a big fan of primitive types, in particular byte arrays.

It's okay to create a new data structure that combines some primitive data types in a "struct", like an array that tracks its length.

But we don't want to "build abstractions and associate behavior to them" (just associate behavior to data structures like push/pop).

whattheheckheck · 2025-03-23T16:33:39 1742747619

Can you expand on this please?

darioush · 2025-03-24T21:02:59 1742850179

Sure, in many languages we have the notation of thing.do_thing(arg1, arg2).

I suggest this is a good notation for data structures like, stack.push(10) or heap.pop()

I'm suggesting we don't use this notation for things like rules to validate a file, so I suggest we write validate(file, rules) instead of rules.validate(file).

Then we can express the rules as a data structure, and keep the IMO unrelated behavior separate. Note then we don't need to worry about whether it should be file.validate(rules) perhaps. Who does the validation belong to? the rules or the file? the abstractions that are created by non-obvious answers to "who does this behavior belong to" are generally problems for future changes.

usernamed7 · 2025-03-22T15:22:52 1742656972

Files have types though: .txt, .jpg, .bin, etc.

whilenot-dev · 2025-03-22T15:53:46 1742658826

The filename suffix isn't much more than part of the filename (a simple variable name in that analogy) - it's more convention than constraint. Nobody is stopping you from giving your file the name you want (and the OS allows). You'd use literal magic[0] to assume an actual type.

"Everything is a file" rather refers to the fact that every resource in UNIX(-like) operating systems is accessible through a file descriptor: devices, processes, everything, even files =)

[0]: https://www.man7.org/linux/man-pages/man4/magic.4.html

mdaniel · 2025-03-22T15:59:06 1742659146

And the good(sic) thing about conventions: so many to choose from! .htm .html .HTML .jpeg

igravious · 2025-03-22T14:50:14 1742655014

ruby: everything is an object

niek_pas · 2025-03-24T11:00:05 1742814005

Haskell: everything is a function

enriquto · 2025-03-20T15:49:20 1742485760

you can put them in the same repository, if that is your thing.

If you put the build files in a .builds/ folder at the root of your repository, they will be run upon each commit. Just like in github or gitlab. You are just not forced into this way of life.

If you prefer, you can store the build files separately, and run them independently of your commits. Moreover, the build files don't need to be associated to any repository, inside or outside sourcehut.

terminalbraid · 2025-03-20T17:14:37 1742490877

I see, that is nice. Thank you for the patient explanation.

enriquto · 2025-03-15T18:25:38 1742063138

The html source is beautiful, did you write it by hand?

susam · 2025-03-15T18:51:02 1742064662

Thanks! Yes, I handcraft all my HTML and CSS. I'm glad you noticed the HTML and liked it. I find great joy in crafting my website by hand. It's like digital gardening. I grow all my HTML and CSS myself. It's all 100% organic and locally sourced!

enriquto · 2025-03-15T19:43:00 1742067780

I rarely ^U nowadays, but your site was so clean that I couldn't resist!

Just as a side note : when writing html5 by hand, you can use the full power of the language, most notably optional tags (no need to write html, body, etc) and auto-closing tags (no need to close p, li, td, etc). You may get something even crispier!

See for a reference the google html style guide : https://google.github.io/styleguide/htmlcssguide.html#Option...

And the official html5 reference for a complete list of optional tags : https://html.spec.whatwg.org/multipage/syntax.html#syntax-ta...

susam · 2025-03-15T20:07:59 1742069279

> Notice that, when writing html5 by hand, you can use the full power of the language, most notably optional tags (no need to write html, body, etc) and auto-closing tags (no need to close p, li, td, etc). You may get something even crispier!

Yes! In fact, sometime back I wrote a little demo page to show the minimal (but not code-golfed) HTML we can write such that it passes validation both with the Nu HTML Checker and HTML Tidy.

Here's the demo page: https://susam.net/code/web/minimal.html

Here's the Nu HTML Checker output: https://validator.w3.org/nu/?doc=https%3A%2F%2Fsusam.net%2Fc...

Here's the HTML Tidy (version 5.8.0) output:

  $ tidy -quiet -errors minimal.html
  $

Here's the HTML:

  <!DOCTYPE html>
  <html lang="en">
  <meta charset="UTF-8">
  <title>Hello</title>
  <body>
  <p>Hello, World!

That said, when writing my own posts, I prefer keeping optional and closing tags intact. Since I use Emacs, I can insert and indent closing tags effortlessly with C-c /. It's a bit like how some people write:

  10 PRINT"HELLO

But I've always preferred:

  10 PRINT "HELLO"

I find the extra structure more aesthetically pleasing.

enriquto · 2025-03-12T17:17:37 1741799857

Hahaha! I'll start calling Rust "Zoomer Ada"

enriquto · 2025-03-12T13:11:59 1741785119

kids these days...

in my time, we used to know each other's IP addresses and just used netcat

InsideOutSanta · 2025-03-12T13:29:36 1741786176

In my time, we used to bring floppies to the schoolyard and just swap them.

deskr · 2025-03-12T15:06:02 1741791962

We swapped cassettes and then copied them in hillbilly double cassette deck.

mrbluecoat · 2025-03-12T16:19:49 1741796389

Making sure to cover the write-protect holes on the top with Scotch tape, of course.

doodlebugging · 2025-03-13T15:36:29 1741880189

In my time we just wrote very small on scraps of paper or tore pictures from magazines. We folded them tight into a compact rectangle, then we folded that rectangle across a rubber band between our thumb and index finger and with the other hand we stretched that rubber band enough so that when released, it would propel the folded rectangle of paper across the intervening distance rapidly enough that the adversarial player in the room could not detect the source or recipient of the rectangle without querying the entire congregation.

This method eliminated potentially adversarial middlemen in transit who might, if you chose to pass it through multiple players (servers if you will) - read it in transit though the message was not intended for them, and then use the contents against you later.

It had the disadvantage that one needed to insure that the sender and recipient were in sync in case the aim was off and the message bounced to an unintended recipient.

I once had the misfortune of sending a tightly folded, secure message that was part of a war game being played during English class, and having that poorly aimed message hit the largest mass of muscle in the class right squarely in the ear because the recipient was busy gloating over the success of their previous move and wasn't able to secure the reply in transit.

We all heard the light snapping sound of the rubber band followed by an uncharacteristically loud profanity from the unintended recipient, my own barely stifled gasp of horror, lots of giggles and laughter from the audience, and as they turned - the beginning of the next round of the Inquisition by the adversarial instructor who mistakenly thought we were all watching the English lesson on the board in real time instead of conducting paper war games in the background.

Fun times.

Dylan16807 · 2025-03-12T15:00:58 1741791658

If that counts then you can use a flash drive and the problem is solved.

JadeNB · 2025-03-12T14:20:04 1741789204

> In my time, we used to bring floppies to the schoolyard and just swap them.

Sneakernet!

enriquto · 2025-03-12T12:06:31 1741781191

> Try doing C with a garbage collector ... it's very liberating.

> Do `#include <gc.h>` then just use `GC_malloc()` instead of `malloc()` and never free.

Even more liberating (and dangerous!): do not even malloc, just use variable length-arrays:

    void f(float *y, float *x, int n)
    {
            float t[n];  // temporary array, destroyed at the end of scope
            ...
    }

This style forces you to alloc the memory at the outermost scope where it is visible, which is a nice thing in itself (even if you use malloc).

kqr · 2025-03-12T15:07:47 1741792067

At first I really liked this idea, but then I realised the size of stack frames is quite limited, isn't it? So this would work for small data but perhaps not big data.

enriquto · 2025-03-12T15:43:10 1741794190

In theory, this is a compiler implementation detail. The compiler may chose to put large stacks in the heap, or to not even use a stack/heap system at all. The semantics of the language are independent of that.

In practice, stack sizes used to be quite limited and system-dependent. A modern linux system will give you several megabites of stack by default (128MB in my case, just checked in my linux mint 22 wilma). You can check it using "ulimit -all", and you can change it for your child processes using "ulimit -s SIZE_IN_KB". This is useful for your personal usage, but may pose problems when distributing your program, as you'll need to set the environment where your program runs, which may be difficult or impossible. There's no ergonomical way to do that from inside your C program, that I know of.

dzdt · 2025-03-12T21:54:21 1741816461

Its a giant peeve of mine that automatic memory management, in the C language sense of the resource being freed at the end of its lexical scope, is tied to the allocation being on the machine stack which in practice may have incredibly limited size. Gar! Why!?

enriquto · 2025-03-12T22:24:02 1741818242

Ackshually, it has nothing to do with the C language. It's an implementation choice by some compilers. A conforming implementation could give you the whole RAM and swap to your stack.

int_19h · 2025-03-13T16:14:33 1741882473

Yes, but does any implementation actually do that?

AFAIK Ada is typically more flexible, but that has to do with the language actually giving you enough facilities to avoid heap allocations in more cases - e.g. you can not only pass VLAs into a function in Ada, but also return one from a function. So it becomes idiomatic, and compilers then have to support this (usually by maintaining a second "large" stack).

__turbobrew__ · 2025-03-13T03:49:35 1741837775

Yea, usually the stack ulimit is only a few KiB for non-root processes by default on linux.

It is easy enough to increase, but it does add friction to using the software as it violates the default stack size limit on most linux installs. Not even sure why stack ulimit is a thing anymore, who cares if the data is on the stack vs the heap?

mrheosuper · 2025-03-13T04:03:41 1741838621

As FW engineer, i do

duskwuff · 2025-03-13T02:39:21 1741833561

It isn't a practical pattern for anything beyond the most trivial applications. Consider what this would look like if you tried to write a text editor, for instance - if a user types a new line of text, where is the memory for that allocated?

kqr · 2025-03-13T05:24:22 1741843462

Those would be the difficult questions one would be forced to confront ahead of time with this technique. That's not a bug; it's a feature!

Similar to what Ada does with access types which are lexically scoped.

brucehoult · 2025-03-13T06:43:53 1741848233

The problem is that regardless of the amount of confrontation it does not have an answer for any infinite run time event-loop based program, other than "allocate all of memory into a buffer at startup and implement your own memory manager inside that".

Which just punts the problem from a mature and tested runtime library to some code you just make up on the spot.

int_19h · 2025-03-13T16:19:43 1741882783

Heap was invented for a reason, and some tasks are naturally easier to model with it.

The problem is that once it's there, people start using it as the proverbial hammer, and everything looks like a nail even if it isn't.

Note though that ""allocate all of memory into a buffer at startup" is a lot more viable if you scope it not to the start of the app, but to the entrypoint of some code that needs to make a complicated calculation. It's actually not all that uncommon to need something heap-like to store temporary data as you compute - e.g. a list or map to cache intermediary results - but which shouldn't outlive the computation. Ada access types give you exactly that - declare them inside the top-level function that's your entrypoint, allocate as needed in nested functions as they get called, and know that it'll all be cleaned up once the top-level function returns.

brucehoult · 2025-03-13T18:24:36 1741890276

That works for something where the events being handled are like "serve a web page" or "compile a C function". It doesn't work for a spreadsheet or word processor or a web browser.

int_19h · 2025-03-15T06:52:13 1742021533

It would be more accurate to say that it doesn't work for some of the allocations in a spreadsheet or word processor app. Which is why you still have the global heap, but the point is that not everything needs to be on the same heap that has the same overall lifetime. That spreadsheet might still be running some algorithm that can do what it needs to do with a local heap.

And that aside, there are still many apps that are more like "serve a web page". Most console apps are like that. Many daemons are, too.

duskwuff · 2025-03-13T18:55:34 1741892134

I'm not convinced it even works very well for either of those cases. It's common in many applications to return the result of a computation as an object in memory, like an array or string of arbitrary length or a treelike structure. Without the ability to allocate memory which exists after a function exits, I'm not sure how you'd do that (short of solutions which create arbitrary limits, like writing to a fixed-size buffer).

brucehoult · 2025-03-14T00:08:13 1741910893

Well, yes, but I'm trying to be generous to the PoV.

My preferred solution is definitely to use the GC. With some help if you want. You can GC the nursery each time around the event loop. You can create and destroy arenas.

hdrz · 2025-03-13T07:34:25 1741851265

C with dynamic arrays and classes? Object pascal says hello…

enriquto · 2025-03-12T09:16:04 1741770964

it's a fairly common usage in numeric computing. If you read, for example, the wikipedia entries for "computational fluid dynamics" you'll see that they consistently speak of "codes" when referring to programs.

https://en.wikipedia.org/wiki/Computational_fluid_dynamics

kergonath · 2025-03-12T19:22:11 1741807331

Same for computational Chemistry, computational Physics (solid state, neutronics, astrophysics), etc.

enriquto · 2025-03-11T22:39:31 1741732771

>I will work on adding somewhere in our docs some metrics for this kind of thing (I think it could be helpful for many).

Certainly! A comparison of performance with specialized tools for large point clouds would be very interesting (like cloudcompare and potree).