Good question, that I don't fully know the answer to. The rest of the byte code (apart from the primitives that enable looping) already allow expressing a lot. From memory (it has been almost half a year since I last worked on this), you can specify things like "for this 32 bit value, the first two bytes can be found in the middle of RAX, the third byte is found following this chain of pointers, and the final byte is on the stack" without even touching the TC parts.
Basically, my impression was that that the format was flexible enough that I couldn't see why you would need the TC parts in practice. The compilers seemed to agree and not use it in practise (at least gcc and llvm).
This was of interest to me since I was generating BPF code from these (for user space trace points) and BPF is famously and intentionally not TC. I could translate many patterns that do show up in real world code, but not the general case.
Define "worse". I absolutely hated this formal essay style even before LLMs were a thing. All these "on the other side", "in conclusion" patterns with loads of generics of doesn't convey anything useful. And they make it really hard to tell if the writer is pretending to know anything or actually knows their shit but don't know how to write so that doesn't sound like an essay assignment. Good riddance.
On a side note: the fixed-pattern essay thing seems to be an American invention, or at least popularized by the American education system.
I do. And I get paid because I write it fulfill to the customer's need that's not covered by an existing solution, not because some law prevents using what already exists.
It wouldn't impact me at all. Without our hardware it is useless. And customers need certifications, support and availability guarantees. Customers aren't paying shit for code. They don't need code. They need solution to their problems. Of which my code is a very small (if critical) part of.
And yes, that's totally irrelevant. Because the mere fact that some people depend on some evil for their living doesn't justify its existence.
That’s good for you, but for the rest of us, we live in economies of scale. Copyright is the legal underpinning of most software engineers’ ability to earn a living and feed their families.
Not really. It elegantly solves the "create a process, letting it inherit these settings and reset these other settings", where "settings" is an ever changing and expanding list of things that you wouldn't want to bake into the API. Thus (omitting error checks and simplifying many details):
pipe (fd[2]); // create a pipe to share with the child
if (fork () == 0) { // child
close (...); // close some stuff
setrlimit (...); // add a ulimit to the child
sigaction (...); // change signal masks
// also: clean the environment, set cgroups
execvp (...); // run the child
}
It's also enormously flexible. I don't know any other API that as well as the above, also lets you change the relationship of parent and child, and create duplicate worker processes.
Comparing it to Windows is hilarious because Linux can create processes vastly more efficiently and quickly than Windows.
> It elegantly solves the "create a process, letting it inherit these settings and reset these other settings", where "settings" is an ever changing and expanding list of things that you wouldn't want to bake into the API.
Or, to quote a paper on deficiencies of fork, "fork() tremendously simplifies the task of writing a shell. But most programs are not shells".
Next. A first solution is trivial: make (almost) all syscalls to accept the target process's pidfd as an argument (and introduce a new syscall to create an empty process in suspended state) — which Windows almost (but not quite) can do already. A second solution would be to push all the insides of the "if (fork () == 0) { ... }" into a eBPF program and pass that to fork() — that will also tremendously cut on the syscall costs of setting up the new process's state as opposed to Windows (which has posix_spawn()-like API).
> create duplicate worker processes.
We have threads for this. Of course, Linux (and POSIX) threads are quite a sad sight, especially with all the unavoidable signalling nonsense and O_CLOFORK/O_CLOEXEC shenanigans.
Yes, but at what cost? 99% of fork calls are immediately followed by exec(), but now every kernel object need to handle being forked. And a great deal of memory-management housekeeping is done only to be discarded afterward. And it doesn't work at all for AMP systems (which we will have to deal with, sooner or latter).
In 1970 it might have been the only way to provide a flexible API, but nowadays we have a great variety of extensible serialization formats better than "struct".
> In 1970 it might have been the only way to provide a flexible API, but nowadays we have a great variety of extensible serialization formats better than "struct".
Actually, fork(2) was very inefficient in the 1970's and for another decade, but that changed with BSD 4.3 which shipped an entirely new VMM in 1990 in 4.3-Reno BSD, which – subsequently – allowed a CoW fork(2) to come into existence in 4.4 BSD in 1993.
Two changes sped fork (2) up dramatically, but before then it entailed copying not just process' structs but also the entire memory space upon a fork.
AFAIR it was quite efficient (basically free) on pre-VM PDP-11 where the kernel swapped the whole address space on a context switch. It only involved swapping to a new disk area.
I used MINIX on 8086 which was similar and it definitely was not efficient. It had to make a copy of the whole address space on fork. It was the introduction of paging and copy-on-write that made fork efficient.
Oh, is that how MINIX did that? AIUI, the original UNIX could only hold one process in memory at a time, so its fork() would dump the process's current working space to disk, then rename it with a new PID, and return to the user space — essentially, the parent process literally turned into the child process. That's also where the misconception "after fork(), the child gets to run before the parent" comes from.
..which is a huge thing because AFAIK most Cortex-A SoCs on the market are full of undocumented peripherals. Cortex-Ms are usually sufficiently documented that you can bring them up from scratch. Once you want a MMU it's either "use our mainlined Linux drivers full of dark magic" (if you are very lucky - perfect if you want Linux, less so if you are developing your own OS), "use our Linux kernel full of binary blobs" (if you are less lucky) or, as a rule, "sign a NDA and don't even bother us if you aren't a billion-dollar corporation".
(Started it as a minor edit, then decided to elaborate so moved it to a new comment).
I think the article is wrong in its core premise. While the electrons get added or removed from the floating gate, the total number of electrons in the SSD chip stays the same. Gates are capacitors, in order to add electrons to one capacitor plate, you have to remove an equal numbers of electrons from the other plate, i. e. from the transistor channel. The net charge of a SSD chip is always zero. Otherwise it would just go bang. <s>2.43×10^-15</s> [my bad 1] 2.67×10^15 electrons is about 300µC - that's a lot of charge to separate macroscopically.
Therefore the mass (weight is a different thing, through it is proportional to mass at a given constant gravity potential) of the data on a SSD isn't fundamentally different from a HDD - they both are caused by a change of internal energy without any change in the number of fermions. I'd expect data on SSD to have larger mass change because a charged capacitor always store more energy than a discharged one, while energy of magnetic domains is less directional and depends mostly on the state of neighbor domains - but I'm not sure about this part.
> So, assuming the source material is correct and electrons indeed have mass, SSDs do get heavier with more data.
That is definitely wrong! No way the source material has more electrons. The only way it could do that is by being charged.
Richard Feynman, The Feynman Lectures:
"If you were standing at arm's length from someone and each of you had one percent more electrons than protons, the repelling force would be incredible. How great? Enough to lift the Empire State Building? No! To lift Mount Everest? No! The repulsion would be enough to lift a "weight" equal to that of the entire earth!"
See, now, if this was Reddit...this is the opportunity for a yo momma joke. But here we are on HN, so I'll just point out that this is the opportunity for a yo momma joke.
I believe TFA reads 2.43×10^-15 kg, not electrons. Unless SSDs are creating new and exciting physics, one can't have less than one electron, as it's an elementary particle.
> energy of magnetic domains is less directional and depends mostly on the state of neighbor domains
Yes, but it's the same thing. The flux changes on the drive define the bits. It's probably true that a drive storing all 1's or all 0's would be quantitatively (but surely immesurably) lighter. But in practice a drive storing properly compressed high-entropy data is going to see a flux change every other bit on average. And all of those are regions of high magnetic field with calculable energy density. Same deal as charge in a capacitor, which also stores energy in the field.
Exactly. On top of that, most managed flash (which is equivalent to SSD controllers) will pass all write through a modified cyclic XOR pad in order to keep the /bit/ entropy high. I don’t think the article holds on multiple abstraction layers.
Which is the same reason storing data to a HDD doesn't add weight. You can pack the data tighter if you are writing basically balanced 1s and 0s. Thus you can pack more bytes into a given area by encoding them into patterns with even distributions even though that means you need to write more bits.
But SSD erasing must write a constant (either one or zero). So an erased ready-to-write SSD block will have consistently different energy than one written with a random scrambled pattern. Same for SMR HDDs - but not for CMR.
What happens when you XOR a 0 with a byte in a one time pad? You get the corresponding byte from the pad, right? And the block full of 1s or 0s is exactly a case the scrambler needs to be used.
It's not for compression or anything like that, it is used because the cells are packed so tightly that sharp changes in energy gradients over the space will even themselves out by donating electrons to their neighbors (flipping bits). It's kind of like rowhammer.
Instead of just a ont-time-pad, a more complicated and often proprietary scrambler is used. But it's towards the same end. It's a deterministic, pseudo-random bit sequence.
If you can bypass the controller (read directly from NAND) you will see the true values that it returns, and it will likely be scrambled even if the flash controller reports otherwise. This also allows you to recover the scrambler, since you know that the other side of the XOR operation is 0.
Trim might be an illusion, but block erase is very real, even through it's not exposed as a device command. What I'm trying to tell, writes are scrambled but erases (not trim which marks a logical block as collectable, but physical erase that makes a physical block writable) aren't as they happen over several megabytes at once with no way to control individual bits.
Another bit I’m surprised seems to have gotten completely glossed over: there is a deep relationship between _entropy_ and mass which puts bounds on the amount of information you can place in a given volume.
TLDR: a given region of space can’t have more entropy than a black hole of the same volume. Rearranging terms, you find that N bits of information (for large N) has an equivalent black hole size, which in turn has a mass…
I'd really hope we do live with 4kb pages forever. Variable page size would make many remapping optimizations (i. e. continuous ring buffers) much harder to do, so we would need more abstraction layers, and more abstraction layers will eat away all the performance gains while also making everything more fragile and harder to understand. Hardware people really love those "performance hacks" that make live a more painful for the upper layers in exchange for a few 0.1%s of speed. You could also probably gain some speed by dropping byte access and saying the minimal addressable unit is now 32 bits. Please don't. If you need larger L1 cache - just increase associativity.
The extra L1 cache from a 64k page is on it's own a ~5-10% perf improvement (and it decreases power use by reducing the number of times you go out to L2.
Funny, most of what you described sums up the Alpha architecture. 8KB pages + huge pages and, initially, only word-addressable memory, no byte access.
(Of course, it only took a few years for this to be rectified with the byte-word extension, which became required by ~all "real software" that supported Alpha)
It's also one of the only architectures Windows NT supported that didn't have 4KB pages, along with Itanium. I've wondered how (or if?) it handled programs that expect 4KB pages, especially in the x86 translation subsystem.
> There is no one to turn to and bully for compliance
> These attempts at curbing the freedom to write and distribute software are pathetic and will fail.
For Linux it will be way more problematic because:
- A lot of of corporate contributions comes from SV.
- Linux Foundation is incorporated in CA.
- Linus himself is CA's resident AFAIR.
So there is zero chance of claiming no jurisdiction. The only hope is whoever is enforcing this batshit wouldn't go after what is essentially not an OS for the purpose of the bill, but rather an internal component (it would be like going after a vendor of bolts and nuts for noncompliance of a toaster).
It's more likely to be an issue for distributions like Debian, Ubuntu, Red Hat, etc.
Although, if I'm understanding this correctly, I think all they would have to do to comply is have something during installation that asks for the age category, and write a file that is world readable, but only writable by root that contains that category that applications can read.
That is already way too much as far as I'm concerned. It's not that it's difficult, it's that it's arbitrary and a form of commanded speech or action. Smallness and easiness isn't an excuse.
If you write a story, there must be a character in it somewhere that reminds kids not to smoke. That's all. It's very easy.
I actually don't mind mandating the market take reasonable actions. The EU mandating USB C was an excellent move that materially improved things.
However I think mandated actions should to the greatest extent possible be minimal, privacy preserving, and have an unambiguous goal that is clearly accomplished. This legislation fails in that regard because it mandates sharing personal information with third parties where it could have instead mandated queries that are strictly local to the device.
Under no circumstances should we be “mandating” how hobbyists write their software. If you want to scope this to commercial OSes, be my guest. That’s not what was done here.
I'm not sure where the line between "hobby" and "professional" lies when it comes to linux distributions. Many of them are nonprofit but not really hobbyist at this point. Debian sure feels like a professional product to me (I daily drive it).
We regulate how a hobbyist constructs and uses a radio. We regulate how a hobbyist constructs a shed in his yard or makes modifications to the electrical wiring in his house.
I think mandating the implementation of strictly device local filtering based on a standardized HTTP header (or in the case of apps an attached metadata field) would be reasonably non-invasive and of benefit to society (similar to mandating USB C).
> I'm not sure where the line between "hobby" and "professional" lies when it comes to linux distributions. Many of them are nonprofit but not really hobbyist at this point. Debian sure feels like a professional product to me (I daily drive it).
"Professional" means you're being paid for the work. Debian is free (gratis), contributors are volunteers, and that makes it not professional.
What about Ubuntu? Its a combination of work by volunteers and paid employees, it is distributed by a commercial company, and said company sells support contracts, but the OS itself is free.
And there are developers who are paid to work on various components of linux from the kernel, to Gnome, does that make it professional?
Is Android not professional, because you don't pay for the OS itself, and it is primarily supported by ad revenue?
I would argue they're not, because they're not fully under the responsibility of a commercial entity, because they're open source. Companies can volunteer employees to the project, even a project they started themselves, but the companies and employees can come and go. Open source projects exist independently as public goods. Ultimately, it just takes anyone in the world to fork a project to exclude everybody else from its development.
Mint started off as Ubuntu. Same project, with none of the support contracts, no involvement from Canonical needed at the end of the day, etc.
On a practical level, it doesn't make sense to put thousands of dollars per user in liabilities to non-compensated volunteers whatever the case may be with regards to the employment of other contributors.
At some point it seems to devolve from a meaningful discussion about how things should be done into a semantic argument (which are almost always pointless).
> it doesn't make sense to put thousands of dollars per user in liabilities to non-compensated volunteers
I agree when it comes to individuals. But it probably does make sense to hold formally recognized groups (such as nonprofits) accountable to various consumer laws. I think the idea odd that Windows, RHEL, Ubuntu, and Debian should all be regulated differently within a single jurisdiction given that they seem to me largely equivalent in purpose.
You've confused and confabulated like 11 different things there. None of what you said has anything to do with either what I said or what the law says.
The way this currently exists is basically unenfoceable because the critical terms are not even defined. It's not even ultimately intelligible, which is a prerequisite to enforcing, or even being able to tell where it does and does not apply, and whether some covered entity is or is not in compliance.
And then another state will pass a law mandating scanning of all local images, and another state will want automated scanning of text, and a different country will want a backdoor for law enforcement. We have to stop this here and now.
"Linux" is just the source code to the kernel, pure free speech, and it can't run by itself in order to ask anybody anything. Underage programmers will benefit from the education of reading it.
Stop spreading disinformation. Linus and others did most of the work in the kernel. GNU project on the kernel side was architecture astronaut vaporware aka "Hurd". They were much more successful in userland (coreutils, gcc and the toolchain, gdb, Emacs, to name a few).
I meant the userland specifically. By calling what is fundamentally a GNU system running on a different kernel just "linux" it makes people think linux and his crew made all of the userland, in part because saying a college student made "an entire operating system" is far more profitable for news agencies than acknowledging his important but overall relatively small role in what they call "linux"
Because the kernel is the irreplaceable piece. None of what GNU did is: there are numerous implementations of coreutils and shells and at least one non-GNU production-quality compiler toolchain (clang-llvm), a few alternative libcs. And many distribution do actively use the non-GNU parts. But none of this is useful without the kernel that is compatible with computers people have. And the only usable kernel we have is Linux (while BSDs are out there too, they take a much different tightly-integrated approach to userspace).
To add to this: I can appreciate the significance of GNU, especially in early Linux distributions, but the position of "GNU was the real OS, Linux was just the kernel" is also deceptive, IMO.
Sure, a lot of the userspace was GNU, but a lot of it ... wasn't. Things like PAM, the init system, and the network config tools, off the top of my head. A lot of system-specific tools come from "not-GNU", too.
You can't discount how much of early Linux was "GNU", and how big a deal GCC and GNU libc (and the rest!) were, but it's disingenuous in my opinion to call GNU an "operating system" that you just plugged Linux, the kernel, into. Even today, as far as I can tell, there is still not a true GNU system. Guix comes close, in terms of being "GNU-ish", but the most usable Hurd distro (AFAIK!) is Debian, where, again, a lot of components come from Debian, rather than GNU.
And, as you say, modern systems have drifted even further from being GNU. They have lots of GNU components, but so did, say, the Sprite OS, or a lot of 4.4BSD derivatives.
On that note, one of these days I want to make the GNU system as it was imagined a reality. Perhaps with the linux kernel as its kernel, maybe with co-official status with the hurd
Oh, I don't discount that! That's why I find it important to specify GNU/Linux. Not only is it respectful, but it makes the very important distinction that it is a Linux system running a GNU userland instead of a plan9 one or a busybox one. Usually when people speak of "linux" they're referring to GNU/Linux though
> figuring out where in memory or registers a given high level variable
Isn't the task itself Turing-hard? Or at least complex enough so that coming up with a non-Turing-complete solution would be impractical?
reply