> We studied 198 randomly sampled, real world fail- ures reported on five popular distributed data-analytic and storage systems, including HDFS, a distributed file system [27]; Hadoop MapReduce, a distributed data- analytic framework [28]; HBase and Cassandra, two NoSQL distributed databases [2, 3]; and Redis, an in- memory key-value store supporting master/slave replica- tion [54]
So, analyzing databases, we picked Java, Java, Java, Java, and one in C. This does not seem very random. I suppose this may provide insight into failure modes in Java codebases in particular, but I'm not sure I'd be in a hurry to generalize.
The best answer to your question is some variant of "write more assembly".
When someone indicates to me they want to learn programming for example, I ask them how many programs they've written. The answer is usually zero, and in fact I've never even heard greater than 10. No one will answer a larger number because that selects out people who would even ask the question. If you write 1000 programs that solve real problems, you'll be at least okay. 10k and you'll be pretty damn good. 100k and you might be better than the guy who wrote the assembly manual.
For a fun answer, this is a $20 nand2tetris-esque game that holds your hand through creating multiple cpu architectures from scratch with verification (similarly to prolog/vhdl), plus your own assembly language. I admittedly always end up writing an assembler outside of the game that copies to my clipboard, but I'm pretty fussy about ux and prefer my normal tools.
For what's written in assembly, lack of portability is a given. The only exceptions would presumably be high level entry points called to from C, etc. If you wanted to support multiple targets, you have completely separate assembly modules for each architecture at least. You'd even need to bifurcate further for each simd generation (within x64 for example).
No, they are not. I'm not sure who started this whole container is just a process thing, but it's not a good analogy. Quite a lot of things you spin up containers for have multiple processes (databases, web servers, etc).
Containers are inherently difficult to sum up in a sentence. Perhaps the most reasonable comparison is to liken them to a "lightweight" vm, but the reasons people use them are so drastically different than vms at this point. The most common usecase for containers is having a decent toolchain for simple, somewhat reproducible software environments. Containers are mostly a hack to get around the mess we've made in software.
Having multiple processes under one user in an operating system is more akin to having multiple threads in one process than you think. The processes don't share a virtual memory space or kernel namespaces and they don't share PID namespaces, but that's pretty much all you get from process isolation (malware works because process isolation is relatively weak). The container adds a layer that goes around multiple processes (see cgroups), but the cgroup scheduling/isolation mechanism is very similar to the process isolation mechanism, just with a new root filesystem. Since everything Linux does happens through FDs, a new root filesystem is a very powerful thing to have. That new root filesystem can have a whole new set of libraries and programs in it compared to the host, but that's all you have to do to get a completely new looking computing environment (from the perspective of Python or Javascript).
A VM, in contrast, fakes the existence of an entire computer, hardware and all. That fake hardware comes with a fake disk on which you put a new root filesystem, but it also comes with a whole lot of other virtualization. In a VM, CPU instructions (eg CPUID) can get trapped and executed by the VM to fake the existence of a different processor, and things like network drivers are completely synthetic. None of that happens with containers. A VM, in turn, needs to run its own OS to manage all this fake hardware, while a container gets to piggyback on the management functions of the host and can then include a very minimal amount of stuff in its synthetic root.
> Having multiple processes under one user in an operating system is more akin to having multiple threads in one process than you think.
Not than I think. I'm well aware of how "tasks" work in Linux specifically, and am pretty comfortable working directly with clone.
Your explanation is great, but I intentionally went out of my way to not explain it and instead give a simple analogy. The entire point was that it's difficult to summarize.
> I'm not sure who started this whole container is just a process thing, but it's not a good analogy. Quite a lot of things you spin up containers for have multiple processes (databases, web servers, etc).
It came from how Docker works, when you start a new container it runs a single process in the container, as defined in the Dockerfile.
It's a simplification of what containers are capable of and how they do what they do, but that simplification is how it got popular.
Not just the kernel and PID 1, we also tend to refer to the rest of the system as "linux" as well, even though it's not technically correct. It's very close to the same simplification.
> Containers are inherently difficult to sum up in a sentence.
Super easy if we talk about Linux. It's a process tree being spawned inside it's own set of kernel namespaces, security measures and a cgroup to provide isolation from the rest of the system.
If someone doesn't understand "container", I'm supposed to expect them to understand all the namespaces and their uses, cgroups, and the nitty gritty of the wimpy security isolation? You are proving my point that it's tough to summarize by using a bunch more terms that are difficult to summarize.
Once you recursively expand all the concepts, you will have multiple dense paragraphs, which don't "summarize" anything, but instead provide full explanations.
I'm responding to the implication that "mechanical shit" is local and thus less damaging.
Since they mentioned nukes it seemed like an obvious example where local things can be catastrophic.
The theoretical risk of electronic things malfunctioning in some global way that they mentioned has never resulted in any nuclear weapons being deployed, but we've actually seen the local mechanical approach they disregard be devastating.
I disagree. The reason humans anthropomorphize "AI" is because we apply our own meta-models of intelligence to llms, etc., where they simply don't apply. The model can spit out something that seems extremely intelligent and well thought out that would truly be shocking if a monkey said it for example due to our meta-model of intelligence, and that may be valid in that case if we determined it wasn't simply memorized. His argument can certainly be more fleshed out, but the point he's making is correct, which is that we can't treat the output of a machine designed to replicate human input as though it contains the requisite intelligence/"feeling"/etc to produce that output on it's own.
I agree that with current LLMs the error goes the other way; they appear more conscious than they are, compared to, say, crows or octopuses which appear less conscious than they actually are.
My point is that "appears conscious" is really the only test there is. In what way is a human that says "that hurts" really feeling pain? What about Stephen Hawking "saying it", what about if he could only communicate through printed paper etc etc. You can always play this dial-down-the-consciousness game.
People used to say fish don't feel pain, they are "merely responding to stimulus".
The only actual difference in my view is that somehow we feel that we are so uber special. Besides that, it seems there's no reason to believe that we are anything more than chemical signals. But the fact that we have this strong "feeling" that we are special refuses us to admit that. I feel like I'm special, I feel like I exist. That's the only argument for being more than something else.
As someone who's been following this since the beginning, the most striking difference is the assumption that Ross was in fact the DPR ordering hits, which he repeatedly denied. Obviously, he could be lying, but that's the main question for me. Since people now assume he was the one and only DPR (I wonder if people didn't get the concept from The Princess Bride), they assume DPR chat logs where murder-for-hire occurred must have been him as well.
So, analyzing databases, we picked Java, Java, Java, Java, and one in C. This does not seem very random. I suppose this may provide insight into failure modes in Java codebases in particular, but I'm not sure I'd be in a hurry to generalize.