Hacker News new | past | comments | ask | show | jobs | submit | nneonneo's comments login

A good family friend caught abdominal TB while abroad, came back to Canada, and was terribly misdiagnosed. TB is very rare here; abdominal TB even more so, and early symptoms look like a lot of other diseases. It took several hospital visits before the doctors realized the actual problem. He very nearly died, and spent nearly a year in the hospital recovering from a parade of complications.

I mean, can’t they just train on some huge codebases? There’s lots of 100KLOC codebases out there which would probably get close to 1M tokens.

That's brilliant!

I mean, arithmetic is the same way, right? Nobody should do the arithmetic by hand, as you say. Kindergarten teachers really ought to just hand their kids calculators, tell them they should push these buttons like this, and write down the answers. No need to teach them how to do routine arithmetics like 3+4 when a calculator can do it for them.


I'm not sure you aren't being a little bit sarcastic but essentially that's true.

If kids don't go through the struggle of understanding arithmetic, higher math will be very very difficult. Just because you can use a calculator, doesn't mean that's the best way to learn. Likewise for using LLMs to program.

I have no anecdata to counter your thesis. I do agree that immersion in the doing of a thing is the best way to learn. I am not fully convinced that doing a lot of arithmetic hand calculation precludes learning the science of patterns that is mathematics. They should still be doing something mathematical but why not go right into using a calculator. I have no experience as an educator and I bet it's hard to get good data on this topic of debate. I could be very wrong.

I'm not an educator but I know from teaching my own children that you don't introduce math using symbols and abstract representations. You grab 5 of some small object and show them how a pile of 2 objects combined with a pile of 3 objects creates a pile of 5 objects.

Remember, language is a natural skill all humans have. So is counting (a skill that may not even be unique to humans).

However writing is an artifical technology invented by humans. Writing is not natural in the sense that language itself is. There is no part of brain we're born with that comes ready to write. Instead, when we learn to write other parts of our brain that are associated with language and hearing and vision are co-opted into the "writing and reading parts".

Teaching kids math using writing and symbolism is unnatural and often an abstraction too far for them (initially). Introducing written math is easier and makes more sense once kids are also learning to read and write - their brains are being rewired by that process. However even an toddler can look at a pile of 3 objects and a pile of 5 objects and know which one is more, even if they can't explicitly count them using language - let alone read and write.


There's a wealth of research on how children learn to do math, and one of the most crucial things is having experiences manipulating numbers directly. Children don't understand how the symbols we use map to different numbers and the operations themselves take time to learn. If you just have them use a black-box to generate answers, they won't understand how the underlying procedures conceptually work and so they'll be super limited in their mathematical ability later on.

Can you explain further why you think nobody has tried teaching first graders math exclusively using calculator in the 30 years they've been dirt cheap?

That's after all the implication from your assessment that there would be no good data.


That was sarcastic, because that's wrong. And I cannot conceive how can one think this is a good approach to learning.

And don't everyone have smartphones? So why not just use OCR to read things. No need to learn to read. Just use speech recognition and OCR.

FYI: a kernel patch to run exes isn’t needed. binfmt_misc can handle this, and wine-binfmt already exists to automatically run PE files through Wine.


This exploit is just wild. There are just so many little tricks connected together - using multiple image files with unexpected formats, aligning heap chunks to sit on easily-predicted and manipulable addresses, deserializing a huge object graph from image metadata, the usual NSExpression insanity, PAC bypass via unsigned pointers to function-pointer-containing structures, etc. etc. I thought the last exploit (where they built an entire virtual CPU out of image decompression commands: https://googleprojectzero.blogspot.com/2021/12/a-deep-dive-i...) was crazy, but that involved a lot fewer "tricks" than this exploit.

Many of these tricks are non-public, meaning that NSO would have had to spend a huge amount of time and effort researching every single one of these. They probably have many more tricks they know about and haven't used. And, Apple could patch every one of them in a future update and roll back all of that work.

There's a good reason why these exploits are expensive and only sent to a limited number of high-value targets. NSO this time around also worked to "protect their IP" using encryption to hide part of their exploit chain, presumably in a bid to avoid losing yet more of their precious zero-days to researchers.

What they're doing is pretty gross (particularly the whole spying-on-journalists bit), but you have to admit the level of technological sophistication and persistence here is pretty impressive.


A strange choice of words. It's like saying “cannibalism is pretty gross, but the chef outdid himself on those slices”.

Moreover, even if it's complex from the technical point of view, morally it's dead simple: hired programmer is the same as dirty grunt with a gun, and the leader delivering speeches, and the rocket engine scientist, and the data processing clerk, and everyone in between. They all serve the Order they believe in, the king of this world.


> It's like saying [...]

"Proof by analogy is fraud" - Bjarne Stroustrup


people do like true crime shows so that's that


“True crime” is just fashionable slang for “pulp fiction” and “tabloid journalism”, so there's nothing new here.


> using multiple image files with unexpected formats,

Unexpected ? You mean your jpeg file is not a jpeg. Why not throw an error message then ? Why does (iMessage) it have to open every byte thrown at it ?


It’s not being processed in the kernel - BlastDoor is a heavily-sandboxed user process. This attack chains together a bunch of exploits - including an encrypted BlastDoor sandbox bypass - in order to gain full control over a device.


And this is one of many reasons why analyses like this provide so much value. It's fascinating that the actual sandbox bypass in this case is so carefully protected. If that sandbox bypass code could be decrypted and reverse-engineered, it could point to the actual vulnerability being exploited, which could then be fixed.


I can already guess without looking that the sandbox is excellent except that oops there's a few edge cases that were MUST FIX, so, we had to cut a small hole in it. The people who understand how difficult this is to do properly built the actual sandbox which will be fine, the ragged hole cut in it (and exploited) was implemented by somebody for whom the sandbox is just another annoying thing they need to work around to get their job done and they don't care that they made it worthless.


I’m sure they care that they spent all that time working around an annoying thing — they don’t want to waste time finding a new workaround when this one is patched. And they’d like to re-use it in other exploit chains, so why not obfuscate it to the best of their ability?

You do highlight an important difference between the mindset of an attacker and a defender. The attacker only needs to be right once, and then they can move onto the next phase of the attack. That’s why they don’t bother “understanding it properly” beyond the perspective of attacking it. A medieval siege unit understands the weak points of castle walls, and how to operate a catapult, but they don’t care about the intricacies of stone masonry beyond what’s necessary for knocking down a wall.


Maybe I was unclear, I think the exploited hole was cut by the vendor.

You hire team A of excellent security people to build you an impenetrable barrier. They do a sterling job and are appropriately rewarded.

Then you realise oops, the new feature we're very proud of can't work with that impenetrable barrier. Hey, Jim, we need the feature to work ASAP, figure out a way to do that without removing the expensive barrier. And so of course Jim cuts a hole in it.


What makes you think this? Typically this has not been the case.


FWIW: this type of bug in Chrome is exploitable to create out-of-bounds array accesses in JIT-compiled JavaScript code.

The JIT compiler contains passes that will eliminate unnecessary bounds checks. For example, if you write “var x = Math.abs(y); if(x >= 0) arr[x] = 0xdeadbeef;”, the JIT compiler will probably delete the if statement and the internal nonnegative array index check inside the [] operator, as it can assume that x is nonnegative.

However, if Math.abs is then “optimized” such that it can produce a negative number, then the lack of bounds checks means that the code will immediately access a negative array index - which can be abused to rewrite the array’s length and enable further shenanigans.

Further reading about a Chrome CVE pretty much exactly in this mold: https://shxdow.me/cve-2020-9802/


> which can be abused to rewrite the array’s length and enable further shenanigans.

I followed all of this up until here. JavaScript lets you modify the length of an array by assigning to indexes that are negative? I'm familiar with the paradigm of negative indexing being used to access things from the end of the array (like -1 being the last element), but I don't understand what operation someone could do that would somehow modify the length of the array rather than modifying a specific element in-place. Does JIT-compiled JavaScript not follow the usual JavaScript semantics that would normally happen when using a negative index, or are you describing something that would be used in combination with some other compiler bug (which honestly sounds a lot more severe even in the absence of an usual Math.abs implementation).


Normally, there would be a bounds check to ensure that the index was actually non-negative; negative indices get treated as property accesses instead of array accesses (unlike e.g. Python where they would wrap around).

However, if the JIT compiler has "proven" that the index is never non-negative (because it came from Math.abs), it may omit such checks. In that case, the resulting access to e.g. arr[-1] may directly access the memory that sits one position before the array elements - which could, for example, be part of the array metadata, such as the length of the array.

You can read the comments on the sample CVE's proof-of-concept to see what the JS engine "thinks" is happening, vs. what actually happens when the code is executed: https://github.com/shxdow/exploits/blob/master/CVE-2020-9802.... This exploit is a bit more complicated than my description, but uses a similar core idea.


I understand the idea of the lack of a bounds check allowing access to early memory with a negative index, but I'm mostly struggling with wrapping my head around why the underlying memory layout is accessible in JavaScript in the first place. I hadn't considered the fact that the same syntax could be used for accessing arbitrary properties rather than just array indexes; that might be the nuance I was missing.


>I followed all of this up until here. JavaScript lets you modify the length of an array by assigning to indexes that are negative?

This is my no doubt dumb understanding of what you can do, based on some funky stuff I did one time to mess with people's heads

do the following const arr = []; arr[-1] = "hi"; console.log(arr) this gives you "-1": "hi"

length: 0

which I figured is because really an array is just a special type of object. (my interpretation, probably wrong)

now we can see that the JavaScript Array length is 0, but since the value is findable in there I would expect there is some length representation in the lower level language that JavaScript is implemented in, in the browser, and I would then think that there could even be exploits available by somehow taking advantage of the difference between this lower level representation of length and the JS array length. (again all this is silly stuff I thought and have never investigated, and is probably laughably wrong in some ways)

I remember seeing some additions to array a few years back that made it so you could protect against the possibility of negative indexes storing data in arrays - but that memory may be faulty as I have not had any reason to worry about it.


You raise a good point that JavaScript arrays are "just" objects that let you assign to arbitrary properties through the same syntax as array indexing. I could totally imagine some sort of optimization where a compiler utilizes this to be able to map arrays directly to their underlying memory layout (presumably with a length prefix), and that would end up potentially providing access to it in the case of a mistaken assumption about omitting a bounds check.


yeah you know what you said made me think about these funny experiments that I haven't done in a long time and I remember now yeah, you can do

const arr = []; arr[false] = "hi";

which console.log(arr); - in FF at least - gives

Array []

false: "hi"

length: 0

which means

console.log(arr[Boolean(arr.length)]); returns

hi

which is funny, I just feel there must be an exploit somewhere among this area of things, but maybe not because it would be well covered.

on edit: for example since the index could be achieved - for some reason - from numeric operation that output NaN, you would then have NaN: "hi", or since the arr[-1] gives you "-1": "hi" but arr[0 -1] returns that "hi" there are obviously type conversions going on in the indexing...which just always struck me as a place you don't expect the type conversions to be going on the way you do with a == b;

Maybe I am just easily freaked out by things as I get older.


Javascript is the new Macromedia/Adobe Flash.

You can do more and more in it and it's so fun, until it suddenly isn't anymore and dies.


This is after the jit.

I.e. don't think fancy language shenanigans that do negative indexing. But negative offset from the beginning of the array memory access.

When there's some inlining, there will be no function call into some index operator function


For example if arrays were implemented like this (they're not)

    struct js_array {
        uint64_t length;
        js_value *values[];
    }
Because after bound checks have been taken care of, loading an element of a JS array probably compiles to a simple assembly-level load like mov. If you bypass the bounds checks, that mov can read or write any mapped address.


Yeah, I understand all of that. I think my surprise was that you can access arbitrary parts of this struct from within JavaScript at all; I guess I really just haven't delved deeply enough into what JIT compiling actually is doing at runtime, because I wouldn't have expected that to be possible.


Nitpick: “Public” is misspelled as “pubic” in several of the captions on that page.


Maybe realizing those things is the actual test?


Oof its still there... but yeah typos happen lol


Works right up until you have another incident, go to delete the file, and realize you already deleted it the first time…


You're on the right track!

The length of the cycles mod 10^k is simply Euler's phi function of 5^k: 5^(k-1) * 4 (or a factor of phi(5^k); AFAIK it is always exactly phi(5^k), although I don't have a proof of this handy).

The length of the even subset grows roughly as 2.5^k * 1.6. To see why, consider that the length of the cycle grows by a factor of 5 when incrementing k. Each all-even-digit power mod 10^k leads to 5 numbers mod 10^{k+1} which all share the same last k digits - i.e. their last k digits are even. We can model the k+1'th digit as being random, in which case we expect half of all those new numbers to consist entirely of even digits (one new digit, which is either odd or even, and k digits from the previous round that are all even). Thus, when incrementing k, the number of all-even-digit powers in the cycle will grow by approximately a factor of 2.5.


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: