More

JoshCole · 2024-05-01T20:24:28 1714595068

Competent companies tend to put a lot of effort into building data analysis tools. There will often be A/B or QRT frameworks in place allowing deployment of two models, for example, the new deep learning model, and the old rule based system. By using the results from these experiments in conjunction with typical offline and online evaluation metrics one can begin to make statements about the impact of model performance on revenue. Naturally model performance is tracked through many offline and online metrics. So people can and do say things like "if this model is x% more accurate then that translates to $y million dollars in monthly revenue" with great confidence.

Lets call someone working at such a company Bob.

A restatement of your claim is that Bob decided to launch a model to live because of hype rather than because he could justify his promotion by pointing to the millions of dollars in increased revenue his switch produced. Bob of course did not make his decision based on hype. He made his decision because there were evaluation criteria in place for the launch. He was literally not allowed to launch things that didn't improve the system according to the evaluation criteria. As Bob didn't want to be fired for not doing anything at the company, he was forced to use a tool that worked to improve the evaluation according to the criteria that was specified. So he used the tool that worked. Hype might provide motivation to experiment, but it doesn't justify a launch.

I say this as someone whose literally seen transitions from decision trees to deep learning models on < 100 feature models which had multi-million dollar monthly revenue impacts.

JoshCole · 2024-03-07T07:45:53 1709797553

The article claims as part of its argument that AI has not had algorithmic advances since the 80s. This is an exceedingly false premise and a common misconception among the ignorant. It would actually be fairer to say that every aspect of neural network training has had algorithmic advances than that no advances have been made.

Here is a quote from research related to this subject:

> Compared to 2012, it now takes 44 times less compute to train a neural network to the level of AlexNet (by contrast, Moore’s Law would yield an 11x cost improvement over this period). Our results suggest that for AI tasks with high levels of recent investment, algorithmic progress has yielded more gains than classical hardware efficiency.

When you apply the principle of charity you can make their claim increasingly vacuous and eventually true. We're still doing optimization - we're still in the same general structure. The thing is, it becomes absurd when you do that. Its not appropriate to take such a premise seriously. It would be like taking seriously the argument that we haven't had any advancement in software engineering since bubble sort since we're still in the regime of trying to sort numbers when we sort numbers.

Its like, okay, sure, we're still sorting numbers, but it doesn't make the wider point it wants to make and its false even under the regime it wants to make the point under.

This isn't even the only issue that makes this premise wrong. For one, AI research in the 80s wasn't centered around neural networks. Hell even if you move forward to the 90s PAIP puts more emphasis on rule systems with programs like Eliza and Student than it does learning from data. So it isn't as if we're in a stagnation without advance; we moved off other techniques to the ones that worked. For another, it tries to narrow down AI research progress myopically to just particular instances of deep learning, but in reality there are a huge number of relevant advances which just don't happen to be in publicly available chat bots but which are already in the literature and force a broadening. These actually matter to LLMs too, because you can take the output of a game solver as conditioning data for an LLM. This was done in the Cicero paper. And the resulting AI has outperformed humans on conversational games as a consequence. So all those advancements are thereby advances relevant to the discussion, yet myopically removed from the context, despite being counterexamples. And in there we find even greater than 44x level algorithmic improvements. In some cases we find algorithmic improvements so great that they might as well be infinite as previous techniques could never work no matter how long they ran and now approximations can be computed practically.

JoshCole · 2024-02-12T06:22:09 1707718929

From the Hacker News guidelines:

> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

> Comments should get more thoughtful and substantive, not less, as a topic gets more divisive.

> When disagreeing, please reply to the argument instead of calling names. "That is idiotic; 1 + 1 is 2, not 3" can be shortened to "1 + 1 is 2, not 3."

> Please don't fulminate. Please don't sneer, including at the rest of the community.

> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

> Throwaway accounts are ok for sensitive information, but please don't create accounts routinely. HN is a community—users should have an identity that others can relate to.

You can read the full guidelines here: https://news.ycombinator.com/newsguidelines.html

JoshCole · 2024-01-22T20:40:56 1705956056

As laymen definitions are incoherent nonsense derived from fiction, the popular culture definition of AI isn't a reasonable substitute for the theory-laden definitions. The four definitions given in Artificial Intelligence: A Modern Approach all substantiate the claim that LLMs are AI. So not only are we not done calling LLMs AI, but it would be incorrect to claim that LLMs are not AI.

JoshCole · 2024-01-22T06:24:59 1705904699

> LLMs do not constitute "AI" let alone the more rigorous AGI.

I have a textbook, "Artificial Intelligence: A Modern Approach," which covers Language Models in Chapter 23 (page 824) and the Transformer architecture in the following chapter. In any field technical terms emerge to avoid ambiguity. Laymen often adopt less accurate definitions from popular culture. LLMs do qualify as AI, even if not according to the oversimplified "AI" some laymen refer to.

It has been argued for the last several decades that every advance which was an AI advance according to AI researchers and AI textbooks was not in fact AI. This is because the laymen have a stupid definition of what constitutes an AI. It isn't because the field hasn't made any progress, but instead because people outside the field lack the sophistication to make coherent statements when discussing the field because their definitions are incoherent nonsense derived from fiction.

> They are a GREAT statistical parlor trick for people that don't understand statistics though.

The people who believe that LLMs constitute AI in a formal sense of the word aren't statistically illiterate. AIMA covers statistics extensively: chapter 12 is on Quantifying Uncertainty, 13 on Probabilistic Reasoning, 14 on Probabilistic Reasoning Over Time, 15 on Probabilistic Programming, and 20 on Learning Probabilistic Models.

Notably, in some of these chapters probability is proven to be optimal and sensible; far from being a parlor trick it can be shown with mathematical rigor that failing to abide by its strictures is not optimal. The ontological commitments of probability theory are quite reasonable; they're the same commitments logic makes. That we model accordingly isn't a parlor trick, but a reasonable and rational choice with ledger arguments proving that failing to do so would lead to regret.

JoshCole · 2023-12-21T18:33:00 1703183580

I've been in a self-driving Tesla vehicle. After hours on the interstate, the person ahead of me slammed on their brakes suddenly. I was caught off guard, not expecting it, and may have crashed by not reacting in time. The Tesla braked. So I have anecdotal experience that the person you're asking for an answer isn't well informed on how Tesla's respond to this type of accident.

Of course, anecdotal evidence isn't a very high standard. Thankfully, statistics on this sort of thing are tracked. Statistically, the Tesla self-driving features reduce accidents per mile. They have for years now and as the tech has progressed the reduction has grown as the technology has matured. So statistical evidence also indicates that the person you are asking the question to is also uninformed.

What is probably happening is that it makes for good clickbait to involve Elon and Tesla into discussions. Moreover, successful content online often provokes emotion. The resulting preponderance of negativity, especially about each driving accident Teslas were involved in or caused, probably tricked them into misunderstanding the reality of the Tesla safety record.

bluGill · 2023-12-21T20:23:15 1703190195

> Statistically, the Tesla self-driving features reduce accidents per mile

While that is the claim, I've never seen an independent analysis of the data. There are reasons to believe that Tesla drivers are not average. I don't know if what claims are true, which is why I want independent analysis of the data so that factors I didn't think of can be controlled for.

x86x87 · 2023-12-21T22:16:48 1703197008

My Subaru from 2018 can do this. It's not rocket system and most cars nowadays have a collision detection system. This is not a slef-driving capability by any means.

jryan49 · 2023-12-22T23:16:43 1703287003

90% of new cars can do this, it's called AEB. It's not a Tesla self driving feature.

JoshCole · on April 3, 2023

He is talking about the long technical talk at CVPR.

https://www.youtube.com/watch?v=g6bOwQdCJrc

For example look at this section: https://youtu.be/g6bOwQdCJrc?t=1370

Also if you look into it you will find Google's self-driving car project ran into similar issues. They upgraded the radar in response to the issues rather than removing it, but there are actual issues.

Vision only was better than radar plus vision in their stack. For real world situations that frequently come up in driving radar wasn't worth it. The theoretical accuracy of the sensor fusion didn't happen in practice because in practice you needed to have vision tell you when sensor fusion wasn't going to be accurate. It isn't naive sensor fusion as in toy math models of sensor fusion. So when there was a disagreement you needed to go with vision. Yet vision was already able to predict both the failure of and the actual result of radar. In theory, you could just have done the more complicated sensor fusion. In practice, the relative advantage of improving other parts of the vision stack far exceeded improving the sensor fusion. Over time one would expect the relative advantage to change as other improvements were made such that this was no longer true. That doesn't mean the time at the decision wasn't correct, it just means people who evaluate it with future changes already in place are suffering from hindsight bias/committing an anachronistic fallacy.

ryanwaggoner · on April 3, 2023

Yeah, Google actually made the right call. Thinking that Tesla made a stupid and dangerous call isn't "hindsight bias", their vision-only approach has always been inferior and reckless.

JoshCole · on April 4, 2023

The practical impact of the change you call stupid and reckless was that the car stopped slamming its brakes when going under underpasses. So when it comes down to it what you are claiming is that it is stupid and reckless to decide not to slam on the brakes.

Look - I get it. I too thought radar sensor fusion would improve the results. Guess who else thought this? The people you consider stupid and reckless thought this. Do you know why they changed their mind? According to you, because they are stupid and reckless.

I can't stress this enough: you are telling yourself inaccurate stories about what happened. If you just /look/ at the /measured impact/ instead of /wrongly guessing what Karpathy thought/ and /wrongly guessing what Karpathy has said/ and /wrongly guessing what other people were referencing when they talked about what Karpathy said/ then you would realize this.

ryanwaggoner · on April 6, 2023

Haha, it’s funny that you circled back a day later to leave another comment lambasting me.

It’s so naive to just take Tesla’s word for all of this, and to believe that they took this approach because they decided it was the safest and most effective. Meanwhile, other credible teams out there have not gone with vision-only, and as a result their systems aren’t a pile of dangerous dogshit that are killing customers. But Tesla is incredibly arrogant and reckless, and they just keep doubling down.

Vision-only is the wrong choice, it won’t get us to L4, and that should have been pretty easy to see from the start.

JoshCole · on April 7, 2023

Here is my point through the filter of ChatGPT asked to state it nicely. I'm sorry if it comes off as a rude.

If you propose a theory that suggests overconfident individuals are taking reckless actions, compromising safety, it's crucial to treat that theory as a hypothesis and extrapolate conjectures based on it. For instance, if your theory revolves around adopting vision-only technology leading to increased accidents, this should be reflected in the accident rates per mile. However, current data indicates a decrease in accidents per mile, which, while not conclusively disproving your theory, serves as strong evidence against it.

Dismissing such reasoning by claiming naivety on the part of Tesla supporters is unconvincing. As a Tesla owner, I have experienced the car's safety features firsthand. For example, I have noticed the vehicle slows down when passing under an underpass, and this improved after the update, making it safer.

Furthermore, it's essential to consider the full range of evidence available, not just a specific instance. One such piece of evidence is Karpathy's CVPR talk, which demonstrates accuracy improvements through video evidence. It's challenging for your argument when you accept Karpathy's comments only when they align with your intuition but dismiss them when they contradict it, especially when the latter is supported by video and metrics and the former alignment with intuition was a misunderstanding of Karpathy on your part.

Additionally, you're overlooking evidence from sources not affiliated with Tesla. For instance, safety assessments by NHTSA and regulatory agencies in other countries consistently rank Tesla as one of the safest cars. While the electric design contributes to this, it remains an issue for your argument, as Tesla's top safety rankings are confirmed by multiple independent bodies. When theorizing about reckless overconfidence, receiving accolades for safety doesn't support the idea that these individuals are taking actions that endanger others.

JoshCole · on April 3, 2023

It is hindsight bias.

You are explaining the observed safety record of Tesla being on top for safety by appealing to them being inferior and reckless. You are explaining Tesla causing a greater reduction in accident death than Waymo by them being more inferior and reckless than Waymo.

You are not explaining the evidence with your theory.

Your theory doesn't even reflect reality. Google invested in Tesla and in Waymo. Google did both things. Google didn't make the right call. They made both calls.

fossuser · on April 3, 2023

Yep - this is the one I was talking about, thanks for linking!

JoshCole · on March 31, 2023

There are millions of lines of code existent. There are thousands added. Your prior for it being Elon's addition should be something like 10,000/1,000,000 or roughly 1/100. The prior that it wasn't Elon's change is going to be something like 99/100.

When you add the additional information that Elon wants the code removed, but existing Twitter engineers think it appropriate to keep this actually increases the probability of it being added by the existing Twitter engineers and decreases the probability it was added due to Elon.

Obviously, these are rough numbers, but hopefully seeing any numbers at all helps you to get an intuition for the math.

jsnell · on March 31, 2023

Why is lines of code the appropriate input here? Here's a different computation that is at least as plausible:

There are hundreds of millions of users. Let's say 300M. Only a single one is special-cased in this code: the narcisistic CEO who reportedly went ballistic when his engagement metrics went down. The prior that it's a change done in response to his demands is 299999999/300000000.

(But of course it was added by existing Twitter engineers. The odds of Musk being able to actually commit code to their repository are zero. Even if he had the permissions, the man simply does not have the technical acumen to make even a trivial change.)

JoshCole · on April 1, 2023

I think you are right, my estimate is much much too low.

I'll explain my mistake and why I made it.

I think I thought to use the estimate I did because Elon claimed he didn't know. The prior probability of not knowing something in a code base with millions of lines is very high, but contingent on his involvement in the change the prior that he is aware of it is much higher. So I started the estimate attempt with the probability that I thought better predicted the production of evidence claiming he didn't know.

Your point does raise my estimate substantially, but I think it probably raises it less than you would expect. I don't agree with your 1/300M prior, because I'm aware that hot users get special treatment. I've seen Elon's account thrown around in interview-style questions about hot users before and used as an example of a hot account that needs special treatment. This is something I've witnessed, but it wasn't contingent on Twitter being acquired and it happened prior to Twitter being acquired.

I also don't particularly assign high odds to wanting it, based on the evidence that he claimed to not want it implicitly by wanting it removed. I don't think it seems appropriate to get to near certain probability the he wanted it with the evidence being that he stated that he didn't want it. In my view there isn't a compelling reason for him to lie about this. He owns Twitter, so if he wanted them to have his account monitored that would be a reasonable thing well within his authority. If he wanted it, he doesn't need to pretend to not want it in order to appease someone.

It does seem to me that the odds that the change was added in response to someone thinking he wanted it is much higher than 1%.

jsnell · on April 1, 2023

> I don't agree with your 1/300M prior, because I'm aware that hot users get special treatment.

That's absolutely fair, and 1/300M was a reductio ad absurdum rather than a serious proposal. Not all users are equal, just like not all lines of code are equal :)

I have a few issues with the "hot user" theory, but they all boil down to the same point: no matter what the use case, you'd never want to do this with a single static user.

Does your infra require special-casing for accounts with more than 100M followers? That should be a flag in the account properties that gets flipped manually or automatically: if these users cause infra problems, you really don't want to be making code changes + full rollouts whenever a new user becomes hot.

Is this just a guard-rail metric, to make sure there's not some bug specifically affecting hot users that tanks their engagement? You'd want a much larger static set than a single account just to ensure there's a large enough number/variety of tweets to compute metrics from. A single user might take a break for a week, or might only be posting very specific kind of content for an extended period of time.

In any case, even if you chose to do this with a single user rather than a set of users, why would Musk be the obvious single choice? He wasn't the most followed Twitter account until two days ago. A year ago there must have been at least a couple of dozen accounts roughly as notable as Musk. The odds of him having been chosen as the special case still would not be very high.

> In my view there isn't a compelling reason for him to lie about this.

The reason to lie about this is that it makes him appear weak, needy, and a target of even more mockery. Given the purchase of Twitter seems to have been a vanity project, having this be exposed and leaving it in goes directly against his apparent goal.

JoshCole · on April 1, 2023

> The reason to lie about this is that it makes him appear weak, needy, and a target of even more mockery. Given the purchase of Twitter seems to have been a vanity project, having this be exposed and leaving it in goes directly against his apparent goal.

I think it only makes sense to think like you are if you've adopted equilibrium assumptions; if you haven't then I find this sort of reasoning to be a conjunction fallacy causing an epistemic closure.

v0idzer0 · on April 1, 2023

This is a weird nitpick that doesn’t attack the argument at all. It’s a dense codebase but whatever measure you personally prefer

jsnell · on April 1, 2023

JoshCole is doing a computation to arrive at the conclusion that there's a <1% chance that this code was added after Musk bought Twitter. I'm using the same methodology with at least equally plausible inputs to arrive at there being a >>99% chance of it.

How is that a nitpick? They're diametrically opposite results.

JoshCole · on April 1, 2023

I'm JoshCole and I didn't find your reply to be a nitpick; you are right that the probability ought to be higher than 1%. My calculation was simplistic and I felt it was prudent to arrive at low probability, because I think probability of wanting something given claim of wanting it removed should probably not be anywhere near close to certain. My estimate isn't 1% though. It was just a short thing to share that gave an intuition for why it might be reasonable to assume he didn't know or want it.

In my opinion if you really care about this topic the right thing to do is ask someone at Twitter when the change was made. Getting more information would make us converge on the true estimate faster than arguing the odds IMO. Feel free to update me with the results if you do end up doing that so I can adjust my beliefs accordingly. I'm not going to try to gain this information, because I don't think the question matters much.

JoshCole · on March 23, 2023

Are car parts car parts? Not according to an auto-mechanics, but according to the laymen. A radiator is not a battery or an engine. Are games games? Not according to a game theorist, but according to the the laymen. A game is not a play or a history.

This isn't an accident of language. An example of an actual accident of language would be giving tanks instead of giving thanks.

Are runners runners? Yes, according to you. A walker is a runner is a missile is a bowling ball rolling between places is light moving through a medium. No, according to a fitness coach, because a runner is not a tank is not a plane. When they say that a person should take up running they don't mean the person should melt down their body in a furnace and sprinkle their atoms into metal which is then pressed into iron plates that are attached to a tank which will then go running.

Sometimes we need to be careful in language. For example, we probably don't want to confuse the process of being incinerated and pressed into iron plates with the process of a human exercising their muscles. The choice to be careful in this way is not an accident of language. It is a very deliberate thing when, for example, John Von Nuemann carefully explains why he thinks the laymen use of the word game has perilous impact on our ability to think about the field of game theory which he starts in his book about the same.

I think you should make your point so as to disprove Nuemann, not pick on the straw man of running. Or you should argue against the use of the term radiator instead of car parts. It will better highlight your fallacy, because with running I have to make your position seem much more farcial then it is. We do gain something from thinking imprecisely. We gain speed. That can really get our thoughts running, so long as we don't trip up, but it calls to attention that when someone chooses to stop running due to the claim that the terrain isn't runnable, the correct response is not to tell them that running is accidental property. It is to be careful as you move over the more complicated terrain. Otherwise you might be incinerating yourself without noticing your error.

coldtea · on March 24, 2023

>This isn't an accident of language. An example of an actual accident of language would be giving tanks instead of giving thanks.

By "Accident of language" I don't mean "slip of the tongue" or "mistake when speaking".

I mean that kind words we use to describe someome who runs as "runner" is an accidental, not essential, property of English, and can be different in other languages. It doesn't represent some deeper truth, other than being a reflection of the historical development of the English vocabulary. I mean it's contigent in the sense it's used in philosophy as: "not logically necessary"

Not just in its sounds (which are obviously accidental, different languages can have different sounds for a word of the same meaning), but also in its semantics and use, e.g. how we don't call a car a "runner".

That we don't call it that doesn't express some fundamental truth, it's just how English ended up. Other languages can very well call both a car and a running man the same thing, and even if they don't for this particular case, they do have such differences between them for all kinds of terms.

>* I think you should make your point so as to disprove Nuemann, not pick on the straw man of running.*

I'm not here to disprove Neumann. I'm here to point that Lanier's argument based on the use of "runner" doesn't contribute anything.

JoshCole · on March 24, 2023

> I'm not here to disprove Neumann.

You are arguing on the basis of possibility of imprecision in language that the choice to be more precise does not contribute anything. That structure - whether you want it to or not - as a direct consequence of logic applies to every thinker who ever argued for precision due to the possibility of ambiguity. It is an argument against formal systems, programming languages, measurement, and more. Some of the time it will turn out that your conclusion was true. Other times it will not. So the argument structure itself is invalid. Your conclusions do not follow from your premises.

Try your blade - your argument structure - against steel rather than straw. I saw you slice through straw with it. So I picked up the blade after you set it down and tried to slice it through steel. The blade failed to do so. The blade is cheap, prone to shattering, and unsuited for use in a serious contest between ideas.

For what it is worth - I do happen to agree with you that Lanier is making a mistake here. I think it is in the logical equivalence mismatch. He wants intelligence to be comparable to running, not to motion more generally, but since intelligence is actually more comparable to compression we can talk of different implementations of the process using terms like artificial or natural intelligence without being fallacious for much the same reason we can talk about different compression algorithms and still be talking about compression. So instead of trying to argue from his distinction between motion in general and motion in humans, I would think the place to point to for contradiction is the existence of cheetah runners versus human runners. Directly contradicting his insinuation is that we actually do say that cheetah are faster runners than humans.

JoshCole · on March 18, 2023

Might be worth it to brush up on computationally reducible phenomenon and computationally irreducible phenomenon. Breaking determined systems into these constituent parts lets you conjecture with respect to intelligent agents. Critically it shows that arguments from observed inability to successfully model the self are evidence for, not evidence against, the presence of an intelligent agent. The enlightenment view misattributes the evidence for agents as evidence against agents.

An anti-enlightenment koan could be: The student came to the master and asked, “Why are tigers green?” The master responded, “The deer they are hunting can’t see orange.” The student then asked, “It is not therefore it is? How mysterious and inscrutable your answers!” But from that moment onwards the master was de-enlightened.