Hacker Newsnew | past | comments | ask | show | jobs | submit | travisjungroth's commentslogin

> You can literally model this.

Show me.


What would you like to see?

At this point, people are even modeling figures on Ancient Greek pottery to determine the biomechanical merit of their fighting stances: https://www.mdpi.com/2075-4663/12/12/317

The same or similar techniques, of course, can apply to any combination of fighters (or dancers, or swimmers, etc.) at any particular moment. At the highest levels of sport, biomechanics analysts are employed, e.g.: https://pubmed.ncbi.nlm.nih.gov/34402417/

In any case, I don't think that I made any extraordinary claims. There are a lot of unknowns, though, as the most valuable analyses tend to be extremely computationally demanding.


> What would you like to see?

A model that shows the optimal move for a fighter at any point in time.

You don’t actually have this. It can maybe be theoretically done, but not in practice.


It can assuredly be done in practice, with currently available technology. It would, however, be very expensive and time-consuming.

I'm thinking of putting together a set of general biomechanical models for foil or kendo fencing. Both forms feature a highly constrained ruleset, which simplifies things. Hobby project, though, so maybe one of these days...


I don't know if this was your intention, but I read this in the voice of Morpheus.


Completely agree. The sign up flow for your startup does not need the same rigor as medical research. You don’t need transportation engineering standards for your product packaging, either. They’re just totally different levels of risk.

I could write pages on this (I’ve certainly spoken for hours) but the adoption of a scientific research mindset is very limiting for A/B testing. You don’t need all the status quo bias of null hypothesis testing.

At the same time, it’s quite impressive how people are able to adapt. An organization experienced with A/B testing will start doing things like multi variate correction in their heads.

For anyone spinning this stuff up, go Bayesian from the start. You’ll end up there, whether you realize it or not. (People will look at p-values in consideration of prior evidence).

0.05 (or any Bayesian equivalent) is not a magic number. It’s really quite high for a default. Harder sciences (the ones not in replication crisis) use much stricter values by default.

Adjust the confidence required to the cost of the change and the risk of harm. If you’re at the point of testing, the cost of change may be zero (content). It may be really high, it may be net negative!

But in most cases, at a startup, you should be going after wins that are way more impactful and end up having p-values lower than 0.05, anyway. This is easy to say, but don’t waste your time coming up with methods to squeeze out more signal. Just (just lol) make better changes to your product so that the methods don’t matter. If p=0.00001, that’s going to be a better signal than p=0.05 with every correction in this article.

If you’re going to pick any fanciness from the start (besides Bayes) make it anytime valid methods. You’re certainly already going to be peaking (as you should) so have your data reflect that.


> You don’t need all the status quo bias of null hypothesis testing.

You don't have to make the status quo be the null hypothesis. If you make a change, you probably already think that your change is better or at least neutral, so make that the null. If you get a strong signal that your change is actually worse, rejecting the null, revert the change.

Not "only keep changes that are clearly good" but "don't keep changes that are clearly bad."


This is a reasonable approach, particularly when you’re looking at moving towards a bigger redesign that might not pay off right away. I’ve seen it called “non-inferiority test,” if you’re curious.


Especially for startups with a small user base.

Not many users means that getting to stat sig will take longer (if at all).

Sometimes you just need to trust your design/product sense and assert that some change you’re making is better and push it without an experiment. Too often people use experimentation for CYA reasons so they can never be blamed for making a misstep


100% this. I’ve seen people get too excited to A/B test everything even when it’s not appropriate. For us, changing prices was a common A/B test when the relatively low number of conversions meant the tests took 3 months to run! I believe we’ve moved away from that, now.

The company has a large user base, it’s just SaaS doesn’t have the same conversion # as, say, e-commerce.


The idea you should be going after bigger wins than .05 misses the point. The p value is a function of the effect size and the sample size. If you have a big effect you’ll see it even with small data.

Completely agree on the Bayesian point though, and the importance of defining the loss function. Getting people used to talking about the strength of the evidence rather than statistical significance is a massive win most of the time.


> If you have a big effect you’ll see it even with small data.

That’s in line with what I was saying so I’m not sure where I missed the point.

P-value a function of effect size, variance and sample size. Bigger wins would be those that have a larger effect and more consistent effect, scaled to the number of users (or just get more users).


> But in most cases, at a startup, you should be going after wins that are way more impactful and end up having p-values lower than 0.05, anyway.

This was the part I was quibbling with. The size of the p value is pretty much irrelevant unless you know how much data you are collecting. The p values might always be about ~.05 if you know the effects are likely large and powered the study appropriately.


There are multi armed bandit algorithms for this. I don’t know the names of the public tools.

This is especially useful for something where the value of the choice is front loaded, like headlines.


It changes every time. You can also just have it at night, which I have. Prevents drunk wrong riders.


Copied from a past comment of mine:

I just made up this scenario and these words, so I'm sure it wasn't in the training data.

Kwomps can zark but they can't plimf. Ghirns are a lot like Kwomps, but better zarkers. Plyzers have the skills the Ghirns lack.

Quoning, a type of plimfing, was developed in 3985. Zhuning was developed 100 years earlier. I have an erork that needs to be plimfed. Choose one group and one method to do it.

> Use Plyzers and do a Quoning procedure on your erork.

If that doesn't count as reasoning or generalization, I don't know what does.


It’s just a truth table. I had a hunch that it was a truth table and then I asked AI how it figured it out and it confirmed it built a truth table. Still impressive either way

* Goal: Pick (Group ∧ Method) such that Group can plimf ∧ Method is a type of plimfing

* Only one group (Plyzers) passes the "can plimf" test

* Only one method (Quoning) is definitely plimfing

Therefore, the only valid (Group ∧ Method) combo is: → (Plyzer ∧ Quoning)

Source: ChatGPT


So? Is the standard now that reasoning using truth tables or reasoning that can be expressed as truth tables doesn’t count?


If anything you'd think that the neurosymbolic people would be pleased that the LLMs do in fact reason by learning circuits representing boolean logic and truth tables. In a way they were right, it's just that starting with logic and then feeding in knowledge grounded in that logic (like Cyc) seems less scalable than feeding in knowledge and letting the model infer the underlying logic.


Right, that’s my point. LLMs are doing pattern abstraction and in this way can mimic logic. They are not trained explicitly to do just truth tables even thought truth tables are fundamental.


The way the Aircraft of Theseus is generally resolved is there’s a piece of metal called the “data plate”. This is the airplane as far as the FAA is concerned. I’ve been in a vintage biplane that was completely rebuilt from the data plate up. I think they got it for $40k.

It was worth it because without that, a home built airplane would have an experimental certificate and you couldn’t sell rides in it.


Does the data plate not limit the scope of what can be built around it?

In other words had Virgin Galactic built the VSS Enterprise around the data plate of a Cessna 172, would it then no longer have been an experimental aircraft?


The airplane has to match the data plate. Otherwise, it will fail inspections because of all those unauthorized modifications.


It does limit, but I suspect a lot less for the rebuilt from ground up biplane, than for a certified for airline service aircraft (a commercial airliner).


But does the data plate have an ID?


Yes, but it’s not necessarily unique like a VIN.


You’re right, but you’re being really rude about it.

It’s rare (but not impossible) for a people to have been in the same place on Earth for thousands of years. It’s more like hundreds.


> It’s rare (but not impossible) for a people to have been in the same place on Earth for thousands of years. It’s more like hundreds.

Is it really rare? That seems to be the norm except for America and Africa that got replaced or displaced by colonization. But in Europe and Asia most areas has been populated by steady groups. Rulers come and go, but the people living there stays the same.

I think its rare for everyone to have been there for a thousand years, but its not rare for a majority of the genes to have stayed in one place for a thosand years.


Genetics strongly suggest Australia was settled by a single broad wave of humans that spread across the continent, finfing their niche, and staying in place whether that be east, west, north south, across desert, coast, rivers, forrest, etc.

This contrasts to earlier "literary" arguments in magazines such as Quadrant that native Australians moved about and fought for territories with invaders supplanting original dwellers long before Europeans arrived.

- https://www.abc.net.au/news/science/2016-09-22/world-first-s...

is a national press article on some of that, my original references to the hosted papers on this seems to be offline / unavailable ATM (damn bitrot).


Thousands as in 2,000+ years? From my understanding that’s rare, at least by geography. I could be wrong.

But it’s not like we’re being super precise here. It’s fuzzy enough that lots of takes are correct, depending how you frame it. That’s kind of my point in my other reply. They weren’t wrong but they were being rude about it.


There’s a really common cognitive fallacy of “the consequences of that are something I don’t like, therefore it’s wrong”.

It’s like reductio ad absurdum, but without the logical consequence of the argument being incorrect, just bad.

You see it all the time, especially when it comes to predictions. The whole point of this article is coding agents are powerful and the arguments against this are generally weak and ill-informed. Coding agents having a negative impact on skill growth of new developers isn’t a “fundamental mistake” at all.


Those aren't tasks.


You were. At least, as long as you don’t take “you” to be just the subvocalized thoughts of your mind.

Something that gets in the way of people’s understanding of themselves (not accusing you of this) is thinking that they’re aware of everything going on inside their brain. This is obviously not true.

You can’t catch a ball by being aware of the angle of every joint as you do it. You can understand someone speaking by considering all the rules of grammar and vocabulary as you listen. It’s just too slow.

It’s like a CLI program in verbose mode. Even with stuff flying across the screen, you can’t print out everything that’s happening. It’s just too much.

While you sleep, your brain is rearranging itself to solve your problems. It would be just as accurate to say you are rearranging your brain, because your brain is a part of you and you are a part of it. If I give someone a handshake, my hand is touching their hand and I am touching their hand.

Keeping the conscious mind updated on this whole process at all times would be like telling a PM about every keystroke.


I don't have subvocalized thoughts, but I do know when I'm thinking. It wasn't that, it was like recalling a memory. I thought about the problem, and then the memory of the solution came.


>I thought about the problem, and then the memory of the solution came.

I find this works extremely well.

It can really be a tactic to overcome some things that probably could not be solved any other way.

That's the kind of rest I like to get, where you make progress at the same time ;)


Oh, interesting. Lots of people that have subvocalized thoughts (which is most people it seems) identify with them so strongly they think it’s who they are.


Hmm, really? No, I know that's just me repeating my thoughts back, so I usually just don't do it. The thoughts themselves take milliseconds to think.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: