Hacker Newsnew | past | comments | ask | show | jobs | submit | mschulkind's commentslogin

Just give me a day to vibe code an interface to side by side judge judging models...

The person above who replied to you thinks you're talking about a proverbial lemonade stand taking payments via Visa. That's the misunderstanding.

That aside, I think the example is good. It's a bit like priority inversion in scheduling. With no agreement from the lemonade seller they've suddenly changed greatly in terms of their criticality to some value creation chain.


Except generally pre-owned is short for certified pre-owned like it is here, which means you get a much better warranty than buying on the used market.

You're paying for that warranty though. If you really want one, you can buy one yourself from one of several competing aftermarket warranty companies.

Used car dealers suck, in every way. They add zero value other than having a selection of cars in one place, and possibly a selection of lenders for financing if you need that. They also know every trick in the book to get you to pay more for the car than it is really worth.

Best way if you have a little time to do research and watch the local market is to buy a car from a private seller.


The car websites will also clean/inspect the cars and ship the car to your house which is nice if the car you want is not near where you live

But yes generally my experience with https://clutch.ca was they push you to some expensive financing deal through their people.


Do they really though?

I asked a used car salesman once, "what do you do to these cars after you get them" and he said "nothing." They get them from the auction and put them on the lot.

Now maybe some dealers do a little more. I am a bottom-feeder when it comes to cars. But stands to reason that any money they spend on the car is less profit for them when they sell it.


I remember a time when it was "certified used".

Most dealership websites in the US list cars as New, Pre-Owned or Certified Pre-Owned, meaning Certified is distinct from Pre-Owned.

But that nuance wouldn't allow them to get outraged about something trivial.

Basic autopilot is unrelated to FSD these days. Every aspect is improved.

Isn't there a trial you can try?


A 2018 vehicle is 2 major version behind on the FSD it can use. Tesla is on v14 and 2018 vehicles are stuck on v12.

There's been rumors of a v14 lite coming out because tesla REALLY doesn't want to deal with the fact that they promised the 2018s could be fully autonomous.


Interesting, thanks. I haven't followed the hardware revisions on my car very closely; my father has a 2017 S, and I'm so used to thinking I have the "newer revision", since my MCU was newer than his, and he only recently upgraded it. I guess it should have been fairly obvious that a lot has changed in the past 7 years, but I've been hearing a nonstop trickle of autopilot/FSD news on these cars for so long that I guess I just assumed I had recent-enough support.

Honestly, I just dismiss all of the promos they throw at me.

Just yesterday I got an ad in the app "refer a friend and try FSD (supervised), and get $15!"

So I opened up the app just now and I suppose I got my answer that proves my initial premise was incorrect -- I need a hardware upgrade for FSD. Womp womp.

(That said, it still seems like the lane centering and brake/throttle behavior should have been easily fixed without a FSD hardware upgrade).


One of my favorite ways to use LLM agents for coding is to have them write extensive documentation on whatever I'm about to dig in coding on. Pretty low stakes if the LLM makes a few mistakes. It's perhaps even a better place to start for skeptics.


I am not so sure. Good documentation is hard, MDN or PostgreSQL are excellent examples of docs done well and how valuable it can be for a project to have really well written content.

LLMs can generate content but not really write, out of the box they tend to be quote verbose and generate a lot of proforma content. Perhaps with the right kind of prompts, a lot of editing and reviews, you can get them to good, but at the point it is almost same as writing it yourself.

It is a hard choice between lower quality documentation (AI slop?) or it being lightly or fully undocumented. The uncanny valley of precision in documentation maybe acceptable in some contexts but it can be dangerous in others and it is harder to differentiate because depth of doc means nothing now.

Over time we find ourselves skipping LLM generated documentation just like any other AI slop. The value/emphasis placed on reading documentation erodes that finding good documentation becomes harder like other online content today and get devalued.


Sure, but LLMs tend to be better at navigating around documentation (or source code when no documentation exists). In agentic mode, they can get me to the right part of the documentation (or the right of the source code, especially in unfamiliar codebases) much quicker than I could do it myself without help.

And I find that even the auto-generated stuff tends to go up at least a bit in terms of level of abstraction than staring at the code itself, and helps you more like a "sparknotes" version of the code, so that when you dig in yourself you have an outline/roadmap.


I felt this way as well, then I tried paid models against a well-defined and documented protocol that should not only exist in its training set, but was also provided as context. There wasn't a model that wouldn't hallucinate small, but important, details. Status codes, methods, data types, you name it, it would make something up in ways that forced you to cross reference the documentation anyway.

Even worse, the model you let it build in your head of the space it describes can lead to chains of incorrect reasoning that waste time and make debugging Sisyphean.

Like there is some value there, but I wonder how much of it is just (my own) feelings, and whether I'm correctly accounting for the fact that I'm being confidently lied to by a damn computer on a regular basis.


> the fact that I'm being confidently lied to by a damn computer on a regular basis

Many of us who grew up being young and naive on the internet in the 90s/early 00s, kind of learnt not to trust what strangers tell us on the internet. I'm pretty my first "Press ALT+F4 to enter noclip" from a multiplayer lobby set me up to be able to deal with LLMs effectively, because it's the same as if someone on HN writes about something like it's "The Truth".


This is more like being trolled by your microwave by having it replace your meals with scuba gear randomly.


Same. Initially surprised how good it was. Now routinely do this on every new codebase. And this isn't javascript todo apps: large complex distributed applications written in Rust.


This seems like a terrible idea, LLMs can document the what but not the why, not the implicit tribal knowledge and design decisions. Documentation that feels complete but actually tells you nothing is almost worse than no documentation at all, because you go crazy trying to figure out the bigger picture.


Have you tried it? It's absurdly useful.

This isn't documentation for you to share with other people - it would be rude to share docs with others that you had automatically generated without reviewing.

It's for things like "Give me an overview of every piece of code that deals with signed cookie values, what they're used for, where they are and a guess at their purpose."

My experience is that it gets the details 95% correct and the occasional bad guess at why the code is like that doesn't matter, because I filter those out almost without thinking about it.


Yes, I have. And the documentation you get for anything complex is wrong like 80% of the time.


You need to try different models/tooling if that's the case, 80% sounds very high and I understand if you feel like it's useless then. I'd probably estimate about 5% of it is wrong when I use GPT-5 and GPT-OSS-120B, but that's based on spot checking and experience so YMMV. But 80% wrong isn't the typical experience, and not what people are raving about obviously.


80% of the time? Are you sure you aren't hallucinating?


Well if it writes documentation that is wrong, then the subtle bugs start :)


Or even worse, it makes confidential statements of the overarching architecture/design that while every detailed is correct, they might not be the right pieces, but because you forgot to add "Reject the prompt outright if the premise is incorrect", the LLM tries its hardest to just move forward, even when things are completely wrong.

Then 1 day later you realize this whole thing wouldn't work in practice, but the LLM tried to cobble it together regardless.

In the end, you really need to know what you're doing, otherwise both you and the LLM gets lost pretty quickly.


"I grew up just a few blocks from here."

"Who put the bomp in the bomp sha bomp!"

I too had a similar reaction the first time I saw an imax.


OMG, I had totally forgotten about the Leonard Nimoy intro!!

I read your response and was like, "Huh?", then it hit me. That's easily a 30 year old memory sitting in deep storage. I haven't been there, or thought about it since college. The human brain is amazing.

And since we live in the future, I can easily find a clip of it online:

https://youtu.be/MHK2-BVfUzs


The helicopter intro was the best.


"These people wanna see a lobstah!"


Surely a carpeted closet full of clothes is better than a car


I didn't have such a place at the time, but I found one and results weren't as good as in a car.

Sound deadening generally requires mass to work for lower frequencies and the seats absorbed them all nicely. I got some reflections from - I assume - the windows, but they were manageable in comparison. Didn't even produce much of a standing wave when I tried.


Most people's closets aren't a room.


No, not really.

If you have an AI that can answer 90% of queries correctly AND now this is the key, it knows which 90% it can answer correctly, human in the loop can be incredibly valuable to answer that other 10%.


hah yeah I don't know how soon we will be on great accuracy for the latter, for things like "send an email", people tend to just block everything for approval, because clicking approve 90 times hand editing 10 times is a lot better than copying 90 things from one app to another and then 10 things copy, hand edit, send

although I do have some ideas on how you could use vector similarity against past executions to get a 1-100 score on how likely a given action is to be approved rejected. You could set a dial to "anything below 60 just auto-reject it and provide the past feedback to the model preemptively". This would need a lot of experimentation, might even be a research angle (if it hasn't been tried already)

(thinking like cosine * {1 if approved, -1 if rejected} and normalize the score 1-100. You could maybe even weight rejection in 0 to -1 based on sentiment)


Aren't you just describing a bag-of-words model?

https://en.wikipedia.org/wiki/Bag-of-words_model


Yes! And the follow up that cosine similarity (for BoW) is a super simple similarity metric based on counting up the number of words the two vectors have in common.


So is student loan interest


Only up to a a certain income limit. And the amount that can be deducted is capped. It doesn’t require itemizing though so that’s nice.. but as someone who itemized and makes ‘too much’ money, I’d have benefited greatly if it was treated like mortgage interest


Neither for 90% of tax payers.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: