I will not be surprised if there is but that is not a problem that cannot be fixed with some effort. The point is if we can produce deploy-able, full-stack apps, which are manageable, this changes what it means for software and for startups.
I live in a remote village in Himalayas, WB, India that I am sure no one on HN has heard of. I got 5G based broadband that is flaky just a few weeks back. By the end of this year, I am sure I will be able to attempt 4-5 products and market them more than I have ever done in my 16 years of professional life.
Sounds like first decade or two of aviation, back when pilots were mostly looking at gauges and tweaking knobs to keep the engine running, and flying the plane was more of an afterthought.
Go to ChatGPT.com and summon a ghost. It's real. It's not a particularly smart ghost, but gets a lot of useful work done. Try it with simpler tasks, to reduce the chances of holding it wrong.
That list of "things LLM apologists say" upthread? That's applicable when you try to make the ghost do work that's closer to the limits of its current capabilities.
The capabilities of LLMs have been qualitatively the same since the first ChatGPT. This is _precisely_ a hype post claiming that a future where LLMs have superhuman capabilities is inevitable.
They've definitely improved in many areas. And not just the easily-gamed public metrics; I've got a few private tests of my own, asking them certain questions to see how they respond, and even on the questions where all versions make mistakes in their answers, they make fewer mistakes than they used to.
I can also see this live, as I'm on a free plan and currently using ChatGPT heavily, and I can watch the answers degrade as I burn through the free allowance of high-tier models and end up on the cheap models.
Now, don't get me wrong, I won't rank even the good models higher than a recent graduate, but that's in comparison to ChatGPT-3.5's responses feeling more like those of a first or second year university student.
And likewise with the economics of them, I think we're in a period where you have to multiply training costs to get incremental performance gains, so there's an investment bubble and it will burst. I don't think the current approach will get in-general-superhuman skills, because it will cost too much to get there. Specific superhuman skills AI in general already demonstrate, but the more general models are mostly only superhuman by being "fresh grad" at a very broad range of things, if any LLM is superhuman at even one skill then I've missed the news.
There remains a significant challenge with LLM-generated code. It can give the illusion of progress, but produce code that has many bugs, even if you craft your LLM prompt to test for such edge cases. I have had many instances where the LLM confidentially states that those edge cases and unit tests are passing, while they are failing.
Three years ago, would you have hired me as a developer if I had told you I was going to copy and paste code from Stack Overflow and a variety of developer blogs, and glue it together in a spaghetti-style manner? And that I would comment out failing unit tests, as Stack Overflow can't be wrong?
LLMs will change Software Engineering, but not in the way that we are envisaging it right now, and not in the way companies like OpenAI want us to believe.
Proper coding agents can easily be set up with hooks or other means of forcing linting and tests to be run and prevent the LLMs from bypassing them already. Adding extra checks in the work flow works very well to improve quality. Use the tools properly, and while you still need to take some care, these issues are rapidly diminishing separately from improvements to the models themselves.
> (from upthread) I was being sold a "self driving car" equivalent where you didn't even need a steering wheel for this thing, but I've slowly learned that I need to treat it like automatic cruise control with a little bit of lane switching.
This is, I think, the core of a lot of people's frustrations with the narrative around AI tooling. It gets hyped up as this magnificent wondrous miraculous _intelligence_ that works right-out-of-the-box; then when people use it and (correctly!) identify that that's not the case, they get told that it's their own fault for holding it wrong. So which is it - a miracle that "just works", or a tool that people need to learn to use correctly? You (impersonal "you", here, not you-`vidarh`) don't get to claim the former and then retreat to the latter. If this was just presented as a good useful tool to have in your toolbelt, without all the hype and marketing, I think a lot of folks (who've already been jaded by the scamminess of Web3 and NFTs and Crypto in recent memory) would be a lot less hostile.
1) Unbounded claims of miraculous intelligence don't come from people actually using it;
2) The LLMs really are a "miraculous intelligence that works right out-of-the-box" for simple cases of a very large class of problems that previously was not trivial (or possible) to solve with computers.
3) Once you move past simple cases, they require increasing amount of expertise and hand-holding to get good results from. Most of the "holding it wrong" responses happen around the limits of what current LLMs can reliably do.
4) But still, that they can do any of that at all is not far from a miraculous wonder in itself - and they keep getting better.
With the exception of 1) being "No True Scotsman"-ish, this is all very fair - and if the technology was presented with this kind of grounded and realistic evaluation, there'd be a lot less hostility (IMO)!
That would be true if I was making a argument of criticism of a certain person or class-of-people, but I'm not. I'm describing my observations about the frustrations that AI-skeptics feel when they are bombarded with contradictory messages from (what they perceive as) "the pro-AI crowd". The fact that there are internal divisions within that group (between those making absurd claims, and those pointing out how correct tool use is important) does mean that the tool-advisers are being consistent and non-hypocritical, but _it doesn't lessen the frustration by the people hearing it_.
That is - I'm not saying that "tool advisers" are behaving badly, I'm observing why their (good!) advice is met with frustration (due to circumstances outside their control).
EDIT: OK, on reading my previous comment, there is some assumption that the comments are being made by the same people (or group-of-people) - so your response makes sense as a defence against that. I think the observations of sources of frustration are still accurate, but I do agree that "tool advisers" shouldn't be accused of inconsistency or hypocrisy when they're not the ones make outlandish claims.
You don't fancy Cyanide Custard? Radioactive Slime Jello made with actual radioactive waste? Sweet Tooth Delight made with real human teeth? Or a few other desserts with NSFW names that would go down a treat for a family dinner. It is hard to please some people /s.
But yes, websites will now be filled with these low-quality recipes, and some might be outright dangerous. Cyanide custard should ring alarm bells, but using the wrong type of mushroom is equally dangerous and much more challenging to spot.
The problem isn't that Boeing is less safe; it is that the company's culture shifted to the extent that technical staff could no longer report perceived safety issues.
It is part of the regular economic cycle. Most companies try to do more with less, like in any downturn. This cycle is different, as senior management now has AI to justify hiring fewer knowledge workers. Things will turn around eventually when the AI bubble bursts. Companies will then scramble for graduates who have likely moved to work in different industries and complain about a talent shortage.
> senior management now has AI to justify hiring fewer knowledge workers
Justify to whom? Shareholders just care about metrics like earnings and revenue. If they don't need workers to optimize this metrics, they have right to hire fewer workers.
The number of employees you hire is seen as a proxy for growth and future earnings and revenue. With AI the argument is that you can grow with less staff.
Mercedes has gained approval to test Level 4 autonomous driving. Level 4 is considered fully autonomous driving, although the vehicle retains a traditional cockpit, and the driver can request control at any time. If a Level 4 system fails or cannot proceed, it is required to pull the vehicle over and bring it to a complete stop under its own control.
I would argue that it is getting very close to what people think autopilot can do. A car that, under certain circumstances, can drive for you and doesn't kill you if you don't pay constant attention.
The one which needs a leading vehicle to follow below 40mph on some stretches of freeways? Try to look up videos of it from owners that are not press or mercedes reps.
> Raising the bar in autonomous driving technology, Mercedes-Benz is the first automobile manufacturer in the US to achieve a Level 3 certification based on a 0-5 scale from the Society of Automotive Engineers (SAE). Under specific conditions, our technology allows drivers to take their hands off the steering wheel, eyes off the road — and take in their surroundings
(Specific conditions being the lead car and speed limits you noted, but that’s not what the person you’re replying to is talking about)
Additionally to their level 3 system, they’ve been granted permission to test Level 4 systems not for public use/availability, on a prototype vehicle:
Not sure about other countries but in Germany L4 is only active on freeways up to certain speeds. The video you linked is not showing this. At least the parts where I skipped through.
Then those same humans won't be able to reason about code, or the problem spaces they're working in, regardless, since it's all fundamentally about precise specifics.
Languages used for day to day communication between humans do not have the specificity needed for detailed instructions... even to other humans. We out of band context (body language, social norms, tradition, knowledge of a person) quite a bit more than you would think.
Programming languages, which are human language, are purpose built for this. Anyone working in the domain of precise specifications uses them, or something very similar (for example, engineering, writing contracts, etc), often daily. ;)
They all usually build down to a subset of english, because near caveman speak is enough to define things with precision.
Yes, let's devise a more precise way to give AI instructions. Let's call it pAIthon. This will allow powers that be, like Zuckerberg to save face and claim that AI has replaced mid-level developers and enable developers to rebrand themselves as pAIthon programmers.
Joking aside, this is likely where we will end up, just with a slightly higher programming interface, making developers more productive.
Fivetran works perfectly fine for syncing Postgres databases into Snowflake. My company syncs dozens of them without problems. I can only assume their Postgres database has a non standard set up.
Yeah, if you have 1 million dollars to spend every time you run a data migration or anything else that touches many rows.
I've seen some new libraries crop up for writing your own replication slot clients. I wouldn't use fivetran for PG.
Either you have a lot of data and fivetran will be too expensive or you don't, and you're better off just using a postgres OLAP plugin/extension.
Maybe it was because it was in beta, but I had a nightmare of a time with fivetrans API trying to coordinate connectors and destinations and git access.
Looking at the code, there is a good chance this codebase is vulnerable to SQL injection.