Hacker News new | past | comments | ask | show | jobs | submit | nichochar's comments login

Even if what you said was true, it will be false within months or years.

What then?

This is the whole premise of the article. Just extrapolate and imagine that it can think and write poetry better than you (it will, and likely soon), what then?

It's a very important question. A cultural one.


A reckoning of sorts, causing us to confront exactly who and what we are..

Reasoning:

.com is the best TLD by a long shot but it's really saturated, so as a startup you have no chance.

As you say, the hope is to make it and be able to buy the X in getX.com where hopefully you've checked that X belongs to a squatter and not an existing company (they're both bad the latter is worse).


We're building our startup infra on cloudflare over the other major hyperscalers and it turned out to be an amazing decision...

Generous free tiers, pricing scales very competitively after that, and their interface is not nearly as bad as GCP / AWS.

I highly recommend this stack.


> their interface is not nearly as bad as GCP / AWS

Underrated.

Until recently, all the features were grouped in a very clear manner within the dashboard. Now, even Cloudflare is complicating its management interface, but they still have a long way to go before reaching the level of confusion of AWS and GCP.


Definitely.

I managed to get R2 with their cdn in front of it up and working in under an hour. The same experience with s3 fronted by cloudfront was 2 very long days. Due to my misunderstanding, yes, but aws provided (1) incomprehensible docs, (2) an extremely complex UI; (3) stale help all over the internet; and (4) incredibly unclear error messages.


Honestly, I feel like Cloudflares interface is quite complicated for the number of features they have. All their stuff seems to be only slightly integrated.


I appreciate the fact its just connected enough to work. AWS does what feels like everything in their power to entrench you. I avoid AWS as much as possible but one example that comes to mind is the fact you basically need to use SQS for SES


I feel like my page ranking on Google is lower after switching to Cloudflare. Like google is secretly punishing Cloudflare hosted pages or something.

I have zero evidence to prove anything. Just gut feeling. Anyone else notice this?


Make sure you're not blocking googlebot, check in https://support.google.com/webmasters/answer/9012289


It's hard to say because Google regularly releases updates that affect rankings.

I've had sites that don't use CF dropping positions in Google Search even though nothing changed on my end. Why? No idea.


you run compute workloads on there?


Yes: https://developers.cloudflare.com/ Look at Cloudflare Workers and Cloudflare Workers AI.


Cool. What are you building?


srcbook.com (not OP, just trolling through their profile).


Same here!


Yeah I don't agree. I'm building a product in the space, and the number one problem is correctness, not latency.

People are very happy to sit there for minutes if the correctness is high and the quality is high. It's still 100x or 1000x faster than finding 3rd party developers to work for you.

I wish the models were getting better but recently they've felt very stuck and this is it, so agent architectures will be the answer in the short term. That's what's working for us at srcbook rn.


I think the logic behind faster inference is that the LLM is unlikely to get it right the first time regardless of its intelligence simply due to the inherent ambiguity of human language. The faster it spits out a semi-functional bit of code the faster the iteration loop and the faster the user gets what they want.

Plus if you’re dealing with things like syntax errors, a really really fast llm + interpreter could report and fix the error in less than a minute with no user input.


Also building something in this space. I think it’s a mistake to compare the speed of LLMs to humans. People don’t like to sit and wait. The more context you can give the better but at some point (>30 seconds) people grow tired of waiting.


yes, people are used to clicks / actions taking < 200ms, when something takes 20s+, it feels broken even if the results are good.


I have a masters degree in math/physics, and 10+ years of being a SWE in strong tech companies. I have come to rely on these models (Claude > oai tho) daily.

It is insane how helpful it is, it can answer some questions at phd level, most questions at a basic level. It can write code better than most devs I know when prompted correctly...

I'm not saying its AGI, but diminishing it to a simple "chat bot" seems foolish to me. It's at least worth studying, and we should be happy they care rather than just ship it?


Interesting that the results can be so different for different people. I have yet to get a single good response (in my research area) for anything slightly more complicated than what a quick google search would reveal. I agree that it’s great for generating quick functioning code though.


> I have yet to get a single good response (in my research area) for anything slightly more complicated than what a quick google search would reveal.

Even then, with search enabled it's ways quicker than a "quick" google search and you don't have to manually skip all the blog-spam.


Google search was great when it came out too. I wonder what 25 years of enshittification will do to LLM services.


Enshittification happened but look at how life changed since 1999 (25 years as you mentioned). Songs in your palm, search in your palm, maps in your palm or car dashboard, live traffic rerouting, track your kids plane from home before leaving for airport, book tickets without calling someone. WhatsApp connected more people than anything.

Of course there are scams and online indoctrination not denying that.

Maybe each service degraded from its original nice view but there is an overall enhancement of our ability to do things.

Hopefully the same happens over next 25 years. A few bad things but a lot of good things.


I think I had most or all of that functionality in 2009, with Android 2.0 on the OG Motorola Droid.

What has Google done for me lately?


But also what new tools will emerge to supplant LLMs as they are supplanting Google? And how good will open source (weights) LLMs be?


absurd, the claim that Google search was better 25 years ago than today. that's vastly trivializing the amount of volume and scale that Google needs to process


I'm using it to aide in writing pytorch code and God if it's awful except for the basic things. It's a bit more useful in discussing how to do things rather than actually doing them though, I'll give you that


Claude is much better at coding and generally smarter; try it instead.

o1-preview was less intelligent than 4o when I tried it, better at multi-step reasoning but worse at "intuition". Don't know about o1.


o1 seems to have some crazy context length / awareness going on compared to current 3.5 Sonnet from playing around it just now. I'm not having to 'remind' it of initial requirements etc nearly as much.


I gave it a try and o1 is better than I was expecting. In particular the writing style is a lot lighter on "GPTisms". It's not very willing to show you its thought process though, the summaries of it seem to skip a lot more than in the preview.


I think the human variable is that you need to know enough to be able to ask the right questions about a subject while not knowing enough about the subject to learn something from the answers.

Because of this, I would assume it is better for people who have interest with more breadth than depth and less impressive to those who have interest that are narrow but very deep.

It seems obvious to me the polymath gains much more from language models than the single minded subject expert trying to dig the deepest hole.

Also, the single minded subject expert is randomly at the mercy of what is in the training data much more in a way than the polymath when all the use is summed up.


I have the $20 version, I fed it code form a personal project, and it did a commendable job of critiquing it, giving me alternate solutions and then iterating on those solutions. Not something you can do with Google.

For example, ok, I like your code but can you change this part to do this. And it says ok boss and does it.

But over multiple days, it loses context.

I am hoping to use the 200$ version to complete my personal project over the Christmas holidays. Instead of me spending a week, I maybe will spend 2 days with chatgpt and get a better version than I initially hoped to.


For code review maybe, it's pretty useful.

Even with the $20 version I've lost days of work because it's told me ideas/given me solutions that are flat out wrong or misleading but sound reasonable, so I don't know if they're really that effective though.


Have you used the best models (i.e. ones you paid for)? And what area?

I've found they struggle with obscure stuff so I'm not doubting you just trying to understand the current limitations.


Try turn search on in ChatGPT and see if it picks up the online references? I've seen it hit a few references and then get back to me with info summarised from multiple. That's pretty useful. Obviously your case might be different, if it's not as smart at retrieval.


My guess is that it has more to do with the person than the AI.


It has a huge amount to do with the subject you're asking it about. His research area could be something very niche with very little info on the open web. Not surprising it would give bad answers.

It does exponentially better on subjects that are very present on the web, like common programming tasks.


How do you get Google search to give useful results? Often for me the first 20 results have absolutely nothing to do with fhe search query.


The comments in this thread all seem so short sighted. I'm having a hard time understanding this aspect of it. Maybe these are not real people acting in good faith?

People are dismissive and not understanding that we very much plan to "hook these things up" and give them access to terminals and APIs. These very much seem to be valid questions being asked.


Not only do we very much plan to, we already do!


HN is honestly pretty poor on AI commentary, and this post is a new low.

Here, at least, I think there must be a large contributing factor of confusion about what a "system card" shows.

The general factors I think contribute, after some months being surprised repeatedly:

- It's tech, so people commenting here generally assume they understand it, and in day-to-day conversation outside their job, they are considered an expert on it.

- It's a hot topic, so people commenting here have thought a lot about it, and thus aren't likely to question their premises when faced with a contradiction. (c.f. the odd negative responses have only gotten more histrionic with time)

- The vast majority of people either can't use it at work, or if they are, it's some IT-procured thing that's much more likely to be AWS/gCloud thrown together, 2nd class, APIs, than cutting edge.

- Tech line workers have strong antibodies to tech BS being sold by a company as gamechanging advancements, from the last few years of crypto

- Probably by far the most important: general tech stubborness. About 1/3 to 1/2 of us believe we know the exact requirements for Good Code, and observing AI doing anything other than that just confirms it's bad.

- Writing meta-commentary like this, or trying to find a way to politely communicate "you don't actually know what you're talking about just because you know what an API is and you tried ChatGPT.app for 5 minutes", are confrontational, declasse, and arguably deservedly downvoted. So you don't have any rhetorical devices that can disrupt any of the above factors.


Personally I am cynical because in my experience @ FAANG, "AI safety" is mainly about mitigating PR risk for the company, rather than any actual harm.


I lived through that era at Google and I'd gently suggest there's something south of Timnit that's still AI safety, and also point out the controversy was her leaving.


I am curious if you have played with Claude-based agent tools like Windsurf IDE at all, and if you find that interesting.

I am a product-ish guy, who has a basic understanding of SQL, Django, React, Typescript, etc.. and suddenly I'm like an MVP v0.1 a week, all by myself.

Do folks at your level find things like Cline, Cursor, and Windsurf useful at all?

Windsurf IDE (Sonnet) blows my mind.


I am building https://srcbook.com which is in this category but focused on webapps.

It's unreal what the AI can do tbh.


It's impressive. I want to be a doubter and say I feel like it just shows what you can do with tailwind/typescript in getting a nice UI out, but it really would be genuinely useful for some cases.

The problem I have with it is that how do you get out of a corner with it? Once I start having problems with the application -- I asked it to generate a search engine for different websites, but one of the API endpoints wasn't working. It kept trying over and over again but failed to get it working.

It's - like other attempts I've had with AI - more frustrating to work with. It'll say "I fixed this" and then have a problem where it created more problems with what it's fixing. I thought it finally worked 100% but it just made it look better by breaking something else without actually fixing the issue.

Admittedly, it took what might have been a day's work into a couple hours, but now I have a chunk of code that I don't understand and will be deliberately harder to understand than if I wrote it myself.

It still feels like I'm trying to coax an intern into working a project, rather than having an application that actually does the work for me.


why windsurf as opposed to something mainstream like vs or cursor? unless there's some conflict of interest


Nahh, I am open to all of it. Windsurf is just the one that caught my attention at the right time. I mentioned 2 comps in op, but Windsurf just happens to be the one that got me.

I have not done a comparison of all of them. I am on an old ThinkPad, so Cursor is out right there, for now.


(this comment was originally a reply to https://news.ycombinator.com/item?id=42331323)


can you give an example of a prompt and response you find impressive?


try the thing i'm building, it will build a website for you from a simple prompt: https://srcbook.com


As someone building a client which needs to sync with a local filesystem (repo) and database, I cannot emphasize how wonderful it is that there is a push to standardize. We're going to implement this for https://srcbook.com


Notebooks are hot these days! We also shipped our own version of a TypeScript notebook[1] but it takes quite different sides of the tradeoff: we want to run backend node code, so unlike this or observable we're not looking to run in the browser environment. Still, for many applications, this idea is a better take!

Kudos to the author.

https://github.com/srcbookdev/srcbook


Tailwind is amazing for LLMs. You can't beat it:

- concise

- inline with the rest of the code

I am willing to bet it's going to become a standard because of its existing popularity + the insane tailwinds that codegen give it.


Unless my designs are predictable and cookie-cutter, why would I want copilot to generate them? How would it know exactly what aesthetic each element should have? I love unique, expressive designs that capture a brand, so Tailwind only serves to make my markup harder to read.


We love feedback, let me know what's missing / what you wish was better. We have a discord if that's easier, or just email me at nicholas <at> srcbook <dot> com


Same category. Differences that I can spot (I'm not very familiar with Marblism):

Currently, Srcbook is:

- open source

- local

- focused on a different stack

- also offers a notebook product


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: