Hacker News new | past | comments | ask | show | jobs | submit login

> I wonder if TPUs, like Google's Tensor chip, will beat out GPUs when it comes to image/video based training?

One of the OpenAI guys was talking about this. He said the specific technology does not matter, it is just a cost line item. They don't need to have the best chip tech available as long as they have enough money.

That said I am curious if anyone else can really comment on this. It seems like as we get to very large and expensive models we will produce more and more specialized technology.




Whether or not cost matters much depends on your perspective.

If you’re OpenAI and GPT4 is just a step on the way to AGI, and you can amortize that huge cost over the hundreds of millions in revenue you’re gonna pull in from subscriptions and API use… then sure you’re probably not very cost sensitive. It could be 20% cheaper or 50% more expensive, whatever, it’s so good your customers will use it at a wide range of costs. And you have truckloads of money from Microsoft anyways.

If you’re a company or a developer trying to build a feature, whole new product, or an entire company on top of GPT then that cost matters a whole lot. The difference between $0.06 and $0.006 per turn could be infeasible vs. shippable.

If you’re trying to compete with OpenAI then you’re probably doing everything possible to reduce that training cost.

So, whether or not it matters - it really depends.


Totally true.


> They don't need to have the best chip tech available as long as they have enough money.

That sounds like someone who is "Blitzscaling." Costs do not matter in those cases, just acquiring customers and marketshare. But for the rest of us, who will see benefits but are not trying to win a $100B market, we will cost optimize.


Yes, agreed. I would like to run large models at home without serious expense.


Maybe it's just a line item to them, but it's pretty relevant to anyone operating with a less-than-gargantuan budget. If a superior/affordable chip is widely available, OpenAI's competitive advantage recedes rapidly because suddenly everyone else can do what they can. To some extent that's exactly what happened with DALL-E/StableDiffusion.

assuming it's not horizontally scalable, because otherwise they would just out-spend everyone else anyway like they've already done. That's a big "if", though.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: