Hacker Newsnew | past | comments | ask | show | jobs | submit | mustyoshi's commentslogin

Yeah this is the thing people miss a lot. 7,32b models work perfectly fine for a lot of things, and run on previously high end consumer hardware.

But we're still in the hype phase, people will come to their senses once the large model performance starts to plateau


I expect people to come to their senses when LLM companies stop subsidizing cost and start charging customers what it actually costs them to train and run these models.

I mean, there is no reason for a inference provider og open models to subsidice you. And costs there is usually cheaper than Claude API pricing.

Its still a market though, there is always the incentive to subsidize if all the competition is keeping prices artificially low.

People don't want to guess which sized model is right for a task and current systems are neither good or efficient at trying to estimate that automatically. I see only the power users tweaking more and more as performance plateaus and the average user only changing when it's automatic.

> 7,32b models work perfectly fine for a lot of things

Like what? People always talk about how amazing it is that they can run models on their own devices, but rarely mention what they actually use them for. For most use cases, small local models will always perform significantly worse than even the most inexpensive cloud models like Gemini Flash.


Gemma 3n E4B has been crazy good for me - fine tune running on Google Cloud Run via Ollama, completely avoiding token based pricing at the cost of throughput limitations

What kind of applications are you using it for?

Doesn't this just fall apart if a video is reencoded? Something fairly common on all video platforms.

Take a computer screen with a full wash of R, G, or B. Sync the RGB display with your 2FA token, but run it at 15FPS instead of one code per minute.

Point the monitor at the wall, or desk, or whatever. Notice the radiosity and diffuse light scattering on the wall (and on the desk, and on the reflection on the pen cap, and on their pupils).

Now you can take a video that was purported to be taken at 1:23pm at $LOCATION and validate/reconstruct the expected "excess" RGB data and then compare to the observed excess RGB data.

What they say they've done as well is to not just embed a "trace" of expected RGB values at a time but also a data stream (eg: a 1FPS PNG) which kindof self-authenticates the previous second of video.

Obviously it's not RGB, but "noise" in the white channels, and not a PNG, but whatever other image compression they've figured works well for the purpose.

In the R, G, B case you can imagine that it's resistant (or durable through) most edits (eg: cuts, reordering), and it's interesting they're talking about detecting if someone has photoshopped in a vase full of flowers to the video (because they're also encoding a reference video/image in the "noise stream").


No, unsafe-yt

I'm not part of Oxide, but I think you're assuming everyone is okay with not controlling the hardware they run on.

There is plenty of addressable market where a public cloud or even a colocation of hardware doesn't make sense for whatever bespoke reason.

I don't think their target customer is startups, and that's okay. You likely aren't their target customer. But they've identified a customer profile, and want to provide a hardware + software experience that hasn't been available to that customer before.


An entire hotel probably wastes less from the mini fridge specifically than a family of 4 for a year.


It's in the ballpark if you include all energy source for the family.

100 rooms times, say, 50W (5kW) is 43,000kWh. That's over 10 UK families of 4-5 (4100kWh/yr) for electricity, or 2 if you include gas usage. So for Americans, it's probably must closer to parity.

The fridge does dump heat into the room, so it has a small additional penalty for the aircon in hot countries, but a small, but inefficient compared to a heat-pump, heating offset in cold countries.


As an American. I'd rather a single well established player get a large contract and actually deliver, than 20 disjointed companies each get 1/20th of the problem, have to work in concert, and possibly not even deliver at all.


This is no different than spending 98c per roll to buy 32 rolls of toilet paper vs 1.33$ per roll to only buy 12.

We have a Sam's Club membership because buying in bulk is cheaper.

Edit: checked prices Sam's vs Meijer


Arguably, CUDA is the current best in class software for it's market.


Next week's volume will be down, but the next next week is back up to last year's volume...?


That was an incredibly long interview for them to just say the COGS is only 100$ higher, but we feel the margin should be 1200.


There's margin (on BOM cost), and then there's profit margin (above the design cost).

Design costs are probably an order of magnitude higher in USA than in China, and can't be spread over hundreds of thousands of units. I'd bet the profit isn't that great.


"Paul Newman alcohol" is just showing you results where those words are all present, it's not really implying how widely known it is.


What are you, an LLM? Look at the results of the first twenty hits and come back, then tell me that they don't speak to that specific issue.


Widely reported does not imply widely known.


How else does an LLM distinguish what is widely known, given there are no statistics collected on the general populations awareness of any given celebrities vices? Robo-apologetics in full force here.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: