Why do these video generation ones never become usable to the public. Is it just...

zoogeny · on Oct 4, 2024

There are a few available to the public. runway.ai and kling are a couple that I see heavily used on Twitter.

I pay for runway right now for experiments and it works. The problem is that maybe 1 out of 10 prompts result in something useable. And when I say useable I have pretty low standards. Since the model pumps out 5 or 10 second clips you have to be pretty creative since the models still struggle with keeping any kind of consistency between shots. Things like lighting, locations, characters can all morph within/between cips.

The issue isn't quality exactly, it is like 80% there. When it works, it is capable of blowing your mind. You can get something that looks like it is a bonafide Hollywood shot. But that is a single 5 second or 10 second clip. So far there is no easy way to reliably piece those together to make even a 1 minute long TikTok.

The real problem is the cost. Since you have to sometimes do 10 prompts to get a single acceptable shot it is like a 10x multiplier on the cost per second of video. That can get very expensive for even short experiments.

yurylifshits · on Oct 4, 2024

Hi zoogeny (and anyone else here) — you can try our new app Nim to address the Runway problems you describe https://alpha.nim.video

We offer both image-to-video (same situation as Runway, need a few attempts to make something awesome) and video-to-video (under the name "Restyle 2.0") - this is our newest tool and is highly reliable, i.e. you can get complex motion (kissing, handshakes, boxing, skateboarding, etc) with controllable changes to input video (changing outfits, characters, backgrounds, styles).

Unlike Runway and Kling, we currently offer a smiple UNLIMITED plan for just $10/mo. Check it out! https://alpha.nim.video

zoogeny · on Oct 5, 2024

Thanks - will look into this more deeply once I am ready to start integrating generation into my tool.

Do you have an API that can be called? Are you interested in reselling your technology through 3rd party tools?

j0ej0ej0e · on Oct 5, 2024

What's the maximum video dimensions your service can output? with a 1024x1024 image it exports 512x512 on the free plan.

jonplackett · on Oct 4, 2024

Kling’s new one 1.5 model is WAY better than anything else I’ve tried. Makes runway look terrible. Really good temporal consistency and even gets hair and clothes and stuff right.

They also just added the ability to do lip sync to a moving head and it gets the lighting right too - runways lip sync breaks if there’s any movement at all.

I’m gonna stop pumping Kling on this comment thread now - until they start paying me to advertise!

dvngnt_ · on Oct 4, 2024

GTA IV Real Life - Runway Gen 3 AI shows the potential to turn low-fidelity source to something life-like https://youtu.be/FGBSzSO8k6A it would be really cool to this to work locally at playable rates

hackernewds · on Oct 4, 2024

How much do you pay? Imagine if they could charge premium prices to studio's like $100k/user

that's probably where the quality is, but not the billions

altairprime · on Oct 4, 2024

At “the public” Internet scale, if a hundred million people click Generate, imagine if Meta ends up paying a million dollars instantaneously.

- How many clicks of Generate are budgeted for?

- How many clicks should each user’s quota be?

- How much advertising revenue will be earned per click?

- Why should they give away a million dollars?

Right now, AI costs for this are so high that offering this feature ‘for free’ would bankrupt a small country in a matter of days, if everyone on Meta used it once. It doesn’t particularly matter what the exact cost is: it’s simply not tolerable to anyone who owes payment for the services provided.

This is also why the AI industry is trying to figure out how to shift as much AI processing as possible to devices without letting users copy their models to profit off of the training research spend.

tqi · on Oct 4, 2024

Meta owns their data centers, so I don't think that framing is quite right. Increased traffic might cost marginally more in terms of electricity usage, but I think mostly what would happen is the service would degrade.

yojo · on Oct 4, 2024

The hardware serving web requests on Facebook is very different from the hardware used to generate these videos. It’s different kit, that is currently quite expensive and power intensive.

Facebook absolutely does not have a fleet of GPUs idling that could suddenly spring into action to generate a billion of these videos, nor do they have power stations on standby ready to handle the electricity load.

tqi · on Oct 4, 2024

Right, my point is that "paying a million dollars instantaneously" isn't something that Meta would face the way a company with a public cloud infra would, and as a result their motivations / concerns are probably more along the lines of bad user experiences (due to performance bottlenecks) hurting public perception rather than runaway costs bankrupting the company.

altairprime · on Oct 4, 2024

Having recently seen cost analysis for hosted enterprise generative AI, we’ll continue to disagree on this point. You certainly are describing valid concerns but Meta never struck me as being particularly worried about how people think of them; and, I am certain this doesn’t have the ’degrade’ capability at the billion users scale — it would have work queue lengths measured in weeks or more, which is useless for social media.

afh1 · on Oct 4, 2024

Just release the model and anyone can run locally, there is no cost except for the end user. Meta has the cash flow to do this if they wanted.

roywiggins · on Oct 4, 2024

Meta probably doesn't want people generating porn (and worse) with their models or derivations of their models, for obvious reputational reasons.

ipaddr · on Oct 4, 2024

They are in the wrong business if that's the main concern and will get overshadowed by others as tike goes on.

layer8 · on Oct 4, 2024

Consistency and continuity is the main problem. Take a look at the “Super Panavision” AI videos on YouTube.

Those videos are a good measure for monitoring AI video improvement.

causal · on Oct 4, 2024

I'd guess 1 in 10 model demos turn out to be useful product, at best.

This and Sora are particularly annoying, though, for how they put together these huge flashy showcases like they're announcing some kind of product launch and then... nothing. Apparently there's value in just flexing your AI-making muscle now and then.

93po · on Oct 4, 2024

to be fair, Sora was one of the most mind blowing technology showcases of my life, and openai is successful at raising tons of money

ActionHank · on Oct 4, 2024

Cost vs profitability is a big factor and those that don't have a product on the market are heavily cherry picking their demos.

FileSorter · on Oct 4, 2024

There are usable ones

runwayml.com

pika.art

hailuoai.com

grumbel · on Oct 4, 2024

klingai.com

lumalabs.ai

Apocryphon · on Oct 4, 2024

When I see lists of URLs like that I can only wonder what a future post archeologist, coming upon this long dusty thread half a decade from now, will find when they try to go to those sites.

ddtaylor · on Oct 4, 2024

I'm confused the demo let me press a button and generate a video, was it not supposed to?

mitthrowaway2 · on Oct 4, 2024

I didn't see a button for that. Just "download paper". Did I miss it?

jonplackett · on Oct 4, 2024

KlingAI is pretty good - but only 5 second clips for their v 1.5 model which is much better than 1.0

I made this with it (after training a Flux Lora on myself)

https://vm.tiktok.com/ZGdJ6uSh1/

Also interesting - blog post from someone who actually got to use Sora https://www.fxguide.com/fxfeatured/actually-using-sora/

TLDR; it’s still quite frustrating to use

chankstein38 · on Oct 4, 2024

Came here to say this... These companies all want patted on the back for how cool their video models are but we're still waiting on Sora since like last year. More and more publish these "look at us" papers but don't publish the models or even give us access to them.

They do exist, Luma AI DreamMachine is pretty cool. As well as Kling, Minimax, etc. But they aren't anything like Sora or this appear to be. They work but these, while likely cherry-picked, are still a whole new breed of video generation. But who knows if we'll ever actually get to use them or if we're just supposed to reflect on them and think about how cool and impressive Facebook and OpenAI are.