deepmind.google

salamo · 2024-05-15T02:18:34 1715739514

The first thing I will do when I get access to this is ask it to generate a realistic chess board. I have never gotten a decent looking chessboard with any image generator that doesn't have deformed pieces, the correct number of squares, squares properly in a checkerboard pattern, pieces placed in the correct position, board oriented properly (white on the right!) and not an otherwise illegal position. It seems to be an "AI complete" problem.

arcticbull · 2024-05-15T02:23:33 1715739813

Similarly the Veo example of the northern lights is a really interesting one. That's not what the northern lights look like to the naked eye - they're actually pretty grey. The really bright greens and even the reds really only come out when you take a photo of them with a camera. Of course the model couldn't know that because, well, it only gets trained on photos. Gets really existential - simulacra energy - maybe another good AI Turing test, for now.

porphyra · 2024-05-15T05:03:54 1715749434

Human eyes are basically black and white in low light since rod cells can't detect color. But when the northern lights are bright enough you can definitely see the colors.

The fact that some things are too dark to be seen by humans but can be captured accurately with cameras doesn't mean that the camera, or the AI, is "making things up" or whatever.

Finally, nobody wants to see a video or a photo of a dark, gray, and barely visible aurora.

exodust · 2024-05-15T05:53:27 1715752407

> nobody wants to see a video or a photo of a dark, gray, and barely visible aurora

Except those who want to see an accurate representation of what it looks like to the naked eye.

stkhlm · 2024-05-15T06:14:03 1715753643

Living in northern Sweden I see the northern lights multiple times a year. I have never seen them pale or otherwise not colorful. Green and reds always. That is to my naked eye. Photographs do look more saturated, but the difference isn't as large as this comment thread make it out to be.

peanut_merchant · 2024-05-15T09:09:24 1715764164

Even in Northern Scotland (further south than northern Sweden) this is the case. The latest aurora showing was vividly colourful to the naked eye.

shwaj · 2024-05-15T07:08:36 1715756916

That mirrors my experience from when I used to live in northern Canada

jabits · 2024-05-15T07:51:54 1715759514

Even in Upper Michigan near Lake Superior we sometimes had stunn, colorful Northern Lights. Sometimes it seemed like they were flying overhead within your grasp

DaSHacka · 2024-05-15T08:32:00 1715761920

Most definitely, it's quite common to find people hanging around outside up towards Calumet whenever there's a night with a high KP Index.

I highly recommend checking them out if you're nearby, the recent auoras have been quite astonishing

exodust · 2024-05-15T12:07:55 1715774875

I'm in Australia where the southern lights are known to be not as intense as northern lights. That's where my remark comes from. Those who have never seen the aurora with their own eyes may like to see an accurate photo. A rare find among the collective celebration of saturation.

fzzzy · 2024-05-15T11:11:05 1715771465

In the upper peninsula of michigan I have only seen grey.

Jensson · 2024-05-15T15:49:41 1715788181

That is the same latitude as Paris though, not very north at all.

freedomben · 2024-05-15T12:29:54 1715776194

Exactly. I went through major gas lighting trying to see the Aurora. I just wasn't sure whether I was actually seeing it, because it always looked so different from the photos. It is absolutely maddening trying to find a realistic photo of what it looks like to the naked eye, so that you can know if what you are seeing is actually the Aurora and not just clouds

paxys · 2024-05-15T03:08:27 1715742507

That's not true at all. I have seen northern lights with my own eyes that were more neon green and bright purple than any mainstream photo.

mapt · 2024-05-16T02:30:29 1715826629

"With my own eyes"

But what sort of eyes are those?

Priming the opsins in your retina is a continuous process, and primed opsins are depleted rapidly by light. Fully adapting your eye to darkness takes a great deal of darkness and a great deal of time - on the order of an hour should set you up.

Most human beings in arctic regions live in places and engage in lifestyles where it's impossible to even come close to attaining the full light sensitivity of the human retina in perfect darkness. The sky never gets dark enough in a city or even a small town to get the full experience, and if you saw your smart watch five minutes ago you still haven't fully recovered your night vision. Even a sliver of moon makes remote dark-sky-sites dramatically brighter.

Everybody is going to have different degrees of the experience because they'll have eyes with different degrees of dark adaptation. And their brains are going to shift around the ~10^3x dynamic range of the eye up or down the light intensity scale by a factor ~10^6, without making it obvious to them.

cryptoz · 2024-05-15T04:17:22 1715746642

There's a middle ground here. I saw the northern lights with my own eyes just days ago and it was mostly grey. I saw some color. But when I took a photo with a phone camera, the color absolutely popped. So it may be that you've seen more color than any photo, but the average viewer in Seattle this past weekend saw grey-er with their eyes and huge color in their phone photos.

(Edit: it was still super-cool even if grey-ish, and there was absolutely beautiful colors in there if you could find your way out of the direct city lights)

goostavos · 2024-05-15T04:48:45 1715748525

The hubris of suggesting that your single experience of vaguely seeing the northern lights one time in Seattle has now led to a deep understanding of their true "color" and that the other person (perhaps all other people?) must be fooling themselves is... part of what makes HN so delightful to read.

I've also seen the northern lights with my own eyes. Way up in the arctic circle in Sweden. Their color changes along with activity. Grey looking sometimes? Sure. But also colors that are so vivid that it feels like it envelopes your body.

lpapez · 2024-05-15T08:19:19 1715761159

> The hubris of suggesting that your single experience of vaguely seeing the northern lights one time in Seattle has now led to a deep understanding of their true "color" and that the other person (perhaps all other people?) must be fooling themselves is... part of what makes HN so delightful to read.

The H in HN stands for Hubris.

stavros · 2024-05-15T07:37:33 1715758653

They did say "the average viewer in Seattle this past weekend", not "all other viewers".

Then again, the average viewer in Seattle this past weekend is hardly representative of what the northern lights look like.

freedomben · 2024-05-15T12:34:22 1715776462

The person they were responding to was saying that the people reporting grays were wrong, and that they had seen it and it was colorful. If anything, you should be accusing that person of hubris, not GP. All GPS point was, is that it can differ in different situations. They used the example of Seattle to show that the person they were responding to is not correct that it is never gray and dull.

mitthrowaway2 · 2024-05-15T20:41:17 1715805677

The human retina effectively combines a color sensor with a monochrome sensor. The monochrome channel is more light-sensitive. When the lights are dim, we'll dilate our pupils, but there's only so much we can do to increase exposure. So in dim light we see mostly in grayscale, even if that light is strongly colored in spectral terms.

Phone cameras have a Bayer filter which means they only have RGB color-sensing. The Bayer filter cuts out some incoming light and dims the received image, compared with what a monochrome camera would see. But that's how you get color photos.

To compensate for a lack of light, the phone boosts the gain and exposure time until it gets enough signal to make an image. When it eventually does get an image, it's getting a color image. This comes at the cost of some noise and motion-blur, but it's that or no image at all.

If phone cameras had a mix of RGB and monochrome sensors like the human eye does, low-light aurora photos might end up closer to matching our own perception.

hoyd · 2024-05-15T05:25:01 1715750701

I can see what you mean, and that the video is somewhat not what it would be like in real. I have lived in northern Norway most of my life, and watched Auroras a lot. It certainly look green and link for the most time. Fainter, it would perhaps sorry gray I guess? Red, when viewed from a more southern viewpoint..

I work at Andøya Space where perhaps most of the space research on Aurora had been done by sending scientific rockets into space for the last 60 yrs.

pmlarocque · 2024-05-15T02:42:01 1715740921

That not true, they look grey when they aren't bright enough, but they can look green or red to the naked eyes if they are bright. I have seen it myself and yes I was disappointed to see only grey ones last week.

see: https://theconversation.com/what-causes-the-different-colour...

arcticbull · 2024-05-15T02:45:26 1715741126

> [Aurora] only appear to us in shades of gray because the light is too faint to be sensed by our color-detecting cone cells."

> Thus, the human eye primarily views the Northern Lights in faint colors and shades of gray and white. DSLR camera sensors don't have that limitation. Couple that fact with the long exposure times and high ISO settings of modern cameras and it becomes clear that the camera sensor has a much higher dynamic range of vision in the dark than people do.

https://www.space.com/23707-only-photos-reveal-aurora-true-c...

This aligns with my experiences.

The brightest ones I saw in Northern Canada I even saw hints of reds - but no real greens - until I looked at it through my phone, and it looked just like the simulated video.

If I looked up and saw them the way they appear in the simulation, in real life, I'd run for a pair of leaded undies.

Kiro · 2024-05-15T05:26:39 1715750799

That is totally incorrect which anyone who have seen real northern lights can attest to. I'm sorry that you haven't gotten the chance to experience it and now think all northern lights are that lackluster.

Tronno · 2024-05-15T02:58:28 1715741908

I've seen it bright green with the naked eye. It definitely happens. That article is inaccurate.

Maxion · 2024-05-15T05:37:39 1715751459

Greens are the more common colors, reds and blues occur in higher energy solar storms.

And yes, they can be as green to the naked eye in that AI video. I've seen aurora shows that fill the entire night sky from horizon to horizon, way more impressive than that AI video with my own eyes.

kortilla · 2024-05-15T05:34:34 1715751274

This is such an arrogant pile of bullshit. I’ve seen very obvious colors on many different occasions in the northern part of the lower 48, up in southern Canada, and in Alaska.

blhack · 2024-05-15T04:49:07 1715748547

Have you ever seen the Northern Lights with your eyes? If so I'm curious where you saw them.

I echo what some other posters here have said: they're certainly not gray.

simonjgreen · 2024-05-15T06:02:03 1715752923

To be fair, the prompt isn’t asking for a realistic interpretation it’s asking for a timelapse. What it’s generated is absolutely what most timelapses look like.

> Prompt: Timelapse of the northern lights dancing across the Arctic sky, stars twinkling, snow-covered landscape

sdenton4 · 2024-05-15T02:41:50 1715740910

That doesn't seem in any way useful, though... To use a very blunt analogy, are color blind people intelligent/sentient/whatever? Obviously, yes: differences in perceptual apparatus aren't useful indicators of intelligence.

shermantanktop · 2024-05-15T02:59:46 1715741986

As a colorblind person…I could see the northern lights way better than all the full-color-vision people around me squinting at their phones.

Wider bandwidth isn’t always better.

Ferret7446 · 2024-05-15T06:04:48 1715753088

> I could see the northern lights way better than all the full-color-vision people around me

How would you know?

squeaky-clean · 2024-05-15T08:18:14 1715761094

Quote the entire sentence, not just a portion of it.

Ferret7446 · 2024-05-16T06:15:02 1715840102

I don't see how that's relevant, unless you're able to possess people looking at their phones to experience what they're experiencing.

shermantanktop · 2024-05-19T02:56:31 1716087391

To add a bit of color (ha) I was with my color-sighted spouse at a spot well known for panoramic views. 50ish people there. Many conversations happening around me.

“I can’t see anything” “Maybe that’s something over there?” “What’s everyone looking at?”

Someone shows their phone.

“Ooh!” “How do you turn on night mode?” “Wow it’s so much clearer on the phone!”

So I can’t know what their eyes see or what they really think, I could hear what came out of their mouths.

I don’t think this is an instance that warrants deep philosophical skepticism about the nature of truth or the impossibility of knowledge.

22c · 2024-05-15T02:40:47 1715740847

I've only ever seen photos of the northern lights and I also didn't know that.

laserbeam · 2024-05-15T03:40:45 1715744445

For decades, game engines have been working on realistic rendering. Bumping quality here and there.

The golden standard for rendering has always been cameras. It’s always photo-realistic rendering. Maybe this won’t be true for VR, but so far most effort is to be as good as video, not as good as the human eye.

Any sort of video generation AI is likely to have the same goal. Be as good as top notch cameras, not as eyes.

darkstar_16 · 2024-05-15T07:55:58 1715759758

Northern lights are actually pretty colourful, even to the naked eye. I've never seen them pale or b/w

Kiro · 2024-05-15T05:20:23 1715750423

Shouldn't the model reflect how it looks on video rather than our naked eye?

skypanther · 2024-05-15T13:23:56 1715779436

What struck me about the northern lights video was that it showed the Milky Way crossing the sky behind the northern lights. That bright part of the Milky Way is visible in the southern sky but the aurora hugging the horizon like that indicates the viewer is looking north. (Swap directions for the southern hemisphere and the aurora borealis).

garyrob · 2024-05-15T04:40:07 1715748007

Even in NY State, Hudson River Valley, I've seen them with real color. They're different each time.

poulpy123 · 2024-05-15T09:29:34 1715765374

that's a bad example since the only images of aurora borealis are brightly colored. What I expect of an image generator is to output what is expected from it

mikeocool · 2024-05-15T11:50:41 1715773841

Ha, wow, I’d never seen this one before. The failures are pretty great. Even repeatedly trying to correct ChatGPT/Dall-e with the proper number of squares and pieces, it somehow makes it worse.

This is what dall-e came up with after trying to correct many previous iterations: https://imgur.com/Ss4TwNC

Etherlord87 · 2024-05-16T14:54:13 1715871253

As someone who criticizes AI a lot: this actually looks pretty cool! AI is not better at surrealism than a good artist, but at least its work is enjoyable as a surreal art. Justifies the name Dall-e pretty well too.

sdenton4 · 2024-05-15T02:43:53 1715741033

This strikes me as equally "AI complete" as drawing hands, which is now essentially a solved problem... No one test is sufficient, because you can add enough training data to address it.

dongping · 2024-05-15T20:23:15 1715804595

Not sure about better models, but DALL-E3 still seems to be having problems with hands:

https://www.reddit.com/r/dalle2/comments/1afhemf/is_it_possi...

https://www.reddit.com/r/dalle2/comments/1cdks71/a_hand_with...

Etherlord87 · 2024-05-16T14:56:13 1715871373

As opposed to legs, eyes, construction elements? ;)

salamo · 2024-05-15T02:58:58 1715741938

Yeah "AI complete" is a bit tongue-in-cheek but it is a fairly spectacular failure mode of every model I've tried.

swyx · 2024-05-15T10:41:24 1715769684

ive been using “agi-hard” https://latent.space/p/agi-hard as a term

because completeness isnt really what we are going for

smusamashah · 2024-05-15T08:47:52 1715762872

Ideogram and dalle do hands pretty well

sabellito · 2024-05-15T07:29:12 1715758152

Per usual the top comment on anything AI related is snark on "it can't to [random specific thing] well yet".

kmacdough · 2024-05-15T09:55:02 1715766902

Tiring, but so is the relentless over-marketing. Each new demo implies new use cases and flexible performance. But the reality is they're very brittle and blunder most seemingly simple tasks. I would personally love an ongoing breakdown of the key weaknesses. I often wonder "can it X?" The answer is almost always "almost, but not a useful almost".

perbu · 2024-05-15T12:44:47 1715777087

Most generative AI will struggle when given a task that requires something more less exact. They're probably pretty good at making something "chessish".

creatonez · 2024-05-18T01:45:17 1715996717

> It seems to be an "AI complete" problem.

Conventionally this term means the opposite -- problems that AI unlocks that conventional computing could not do. Conventional computing can render a very wide range of different stylized chess boards, but when an ML technique like diffusion is applied to this mundane problem, it falls apart.

Trixter · 2024-05-16T20:52:43 1715892763

Mine is generation of any actual IBM PC/XT computer. All of the training sets either didn't include actual IBM PCs in them, or they labeled all PC compatibles "IBM PC". Whatever the reason, no generative AI today, whether commercial or open-source, can generate any picture of an IBM PC 5150. Once that situation improves, I'll start taking notice.

svag · 2024-05-14T20:34:01 1715718841

An interesting thing that Google does is to watermark the AI generated videos using the [SynthID technology](https://deepmind.google/technologies/synthid/).

It seems that the SynthID is not only for AI generated video but for image, text and audio.

bardak · 2024-05-15T01:02:50 1715734970

I would like a bit more convincing that the text watermark will not be noticeable. AI text already has issues with using certain words to frequently. Messing with the weights seems like it might make the issue worse

Tostino · 2024-05-15T01:12:47 1715735567

Not to mention, when does he get applied? If I am asking an llm to transform some data from one format to another, I don't expect any changes other than the format.

padolsey · 2024-05-15T04:33:19 1715747599

It seems really clever, especially the encoding of a signature into LLM token probability selections. I wonder if synthid will trigger some standarization in the industry. I don't think there's much incentive to tho. Open-source gen AI will still exist. What does google expext to occur? I guess they're just trying to present themselves as 'ethically pursuing AI'.

ugh123 · 2024-05-14T20:40:45 1715719245

From a filmmaking standpoint I still don't think this is impactful.

For that it needs a "director" to say: "turn the horse's head 90˚ the other way, trot 20 feet, and dismount the rider" and "give me additional camera angles" of the same scene. Otherwise this is mostly b-roll content.

I'm sure this is coming.

qingcharles · 2024-05-14T21:56:18 1715723778

I can see using these video generators to create video storyboards. Especially if you can drop in a scribbled sketch and a prompt for each tile.

ancientworldnow · 2024-05-15T00:08:32 1715731712

That sounds actively harmful. Often we want story boards to be less specific so as not to have some non artist decision maker ask why it doesn't look like the storyboard.

And when we want it to match exactly in an animatic or whatever, it needs to be far more precise than this, matching real locations etc.

gregmac · 2024-05-15T03:13:58 1715742838

I hadn't thought about that in movie context before, but it totally makes sense.

I've worked with other developers that want to build high fidelity wire frames, sometimes in the actual UI framework, probably because they can (and it's "easy"). I always push back against that, in favor of using whiteboard or Sharpies. The low-fidelity brings better feedback and discussion: focused on layout and flow, not spacing and colors. Psychologically it also feels temporary, giving permission for others to suggest a completely different approach without thinking they're tossing out more than a few minutes of work.

I think in the artistic context it extends further, too: if you show something too detailed it can anchor it in people's minds and stifle their creativity. Most people experience this in an ironically similar way: consider how you picture the characters of a book differently depending on if you watched the movie first or not.

sbarre · 2024-05-15T00:24:34 1715732674

I know you weren't implying this, but not every storyboard is for sharing with (or seeking approval from) decision makers.

I could see this being really useful for exploring tone, movement, shot sequences or cut timing, etc..

Right now you scrape together "kinda close enough" stock footage for this kind of exploration, and this could get you "much closer enough" footage..

shermantanktop · 2024-05-15T03:15:13 1715742913

I think of it in terms of the anchoring bias. Imagine that your most important decisions are anchored for you by what a 10 year old kid heard and understood. Your ideas don’t come to life without first being rendered as a terrible approximation that is convincing to others but deeply wrong to you, and now you get to react to that instead of going through your own method.

So if it’s an optional tool, great, but some people would be fine with it, some would not.

sbarre · 2024-05-15T12:09:31 1715774971

Absolutely. Everyone's creative process is different (and valid).

cpill · 2024-05-15T09:33:52 1715765632

I guess this will give birth to a new kind of film making. Start with a rough sketch, generate 100 higher quality versions with an image generator, select one to tweak, use that as input to a video generator which generates 10 versions, coffee one to refine etc

larodi · 2024-05-15T09:11:44 1715764304

Perhaps the only industry which immediately benefits from this is the short ads and perhaps TikTok. But still it is very dubious, as people seem to actually enjoy being themselves the directors of their thing, not somebody else.

Maybe this works for ads for duner place or shisha bar in some developing country. I’ve seen generated images used for menus in such places.

But I doubt a serious filmography can be done this way. And if it can - it’d be again thanks to some smart concept on behalf of humans.

imachine1980_ · 2024-05-15T01:50:06 1715737806

Stock videos are indeed crucial, especially now that we can easily search for precisely what we need. Take, for instance, the scene at the end of 'Look Up' featuring a native American dance in Peru. The dancer's movements were captured from a stock video, and the comet falling was seamlessly edited in. now imagine having near infinite stock videos tailored to the situation.

rzmmm · 2024-05-15T03:44:33 1715744673

Stock photographers are already having issues with piracy due to very powerful AI watermark removal tools. And I suspect the companies are using content of these people to train these models too. .

Shocka1 · 2024-05-16T13:12:41 1715865161

Unlimited possibilities. And more is coming - we're only in the beginning stages of this tech. Truly exciting stuff.

chacham15 · 2024-05-15T01:30:52 1715736652

I dont think "turn the horse's head 90˚" is the right path forward. What I think is more likely and more useful is: here is a start keyframe and here is a stop keyframe (generated by text to image using other things like controlnet to control positioning etc.) and then having the AI generate the frames in between. Dont like the way it generated the in between? Choose a keyframe, adjust it, and rerun with the segment before and segment after.

GenerocUsername · 2024-05-15T02:16:44 1715739404

This appeals to me because it feels auditable and controllable... But the pace these things have been progressing the last 3 years, I could imagine the tech leapfrogs all conventional understanding real soon. Likely outputting gaussian splat style outputs where the scene is separate from the camera and ask peices can be independently tweaked via a VR director chair

8note · 2024-05-15T03:41:10 1715744470

So a declarative keyframe of "the horses head is pointed forward" and a second one of "the horse is looking left"

And let the robot tween?

Vs an imperative for "tween this by turning the horse's head left"

evantbyrne · 2024-05-14T21:45:08 1715723108

They claim it can accept an "input video and editing command" to produce a new video output. Also, "In addition, it supports masked editing, enabling changes to specific areas of the video when you add a mask area to your video and text prompt." Not sure if that specific example would work or not.

sailfast · 2024-05-14T23:04:43 1715727883

For most things I view on the internet B-roll is great content, so I'm sure this will enable a new kind of storytelling via YouTube Shorts / Instagram, etc at minimum.

kmacdough · 2024-05-15T10:45:07 1715769907

I wouldn't be so sure it's coming. NNs currently dont have the structures for long term memory and development. These are almost certainly necessary for creating longer works with real purpose and meaning. It's possible we're on the cusp with some of the work to tame RNNs, but it's taken us years to really harness the power of transformers.

Eji1700 · 2024-05-15T01:12:44 1715735564

There's also the whole "oh you have no actual model/rigging/lighting/set to manipulate" for detail work issue.

That said, I personally think the solution will not be coming that soon, but at the same time, we'll be seeing a LOT more content that can be done using current tools, even if that means a dip in quality (severely) due to the cost it might save.

SJC_Hacker · 2024-05-15T02:52:34 1715741554

This lead me to the question of why hasn't there been an effort to do this with 3D content (that I know of).

Because camera angles/lighting/collision detection/etc. at that point would be almost trivial.

I guess with the "2D only" approach that is based on actual, acquired video you get way more impressive shots.

But the obvious application is for games. Content generation in the form of modeling and animation is actually one the biggest cost centers for most studios these days.

gedy · 2024-05-15T01:58:51 1715738331

I think with AI content, we'd need to not treat it like expecting fine grained control. E.g. instead like "dramatic scene of rider coming down path, and dismounting horse, then looking into distance", etc. (Or even less detail eventually once a cohesive story can be generated.)

thehappypm · 2024-05-15T13:15:58 1715778958

If you or I don’t see the potential here, I think that just means someone more creative is going to do amazing things with it

Shocka1 · 2024-05-16T13:17:17 1715865437

HN has always been notoriously negative, and wrong a lot of the time. One of my personal favorites is Brian Armstrong's post about an exciting new company he was starting around cryptocurrency and needing a co-founder... Always a good one to go back and read when I've been staying up late working on side projects and need a mental boost.

https://news.ycombinator.com/item?id=3754664

thehappypm · 2024-05-16T19:26:37 1715887597

Wow, that is a really negative thread. To be fair it’s not the best post either, but it shows that people jump to negativity really fast.

teaearlgraycold · 2024-05-15T01:38:06 1715737086

Everything I’ve heard from professionals backs that up. Great for B roll. Great for stock footage. That’s it.

aetherson · 2024-05-15T01:37:59 1715737079

Yeah, I've made a lot of images, and it sure is amazing if all you're interested in is, like, "Any basically good image," but if you start needing something very particular, rather than "anything that is on a general topic and is aesthetically pleasing," it gets a lot harder.

And there are a lot more degrees of freedom to get something wrong in film than in a single still image.

lofaszvanitt · 2024-05-15T05:52:43 1715752363

I can't wait what will the big video camera makers gonna do with tech similar to this. Since Google clearly have zero idea what to do with this, and they lack the creativity, it's up to ARRI, Canon, Panasonic etc. to create their own solutions for this tech. I can't wait to see what Canon has up its sleeves with their new offerings that come in a few months.

loudmax · 2024-05-14T18:14:59 1715710499

The videos in this demo are pretty neat. If this had been announced just four months ago we'd all be very impressed by the capabilities.

The problem is that these video clips are very unimpressive compared to the Sora demonstration which came out three months ago. If this demo was announced by some scrappy startup it would be worth taking note. Coming from Google, the inventor of the Transformer and owner of the largest collection of videos in the world, these sample videos are underwhelming.

Having said that, Sora isn't publicly available yet, and maybe Veo will have more to offer than what we see in those short clips when it gets a full release.

alex_duf · 2024-05-15T08:36:51 1715762211

>these sample videos are underwhelming

wow the speed at which we can be blasé is terrifying. 6 months ago this was not possible, and felt this was years away!

They're not underwhelming to me, they're beyond anything I thought would ever be possible.

are you genuinely unimpressed? or maybe trying to play it cool?

steamer25 · 2024-05-15T17:48:11 1715795291

They didn't really do a very good job of selecting marketing examples. The only good one, that shows off creative possibilities, is the knit elephant. Everything else looks like the results of a (granted fairly advanced) search through a catalog of stock footage.

Even search, in and of itself, is incredibly amazing but fairly commoditized at this point. They should've highlighted more unique footage.

danielbln · 2024-05-15T10:27:22 1715768842

The faster the tech cycle, the faster we become accustomed to it. Look at your phone, an absolute, wondrous marvel of technology that would have been utterl and totally scifi just 25 years ago. Yet we take it for granted, as we do with all technology eventually. The time frames just compress is all, for better or for worse.

newswasboring · 2024-05-15T11:22:22 1715772142

Yeah man but there has to be some thresholds. We take phones for granted after years of active availability. I personally remember days when "what if your phone dies" was a valid concern for even short periods, and I'm not that old. Sora isn't even available publicly. At some point it crosses over from being jaded to just being a cynic.

loudmax · 2024-05-15T12:28:49 1715776129

On some level, it's healthy to retain a sense of humility at the technological marvels around us. Everything about our daily lives is impressive.

Just a few years ago, I would have been absolutely blown away by these demo videos. Six months ago, I would have been very impressed. Today, Google is rolling a product that seems second best. They're playing catch-up in a game where they should be leading.

I will still be very impressed to see videos of that quality generated on consumer grade hardware. I'll also be extremely impressed if Google manages to roll out public access to this capability without major gaffes or embarrassments.

This is very cool tech, and the developers and engineers that produced it should be proud of what they've achieved. But Google's management needs to be asking itself how they've allowed themselves to be surpassed.

fakedang · 2024-05-14T18:37:39 1715711859

Honestly, if Veo becomes public faster than Sora, they could win the video AI race. But what am I wishfully thinking - it's Google we're talking about!

Jensson · 2024-05-14T19:34:57 1715715297

> But what am I wishfully thinking - it's Google we're talking about!

Google the company known to launch way too many products? What other big company launches more stuff early than them? What people complain about Google is that they launch too much and then shut them down, not that they don't launch things.

fakedang · 2024-05-15T22:22:29 1715811749

Google lost first place in AI precisely because they've been walking around imaginary eggshells regarding AI's effect on the public. That led to the whole Gemini fiasco and the catch up game they've had to play with OpenAI-MSFT.

spaceman_2020 · 2024-05-15T06:37:49 1715755069

The cost to switch to new models is negligible. People will switch to Sora if its better instantly

I’ve switched to Opus from GPT-4 for coding and it was non-trivially easy

ndls · 2024-05-15T06:54:40 1715756080

I think you used non-trivially wrong there, bud.

spaceman_2020 · 2024-05-22T06:13:35 1716358415

hah, I did :)

SilverSlash · 2024-05-15T09:30:36 1715765436

Except your single experience doesn't mean it's generally true, bud. For instance I have not switched to Opus despite claims that it is better because I don't want to go through the effort of cancelling my ChatGPT subscription and subbing to Claude. Plus I like getting new stuff early that OpenAI occasionally gives out and the same could apply for Google's AI.

fakedang · 2024-05-15T22:24:22 1715811862

Sorry, but lock in effects are real. End users, solo devs and startups might find it trivially easy, but enterprise clients would go through hoops before a decision is to be made. And enterprise clients would rather not go through with that, hence they'll stick to whoever came first, unless there's a massive differentiator between the two.

xnx · 2024-05-14T20:31:23 1715718683

60 second example video: https://www.youtube.com/watch?v=diqmZs1aD1g

candiddevmike · 2024-05-15T01:39:22 1715737162

For some reason this video reminds me of dreaming--details just kind of pop in and out and the entire thing seems very surreal and fractal.

jprete · 2024-05-15T02:53:47 1715741627

Same impression here. The scene changes very abruptly from a sky view to following the car. The cars meld with the ground frequently, and I think I saw one car drive through another at one point.

londons_explore · 2024-05-15T04:05:53 1715745953

Looks like in places this has learned video compression artifacts...

exodust · 2024-05-15T06:43:06 1715755386

Funny if true. Perhaps in some generated video it will suddenly interrupt the sequence with pretend unskippable ads for phone cases & VPNs.

nixpulvis · 2024-05-15T02:38:08 1715740688

So… much… bloom. I like it, but still holy shit. I hate that I like it because I don’t want this art form to be reduced by overuse. Sadly, it’s too late.

I’ll just go back to living under a rock.

antifa · 2024-05-26T19:49:58 1716752998

datashaman · 2024-05-15T10:27:17 1715768837

1080p but it has pixelated artifacts...

mccraveiro · 2024-05-14T18:15:01 1715710501

They didn't show any human videos, which could indicate that the technology struggles with generating them.

chubot · 2024-05-14T19:54:25 1715716465

It's also probably that it's easier to spot fake humans than to spot fake cats or camels. We are more attuned to the faces of our own species

That is, AI humans can look "creepy" whereas AI animals may not. The cowboy looks pretty good precisely because it's all shadow.

CGI animators can probably explain this better than I can ... they have to spend way more time on certain areas and certain motions, and all the other times it makes sense to "cheat" ...

It explains why CGI characters look a certain way too -- they have to be economical to animate

himinlomax · 2024-05-14T21:40:48 1715722848

They're probably still wary of their latest PR disaster, the inclusive and diverse WW2 Germans from Gemini.

revscat · 2024-05-14T18:37:35 1715711855

I’m sure part of the reason, beyond those given already, is that they want to avoid the debate around nudity.

karmasimida · 2024-05-14T18:22:33 1715710953

Actually there is one in the last demo, it is not an individual one, but one shot in the demo where a team uses this model to create a scene with human in it, where they created an image of black woman but only up her head in it

I would generally agree though, it is not normal they didn’t show more human

ants_everywhere · 2024-05-16T02:01:59 1715824919

Gemini still won't generate images of humans or even other hominids. They're missing here probably for the same reason. Namely that they're trying to figure out how to balance diverse representation with all the various other factors.

dyauspitr · 2024-05-14T19:22:40 1715714560

You know why and it’s not that their technology struggles with it.

lewispollard · 2024-05-15T06:57:37 1715756257

Please elaborate, because I certainly don't.

blinky88 · 2024-05-15T07:36:32 1715758592

I think he's talking about the diversity controversy

dyauspitr · 2024-05-15T21:20:24 1715808024

That might be a factor too but I was referring more to the nudity and objectification issue.

mjfl · 2024-05-14T21:21:54 1715721714

thank goodness.

popcar2 · 2024-05-14T18:27:41 1715711261

Not nearly as impressive as Sora. Sora was impressive because the clips were long and had lots of rapid movement since video models tend to fall apart when the movement isn't easy to predict.

By comparison, the shots here are only a few seconds long and almost all look like slow motion or slow panning shots cherrypicked because they don't have that much movement. Compare that to Sora's videos of people walking in real speed.

The only shot they had that can compare was the cyberpunk video they linked to, and it looks crazy inconsistent. Real shame.

latexr · 2024-05-14T19:53:42 1715716422

> Not nearly as impressive as Sora. Sora was impressive because the clips were long and had lots of rapid movement

The most impressive Sora demo was heavily edited.

https://www.fxguide.com/fxfeatured/actually-using-sora/

jsheard · 2024-05-14T20:16:40 1715717800

To Shy Kids credit they made it clear the Sora footage was heavily edited, but OpenAIs site still presents Air Head without that context.

https://www.youtube.com/watch?v=KFzXwBZgB88 (posted the day after the short debuted)

https://openai.com/index/sora-first-impressions (no mention of editing, nor do they link to the above making-of video)

seoulmetro · 2024-05-14T22:49:21 1715726961

There is now on that second link:

>The videos below were edited by the artists, who creatively integrated Sora into their work, and had the freedom to modify the content Sora generated.

jsheard · 2024-05-14T22:54:21 1715727261

Ha, here's an archive from yesterday for posterity.

https://web.archive.org/web/20240513050023/https://openai.co...

They also just added a link to the making-of video.

Aeolun · 2024-05-14T23:45:31 1715730331

If you modified something because it got some attention on HN, at least have the guts to own up to it :/

seoulmetro · 2024-05-15T01:02:35 1715734955

That's hilarious. Your comment clearly got seen by someone.

rvz · 2024-05-14T20:07:47 1715717267

Interesting to see that OpenAI was successful in creating their own reality distortion spells, just like Apple's reality distortion field which has fooled many of these commenters here.

It's quite early to race to the conclusion that one is better than the other when not only they are both unreleased, but especially when the demos can be edited, faked or altered to look great for optics and distortion.

EDIT: It appears there is at least one commenter who replied below that is upset with this fact above.

It is OK to cope, but the truth really doesn't care especially when the competition (Google) came out much stronger than expected with their announcements.

ijidak · 2024-05-14T22:17:23 1715725043

Well, as a counterpoint, Apple did become a $2 trillion dollar company...

Distortion is easiest when the products really work. :)

adventured · 2024-05-14T22:25:30 1715725530

Apple got up to $3 trillion back in 2023.

turnsout · 2024-05-15T00:48:17 1715734097

Indeed, and they’re at 2.87T today… Built largely on differentiated high-margin products, which is not how I would describe OpenAI. I should clarify that I’m a fan of both companies, but the reality is that OpenAI’s business model depends on how well it can commoditize itself.

fkyoureadthedoc · 2024-05-14T20:14:51 1715717691

[flagged]

latexr · 2024-05-14T20:23:13 1715718193

HN guidelines ask commenters to be kind and for the discussion to get more thoughtful and substantive as it progresses.

If you believe a comment is so bad as to warrant shame and embarrassment, please explain why you think so, rather than being dismissive and spewing insults.

On a related note, that is likely why you’re being downvoted. I wouldn’t be surprised if the comment is soon flagged.

hanspeter · 2024-05-15T06:32:24 1715754744

I believe it was clear that Air Head was an edited video.

The intention wasn't to show "This is what Sora can generate from start to end" but rather "This is what a video production team can do with Sora instead of shooting their own raw footage."

Maybe not so obvious to others, but for me it was clear from how the other demo videos looked.

Jensson · 2024-05-14T18:59:46 1715713186

> Sora was impressive because the clips were long and had lots of rapid movement

Sora videos ran at 1 beat per second, so everything in the image moved at the same beat and often too slow or too fast to keep the pace.

It is very obvious when you inspect the images and notice that there are keyframes at every whole second mark and everything on the screen suddenly goes in their next animation step.

That really limits the kind of videos you can generate.

lupire · 2024-05-14T19:36:27 1715715387

So it needs to learn how far each object can travel in 1sec at its natural speed?

Jensson · 2024-05-14T19:40:21 1715715621

It also needs to separate animation steps for different objects so that objects can keep different speeds. It isn't trivial at all to go from having a keyframe for the whole picture to having separate for separate parts, you need to retrain the whole thing from the ground up and the results will be way worse until you figure out a way to train that.

My point is that it isn't obvious at all that Soras way actually is closer to the end goal, it might look better today to have those 1 second beats for every video but where do you go from there?

Aerroon · 2024-05-14T23:19:45 1715728785

The best case scenario would probably being able to generate "layers" at a time. That would give more creative control over the outcome, but I have no idea how you would do it.

TIPSIO · 2024-05-14T19:08:52 1715713732

Objectively speaking (if people would be honest with themselves), both are just decent at best.

I think comparing them now is probably not that useful outside of this AI hype train. Like comparing two children. A lot can happen.

The bigger message I am getting from this is it's clear OpenAI won't have a super AI monopoly.

TaylorAlexander · 2024-05-14T19:19:07 1715714347

Comparing two children is a good one. My girlfriend has taken to pointing out when I’m engaging in “punditry”. They're an engineer like I am and we talk about tech all the time, but sometimes I talk about which company is beating which company like it’s a football game, and they call me out for it.

Video models are interesting, and to some extent trying to imagine which company is gonna eat the other’s lunch is kind of interesting, but sometimes that’s all people are interested in and I can see my girlfriend's reasoning for being disinterested in such discussion.

Jonanin · 2024-05-15T03:00:57 1715742057

Except that many of the people involved do think of it like a football game, and thus it actually is like one. Of course the researchers and engineers at both OpenAI and Google DeepMind have a sense of rivalry and strive to one up another. They definitely feel like they are in a competition.

TaylorAlexander · 2024-05-15T06:33:21 1715754801

> They definitely feel like they are in a competition.

Citation needed?

Although I did not work in AI, I did work at Google X robotics on a robot they often use for AI research.

Maybe some people felt like it was a competition, but I don’t have much reason to believe that feeling is common. AI researchers are literally in collaboration with other people in the field, publishing papers and reading the work of others to learn and build upon it.

Jensson · 2024-05-15T08:34:31 1715762071

> AI researchers are literally in collaboration with other people in the field, publishing papers and reading the work of others to learn and build upon it.

When OpenAI suddenly stopped publishing their stuff I bet that many researchers now started feeling like it started to be a competition.

OpenAI is no longer cooperating, they are just competing. They still haven't said anything about how gpt-4 works.

Aeolun · 2024-05-14T23:47:17 1715730437

I’m fairly certain Google just has a big stack of these in storage but never released, or the moment someone pulls ahead it’s all hands on deck to make the same thing.

motoxpro · 2024-05-14T21:24:50 1715721890

What would make this "Good?"

ein0p · 2024-05-14T18:40:05 1715712005

Also Sora demos had some really impressive generations featuring _people_. Here we hardly see any people which likely means exactly what you’d guess.

data-ottawa · 2024-05-14T19:20:56 1715714456

Has Gemini started generated impacted of people again? My trial has ended and I haven’t been following the issue.

nuz · 2024-05-14T18:32:04 1715711524

Sora is also movement limited to a certain range if you look at the clips closely. Probably something like filtering by some function of optical flow in both cases.

arcastroe · 2024-05-14T19:44:00 1715715840

> The shots here [..] almost all look like slow motion or slow panning shots.

I think this is arguably better than the alternative. With slow-mo generated videos, you can always speed them up in editing. It's much harder to take a fast-paced video and slow it down without terrible loss in quality.

btown · 2024-05-14T23:54:54 1715730894

A commercially available tool that can turn still images into depth-conscious panning shots is still tremendously impactful across all sorts of industries, especially tourism and hospitality. I’m really excited to see what this can do.

pheatherlite · 2024-05-15T00:27:21 1715732841

Not just that, but anything with a subject in it felt uncanny valleyish... like that cowboy clip, the gate of the horse stood out as odd and then I gave it some attention . It seems like a camel's gate. And whole thing seems to be hovering, gliding rather than walking. Sora indeed seems to have an advantage

__float · 2024-05-15T01:09:08 1715735348

I thought a camel's gait is much closer to two legs moving almost at the same time. Granted, I don't see camels often. Out of curiosity can you explain that more?

spiderfarmer · 2024-05-14T18:41:31 1715712091

Also the horse just looks weird, just like the buildings and peppers.

It's impressive as hell though. Even if it would only be used to extrapolate existing video.

dyauspitr · 2024-05-14T23:12:46 1715728366

They’re not showing people because that can get hairy quickly.

dangoodmanUT · 2024-05-14T19:42:44 1715715764

[flagged]

ipaddr · 2024-05-14T23:34:42 1715729682

I can't wait to see any weights.

LZ_Khan · 2024-05-14T18:46:22 1715712382

I imagine thats just a function of how much training data you throw at it.

totaldude87 · 2024-05-14T19:48:34 1715716114

Could also be the doing of google. if Veo screws up , the weight falls on Alphabet stock. While open AI is not public and doesn't have to worry about anything . Like even if open AI faked some of their AI videos(not saying they did), it wouldn't affect them the way it would affect Veo--> Google-->Alphabet

being cautious often puts a dent in innovation

soulofmischief · 2024-05-14T19:51:03 1715716263

You mean like how they faked some Gemini stuff?

https://www.bbc.com/news/technology-67650807

inasio · 2024-05-14T18:21:56 1715710916

From a 2014 Wired article [0]: "The average shot length of English language films has declined from about 12 seconds in 1930 to about 2.5 seconds today"

I can see more real-world impact from this (and/or Sora) than most other AI tools

[0] https://www.wired.com/2014/09/cinema-is-evolving/

mattgreenrocks · 2024-05-14T18:36:18 1715711778

This is very noticeable. Watching movies from the 1970s is positively serene for me, vs the shot time on modern films often leaves me wonder, "wait, what just happened there?"

And I'm someone who is fine playing fast action video games. Can't imagine what it's like if you're older or have sensory processing issues.

psbp · 2024-05-14T20:51:48 1715719908

My brain processes too slow for modern action movies.

I can tell what's going on, but I always end up feeling agitated.

MarcScott · 2024-05-15T06:14:50 1715753690

I'm okay with watching the majority of action movies, but I distinctly remember watching this fight scene in a Bourne movie and not having a clue what was going on. The constant camera changes, short shot length, and shaky cam, just confused the hell out of me.

https://youtu.be/uLt7lXDCHQ0?si=JnVMjmu0WgN5Jr5e&t=70

earthnail · 2024-05-15T08:39:24 1715762364

I thought it was brilliant. Notice there’s no music. It’s one of the most brutal action scenes I know. Brutal in the sense of how honest it felt about direct combat.

JohnMakin · 2024-05-15T18:29:14 1715797754

I'm glad we're finally getting away from the 00's shaky cam era.

ryandrake · 2024-05-14T18:44:55 1715712295

Obligatory: Liam Neeson jumps over a fence in 6 seconds, with 14 cuts[1].

1: https://www.youtube.com/watch?v=gCKhktcbfQM

aidenn0 · 2024-05-14T19:20:54 1715714454

I'd like to fact check this amazing comment on that video, but it would require watching Taken 3:

> Some of y'all may find how awful this editing gets pretty interesting: I did an Average Shot Length (ASL) for many movies for a recent project, and just to illustrate bad overediting in action movies, I looked at Taken 3 (2014) in its extended cut.

> The longest shot in the movie is the last shot, an aerial shot of a pier at sunset ending the movie as the end credits start rolling over them. It clocks in at a runtime of 41 seconds and is, BY FAR, the longest shot in the movie.

> The next longest is a helicopter establishing shot of the daughter's college after the "action scene" there a little over an hour in, at 5 seconds.

> Otherwise, the ASL for Taken 3 (minus the end credits/opening logos), which has a runtime of 1:49:40, 4,561 shots in all (!!!), is 1.38 SECONDS . For comparison, Zack Snyder's Justice League (2021) (minus end credits/opening logos) is 3:50:59, with 3163 shots overall, giving it an ASL of 4.40 seconds, and this movie, at 1 hour 50 minutes, has north of 4,561 for an ASL of 1.38 seconds?!?! Taken 3 has more shots in it than Zack Snyder's Justice League, a movie more than double its length...

> To further illustrate how ridiculous this editing gets, the ASL for Taken 3's non-action scenes is 2.27 seconds. To reiterate, this is the non-action scenes. The "slow scenes." The character stuff. Dialogue scenes. The stuff where any other movie would know to slow down. 2.27 SECONDS For comparison, Mad Max: Fury Road (minus end credits/opening logos) has a runtime of 1:51:58, with 2646 shots overall, for an ASL of 2.54 seconds. TAKEN 3'S "SLOW SCENES" ARE EDITED MORE AGGRESSIVELY THAN MAD MAX: FURY ROAD!

> And Taken 3's action scenes? Their ASL is 0.68 seconds!

> If it weren't for the sound people on the movie, Taken 3 wouldn't be an "action movie". It'd be abstract art.

throwup238 · 2024-05-14T19:32:01 1715715121

It's worth noting that Taken 3 has a 13% rating on Rotten Tomatoes, which is well in to "it's so bad it's good" territory. I don't think the rapid cuts went unnoticed.

nimithryn · 2024-05-14T20:20:40 1715718040

Yeah, this sequence is a meme commonly cited to show "choppy modern editing"

llmblockchain · 2024-05-14T20:15:50 1715717750

More chops than an MF DOOM track.

kristofferR · 2024-05-14T19:34:31 1715715271

The top comment makes a really good point though:

"He's 68. I'm guessing they stitched it together like this because "geriatric spends 30 seconds scaling chainlink fence then breaks a hip" doesn't exactly make for riveting action flick fare."

Lingering shots are horrible for obscuring things.

troupo · 2024-05-14T21:38:36 1715722716

Keanu Reeves was 57-8 when he shot the last John Wick. IIRC Bob Odenkirk was 58 in Nobody. Neeson was 60 in Taken 3.

There ways to shoot an action scene with an aging star that doesn't involve 14 cuts in 4 seconds. You just have to care about your craft.

lupire · 2024-05-14T19:40:15 1715715615

Movies have stunt performers.

And Neeson was only 60 when filming Taken 3.

nineteen999 · 2024-05-14T22:06:14 1715724374

Is it Liam Neeson, or his stunt double?

Shocka1 · 2024-05-16T13:38:36 1715866716

The first time I watched The Rise of Skywalker it was just too much being thrown at my brain. The second and third watch was much easier to process of course. I'm a big fan of older movies and have noticed the shot length difference anecdotally - Lawrence of Arabia and Ben Hur are two of my favorites. So I suppose it all makes sense to me now that there is actually a comparison measurement that has been completed.

kemitchell · 2024-05-14T22:07:51 1715724471

Enjoy some Tarkovsky.

jsheard · 2024-05-14T18:25:19 1715711119

Even if the shots are very short you still need coherency between shots, and they don't seem to have tackled that problem yet.

lobochrome · 2024-05-14T23:56:09 1715730969

Shot length, yes - but the scene stays the same. Getting continuity with just prompts seems not yet figured out.

Maybe it's easy, and you feed continuity stills into the prompt. Maybe it's not, and this will always remain just a more advanced storyboarding technique.

But then again, storyboards are always less about details and more about mood, dialog, and framing.

chipweinberger · 2024-05-15T00:22:08 1715732528

In 1930 they often literally had a single camera.

Just worth keeping that in mind. You could not just switch between multiple shots like you can today.

joshuahedlund · 2024-05-14T20:51:53 1715719913

How many of those 2.5 second "shots" are back-and-forths between two perspectives (ex. of two characters talking to one another) where each perspective is consistent with itself? This would be extremely relevant for how many seconds of consistent footage are actually needed for an AI-generated "shot" at film-level quality.

indy · 2024-05-14T18:09:07 1715710147

As someone who doesn't live in the US this year's Google IO feels like I'm outside looking in at all the cool kids who get to play with the latest toys.

roynasser · 2024-05-14T18:16:23 1715710583

VPN'd right into that playground, turns out the toys were pretty blah

numbers · 2024-05-14T19:50:01 1715716201

don't feel left out, we're all on the wait lists

mrcwinn · 2024-05-15T03:18:32 1715743112

OpenAI has the model advantage.

Google and Apple have the ecosystem advantage.

Apple in particular has the deeper stack integration advantage.

Both Apple and Google have a somewhat poor software innovation reputation.

How does it all net out? I suspect ecosystem play wins in this case because they can personalize more deeply.

xNeil · 2024-05-15T04:15:15 1715746515

>Google and Apple have a somewhat poor software innovation reputation.

I'm assuming you mean reputation as in general opinion among developers? Because Google's probably been the most innovative company of the 21st century so far.

bugbuddy · 2024-05-15T05:29:29 1715750969

Yes, I miss Stadia so much. It was the most innovative streaming platform I had ever used. I wished I could still use it. Please, Google, bring Stadia back.

teaearlgraycold · 2024-05-15T07:08:26 1715756906

They’re renting out the tech to 3rd parties

miki123211 · 2024-05-15T03:48:10 1715744890

Google and Apple also have an "API access" advantage. It is similar to the ecosystem advantage but goes beyond it; Google and Apple restrict third-party app makers from access to crucial APIs like receiving and reading texts or interacting with onscreen content from other apps. I think that may turn out to be the most important advantage of them all. This should be a far bigger concern for antitrust regulators than petty squabbles over in-app purchases. Spotify and Netflix are possible (if slightly inconvenient) to use on iOS, a fully-featured AI assistant coming from somebody who isn't Apple is not.

Google (and to a lesser extend also Microsoft and Meta) also have a data advantage, they've been building search engines for years, and presumably have a lot more in-house expertise on crawling the web and filtering the scraped content. Google can also require websites which wish to appear in Google search to also consent to appearing in their LLM datasets. That decision would even make sense from a technical perspective, it's easier and cheaper to scrape once and maintain one dataset than to have two separate scrapers for different purposes.

Then there's the bias problem, all of the major AI companies (except for Mistral) are based in California and have mostly left-leaning employees, some of them quite radical and many of them very passionate about identity politics. That worldview is inconsistent with a half of all Americans and the large majority of people in other countries. This particularly applies to the identity politics part, which just isn't a concern outside of the English-speaking world. That might also have some impact on which AI companies people choose, although I suspect far less so than the previous two points.

lowkey · 2024-05-15T03:24:56 1715743496

Google has a deep addiction to AdWords revenue which makes for a significant disadvantage. Nomatter how good their technology, they will struggle internally with deploying it at scale because that would risk their cash cow. Innovator’s dilemma.

frankacter · 2024-05-15T03:37:15 1715744235

Google Cloud and cloud services generated almost 9.57 billion. That's up 28% from prior:

https://www.crn.com/news/networking/2024/google-cloud-posts-...

They are embedding their models not only widely across their platforms suite of internal products and devices, but also computationally via API for 3rd party development.

Those are all free from any perceived golden handcuffs that AdWords would impose.

damsalor · 2024-05-15T06:32:59 1715754779

Yea, well. I still think there is a conflict of interest if you sell propaganda

lowkey · 2024-05-15T12:41:39 1715776899

As of 2020, AdWords represented over 80% of all Google revenue [1] while in 2021 7% of Google’s revenue came from cloud [2].

[1] https://www.cnbc.com/2021/05/18/how-does-google-make-money-a...?

[2] https://aag-it.com/the-latest-cloud-computing-statistics/?t

mirekrusin · 2024-05-15T03:56:02 1715745362

Not mentioning Meta, the good guy now, is scandalous.

X is not going to sit quietly as well.

There is also the rest of us.

riffraff · 2024-05-15T06:07:38 1715753258

X is tiny compared to Apple/Meta/Google, both in engineering size and in "fingerprint" in people's life.

Also engineering wise, currently every tweet is followed by a reply "my nudes in profile" and X seems unable to detect it as trivial spam, I doubt they have the chops to compete in this arena, especially after the mass layoffs they experienced.

mirekrusin · 2024-05-15T07:52:46 1715759566

By X I mean one guy with big pocket who won't sit quietly - I wouldn't underestimate him.

hwbunny · 2024-05-15T05:55:45 1715752545

ahem...zzzzzzzz

SoftTalker · 2024-05-14T19:36:15 1715715375

Vaguely unsettling that the thumbnail for first example prompt "A lone cowboy rides his horse across an open plain at beautiful sunset, soft light, warm colors" looks something like the pixelated vision of The Gunslinger android (Yul Brynner's character) from the 1973 version of Westworld.

See 1:11 in this video https://www.youtube.com/watch?v=MAvid5fzWnY

Incidentally that was one of the early uses of computer graphics in a movie, supposedly those short scenes took many hours to render and had to be done three times to achieve a colorized image.

AceJohnny2 · 2024-05-14T19:39:35 1715715575

Can't say I see a visual similarity. In any case, "Cowboy silhouette in the sunset" is a pretty classic American visual.

But the parallel you made between android Brynner's vision and the generated imagery is fun to consider!

solatic · 2024-05-15T16:58:22 1715792302

> It's critical to bring technologies like Veo to the world responsibly. Videos created by Veo are watermarked using SynthID, our cutting-edge tool for watermarking and identifying AI-generated content

And we're supposed to believe that this is resilient against prompt injection?

How do you prevent state actors from creating "proof" that their enemies engaged in acts of war, and they are only engaging in "self-defense"?

dmix · 2024-05-15T17:15:14 1715793314

Nation states can run their own models if not now very soon. This isn't something you're going to control via AI-safety woo woo.