I’ve Been Generating AI Videos Using Google’s Veo 2 — Here Are 5 Of My Favorites

i’ve-been-generating-ai-videos-using-google’s-veo-2-—-here-are-5-of-my-favorites
I’ve Been Generating AI Videos Using Google’s Veo 2 — Here Are 5 Of My Favorites

Last week, Google started rolling out its Veo 2 video generation model to Gemini Advanced subscribers. I’ve been playing around with it since — so much so, I’ve apparently already bumped up against Google’s monthly limit on video generations.

In the run-up to Veo 2’s wider availability, Google naturally highlighted clips generated by the model that were hard to distinguish from human-made video, whether it was prompted to imitate lifelike footage or a cutesy animation. The results I’ve seen in my time with Veo 2 haven’t been quite so impressive — but I have to say, they’re closer than I expected, and even the worse ones are still interesting. Here are five of my favorite early results from Veo 2.

5 A shark party in the woods

For this video, I engaged in the time-honored tradition of prompting generative AI to create a nonsensical, stream-of-consciousness-style scene. I asked Veo 2 for a video of human-shark hybrids having a bonfire in the woods, holding red cups. Check and check. I also prompted it to include “vans” in the scene, assuming it would know from context I meant the type of van someone might go camping in. Instead, it gave all the sharks recognizable Vans sneakers. Not exactly what I was after, but definitely funnier.

The eight-second clip of shark-people dancing around a fire is lifelike at a glance, with the fire burning realistically, the background convincingly blurred, and the sharks’ skin showing some realistic texture. Finer details aren’t so clean: the sharks each seem to have one regular flipper and one humanoid hand. The cups of the ones in the background are also kind of floating near their hands rather than held in them. Still, an admirable effort for something so pointless.

See also  Google’s Gemini AI Will Expand To Your Car, Headphones, And Watches Soon

4 An etched golden skull

To get a feel for how Veo 2 manages complex textures, I asked it for a gold skull with finely etched details in the style of a calavera, rotating under a bright light. The result feels incomplete, with the skull making a partial turn, pausing, then continuing, but both the anatomy of the skull and the way the light plays on its various textures and details look convincing.

3 Gen-AI newscasters

Thinking of ways video generation could potentially be used to deceive people, I prompted Veo 2 to simulate a cable news broadcast, with anchors sitting at a desk and speaking to camera. For the most part, the results are convincing — one anchor speaks as the other nods along. They even have realistic reflections in the surface of the desk.

Veo 2 fell down on text in this one, though: the chyron at the bottom was meant to read “AI VIDEO GENERATION IS HERE. WHAT IS IT FOR?” Close, but not quite. Small details in the footage are a little off, too, like one anchor’s pen appearing and disappearing, and the other wearing two microphones. The graphic behind the anchors feels appropriately cheesy for a TV news segment about AI video, though, featuring a film reel overlayed on a bunch of ones and zeroes.

See also  Google's Gemini Live Is Getting Smarter At Handling Multiple Languages And Accents

2 The Legend of Zelda — kind of?

I was curious whether Veo 2 was trained on footage of video games, so to find out, I prompted it to create scenes from a few specific titles. In this one, I described the opening moments of The Legend of Zelda: Breath of the Wild, in which Link runs out of a cave to look out over the landscape from a cliff.

Veo 2 couldn’t quite make that specifically, but it’s absolutely been trained on footage of the game. A Link-like character runs out of a cave and to a cliff, with a blob of items on his back that looks vaguely like a sword and a shield if you squint. Interestingly, the game’s UI is almost intact — elements are all in the right places, and the map in the corner rotates realistically as the camera pans.

1 Unambiguous Cyberpunk 2077 gameplay

I prompted Veo 2 for footage from a few more games after Breath of the Wild, but the simple prompt “Cyberpunk 2077 gameplay” yielded what seemed like the most accurate result. The rainy city street, the UI, the small aircraft — it all looks very much like Cyberpunk. There’s even a billboard advertising what look like cybernetic implants.

Finer details are a mess; text and iconography are squiggly and vague, and Veo 2 seemed to throw in a gun-swaying running animation despite the player character not actually moving through the scene. Still, Veo 2 knows what Cyberpunk 2077 looks like, and it’s not afraid to recreate it.

See also  Google Chat Could Soon Get A Major Upgrade With Scheduled Messages And Gemini (APK Teardown)

Affordable AI video generation is here

Image of the Google Gemini assistant overlay on an Android phone.

My first week with Veo 2 in Gemini felt a lot like my early experience with AI image generation apps. The novelty of plugging in a short prompt to get a short video in a minute or two means that even when results are less than stellar, they’re still interesting to see. It’s new, and it’s weird, and it’s fun.

But I’m not sure what exactly regular users are supposed to do with Veo 2 other than goof around. Considering how resource-intensive AI video generation is, offering Veo 2 as part of a $20-per-month subscription doesn’t seem exactly sustainable for Google. What’s more, it looks like Gemini may eventually offer a “freemium” video generation feature that doesn’t cost money at all. Gemini Advanved cut me off for the month after I’d generated about 50 clips, and a potential free version of the feature will be more limited still.

Whatever Google’s long-term video generation ambitions may be, Veo 2 is rolling out broadly for Gemini Advanced subscribers right now in both the mobile app and the Gemini web interface.