🎥 A Deep Dive into AI-Powered Filmmaking using WAN2.2

Playback speed

Share post at current time

0:00

Transcript

🎥 A Deep Dive into AI-Powered Filmmaking using WAN2.2

From Raw AI to Real Impact

Vikram Pawar

Jul 30, 2025

When I first opened up WAN2.2, my intention was simple:
Test the tool. Generate a few clips. Experiment with cinematic AI visuals.

But as I kept creating, something felt hollow.

The outputs were beautiful, yes—but they didn’t add up to anything meaningful. They lacked structure, rhythm, and emotional resonance. So I paused and asked myself:

What if I tried to make something that actually meant something?
Not just random beauty—but a short film that told a message-driven story, even within the constraints of AI.

That’s what set me off on the journey of building “Global Warming: Our Planet’s Story” with the help of Google Gemini for research and story ideation—and here’s the complete process I went through to make it happen.

Download exact prompts I used in each of the scene of the video

Prompts Of Global Warming Video Using Wan2

19.4MB ∙ PDF file

Download

Wish to know more about WAN2.2 Model?

Click Here to know more about WAN2.2

🧪 Phase 1: Testing WAN2.2 and Hitting Creative Walls

I started with themes that excited me visually:

A racing short
A sci-fi odyssey
A symbolic love story

But I hit the same issue over and over:

WAN2.2 can’t maintain consistency.
No persistent faces, objects, or locations across prompts.

Every 5-second video was isolated. A person in prompt one would look completely different in prompt two. The same "car" would vary in color, model, even physics.

So I changed course. I stopped trying to connect shots directly and began designing something that didn’t need visual consistency.

🧠 Phase 2: Rethinking Storytelling – Concept as the Main Character

Instead of plot, I leaned into visual progression.
Rather than follow a person or place, what if the viewer followed an idea?

Early drafts explored:

The evolution of energy
Urban decay and regrowth
Light overtaking shadow

But through research and thematic exploration with Google Gemini, one narrative emerged as urgent, powerful, and well-suited to abstract visual transitions:

Climate change.
Told in four emotional chapters:
Balance → 2. Expansion → 3. Collapse → 4. Hope

📌 Google Gemini played a key role here. It helped me structure the narrative flow, sharpen the thematic transitions, and provided crisp language for scene framing and emotional anchoring. Gemini did a fantastic job distilling complex environmental messaging into a compact, cinematic structure.

🧩 Phase 3: Structuring the Flow – Building Emotional Chapters

We broke the film into four chapters:

Before Humans – Pristine beauty, untouched nature
Human Impact – Expansion, industrialization, destruction
The Crisis – Heat, floods, devastation
A New Chapter – Reflection and fragile hope

Each chapter was split into 2.5-second visual prompts, keeping segments snappy and distinct while still allowing us to shape a rising and falling emotional arc.

✍️ Phase 4: Prompt Engineering – The Real Directing

Prompt writing turned out to be the most important part.

Every single visual was engineered to be:

Highly specific (“Himalayan glacial valley, ethereal mist, ibex on a cliff”)
Time-bound and emotionally aligned
Dynamic — verbs like melts, drifts, chokes, crashes shaped movement
Layered with sensory details — fog, light, movement, sounds

We paired each prompt with:

🎶 Background music tone (e.g. melancholic cello, urgent percussive)
🌬️ Ambient sound cues (e.g. thunder, wind, water, industrial hum)

All of this helped each isolated clip feel like part of a bigger rhythm.

🎙️ Phase 5: Voiceover Tools – When AI Let Me Down

Next came narration—and the big disappointment.

Note: WAN2.2 currently does not have capability of dialogue or voiceover.

I wanted each chapter to sound different.
To move from reverence ➝ energy ➝ alarm ➝ reflection ➝ hope.

But across all tools:

…the delivery was flat.
One voice. One tone. No emotional range. Even after detailed instructions like:

“Grave tone with urgency”, “Hopeful upward inflection”, or “Melancholic delivery with slow pace”

They’d still read everything like a neutral weather report.

Eventually, I gave up on AI voiceover.
The tone-shifting this story needed wasn’t there.
I kept Gemini clips for timing, but plan to overlay human VO later.