When I first opened up WAN2.2, my intention was simple:
Test the tool. Generate a few clips. Experiment with cinematic AI visuals.
But as I kept creating, something felt hollow.
The outputs were beautiful, yesβbut they didnβt add up to anything meaningful. They lacked structure, rhythm, and emotional resonance. So I paused and asked myself:
What if I tried to make something that actually meant something?
Not just random beautyβbut a short film that told a message-driven story, even within the constraints of AI.
Thatβs what set me off on the journey of building βGlobal Warming: Our Planetβs Storyβ with the help of Google Gemini for research and story ideationβand hereβs the complete process I went through to make it happen.
Download exact prompts I used in each of the scene of the video
Wish to know more about WAN2.2 Model?
π§ͺ Phase 1: Testing WAN2.2 and Hitting Creative Walls
I started with themes that excited me visually:
A racing short
A sci-fi odyssey
A symbolic love story
But I hit the same issue over and over:
WAN2.2 canβt maintain consistency.
No persistent faces, objects, or locations across prompts.
Every 5-second video was isolated. A person in prompt one would look completely different in prompt two. The same "car" would vary in color, model, even physics.
So I changed course. I stopped trying to connect shots directly and began designing something that didnβt need visual consistency.
π§ Phase 2: Rethinking Storytelling β Concept as the Main Character
Instead of plot, I leaned into visual progression.
Rather than follow a person or place, what if the viewer followed an idea?
Early drafts explored:
The evolution of energy
Urban decay and regrowth
Light overtaking shadow
But through research and thematic exploration with Google Gemini, one narrative emerged as urgent, powerful, and well-suited to abstract visual transitions:
Climate change.
Told in four emotional chapters:
Balance β 2. Expansion β 3. Collapse β 4. Hope
π Google Gemini played a key role here. It helped me structure the narrative flow, sharpen the thematic transitions, and provided crisp language for scene framing and emotional anchoring. Gemini did a fantastic job distilling complex environmental messaging into a compact, cinematic structure.
π§© Phase 3: Structuring the Flow β Building Emotional Chapters
We broke the film into four chapters:
Before Humans β Pristine beauty, untouched nature
Human Impact β Expansion, industrialization, destruction
The Crisis β Heat, floods, devastation
A New Chapter β Reflection and fragile hope
Each chapter was split into 2.5-second visual prompts, keeping segments snappy and distinct while still allowing us to shape a rising and falling emotional arc.
βοΈ Phase 4: Prompt Engineering β The Real Directing
Prompt writing turned out to be the most important part.
Every single visual was engineered to be:
Highly specific (βHimalayan glacial valley, ethereal mist, ibex on a cliffβ)
Time-bound and emotionally aligned
Dynamic β verbs like melts, drifts, chokes, crashes shaped movement
Layered with sensory details β fog, light, movement, sounds
We paired each prompt with:
πΆ Background music tone (e.g. melancholic cello, urgent percussive)
π¬οΈ Ambient sound cues (e.g. thunder, wind, water, industrial hum)
All of this helped each isolated clip feel like part of a bigger rhythm.
ποΈ Phase 5: Voiceover Tools β When AI Let Me Down
Next came narrationβand the big disappointment.
Note: WAN2.2 currently does not have capability of dialogue or voiceover.
I wanted each chapter to sound different.
To move from reverence β energy β alarm β reflection β hope.
But across all tools:
β¦the delivery was flat.
One voice. One tone. No emotional range. Even after detailed instructions like:
βGrave tone with urgencyβ, βHopeful upward inflectionβ, or βMelancholic delivery with slow paceβ
Theyβd still read everything like a neutral weather report.
Eventually, I gave up on AI voiceover.
The tone-shifting this story needed wasnβt there.
I kept Gemini clips for timing, but plan to overlay human VO later.
π§΅ Phase 6: The Stitch β Pulling It All Together in Canva
I used Canva for the final assembly:
Synced visuals and transitions
Chapter titles and text animations
Even without voiceover, the flow felt cinematic and emotionally connected.
π‘ Final Reflection: AI is Powerful, But Direction is Everything
Tools like WAN2.2 are powerful, but they donβt replace storytelling.
You can generate cool visuals, sure.
But to build something cohesive and moving, you have to design it.
That means:
Understanding tool constraints
Thinking like a director and editor
Writing like a screenwriter
Layering visuals, sound, and narration with intention
This wasnβt a film created by AI.
It was a film crafted by a human, with AI as the collaborator.
Let me know if you have any questions.
Happy to assist further!
Share this post