Runway Gen-3 is one of the most cinematic AI video models. Success hinges on precise camera vocabulary, a single concrete action, and explicit stability.
Camera vocabulary that Gen-3 likes
| Motion | Recommended phrasing |
|---|---|
| Static | camera holds static / locked-off shot |
| Push in | slow steady dolly in / push in toward [subject] |
| Pull out | slow dolly out / pull back to reveal |
| Track | camera tracks horizontally to the right |
| Follow | tracks [subject] at the same pace |
| Tilt | slow tilt up from feet to face |
| Orbit | slow orbit around the subject, 90 degrees |
Single concrete action
{scene}, camera holds static, single action: a chef slowly pours olive oil onto the pan, soft side light, 5-second cinematic clip, no shake
The phrase "single action:" is a strong isolator on Gen-3 — it tells the model to do exactly one thing.
Image-to-video tips
- Provide a high-quality first frame
- The text prompt should only describe motion + camera, not re-describe the image
- Motion Strength 2–4 is the comfortable range
- Stay under 5 verbs total
Pitfalls
Too many verbs
walks + sits + picks + drinks always collapses. One core verb.
No duration
"5-second clip" keeps pacing consistent.