When to use this mode
Text to Video fits when:
- You want short motion clips (6s or 10s) rendered from scratch, no reference image needed.
- You’re prototyping b-roll, hooks, or background loops where one prompt = one clip.
- You’re building a storyboard where each scene is described independently. (For sequences where shot 2 should continue from shot 1, see Chain prompts .)
If you have a still image and want motion added to it, use Image to Video instead.
Set up the run
- Click the Text to Video mode tile.
- Paste your prompt list in the Prompts textarea (blank line between prompts).
- Open the Refine section and configure the three video-specific knobs:
- Length — 6 s or 10 s. Longer clips eat your quota faster.
- Quality — 480p or 720p. Pick 720p for finished pieces, 480p for ideation.
- Aspect — 9:16 for shorts, 16:9 for YouTube/landing pages, 1:1 for feed posts. Same as Text to Image.
The 480p → 720p upscale trick
If you pick 480p in the Quality dropdown, a fourth control appears below: Upscale after generation.
Toggling it on flips the saved quality to 480p-upscale. Every clip is generated at 480p (fast), then re-submitted to Grok for a 720p upscale pass, then downloaded. The total time is slightly longer than just picking 720p outright, but the failure rate is lower because the 480p pass is more likely to survive Grok’s stricter 720p quality gate.
For batches over ~20 clips, this trick is often the difference between a 90% success rate and a 70% one.
Heads up. Upscale uses one extra Grok generation per clip. If your daily Grok quota is tight, factor that in.
Run and watch progress
Click Run →. The progress UI is the same as Text to Image — per-row status badges (queued → generating → done) and live percent bars. Video runs are slower, so plan for ~1–2 minutes per clip at 720p.
Cancelling mid-batch (the Cancel button on the Current run card) stops new submissions but lets in-flight clips finish; you can later click Continue · N unfinished to pick up the misses without restarting.
Chain mode for short storyboards
The Chain prompts checkbox sits below the prompt list, only visible in video and image-to-image modes. When on:
#0 → #1 → #2 → #3 → ...
Each prompt’s output becomes the input frame for the next prompt. A 10-clip chain produces a single continuous-feeling sequence instead of 10 disconnected fragments. See Chain prompts for the full walkthrough.
Where files land
~/Downloads/<your folder>/
├── 1_<prompt text or index>.mp4
├── 2_<...>.mp4
└── ...
Same naming rules as images — Folder, Filename prefix, and Use the prompt text as the filename in the Downloads section apply identically.
Things that will save you a re-run
- Start with 480p + upscale on, not raw 720p. Better total success rate on long queues.
- Keep first-pass batches under 20 clips. Video failures cost more than image failures (in time and quota), so de-risk with a small run before committing to a 100-clip sprint.
- Aspect ratio is sticky between runs. If you ran a 9:16 batch yesterday, the dropdown still says 9:16 today. Double-check before clicking Run.