Grok Video

Grok Video in Synclip — Cinematic AI Video in 6 to 15 Seconds
Three aspect ratios. One reference image. Linear, transparent pricing.

xAI's Grok Video model lands in Synclip. Write a prompt, pick your aspect ratio (3:2 cinematic / 2:3 portrait / 1:1 square), add an optional reference image for character or scene continuity, choose 6, 10, or 15 seconds — and get 720p output with a thumbnail baked in. Pricing is simple: 3 coins per second, always.

Grok Video

What is Grok Video?

Grok Video is xAI's text-to-video model, now available inside Synclip. It generates 720p video at three aspect ratios, produces a thumbnail automatically alongside the clip, and supports a single reference image to anchor identity or scene style.

The model is designed for short-form cinematic output: anywhere from 6 seconds (a punchy loop or teaser) to 15 seconds (a full narrative beat). Unlike flat-rate models, Grok Video uses linear per-second billing so you only pay for what you actually generate.

  • 720p output with auto-generated thumbnail
  • 3:2 landscape, 2:3 portrait, 1:1 square aspect ratios
  • Optional single reference image for character or scene consistency
  • 6 s / 10 s / 15 s duration options
  • Linear pricing: 3 coins / second (18 → 30 → 45 coins)

Aspect Ratios — Pick the Right Frame for Your Platform

Grok Video gives you three ratios, each optimised for a distinct distribution channel:

RatioFormatBest for
3:2Landscape / CinematicYouTube, film reels, desktop viewers
2:3Portrait / Short-formReels, TikTok, Shorts, mobile-first feeds
1:1Square / SocialInstagram posts, product ads, cross-platform reposts

Pick your ratio before writing your prompt — the composition language of the prompt changes depending on orientation. For portrait, describe vertical movement; for landscape, use horizontal staging.

Duration & Pricing — Transparent, Linear Billing

Grok Video costs exactly 3 coins per second. No hidden tiers, no capacity surcharges:

DurationGrok VideoVeo 3.1 FastSora 2
6 s18 coins18 coins (Veo 3.1 Fast, any length)8 coins (Sora 2, 10 s)
10 s30 coins18 coins (Veo 3.1 Fast, any length)8 coins (Sora 2, 10 s)
15 s45 coins18 coins (Veo 3.1 Fast, any length)12 coins (Sora 2, 15 s)

Veo 3.1 Fast is a flat-rate model — the same coin cost regardless of duration. If you need the longest clip at the lowest coin spend, Veo 3.1 Fast wins on raw economics. Grok Video's advantage is cinematic quality at shorter durations and the reference-image workflow.

Reference Image — One Image, Consistent Results

Upload a single image alongside your prompt and Grok Video will use it to anchor the visual identity of the clip. This is the primary consistency tool for the model: character face/outfit, scene location, product look, or even a color palette can be locked with one reference.

Best for:
  • Character consistency across multiple generated clips
  • Continuing a scene with the same background or location
  • Product shots that must match an existing brand visual
  • Locking a colour-grade or lighting style

Tip: Keep the reference image clean and representative. A single well-lit face or product on a neutral background gives the model the clearest signal. Avoid busy compositions with multiple focal points.

Four-Step Workflow

A repeatable sequence that works whether you're generating a one-off clip or building a short-form series.

Step 1 · Select Grok Video in the model picker

Open the Video Creator workspace in Synclip and choose Grok Video from the model dropdown. The interface will show the three aspect ratio options and the duration selector.

Step 2 · Write your prompt

Structure the prompt with five elements: subject, scene, camera move, motion beat, and style constraints. Keep it under 120 words. Avoid asking for readable text in-frame.

  • Subject: who or what is in the shot
  • Scene: environment and background
  • Camera: shot type (close-up / medium / wide) and move (dolly / pan / orbit)
  • Motion beat: what changes during the clip
  • Style: realistic / cinematic / commercial / etc.

Step 3 · Set ratio, duration, and optional reference image

Pick the aspect ratio for your target platform. Choose 6 s for a loop or teaser, 10 s for a product beat, or 15 s for a full narrative moment. If you need visual consistency, upload one reference image before generating.

Step 4 · Generate and iterate

Run the generation. The model returns the video and an auto-generated thumbnail. If the shot direction is right but details need adjustment, tweak the motion beat or camera language and re-run — the reference image stays locked between iterations.

Prompt Templates — Copy, Replace, Generate

Replace the bracketed fields with your project specifics.

A) Landscape cinematic (3:2) — establishing shot

Prompt
"A [SUBJECT] in [LOCATION]. Wide establishing shot, slow dolly-in from left. Natural light, slight lens flare, cinematic colour grade. Realistic motion, subtle camera shake. No text."
When to use:
  • YouTube intros
  • Film-style b-roll
  • Travel and destination content

B) Portrait short-form (2:3) — vertical character story

Prompt
"Close-up portrait of [CHARACTER] in [SCENE]. Camera holds still, subject looks directly into lens, then glances away. Soft bokeh background, warm skin tones. Authentic, handheld documentary feel. No text."

Tip: Pair with a reference image of the character for face consistency across clips.

C) Square social (1:1) — product reveal

Prompt
"Commercial product shot of [PRODUCT] on [BACKGROUND]. Camera starts on a detail macro, then pulls back to a full product hero reveal. Clean studio lighting, crisp reflections, premium feel. No text."
When to use:
  • Instagram ads
  • E-commerce product videos
  • Brand content

Model Comparison — Grok Video vs Veo 3.1 Fast vs Sora 2

A quick reference for choosing the right model per use case:

FeatureGrok VideoVeo 3.1 FastSora 2
Output resolution720p720p720p
Aspect ratios3:2 / 2:3 / 1:116:9 / 9:1616:9 / 9:16 / 1:1
Max duration15 s25 s15 s
Reference image1 imageMultiple (ingredients)No
First / last frameNoYes (Veo 3.1)No
Auto thumbnailYesNoNo
Pricing (15 s)45 coins18 coins (flat)12 coins

FAQ

What resolution does Grok Video output?

720p. The model also generates a thumbnail image automatically alongside the video clip.

Can I use more than one reference image?

Grok Video currently supports a single reference image per generation. For multi-image reference workflows (ingredients-style), use Veo 3.1 in Synclip.

Why does 15 seconds cost more than Veo 3.1 Fast?

Grok Video uses linear per-second billing (3 coins/s), so longer clips cost proportionally more. Veo 3.1 Fast is flat-rate per generation regardless of duration. If coin efficiency at max duration is your priority, Veo 3.1 Fast is the better pick.

Can I use Grok Video for portrait (vertical) content?

Yes — the 2:3 aspect ratio is designed for vertical short-form platforms like Reels, TikTok, and Shorts.