AI Models on Upsampler

Every model we support, grouped by tool. Click any model for examples, capabilities, and pricing details.

Nano Banana

Google's Gemini 2.5 Flash-powered text-to-image generator. Supports reference images for subject consistency and produces clean, well-composed shots with the prompt-following accuracy you'd expect from a Gemini-grounded model. A solid pick for fast iteration when you don't need 4K output or the multi-reference workflow that Nano Banana 2 introduces.

4 Credits

Nano Banana 2

Google DeepMind's Gemini 3-powered flagship for text-to-image. Renders up to 4K with industry-leading prompt adherence, native multi-image references, and web search grounding for factual scenes. Ideal for product photography with brand consistency, character series across multi-shot campaigns, and complex compositional work that benefits from compositing several references at once.

11 Credits

Nano Banana Pro

The premium tier of Google's Nano Banana lineup. Tuned for magazine-cover fidelity at 4K, with the strongest identity preservation in the family across faces, products, and brand elements. Best when the cost per image is justified by the output going into print, paid ads, or hero placements where every detail matters.

15 Credits

Imagen 3

Google's previous-generation Imagen text-to-image model. Strong on natural lighting and skin tones with a documentary photographic feel, but supplanted by Imagen 4 and Imagen 4 Ultra on prompt adherence and text rendering. Reasonable choice when you want the older Imagen aesthetic specifically — otherwise step up to Imagen 4.

5 Credits

Imagen 4

Google's mid-tier flagship in the Imagen series. Delivers stronger prompt adherence and improved text rendering over Imagen 3, with the photorealistic skin tones and natural lighting Imagen is known for. Sits below Imagen 4 Ultra on overall quality and above Imagen 3 on instruction-following — a pragmatic default for product, ecommerce, and editorial photography.

4 Credits

Imagen 4 Ultra

Google's premium Imagen tier, built for magazine-cover-quality photorealism. Hand-tuned for fine-grained detail in faces, fabrics, and natural materials with faithful prompt adherence and improved typography. Sits at the top of the Google image lineup alongside Nano Banana Pro — pick this when you want documentary-style realism over Gemini's broader compositional flexibility.

6 Credits

Seedream 3

ByteDance's earlier Seedream image model with native 2K output and a recognizable cinematic look. Useful when you specifically want the early-Seedream aesthetic — fashion editorial, Asian-cinema mood, vibrant stylization. For most production use cases Seedream 4, 4.5, or 5 Lite outperform it on detail and prompt adherence.

3 Credits

Seedream 4

ByteDance's evolution of Seedream — refined character work, fashion-grade outputs, and the cinematic stylization the line is known for, now with reference-image support. Strong choice for editorial campaigns, music-video stills, and stylized portraiture where mood and color palette matter more than literal photographic realism.

3 Credits

Seedream 4.5

ByteDance's flagship cinematic image model. Top-tier in the Seedream lineup for color, mood, and editorial stylization, with stronger spatial understanding than 4.0 — convincing depth, perspective, and prop placement. Pair with Seedream 5 Lite for high-volume iteration once you've nailed the look.

4 Credits

Seedream 5 Lite

A faster, lower-cost variant of the Seedream 4.5 aesthetic — same cinematic palette and mood handling at reduced fidelity. Built-in reasoning and example-based editing differentiate it from other budget tiers: pass an example image of the desired result and the model interprets the intent. Good for high-volume social content, moodboards, and exploration before committing to a hero shot.

4 Credits

GPT Image

OpenAI's first standalone text-to-image model. Best-in-class typography rendering at the time of release — making it well-suited to posters, packaging mockups, and text-heavy designs — though GPT Image 1.5 and 2 supersede it on photorealism and instruction-following. Still worth picking when you want the original GPT Image aesthetic for typographic work specifically.

6 Credits

GPT Image 1.5 Medium

The mid-quality tier of OpenAI's GPT Image 1.5. Improved photorealism and instruction-following over GPT Image 1 at a more reasonable cost than GPT Image 1.5 High. Solid pick for ads, packaging, and marketing creative when GPT Image 2 pricing is excessive.

5 Credits

GPT Image 1.5 High

The high-quality tier of OpenAI's GPT Image 1.5 mid-generation release. Improved photorealism over GPT Image 1 with the same industry-leading text rendering, ideal for ads, packaging, and editorial layouts where typography and image quality both matter. Surpassed by GPT Image 2 High for premium work but cheaper at the high tier.

14 Credits

GPT Image 2 Low

OpenAI's GPT Image 2 budget tier. Same model architecture as GPT Image 2 Medium and High at a fraction of the cost — useful for high-volume iteration, draft generation, and exploration before committing to higher tiers for finals.

2 Credits

GPT Image 2 Medium

OpenAI's GPT Image 2 at the medium quality tier. Combines top-of-the-leaderboard prompt adherence on Artificial Analysis with the strongest typography rendering of any image model — exact text in quotes, complex layouts, and packaging mockups all just work. The default choice for production marketing creative when GPT Image 2 High's premium pricing isn't justified.

6 Credits

GPT Image 2 High

OpenAI's GPT Image 2 at maximum quality, currently ranked #1 on the Artificial Analysis text-to-image arena. Delivers magazine-cover fidelity with industry-best text rendering and complex prompt adherence — pick it for hero shots, premium ad creative, and editorial covers where the cost per image is worth the output.

15 Credits

Flux Schnell

Black Forest Labs' fastest, lowest-cost Flux variant. Built for rapid prototyping, moodboards, and high-volume batch generation where speed and credit cost matter more than raw fidelity. Open-weight and well-supported in the community, with extensive prompt-pattern documentation accumulated since the original Flux release.

1 Credit

Flux 1.1 Pro

Black Forest Labs' workhorse flagship before the Flux 2 release. Strong stylization range across photorealistic and illustrated work with broad community knowledge of prompt patterns — a known quantity for production workflows. Edged out by Flux 2 Pro on detail and prompt adherence but still cost-competitive for everyday creative.

4 Credits

Flux 2 Klein

The smaller, faster open-weight variant of Black Forest Labs' Flux 2 family. Self-hostable for teams that want the Flux 2 aesthetic on their own infrastructure, with strong quality-per-credit at a lower cost than Flux 2 Pro. Good for iteration and exploration before stepping up to Pro or Max for finals.

2 Credits

Flux 2 Pro

Black Forest Labs' Flux 2 production tier — the current Flux flagship for most use cases. Stylized photorealism with stronger prompt adherence than Flux 1.1 Pro, broad aesthetic range, and reference-image support. Sits below Flux 2 Max on top-end fidelity but offers a better cost-quality ratio for production work.

5 Credits

Flux 2 Max

The highest-fidelity tier in Black Forest Labs' Flux 2 family. Top-5 on the Artificial Analysis text-to-image arena with magazine-grade detail in textures, fabrics, and natural materials. Reach for it on hero shots, magazine layouts, and premium ad creative where the cost per image is justified.

10 Credits

Flux Kontext Pro

Black Forest Labs' instruction-based image editor — describe an edit in plain English ('replace the background with a sunset beach', 'change the shirt to red') and get a clean result that preserves the rest of the scene. Strong identity and lighting consistency makes it production-ready for outfit swaps, background replacement, and prop changes.

4 Credits

Flux Kontext Max

The premium Flux Kontext tier from Black Forest Labs. Improved typography handling and stronger scene understanding than Flux Kontext Pro, suited to complex multi-step edits, art-directed photo manipulations, and edits involving on-image text. Pick this when Pro's results aren't sticking the landing on harder transformations.

8 Credits

Wan 2.2

Alibaba's Wan 2.2 image-generation model with cinematic 2MP output in seconds. Open-weight, with the same Wan-line stylization that the video model is known for. Useful for rapid concept work and budget-tier generation when speed and cost matter more than top-tier fidelity.

2 Credits

Z Image Turbo

A fast 6B-parameter text-to-image model from the Z-Image lineup with surprisingly strong photorealism for its size. Built for high-volume iteration where credit cost and speed matter — moodboards, social variants, exploration before stepping up to a flagship. Open-weight and competitive at this budget tier.

1 Credit

Qwen Image

Alibaba's Asian-trained text-to-image model with notable strength in complex multilingual text rendering — Chinese, Japanese, and Korean characters render reliably where most Western models struggle. Open-weight, with broad stylization range. Useful for content targeting Asian markets or any project requiring native-script typography in generated imagery.

3 Credits

Qwen Image 2

Alibaba's unified generation-and-editing model with native 2K resolution and improved fidelity over the original Qwen Image. Strong text rendering across multiple scripts and natural cinematic stylization make it a solid pick for fashion, editorial, and content targeting Asian markets — and it doubles as an editor in the same checkpoint.

4 Credits

Qwen Image 2 Pro

The premium tier of Alibaba's Qwen Image 2. Enhanced realism and text accuracy at native 2K with precise image-editing capability built in — competitive with Flux 2 Pro on the Artificial Analysis text-to-image arena. Best-in-class for stylized fashion photography with multilingual text demands.

8 Credits

Recraft V4

Recraft's design-focused flagship, built around design taste rather than photorealism. Best-in-class typography rendering for an image model alongside strong prompt accuracy and art-directed composition. The natural choice for posters, packaging, brand assets, and any creative where text and graphic design matter as much as the image.

4 Credits

Recraft V4 SVG

Recraft V4 in SVG output mode — generates production-ready vector images you can edit directly in Figma or Illustrator. Built for logos, icons, and scalable design assets where the output needs to be vector from the start. Unique in the AI image space; most models output rasters that don't trace cleanly.

8 Credits

Recraft V3 Realism

Recraft's previous-generation realism model with strong long-form text rendering and photorealistic outputs. Surpassed by Recraft V4 on overall design taste and prompt accuracy, but still useful when you want the older Recraft V3 aesthetic specifically. Same checkpoint underneath as the V3 Digital Illustration variant.

4 Credits

Recraft V3 Digital Illustration

Recraft V3 in digital-illustration mode — consistent stylized output for editorial spot art, blog imagery, and content series. Useful when you need a series of images with a unified illustrated treatment rather than photorealism. Lower-cost than Recraft V4 if the simpler V3 aesthetic fits the brief.

4 Credits

Grok Imagine Image

xAI's image generation and editing model with strong prompt adherence and a distinctive personality-driven aesthetic. Built for X-platform-aligned content, social-media-ready imagery, and irreverent creative work where polish matters less than tone. Strong cost-quality at this tier compared to the major US flagships.

2 Credits

Ideogram V3 Realism

Ideogram's typography-focused image generator at the V3 release. Best-in-class text rendering at a lower cost than GPT Image, with photorealistic outputs that work for posters, packaging, and text-prominent ad creative. The natural choice when you need accurate on-image typography but GPT Image 2 pricing is excessive.

6 Credits

Ideogram V3 Digital Illustration

Ideogram V3 tuned for digital illustration — typography-aware editorial spot art and blog imagery with consistent stylized treatment. Same V3 checkpoint as the realism variant; the illustration mode just shifts default style. Cheaper than Recraft for editorial content where text legibility matters.

6 Credits

Flux

Upsampler's in-house creative upscaler powered by Flux. Adds plausible new detail at higher resolution rather than just enlarging existing pixels — the adjustable creativity slider lets you dial between faithful upscaling and aggressive detail enhancement. Built for AI-generated images, stylized work, and any upscale where the source benefits from a quality lift, not just a resolution lift. Only available on Upsampler.

1 Credit/MP

Stable Diffusion

Upsampler's custom Stable Diffusion-powered upscaler. Preserves the overall composition while regenerating fine detail — good for upscaling that should keep the source recognizable while filling in plausible texture and detail. A different aesthetic from the Flux upscaler; pick whichever style fits the source. Only available on Upsampler.

1 Credit/MP

Seed VR 2 Small

ByteDance's SeedVR2 — a lightweight diffusion upscaler that prioritizes faithfulness to the source over creative interpretation. The right choice for documentary preservation, ecommerce product upscaling, and any work where 'keep the original character' matters more than 'add new detail'.

0.25 Credits/MP

Seed VR 2 Large

The larger variant of ByteDance's SeedVR2 with higher fidelity for archival-grade outputs. Same source-preserving philosophy as the standard model — no creative reinterpretation, no invented detail — just better recovery of fine texture and edge sharpness for hero archival pieces.

0.5 Credits/MP

P Image Upscale

Pruna AI's optimized hyperrealistic image enhancer. Balanced speed-quality for general-purpose upscaling — a good middle ground between fast GAN-based options and slower premium diffusion upscalers. Strong for photorealistic content where you want a quality lift without the cost of SeedVR 2 Large.

0.25 Credits/MP

GAN

Traditional GAN-based upscaling — fast, low-cost, and faithful to the source. Doesn't add invented detail like diffusion-based upscalers; preserves what's there at higher resolution. The pragmatic choice for high-volume batch upscaling where speed and credit cost matter more than aesthetic lift.

0.25 Credits/MP

Nano Banana

Google's Gemini 2.5 Flash applied to photo restoration. Fast turnaround for casual restoration jobs — scratch removal, color correction on faded photos, light damage repair — with strong identity preservation across the restored image. Cheaper than Nano Banana 2 or Pro and well-suited to high-volume restoration where archival fidelity isn't required.

7 Credits

Nano Banana 2

Google's Gemini 3.1 Flash for high-fidelity photo restoration. Removes damage and noise with exceptional identity preservation — faces stay recognizable, color grading stays natural, fine detail comes back without invented features. Top 5 on the Artificial Analysis image-editing arena. Strong default for production restoration work.

14 Credits

Nano Banana Pro

Google's flagship for premium photo restoration. Magazine-quality manipulations with archival-grade fidelity — faces, fabrics, fine detail all recover without inventing implausible features. Top 3 on the image-editing arena. Reach for it on family-archive hero pieces, photo books, and any restoration where the result is going into print.

18 Credits

Seedream 4

ByteDance Seedream 4 applied to restoration. Stylized photo recovery with cinematic mood preservation — useful for fashion archives, music-video-era stills, and atmospheric vintage photo treatment where you want to keep the original feel rather than aim for documentary realism.

6 Credits

Seedream 4.5

ByteDance Seedream 4.5 for premium restoration. Rich color recovery and cinematic mood preservation — well-suited to editorial archives, fashion shoot recoveries, and any restoration where preserving the source's stylistic intent matters more than literal fidelity. Top of the Seedream restoration lineup.

7 Credits

Seedream 5 Lite

Lower-cost Seedream restoration tier — same cinematic aesthetic at faster iteration speeds. Built for batch archival workflows and high-volume restoration where credit cost matters. The built-in reasoning means it handles tricky inputs (heavily damaged sources, unusual subjects) better than other budget tiers.

7 Credits

Flux Kontext Fast

Fast Flux Kontext for restoration. Lower cost and faster turnaround on instruction-driven restoration — describe damage in plain English, the model handles the rest. Best for high-volume archival workflows where Flux Kontext Pro's quality isn't worth the per-image cost difference.

4 Credits

Flux Kontext Pro

Black Forest Labs' production-grade instruction-based editor applied to restoration. Describe damage in plain English ('remove scratches', 'restore faded color', 'sharpen the blur') and get clean results that preserve the original character. Strong identity and lighting consistency makes it reliable for archival work.

7 Credits

Flux Kontext Max

The premium Flux Kontext tier for archival restoration. Improved typography handling helps with old documents and signage; stronger scene understanding handles complex multi-issue damage where Pro's outputs miss the mark. Reach for it on archival hero pieces and document restoration that warrants the cost.

11 Credits

Flux 2 Klein

Open-weight Flux 2 in restoration mode. Lightweight Flux 2 quality at a lower cost than Pro or Max — practical for self-hosted archival pipelines and high-volume restoration where the Flux 2 aesthetic matters but flagship pricing doesn't fit.

5 Credits

Flux 2 Pro

Flux 2 Pro applied to restoration. Production-quality vintage photo recovery with stylistic preservation — useful when you want the recovered image to retain a deliberately stylized feel rather than aim for documentary realism. Sits below Flux 2 Max on top-end fidelity at better cost-quality.

8 Credits

Flux 2 Max

The highest-fidelity Flux 2 restoration tier. Magazine-grade recovery for archival hero pieces with art-directed control over the restoration treatment. Reach for it on family heirlooms, photo book covers, and any restoration headed for print where the cost per image is justified by the output.

13 Credits

GPT Image 2 Low

OpenAI GPT Image 2 budget restoration tier. Cheapest GPT Image 2 option for quick restoration — strong text-handling carries through to old documents and signage, useful even at the low tier. Good for high-volume archival workflows where the medium or high tier isn't justified.

5 Credits

GPT Image 2 Medium

OpenAI GPT Image 2 mid-tier for restoration. Industry-best text rendering carries through to archival document recovery — old letters, signage, and newspapers all reconstruct legibly where most restoration models fail. The default for production-quality archival when high-tier pricing isn't justified.

9 Credits

GPT Image 2 High

OpenAI GPT Image 2 at maximum quality for archival restoration. Highest text rendering accuracy in any image model, exceptional fine-detail recovery, and top-tier instruction adherence. The premium choice for hero archival pieces, document restoration, and any restoration work where every detail counts.

18 Credits

Qwen Edit

Alibaba's Qwen Edit applied to photo restoration. Open-weight, with multilingual text rendering strength that helps when restoring old Chinese, Japanese, or Korean documents where Western models can't read the source — let alone reconstruct it. Specialized but strong in its niche.

6 Credits

Qwen Edit 2511

The newer release of Alibaba's Qwen Edit applied to restoration. Improved fidelity over the original with stronger stylistic control. Best open-weight option for stylized Asian-aesthetic restoration and multilingual document recovery.

6 Credits

Nano Banana

Google's Gemini 2.5 Flash applied to instruction-driven image editing. Fast turnaround on quick edits — background swaps, object additions, color adjustments — with the identity preservation and prompt-following accuracy that comes from being Gemini-grounded. A practical choice for casual editing workflows that don't need Nano Banana 2's multi-image reference support.

4 Credits

Nano Banana 2

Google's Gemini 3.1 Flash for image editing. Currently in the top 5 on the Artificial Analysis image-editing arena, with native multi-image references that make complex edits and identity-preservation jobs significantly more reliable. Strong default for instruction-based edits — describe what to change, the rest of the scene stays consistent.

11 Credits

Nano Banana Pro

The flagship tier of Google's image-editing lineup. Magazine-quality photo manipulations with the strongest identity preservation in any current model — faces, products, brand elements all hold up across edits. Top 3 on the Artificial Analysis image-editing arena. Justified when the edit is going into print or premium ad placement.

15 Credits

Seedream 4

ByteDance Seedream 4 applied to editing. Stylized photo edits that preserve the cinematic mood and color palette of the source — useful for fashion editorial revisions, music-video-style retouching, and atmospheric tone shifts. Less photographically literal than Nano Banana, more cinematic — pick it when aesthetic preservation matters more than precise instruction-following.

3 Credits

Seedream 4.5

ByteDance Seedream 4.5 in edit mode. Premium cinematic edits with rich color recovery and mood handling, with stronger spatial understanding than Seedream 4 — character outfits, props, and backgrounds all sit more naturally in their scenes. Top of the Seedream editing lineup; pair with Seedream 5 Lite for high-volume iteration.

4 Credits

Seedream 5 Lite

A faster, lower-cost alternative to Seedream 4.5 in edit mode. Same cinematic Seedream aesthetic at reduced fidelity, built for batch editing workflows where credit cost matters. The built-in reasoning and example-based editing differentiate it from other budget tiers — pass an example image of the desired result and the model interprets the intent.

4 Credits

Flux Kontext Fast

The fast variant of Black Forest Labs' instruction-based editor. Same Flux Kontext interface — describe edits in plain English — at lower cost and faster turnaround. Best for iteration, ideation, and high-volume editing workflows where Flux Kontext Pro's quality isn't worth the credit difference per edit.

1 Credit

Flux Kontext Pro

Black Forest Labs' production-grade instruction-based image editor. Describe an edit in plain English and get a clean result that preserves the rest of the scene — strong identity, lighting, and color consistency across iterations. The default choice for instruction-based edits where Nano Banana isn't preferred.

4 Credits

Flux Kontext Max

The premium tier of Black Forest Labs' Flux Kontext editor. Improved typography handling and stronger scene understanding than Pro — useful for complex multi-step edits, art-directed manipulations, and edits involving on-image text. Reach for it when Kontext Pro's outputs aren't holding up on the harder transformations.

8 Credits

Flux Fill Pro

Black Forest Labs' mask-based inpainting model. Pixel-perfect localized edits — paint a mask over the region you want changed, describe what should fill it, the rest of the image is untouched. The complement to Flux Kontext: Kontext for instruction-based whole-image edits, Fill Pro for precision regional work.

5 Credits

Flux 2 Klein

Open-weight Flux 2 in edit mode. Self-hostable Flux 2 quality at a lower cost than Pro or Max, useful when you want the Flux 2 aesthetic preserved across edits without paying flagship pricing. Strong stylization preservation makes it a good pick for iterating on Flux 2 generations specifically.

2 Credits

Flux 2 Pro

Black Forest Labs' Flux 2 Pro applied to edits. Production-quality stylized photo edits with strong stylization preservation and reference-image support — useful for iterating on Flux 2 generations or for editing existing photos in a Flux 2 aesthetic. Sits below Flux 2 Max on top-end fidelity but offers better cost-quality.

5 Credits

Flux 2 Max

The highest-fidelity Flux 2 edit tier. Top-25 on the Artificial Analysis image-editing arena with magazine-grade detail and exceptional preservation of source character. Pick it for hero edits, premium ad revisions, and any work going into print where the cost per edit is justified by the output.

10 Credits

GPT Image 2 Low

OpenAI GPT Image 2 in edit mode at the budget tier. Cheapest GPT Image 2 option for quick, lower-detail edits with the same model's text rendering and instruction-following advantages preserved. Useful for high-volume iteration, draft edits, and exploration before committing to medium or high tier for the final.

2 Credits

GPT Image 2 Medium

OpenAI's GPT Image 2 mid-tier for edits. Industry-best text rendering carries through to editing — exact text replacements, typography revisions, and on-image text edits all work where most editors fall apart. The default for production-quality edits when high-tier pricing isn't warranted.

6 Credits

GPT Image 2 High

OpenAI's GPT Image 2 at maximum quality for edits. Ranked #2 on the Artificial Analysis image-editing arena with magazine-cover fidelity and best-in-class text rendering during edits. The premium choice for hero edits, complex art-directed manipulations, and any typography-heavy editing work.

15 Credits

Grok Imagine Image

xAI's image editor with strong prompt adherence and a distinctive personality-driven aesthetic. Best for stylized edits where polish matters less than tone — social-media content, irreverent creative, X-platform-aligned visuals. Cost-competitive at this tier compared to the US flagships.

2 Credits

Ideogram V3

Ideogram V3 in edit mode. Typography-aware revisions for design-focused workflows — logo refinements, packaging text edits, poster typography revisions. Cheaper than GPT Image 2 for typography-heavy editing, though weaker on general photo edits where instruction-following matters more than text rendering.

6 Credits

Qwen Edit

Alibaba's Qwen Image Edit — an edit-trained variant of Qwen Image with native multi-image editing support. Open-weight, with the multilingual text rendering strength carried through to editing. Useful for stylized photo revisions, fashion editorial, and content targeting Asian markets.

3 Credits

Qwen Edit 2511

The newer release of Alibaba's Qwen Edit, with improved fidelity and stylistic control over the original. Ranked #28 on the Artificial Analysis image-editing arena. Strongest open-weight option for stylized Asian-aesthetic edits where mainstream Western models miss the brief.

3 Credits

Qwen Image 2

Alibaba's Qwen Image 2 in edit mode — the unified generation-and-editing model that handles both from a single checkpoint. Native 2K resolution with strong text rendering across scripts. A natural choice when you want consistency between generation and editing in the same workflow.

4 Credits

Qwen Image 2 Pro

The premium tier of Qwen Image 2 in edit mode. Enhanced realism and text accuracy at native 2K with precise image-editing capability — strong cinematic and fashion sensibility makes it well-suited to stylized editorial revisions where Western flagships can feel too literal.

8 Credits

BiRefNet Background Removal

BiRefNet — a specialized background-removal model purpose-built for clean edge extraction. Preserves fine detail in hair, fur, and transparent or translucent regions where general-purpose tools leave halos and jagged edges. The right choice for ecommerce cutouts, compositing prep, and asset extraction at scale.

1 Credit

Depth Anything V2

Depth Anything V2 — a specialized depth-estimation model that generates high-quality depth maps from any image. Useful for ControlNet workflows, 3D-aware compositing, AR/VR asset preparation, and anywhere downstream tools need a depth signal. Not a creative editor; a feeder model for other pipelines.

1 Credit

Wan 2.2 5B

The smaller, faster variant of Alibaba's Wan 2.2 video model. Open-weight, ultra-fast text-to-video and image-to-video for rapid prototyping. Lower fidelity than the 14B variant but practical for ideation, draft renders, and high-volume iteration before committing to a flagship.

0.2 Credits/sec

Wan 2.2 14B

Alibaba's flagship Wan 2.2 video model with crisp 480p output and strong stylization. Open-weight availability makes it useful for self-hosted pipelines and teams that want production-quality video generation without closed-API costs. Edged out by Veo, Sora, and Kling at the top tier but cost-competitive.

1 Credit/sec

Happy Horse

HappyHorse 1.0 — Alibaba-ATH's image-to-video flagship, currently ranked #1 on the Artificial Analysis text-to-video arena. Synchronized native audio with multilingual lip-sync makes it strong for character-driven content where dialogue matters. Best when you have a reference image and want it animated with audio.

14 Credits/sec

P-Video Draft

Pruna AI's draft-tier video generation — roughly 4x faster than the full P-Video model for rapid previews. Built for the iteration phase: nail down the motion and composition cheaply, then commit to a full render only when you're sure. Cuts cost meaningfully on exploratory work.

1 Credit/sec

P-Video

Pruna AI's production-tier video model with fast generation, built-in audio, and multi-aspect-ratio support. Optimized for cost-quality balance — solid for production workflows where the top-tier closed models (Veo, Sora) feel too expensive for the use case.

4 Credits/sec

LTX 2 Distilled

Lightricks' distilled LTX 2 — open-source audio-video model for expressive clips with sound. Lower fidelity than the full LTX 2 / 2.3 line but practical for open-weight workflows and teams that want LTX-style video without closed-API costs.

2 Credits/sec

LTX 2.3 Fast

Lightricks' fastest LTX 2.3 tier with synchronized audio. Open-weight cinematic concept iteration — describe a clip with audio cues, get a quick render with sound. Use for iteration before stepping up to LTX 2.3 Pro for final renders.

4 Credits/sec

LTX 2.3 Pro

Lightricks' flagship LTX 2.3 with synchronized audio. Higher-fidelity video generation in the open-weight LTX line, with cinematic camera handling and audio sync. Strong choice for teams that want premium video quality on infrastructure they control.

6 Credits/sec

Seedance 1 Pro

ByteDance's Seedance 1.0 — fast, cost-efficient video generation with strong motion physics and detail. The original Seedance flagship, now succeeded by Seedance 1.5 Pro and 2 Fast for top-tier work. Still a practical pick when 1.5 Pro and 2 are overkill for the brief.

3 Credits/sec

Seedance 1.5 Pro (No Audio)

ByteDance Seedance 1.5 Pro without the audio overhead. Cinema-quality video with precise motion and cinematic camera control at lower cost than the audio-enabled variant. Built for B-roll, silent narrative, and any content where audio is added in post.

3 Credits/sec

Seedance 1.5 Pro

ByteDance Seedance 1.5 Pro with synchronized audio. Cinema-quality video with precise lip-syncing and cinematic camera control — strong for narrative short-form content, music video aesthetics, and anywhere dialogue matters as much as visuals.

5 Credits/sec

Seedance 2

ByteDance Seedance 2.0 — the next-generation Seedance flagship with native audio, multimodal inputs, and 720p output. Currently #2 on the Artificial Analysis text-to-video arena. Reach for it on hero clips, premium ad creative, and narrative content where the cost per second is justified.

13 Credits/sec

Grok Imagine Video

xAI's stylized video generator with synchronized audio. High-quality text-to-video and image-to-video with the distinctive xAI personality. Currently ranked #5 on the Artificial Analysis text-to-video arena. Strong for X-platform-aligned content and stylized social video where polish matters less than tone.

5 Credits/sec

Kling 2.5 Pro

Kling 2.5 Turbo Pro — the flagship Kling tier with pro-grade text-to-video and image-to-video. Smooth motion, strong prompt fidelity, and exceptional motion physics for complex camera work — tracking shots, dolly moves, and crane sweeps all hold up. Strong choice for cinematic narrative work.

7 Credits/sec

Runway Gen 4.5

Runway Gen-4.5 — premium text-to-video and image-to-video with cinematic quality, rich detail, and fluid motion. Mature creator-focused tooling with strong cinematic handling. Currently #12 on the Artificial Analysis text-to-video arena. The natural choice for teams already on Runway's broader video stack.

12 Credits/sec

Veo 3.1 (No Audio)

Google DeepMind's Veo 3.1 Fast without audio — high-fidelity video with strong temporal coherence and native video extension. Lower cost than the audio-enabled variant, built for B-roll and silent cinematic content where audio is added in post.

10 Credits/sec

Veo 3.1

Google DeepMind's Veo 3.1 Fast with synchronized audio. Industry-leading dialogue lip-sync and audio realism — context-aware audio generation, smooth motion, and native video and audio extension. Reach for it on dialogue-heavy short-form content where Veo's lip-sync advantage justifies the cost.

15 Credits/sec