AI Models in unified API

Experiment quickly in a clean UI, explore example runs for insights,
or integrate via SDKs and APIs.

LongCat-Video t2v

Turn plain text into cinematic, on-brand video. Describe the scene and camera feel; LongCat generates smooth, consistent shots with adjustable length, FPS, and quality.

Jewelry Modeling

Generate realistic jewelry modeling photos by applying provided jewelry images onto a model or person, with professional lighting and detailed textures.

Real Esrgan Image Upscaler

High-quality image upscaler with optional face enhancement

Seedream v4

Seedream 4.0 is a next-generation image creation model that unifies generation and editing in a single architecture, enabling advanced multimodal reasoning and reference consistency while delivering stunning 4K images with significantly faster inference.

SDXL Lightning 4-step

SDXL-Lightning is a lightning-fast text-to-image generation model that produces high-quality 1024px images in just a few steps, distilled from Stable Diffusion XL.

LongCat-Video t2v

Turn plain text into cinematic, on-brand video. Describe the scene and camera feel; LongCat generates smooth, consistent shots with adjustable length, FPS, and quality.

Jewelry Modeling

Generate realistic jewelry modeling photos by applying provided jewelry images onto a model or person, with professional lighting and detailed textures.

Real Esrgan Image Upscaler

High-quality image upscaler with optional face enhancement

Seedream v4

SDXL Lightning 4-step

SDXL-Lightning is a lightning-fast text-to-image generation model that produces high-quality 1024px images in just a few steps, distilled from Stable Diffusion XL.

LongCat-Video t2v

Turn plain text into cinematic, on-brand video. Describe the scene and camera feel; LongCat generates smooth, consistent shots with adjustable length, FPS, and quality.

Jewelry Modeling

Generate realistic jewelry modeling photos by applying provided jewelry images onto a model or person, with professional lighting and detailed textures.

Real Esrgan Image Upscaler

High-quality image upscaler with optional face enhancement

Seedream v4

SDXL Lightning 4-step

SDXL-Lightning is a lightning-fast text-to-image generation model that produces high-quality 1024px images in just a few steps, distilled from Stable Diffusion XL.

LongCat-Video t2v

Turn plain text into cinematic, on-brand video. Describe the scene and camera feel; LongCat generates smooth, consistent shots with adjustable length, FPS, and quality.

Jewelry Modeling

Generate realistic jewelry modeling photos by applying provided jewelry images onto a model or person, with professional lighting and detailed textures.

Real Esrgan Image Upscaler

High-quality image upscaler with optional face enhancement

Seedream v4

SDXL Lightning 4-step

SDXL-Lightning is a lightning-fast text-to-image generation model that produces high-quality 1024px images in just a few steps, distilled from Stable Diffusion XL.

LongCat-Video i2v

LongCat-Video turns a single still image into minutes-long, smooth 480p, 30fps video with stable style, lighting, and identity — fast, consistent, production-ready animation from one frame.

Veo 3.1 Text to Image

Create cinematic 8-second videos with Veo 3.1, Google’s latest text-to-video model in the Gemini API — now with native audio, frame control, and reference image support.

Seedance 1.0 Pro

Seedance 1.0 generates 1080P videos with smooth motion, rich detail, and diverse styles, while the pro version adds multi-shot narrative and advanced instruction following for cinematic results.

Nano Banana

State of the art image editing model from Google Gemini 2.5.

Inspyrenet Image Mask

Helps find and highlight important objects in high-resolution images. It works without needing special high-quality training data and gives sharp, accurate results.

LongCat-Video i2v

LongCat-Video turns a single still image into minutes-long, smooth 480p, 30fps video with stable style, lighting, and identity — fast, consistent, production-ready animation from one frame.

Veo 3.1 Text to Image

Create cinematic 8-second videos with Veo 3.1, Google’s latest text-to-video model in the Gemini API — now with native audio, frame control, and reference image support.

Seedance 1.0 Pro

Seedance 1.0 generates 1080P videos with smooth motion, rich detail, and diverse styles, while the pro version adds multi-shot narrative and advanced instruction following for cinematic results.

Nano Banana

State of the art image editing model from Google Gemini 2.5.

Inspyrenet Image Mask

Helps find and highlight important objects in high-resolution images. It works without needing special high-quality training data and gives sharp, accurate results.

LongCat-Video i2v

LongCat-Video turns a single still image into minutes-long, smooth 480p, 30fps video with stable style, lighting, and identity — fast, consistent, production-ready animation from one frame.

Veo 3.1 Text to Image

Create cinematic 8-second videos with Veo 3.1, Google’s latest text-to-video model in the Gemini API — now with native audio, frame control, and reference image support.

Seedance 1.0 Pro

Seedance 1.0 generates 1080P videos with smooth motion, rich detail, and diverse styles, while the pro version adds multi-shot narrative and advanced instruction following for cinematic results.

Nano Banana

State of the art image editing model from Google Gemini 2.5.

Inspyrenet Image Mask

Helps find and highlight important objects in high-resolution images. It works without needing special high-quality training data and gives sharp, accurate results.

LongCat-Video i2v

LongCat-Video turns a single still image into minutes-long, smooth 480p, 30fps video with stable style, lighting, and identity — fast, consistent, production-ready animation from one frame.

Veo 3.1 Text to Image

Create cinematic 8-second videos with Veo 3.1, Google’s latest text-to-video model in the Gemini API — now with native audio, frame control, and reference image support.

Seedance 1.0 Pro

Seedance 1.0 generates 1080P videos with smooth motion, rich detail, and diverse styles, while the pro version adds multi-shot narrative and advanced instruction following for cinematic results.

Nano Banana

State of the art image editing model from Google Gemini 2.5.

Inspyrenet Image Mask

Helps find and highlight important objects in high-resolution images. It works without needing special high-quality training data and gives sharp, accurate results.

From Idea to Production in Minutes

A streamlined workflow for building with AI models.

Search models...

FLUX

Veo 3.1

Imagen

Seedream V4

Seedance

Wan 2.1

View Models

Search models...

FLUX

Veo 3.1

Imagen

Seedream V4

Seedance

Wan 2.1

No infrastructure to manage. Focus on building, we handle the rest.

API Quickstart

Explore the unified API. For each model, we provide a code snippet to get you started.

import { modelrunner } from "@modelrunner/client";

const result = await modelrunner.subscribe(`google/veo-3.1-text-to-video`, {
  input: {
  "prompt": "An eye-level shot glides through a misty pine forest at dawn. Soft sunlight filters through the trees, illuminating particles in the air. A red fox slowly emerges from the fog, stretches, and walks across moss-covered ground. The camera tracks its gentle movement in shallow focus. Natural ambiance fills the soundscape — birds chirping, distant rustle of leaves, and a light breeze passing through the forest.",
  "duration": "6",
  "resolution": "720p",
  "aspect_ratio": "16:9",
  "person_generation": "allow_all"
}
});

Run in Playground

Frequently Asked Questions

Everything you need to know about ModelRunner

How is ModelRunner different from other AI providers?

ModelRunner provides a unified API for generative AI models across image and video. Instead of managing multiple provider integrations, you access Google, ByteDance, and other providers through a single endpoint. This simplifies development, reduces integration overhead, and gives you flexibility to switch between providers or models without changing your code.

What models does ModelRunner support?

ModelRunner supports models across four categories: text-to-image, image-to-image, text-to-video, and image-to-video. We integrate with leading providers including Google (Veo 3.1, Imagen) and ByteDance (Seedance, Seedream), as well as open-source models like FLUX running on serverless compute. New models are added regularly, and you can test any model instantly in our Playground before integrating.

How does pricing work?

ModelRunner offers transparent, pay-as-you-go pricing with no hidden fees. Pricing varies by model type: per-second GPU time for serverless models, per-output for simple generation, or per-output-second for video (based on duration). All pricing is clearly displayed for each model, including GPU costs for serverless inference. You purchase credits via Stripe and only pay for what you use.

Can I try models before integrating?

Yes. Every model on ModelRunner has an interactive Playground where you can test generation with full parameter control. You can explore example runs to see real inputs and outputs, then copy ready-to-use code snippets for your integration. The Playground mirrors the exact API behavior, so what you see is what you get in production.

How do I integrate ModelRunner into my application?

Integration is straightforward: sign up, purchase credits, and create an API key. You can then make requests via our REST API or use our JavaScript SDK for a more streamlined experience. All models share a consistent interface—same authentication, same request/response patterns—so switching between models requires minimal code changes.

Is my data private and secure?

Yes. ModelRunner does not use your inputs or outputs for training. We support secure authentication including passkeys (WebAuthn), OAuth (GitHub, Google), and two-factor authentication. Your API keys are managed securely, and all requests are processed through our infrastructure without data retention beyond what's needed to fulfill your request.

Can I use ModelRunner for commercial projects?

Yes. You can use ModelRunner-generated content for commercial purposes. However, licensing terms depend on the underlying model provider. We clearly document licensing for each model in our catalog. For open-source models, standard open-source licenses apply. For provider APIs like Google or ByteDance, their respective terms of service govern commercial use.

Does ModelRunner support enterprise workloads?

Yes. ModelRunner supports production applications requiring high throughput and reliability. Our async queue system handles requests at scale with real-time status updates via SSE. For enterprise needs like dedicated capacity, custom SLAs, or volume pricing, contact our sales team to discuss tailored solutions.

Do pro models ever use faster or cheaper models under the hood?

No. When you select a pro model, you always get that exact model—we never substitute it with a faster or cheaper alternative. Every request is processed by the model you chose, ensuring consistent quality and predictable results. This transparency is core to how ModelRunner operates: what you select is exactly what runs your generation.

AI Models in unified APIAI Models in unified API

LongCat-Video t2v

Jewelry Modeling

Real Esrgan Image Upscaler

Seedream v4

SDXL Lightning 4-step

LongCat-Video t2v

Jewelry Modeling

Real Esrgan Image Upscaler

Seedream v4

SDXL Lightning 4-step

LongCat-Video t2v

Jewelry Modeling

Real Esrgan Image Upscaler

Seedream v4

SDXL Lightning 4-step

LongCat-Video t2v

Jewelry Modeling

Real Esrgan Image Upscaler

Seedream v4

SDXL Lightning 4-step

LongCat-Video i2v

Veo 3.1 Text to Image

Seedance 1.0 Pro

Nano Banana

Inspyrenet Image Mask

LongCat-Video i2v

Veo 3.1 Text to Image

Seedance 1.0 Pro

Nano Banana

Inspyrenet Image Mask

LongCat-Video i2v

Veo 3.1 Text to Image

Seedance 1.0 Pro

Nano Banana

Inspyrenet Image Mask

LongCat-Video i2v

Veo 3.1 Text to Image

Seedance 1.0 Pro

Nano Banana

Inspyrenet Image Mask

From Idea to Production in Minutes

1.Explore

2.Experiment

3.Learn

4.Ship

Explore

Experiment

Learn

Ship

API Quickstart

Frequently Asked Questions

AI Models in unified API