stability-ai / stable-diffusion-v3.5-large

Generate high-quality images from a text prompt with strong prompt adherence, accurate typography, and diverse visual styles.

0.065 per megapixel of image

OpenAPI

Input

Prompt

The text prompt describing the image to generate.

Image Size

Width

Height

The size of the generated image. Choose a preset (e.g. 'square_hd', 'portrait_16_9') or pass a custom {width, height} object.

Num Inference Steps

Min: 1 - Max: 50

The number of inference steps to perform. More steps can improve detail at the cost of speed.

Additional Settings

Customize your input with more control.

Guidance Scale

Min: 0 - Max: 20

The CFG (Classifier Free Guidance) scale. Higher values increase adherence to the prompt.

Negative Prompt

Describe what you do NOT want to appear in the image.

Seed

The same seed and the same prompt given to the same version of the model will output the same image every time.

Output Format

The format of the generated image.

Enable Safety Checker

Safety checker can only be disabled on API call

ControlNet

Path

URL or HuggingFace path to the ControlNet model.

End Percentage

Min: 0 - Max: 1

Fraction of the denoising process at which ControlNet conditioning ends.

Start Percentage

Min: 0 - Max: 1

Fraction of the denoising process at which ControlNet conditioning starts.

Control Image URL

URL of the control image that guides the generation.

Conditioning Scale

Min: 0 - Max: 2

How strongly the ControlNet conditioning is applied.

Optional ControlNet conditioning. Provide a control model path and a control image to guide structure/composition.

LoRAs

Optional list of LoRA weights to apply. Each entry references a LoRA path and an optional scale.

IP-Adapter

Path

URL or HuggingFace path to the IP-Adapter weights.

Scale

How strongly the IP-Adapter conditioning is applied.

Image URL

URL of the reference image used for IP-Adapter conditioning.

Subfolder

Optional subfolder within the IP-Adapter repository.

Weight Name

Optional specific weight file name within the IP-Adapter repository.

Mask Image URL

Optional URL of a mask image restricting where IP-Adapter conditioning applies.

Mask Threshold

Min: 0.01 - Max: 0.99

Threshold used to binarize the IP-Adapter mask.

Image Encoder Path

URL or HuggingFace path to the image encoder.

Image Encoder Subfolder

Optional subfolder within the image encoder repository.

Image Encoder Weight Name

Optional specific weight file name within the image encoder repository.

Optional IP-Adapter image prompting. Provide an adapter path and a reference image to condition generation on that image.

You need to be logged in to run this model and view results.

Output

{
  "error": "",
  "inferenceTime": 4879,
  "output": [
    "https://media.modelrunner.ai/jM1PZPPq6rMSJOuzY0QrM.jpeg"
  ],
  "input": {
    "loras": [],
    "prompt": "an ornate Victorian greenhouse filled with tropical plants and butterflies, afternoon sunlight streaming through iron and glass panels, warm amber tones, ultra-detailed digital painting",
    "image_size": "landscape_4_3",
    "output_format": "jpeg",
    "guidance_scale": 3.5,
    "negative_prompt": "",
    "num_inference_steps": 28,
    "enable_safety_checker": true
  },
  "logs": "Generated 1 output(s)"
}

Generated in 4.879 seconds

Logs (1 lines)

Examples

Model Details

Stable Diffusion 3.5 Large turns a text prompt into a high-quality image. It is Stability AI's flagship 8-billion-parameter Multimodal Diffusion Transformer (MMDiT), built for strong prompt adherence, accurate typography, and a wide range of visual styles from photorealism to illustration and 3D renders. Give it a descriptive prompt and pick an image size; it returns one or more generated images. Its standout strengths are complex-prompt understanding and stylistic diversity, making it a strong general-purpose text-to-image default.

## Best for - Photorealistic scenes and portraits from a detailed text description - Rendering legible text, signage, and typography inside an image - Concept art, illustration, and stylized renders across many aesthetics - Marketing visuals, product mockups, and social imagery from a prompt - Complex multi-subject compositions that need faithful prompt adherence

## Choose another model when - You want to edit, restyle, or inpaint an existing image rather than generate from scratch — use an image-editing model - You need to enlarge or add detail to an existing image — use an upscaling model - You need a video or animation — use a text-to-video model

## Tips - Be specific about subject, setting, lighting, and style; SD 3.5 Large rewards detailed prompts - Raise `guidance_scale` (default 3.5) toward 5-7 for tighter prompt adherence; lower it for more creative latitude - Use `negative_prompt` to exclude unwanted elements (e.g. "blurry, extra fingers, watermark") - Set `image_size` to a preset ("square_hd", "portrait_16_9", "landscape_4_3", …) or pass a custom `{ width, height }` object

## Advanced Configuration - `controlnet` — condition generation on a control image (structure/pose/edges) via a ControlNet model path plus a `control_image_url`. API-only. - `ip_adapter` — image-prompt the generation from a reference image (`image_url`) via an IP-Adapter path, optionally masked. API-only. - `loras` — apply one or more LoRA weights (by `path`, with an optional `scale`) for custom styles or subjects. API-only.

To run via the ModelRunner JavaScript client: ```js import { modelrunner } from "@modelrunner/client";

const result = await modelrunner.subscribe("stability-ai/stable-diffusion-v3.5-large", { input: { prompt: "a serene mountain lake at golden hour, mist over the water, photorealistic", image_size: "landscape_4_3", num_inference_steps: 28, guidance_scale: 3.5, }, }); ```

stability-ai / stable-diffusion-v3.5-large

Model Input

Input

Additional Settings

Model Output

Output

Model Example Requests

Examples

Model Details

Model Details