Skip to main content
stability-ai avatar

stability-ai / stable-diffusion-v3.5-large

Generate high-quality images from a text prompt with strong prompt adherence, accurate typography, and diverse visual styles.

0.065 per megapixel of image

Model Input

Input

The text prompt describing the image to generate.

The size of the generated image. Choose a preset (e.g. 'square_hd', 'portrait_16_9') or pass a custom {width, height} object.

Min: 1 - Max: 50

The number of inference steps to perform. More steps can improve detail at the cost of speed.

Additional Settings

Customize your input with more control.

Min: 0 - Max: 20

The CFG (Classifier Free Guidance) scale. Higher values increase adherence to the prompt.

Describe what you do NOT want to appear in the image.

The same seed and the same prompt given to the same version of the model will output the same image every time.

The format of the generated image.

Safety checker can only be disabled on API call

URL or HuggingFace path to the ControlNet model.

Min: 0 - Max: 1

Fraction of the denoising process at which ControlNet conditioning ends.

Min: 0 - Max: 1

Fraction of the denoising process at which ControlNet conditioning starts.

URL of the control image that guides the generation.

Min: 0 - Max: 2

How strongly the ControlNet conditioning is applied.

Optional ControlNet conditioning. Provide a control model path and a control image to guide structure/composition.

Optional list of LoRA weights to apply. Each entry references a LoRA path and an optional scale.

URL or HuggingFace path to the IP-Adapter weights.

How strongly the IP-Adapter conditioning is applied.

URL of the reference image used for IP-Adapter conditioning.

Optional subfolder within the IP-Adapter repository.

Optional specific weight file name within the IP-Adapter repository.

Optional URL of a mask image restricting where IP-Adapter conditioning applies.

Min: 0.01 - Max: 0.99

Threshold used to binarize the IP-Adapter mask.

URL or HuggingFace path to the image encoder.

Optional subfolder within the image encoder repository.

Optional specific weight file name within the image encoder repository.

Optional IP-Adapter image prompting. Provide an adapter path and a reference image to condition generation on that image.

You need to be logged in to run this model and view results.
Log in

Model Output

Output

Generated image output
Generated in 4.879 seconds
Logs (1 lines)

Model Example Requests

Examples

Example output 1Example output 2Example output 3

Model Details

Model Details

Stable Diffusion 3.5 Large turns a text prompt into a high-quality image. It is Stability AI's flagship 8-billion-parameter Multimodal Diffusion Transformer (MMDiT), built for strong prompt adherence, accurate typography, and a wide range of visual styles from photorealism to illustration and 3D renders. Give it a descriptive prompt and pick an image size; it returns one or more generated images. Its standout strengths are complex-prompt understanding and stylistic diversity, making it a strong general-purpose text-to-image default.

## Best for - Photorealistic scenes and portraits from a detailed text description - Rendering legible text, signage, and typography inside an image - Concept art, illustration, and stylized renders across many aesthetics - Marketing visuals, product mockups, and social imagery from a prompt - Complex multi-subject compositions that need faithful prompt adherence

## Choose another model when - You want to edit, restyle, or inpaint an existing image rather than generate from scratch — use an image-editing model - You need to enlarge or add detail to an existing image — use an upscaling model - You need a video or animation — use a text-to-video model

## Tips - Be specific about subject, setting, lighting, and style; SD 3.5 Large rewards detailed prompts - Raise `guidance_scale` (default 3.5) toward 5-7 for tighter prompt adherence; lower it for more creative latitude - Use `negative_prompt` to exclude unwanted elements (e.g. "blurry, extra fingers, watermark") - Set `image_size` to a preset ("square_hd", "portrait_16_9", "landscape_4_3", …) or pass a custom `{ width, height }` object

## Advanced Configuration - `controlnet` — condition generation on a control image (structure/pose/edges) via a ControlNet model path plus a `control_image_url`. API-only. - `ip_adapter` — image-prompt the generation from a reference image (`image_url`) via an IP-Adapter path, optionally masked. API-only. - `loras` — apply one or more LoRA weights (by `path`, with an optional `scale`) for custom styles or subjects. API-only.

To run via the ModelRunner JavaScript client: ```js import { modelrunner } from "@modelrunner/client";

const result = await modelrunner.subscribe("stability-ai/stable-diffusion-v3.5-large", { input: { prompt: "a serene mountain lake at golden hour, mist over the water, photorealistic", image_size: "landscape_4_3", num_inference_steps: 28, guidance_scale: 3.5, }, }); ```