Model Details
Z-Image Turbo ControlNet generates images that follow the structure of a control image while matching a text prompt. Provide a reference image and choose how it conditions the result — use its edges (canny), depth, or human pose — so the output keeps the composition, geometry, or figure layout of the source while the prompt sets subject, style, and details. Built on the 6B Z-Image Turbo architecture, it runs in as few as 1–8 inference steps for near-real-time turnaround, and can return up to 4 images per request.
## Best for - Redrawing a scene while keeping its layout: feed a photo and prompt a new subject or style with the same composition - Edge-guided generation from a sketch or line drawing (canny) so the output traces your linework - Depth-guided generation that preserves 3D structure and camera perspective from a reference - Pose-controlled character or figure generation that copies a body pose from a reference photo - Fast, cheap structural-conditioning iterations where you want many variations at low cost
## Choose another model when - You want to transform an image by prompt strength alone with no structural map — use the z-image image-to-image variant - You want a pure text-to-image render with no reference image to anchor to — use a text-to-image model - You need to edit specific regions of an existing image with a mask — use an inpainting/edit model - You need video output — use an image-to-video model
## Tips - Set `preprocess` to match your control image: `canny` for line art / edges, `depth` for 3D structure, `pose` for figures, or `none` to condition on the raw image - Tune `control_scale` (0–1, default 0.75) to trade prompt freedom against how tightly the output follows the control image; lower it if the result feels over-constrained - Use `control_start` and `control_end` to apply conditioning only during part of the denoising process — ending early (e.g. 0.8) lets the model add prompt-driven detail late - Leave `image_size` at `auto` to inherit the control image's aspect ratio, or pass a preset / custom `{width, height}`
## Advanced Configuration - `enable_prompt_expansion` (default false) rewrites your prompt for richer detail; enabling it adds a small per-request surcharge. - `acceleration` (`none` / `regular` / `high`) trades a little quality for speed.
To run via the ModelRunner JavaScript client: ```js import { modelrunner } from "@modelrunner/client";
const result = await modelrunner.subscribe("tongyi-mai/z-image/turbo/controlnet", { input: { prompt: "A futuristic city skyline at night, neon lights", image_url: "https://media.modelrunner.ai/2ZBTR6fvTxz172zb027cJ.png", preprocess: "canny", control_scale: 0.75, image_size: "landscape_16_9", }, }); ```




