The 75 token limit
Stable Diffusion's CLIP text encoder accepts up to 75 tokens per pass. An English word is roughly 1–2 tokens; a non-Latin character is 2–3 tokens. Long prompts inevitably exceed this.
How BREAK works
a young woman, long black hair, BREAK, in a misty forest, soft morning light, BREAK, photorealistic, sharp focus
BREAK splits the prompt into segments. Each segment is encoded independently (max 75 tokens) and the results are concatenated. You can chain as many segments as you want.
Typical uses
- Subject block + BREAK + scene block + BREAK + style block
- Stop the subject from being diluted by the scene description
- Multi-character: one BREAK segment per character
BREAK is not magic
Unlike ControlNet, BREAK does not assign segments to image regions. It only chunks attention.
4–5 segments often produce drift. Stay at 2–3.
SDXL note
SDXL's token limit is 77 with dual encoders. BREAK still helps but the benefit is smaller than in SD 1.5. Flux does not need BREAK — its natural-language handling is much better.