Why text-only character consistency rarely works
Write "a young woman with long black hair" and you get a different person every time. AI models sample from prompt vector plus random noise; text alone can lock down a category of people, never a specific person. So "character consistency" is really a search problem: narrow the possibility space until one solution stays close to identical across runs.
There are five tools, weakest to strongest: identity anchors (age, profession, personality) → appearance anchors (precise hair / skin / eyes / mole) → fixed wardrobe (one outfit + accessories) → signature props (always wears the same brass watch) → LoRA or reference images (the strongest). Stack them and consistency rises.
5 lock-in techniques
1. Identity anchors
a 28-year-old female violinist named "Mei", calm and introverted, often slightly slouched posture
Give the character a specific role and temperament. "Named Mei" does no semantic work, but it stabilizes how you refer to her and seeds a personality.
2. Appearance anchors
shoulder-length wavy black hair with a faint copper highlight, fair olive skin, narrow almond eyes, small mole on the right cheek under the eye
Lock at least five points: hairstyle + hair color + skin + eye shape + one distinguishing mark (mole / freckle / birthmark).
3. Fixed wardrobe
always wears a charcoal wool coat over a cream turtleneck, dark navy trousers, a vintage leather satchel
One outfit reused across every shot is more recognizable than changing wardrobe. Readers come to equate the outfit with the character.
4. Signature props
always carries a brass pocket watch on a thin chain
A specific recurring object lifts recognizability another notch and is easy to keep verbatim.
5. LoRA or reference images (strongest)
Train a character LoRA, or use Midjourney --cref / --sref. LoRA works across many shots; --cref is well-suited to 3–5 shot mini series.
Midjourney --cref and --sref
| Parameter | Purpose | Typical use |
|---|---|---|
| --cref [image URL] | Reference an image as the character | Fix face and identity |
| --cw 0–100 | Character weight | 0 = face only; 100 = face + outfit + pose |
| --sref [image URL] | Reference an image as the style | Lock the look / palette |
| --sw 0–1000 | Style weight | 100–300 is the comfortable range |
Wrong vs. right examples
✗ Wrong
a beautiful young woman with long hair
"Beautiful + young + long hair" describes hundreds of millions of training images. Ten generations give ten different people.
✓ Right
a 28-year-old female violinist named "Mei", shoulder-length wavy black hair with a faint copper highlight, fair olive skin, narrow almond eyes, small mole on the right cheek, always wears a charcoal wool coat over a cream turtleneck, carries a brass pocket watch --cref https://example.com/mei-ref.jpg --cw 60
Identity + five appearance anchors + outfit + signature prop + --cref reference. Recognizability lands in the 70–85% range.
5 real samples
{fixed character block} walks across a stone bridge in light snow, golden hour backlight, medium shot --cref [url] --cw 60 --ar 3:4
Reuse the character block verbatim, change only the scene. Same reference image, same --cw.
{fixed character block} sits at a wooden cafe table reading a paperback book, soft afternoon window light --cref [url] --cw 60 --ar 3:4
Identical block + same reference. Only the environment changes.
<lora:meiv1:0.85> a young female violinist Mei, charcoal wool coat over cream turtleneck, standing in a sunlit pine forest, soft backlight, photorealistic portrait, shallow depth of field
Use the trained LoRA at weight 0.7–0.9; below 0.6 it stops working.
{fixed character block}, panel of a manga page, three-quarter view, looking down at the pocket watch, soft ink wash style --cref [url] --cw 80 --niji 6
For panel-to-panel comics, push --cw to 70–85 to suppress drift.
a 32-year-old male detective with a deep scar over his left eyebrow, always wears a black trench coat and gray fedora, smoking a thin cigar, neo-noir aesthetic
Without a LoRA or --cref, a strong identifying prop (scar / hat / cigar) lifts recognizability into the 50–60% band.
5 common pitfalls
The fixed block has to be reused verbatim. Tiny rewording turns "her" into "them".
The new image copies the reference's pose and outfit. Use 40–70.
"Named Mei" alone teaches the model nothing. Pair it with five appearance anchors.
"Black backpack" does nothing. Use "brass pocket watch on a thin chain" — specific objects work.
A series must stay on the same model and version. Swap base models, lose LoRA compatibility, or change Midjourney major versions and the character drifts.