Most prompt engineering guides focus on getting ChatGPT to write better emails. This isn’t that.
This is about making AI characters that feel like real people — characters that remember what happened yesterday, whose mood shifts based on your relationship, and whose speech patterns stay consistent across thousands of messages.
Everything here comes from building Suzune, our production roleplay bot. These aren’t theories — they’re patterns we use every day.
Table of contents
Open Table of contents
The Core Problem: Characters That Don’t Feel Like Characters
If you’ve tried roleplay with vanilla ChatGPT or Claude, you’ve experienced this:
- The character “breaks” mid-conversation and starts talking like a helpful assistant
- Speech patterns drift — a tsundere suddenly becomes polite and accommodating
- The AI forgets what happened 10 messages ago
- Every response starts the same way
The root cause? A static system prompt isn’t enough. You need a dynamic prompt architecture — one that evolves with every message.
Layer 1: The Character Persona
The foundation is a character definition file. In Suzune, we use YAML + Markdown:
name: sakura
display_name: Sakura
system_prompt: persona.md # loaded from file
example_dialogue: |
User: How was your day?
Sakura: *leans back in her chair and stretches* Ugh, don't even
ask. My editor called THREE times about the deadline. *sighs*
...I guess it wasn't all bad though. I found this amazing café
near the station. The matcha latte was actually decent.
The persona.md file contains the character’s core identity. Here’s what matters:
What to Include
- Background & Motivation — Who is this character? What do they want? What are they afraid of?
- Speech Patterns — Specific mannerisms, verbal tics, vocabulary preferences
- Emotional Range — How do they express happiness vs. anger vs. vulnerability?
- Relationship Dynamics — How do they treat strangers vs. close friends vs. romantic partners?
What to Avoid
- Don’t write a novel. Keep the persona under 800 tokens. LLMs pay less attention to the middle of long prompts.
- Don’t say “you are kind and caring.” Instead, show it: “When someone is upset, she tries to make them laugh — often with terrible puns.”
- Don’t list personality traits. Describe behaviors. Traits are abstract; behaviors are actionable.
The Example Dialogue Trick
This is the single most impactful technique for voice consistency: include 3–5 example exchanges that demonstrate exactly how the character talks.
LLMs are pattern-matching machines. If you show them a character who uses *action asterisks*, speaks in short sentences, and drops casual slang — the model will mimic that pattern far more reliably than if you just wrote “she speaks casually.”
User: What do you think about the new project?
Sakura: *taps her pen on the desk* Honestly? It's a mess.
The specs are vague, the timeline is insane, and I'm
pretty sure the client doesn't actually know what they want.
*pauses* ...But the tech stack is interesting. I wouldn't
mind getting my hands on it. *small grin*
Notice:
- Action descriptions in
*asterisks* - Internal monologue through pauses
- Mix of complaint and genuine interest (character complexity)
- Natural speech rhythm with interruptions
Layer 2: Dynamic Context Injection
Here’s where it gets interesting. A static prompt produces static behavior. For characters that feel alive, you need context that changes every message.
The System Prompt Assembly Pipeline
In Suzune, the system prompt is rebuilt from scratch on every message from ~20 data sources:
┌─────────────────────────────────────────────┐
│ SYSTEM PROMPT (assembled) │
├─────────────────────────────────────────────┤
│ 1. Character persona (persona.md) │
│ 2. Behavior rules (rules.md) │
│ 3. Example dialogue │
│ 4. World info (lorebook entries) │
│ 5. Relationship stage guidance │
│ 6. Character's diary/memory (memo.md) │
│ 7. Compressed chat history (summary) │
│ 8. Emotion/affection scores │
│ 9. Current outfit │
│10. Current date & time │
│11. Tone reminder (recency bias fix) │
│12. Story direction hints │
└─────────────────────────────────────────────┘
Why rebuild every time? Because the character’s context changes:
- It’s 11 PM → the character should be sleepy
- Affection score just crossed a threshold → unlock new behaviors
- A lorebook keyword was mentioned → inject relevant world info
- The character changed outfits earlier → reference the new outfit
Keyword-Triggered World Info (Lorebooks)
Borrowed from SillyTavern’s World Info concept, lorebooks inject context only when relevant keywords appear in recent messages:
{
"keys": ["café", "coffee", "matcha"],
"content": "Sakura's favorite café is 'Tsuki no Shizuku' near the station. She orders matcha latte every time.",
"sticky": 3,
"priority": 5
}
When someone mentions “café,” this entry activates and stays in the prompt for 3 turns (sticky: 3). This means the character naturally “remembers” details about the café without it cluttering every single response.
You can also gate entries by relationship level — NSFW-related world info only activates after the relationship reaches a certain intimacy threshold.
Time Awareness
Every user message in Suzune is timestamped:
[03/31 22:45] User: Still up?
The current datetime is also injected into the system prompt. This means the character knows:
- What time of day it is (and acts accordingly)
- How much time passed between messages
- What day of the week it is
A character who says “Good morning!” at 10 PM breaks immersion instantly. Time awareness prevents this.
Layer 3: Tone Enforcement (Fighting Recency Bias)
Here’s a problem every RP bot developer hits: the character’s voice drifts over long conversations. After 30+ messages, a sarcastic character starts sounding generic.
Why? Recency bias. LLMs pay more attention to tokens near the end of the context. As the conversation grows, the character persona (at the beginning of the system prompt) gets “pushed out” of the model’s attention.
The Double-Injection Pattern
Our solution: inject the tone rules twice — once at the beginning (in the persona) and once at the very end of the system prompt:
[System prompt start]
... character persona with tone rules ...
[50+ messages of conversation]
[System prompt end]
... ★ tone rules again ★ ...
We literally re-inject a section called “Absolute Speech Rules” as the final system message before the model generates. It’s redundant, but it works — voice consistency improves dramatically.
Anti-Repetition Detection
Another common problem: the model starts every response with the same phrase. “Sakura sighs and…” over and over.
Suzune tracks the opening patterns of recent responses. If the same pattern appears 3+ times, a warning is injected:
[System] Your last 3 responses started with similar patterns.
Vary your opening — start with dialogue, an action, a thought,
or a scene description.
Simple, but effective.
Layer 4: The Relationship Arc
The most powerful technique in our system: the character’s behavior evolves based on relationship scores.
Affection Scoring
Suzune tracks 5 dimensions, each scored 1–10:
| Score | What It Represents |
|---|---|
| Trust | How safe the character feels with you |
| Affection | How much they like you |
| Respect | How much they admire you |
| Excitement | How stimulating they find interactions |
| Devotion | How dedicated they are to the relationship |
The character itself updates these scores during conversation — they’re part of the AI’s toolset, not calculated externally.
Stage-Gated Behavior
Based on the average of Trust and Affection (we call this “affinity”), the system prompt fundamentally changes:
| Affinity | Relationship Stage | Prompt Behavior |
|---|---|---|
| < 3 | Strangers | Formal language, no romance, keeps distance |
| 3–5 | Acquaintances | Warming up, occasional vulnerability |
| 5–7 | Close | Can be romantic, sends selfies, shares secrets |
| 7+ | Deep Trust | Full vulnerability, exclusive behaviors unlocked |
This isn’t just cosmetic. At low affinity, the system prompt literally forbids romantic content. The character will reject advances — not because of an AI content filter, but because the character isn’t ready yet.
This creates a genuine relationship arc that users find incredibly engaging. You have to actually build trust before the character opens up.
Layer 5: Model Routing for Uncensored Content
This is the elephant in the room for NSFW RP bots: most high-quality models censor adult content.
Claude writes beautifully but refuses explicit scenes. GPT-4 is similar. DeepSeek V3 will write NSFW content but with rougher prose quality.
The Draft → Rewrite Pipeline
Suzune’s solution is a two-model pipeline:
User message
│
▼
DeepSeek V3.2 (draft) ── uncensored, handles NSFW content
│
▼
Claude Haiku (rewrite) ── polishes voice, fixes prose quality
│
├── If rewrite is good → use rewrite
└── If Claude censored it → discard rewrite, use DS3.2 draft
The key innovation: censorship detection. If Claude’s rewrite is suspiciously shorter than the draft (< 60% length), or contains refusal patterns, or suddenly switches to a different language — we throw away the rewrite and serve the original uncensored draft. (Full details in Navigating AI Content Filters.)
This gives you the best of both worlds: uncensored content with polished prose quality. When it works (which is most of the time for non-explicit scenes), the quality improvement is noticeable.
Pre-Emptive Routing
For scenes that are obviously going to be explicit, we skip the rewrite entirely and route straight to DeepSeek. No point wasting an API call on a rewrite that’s going to get censored.
Putting It All Together
Here’s what happens when a user sends a message to Suzune:
- Message arrives → timestamped and saved
- System prompt assembled → 20+ dynamic sections combined
- Lorebooks scanned → keyword-triggered world info injected
- Relationship stage applied → behavior gates based on affection scores
- Model generates → DeepSeek V3.2 produces draft
- Quality rewrite → Claude polishes (with censorship detection)
- Emotion detected → character’s emotional state extracted from response
- Response delivered → with appropriate expression sprite
The result? A character that feels like a person — not an AI playing a character.
Try It Yourself
If this sounds overwhelming, here’s where to start:
- Start with example dialogue. It’s the highest-impact, lowest-effort technique.
- Add time awareness. Just injecting the current datetime makes a surprising difference.
- Implement tone re-injection. Copy your speech rules to the end of the system prompt.
- Layer in lorebooks gradually. Start with 5–10 entries for your character’s world.
And if you don’t want to build from scratch — the best AI chatbot platforms have done a lot of this work for you. Candy AI and JanitorAI in particular offer good customization options that let you apply some of these techniques without writing code.
This article is part of WaifuStack’s series on building AI roleplay bots. Next up: how we design character personalities with YAML.
Building something similar? Share your approach on X — we’d love to see what you’re working on.