Skip to content
WaifuStack
Go back

Prompt Engineering for Immersive AI Roleplay: Lessons from Building Suzune

Most prompt engineering guides focus on getting ChatGPT to write better emails. This isn’t that.

This is about making AI characters that feel like real people — characters that remember what happened yesterday, whose mood shifts based on your relationship, and whose speech patterns stay consistent across thousands of messages.

Everything here comes from building Suzune, our production roleplay bot. These aren’t theories — they’re patterns we use every day.

Table of contents

Open Table of contents

The Core Problem: Characters That Don’t Feel Like Characters

If you’ve tried roleplay with vanilla ChatGPT or Claude, you’ve experienced this:

The root cause? A static system prompt isn’t enough. You need a dynamic prompt architecture — one that evolves with every message.


Layer 1: The Character Persona

The foundation is a character definition file. In Suzune, we use YAML + Markdown:

name: sakura
display_name: Sakura
system_prompt: persona.md  # loaded from file
example_dialogue: |
  User: How was your day?
  Sakura: *leans back in her chair and stretches* Ugh, don't even
  ask. My editor called THREE times about the deadline. *sighs*
  ...I guess it wasn't all bad though. I found this amazing café
  near the station. The matcha latte was actually decent.

The persona.md file contains the character’s core identity. Here’s what matters:

What to Include

  1. Background & Motivation — Who is this character? What do they want? What are they afraid of?
  2. Speech Patterns — Specific mannerisms, verbal tics, vocabulary preferences
  3. Emotional Range — How do they express happiness vs. anger vs. vulnerability?
  4. Relationship Dynamics — How do they treat strangers vs. close friends vs. romantic partners?

What to Avoid

The Example Dialogue Trick

This is the single most impactful technique for voice consistency: include 3–5 example exchanges that demonstrate exactly how the character talks.

LLMs are pattern-matching machines. If you show them a character who uses *action asterisks*, speaks in short sentences, and drops casual slang — the model will mimic that pattern far more reliably than if you just wrote “she speaks casually.”

User: What do you think about the new project?
Sakura: *taps her pen on the desk* Honestly? It's a mess.
The specs are vague, the timeline is insane, and I'm
pretty sure the client doesn't actually know what they want.
*pauses* ...But the tech stack is interesting. I wouldn't
mind getting my hands on it. *small grin*

Notice:


Layer 2: Dynamic Context Injection

Here’s where it gets interesting. A static prompt produces static behavior. For characters that feel alive, you need context that changes every message.

The System Prompt Assembly Pipeline

In Suzune, the system prompt is rebuilt from scratch on every message from ~20 data sources:

┌─────────────────────────────────────────────┐
│           SYSTEM PROMPT (assembled)          │
├─────────────────────────────────────────────┤
│ 1. Character persona (persona.md)           │
│ 2. Behavior rules (rules.md)                │
│ 3. Example dialogue                         │
│ 4. World info (lorebook entries)            │
│ 5. Relationship stage guidance              │
│ 6. Character's diary/memory (memo.md)       │
│ 7. Compressed chat history (summary)        │
│ 8. Emotion/affection scores                 │
│ 9. Current outfit                           │
│10. Current date & time                      │
│11. Tone reminder (recency bias fix)         │
│12. Story direction hints                    │
└─────────────────────────────────────────────┘

Why rebuild every time? Because the character’s context changes:

Keyword-Triggered World Info (Lorebooks)

Borrowed from SillyTavern’s World Info concept, lorebooks inject context only when relevant keywords appear in recent messages:

{
  "keys": ["café", "coffee", "matcha"],
  "content": "Sakura's favorite café is 'Tsuki no Shizuku' near the station. She orders matcha latte every time.",
  "sticky": 3,
  "priority": 5
}

When someone mentions “café,” this entry activates and stays in the prompt for 3 turns (sticky: 3). This means the character naturally “remembers” details about the café without it cluttering every single response.

You can also gate entries by relationship level — NSFW-related world info only activates after the relationship reaches a certain intimacy threshold.

Time Awareness

Every user message in Suzune is timestamped:

[03/31 22:45] User: Still up?

The current datetime is also injected into the system prompt. This means the character knows:

A character who says “Good morning!” at 10 PM breaks immersion instantly. Time awareness prevents this.


Layer 3: Tone Enforcement (Fighting Recency Bias)

Here’s a problem every RP bot developer hits: the character’s voice drifts over long conversations. After 30+ messages, a sarcastic character starts sounding generic.

Why? Recency bias. LLMs pay more attention to tokens near the end of the context. As the conversation grows, the character persona (at the beginning of the system prompt) gets “pushed out” of the model’s attention.

The Double-Injection Pattern

Our solution: inject the tone rules twice — once at the beginning (in the persona) and once at the very end of the system prompt:

[System prompt start]
... character persona with tone rules ...
[50+ messages of conversation]
[System prompt end]
... ★ tone rules again ★ ...

We literally re-inject a section called “Absolute Speech Rules” as the final system message before the model generates. It’s redundant, but it works — voice consistency improves dramatically.

Anti-Repetition Detection

Another common problem: the model starts every response with the same phrase. “Sakura sighs and…” over and over.

Suzune tracks the opening patterns of recent responses. If the same pattern appears 3+ times, a warning is injected:

[System] Your last 3 responses started with similar patterns.
Vary your opening — start with dialogue, an action, a thought,
or a scene description.

Simple, but effective.


Layer 4: The Relationship Arc

The most powerful technique in our system: the character’s behavior evolves based on relationship scores.

Affection Scoring

Suzune tracks 5 dimensions, each scored 1–10:

ScoreWhat It Represents
TrustHow safe the character feels with you
AffectionHow much they like you
RespectHow much they admire you
ExcitementHow stimulating they find interactions
DevotionHow dedicated they are to the relationship

The character itself updates these scores during conversation — they’re part of the AI’s toolset, not calculated externally.

Stage-Gated Behavior

Based on the average of Trust and Affection (we call this “affinity”), the system prompt fundamentally changes:

AffinityRelationship StagePrompt Behavior
< 3StrangersFormal language, no romance, keeps distance
3–5AcquaintancesWarming up, occasional vulnerability
5–7CloseCan be romantic, sends selfies, shares secrets
7+Deep TrustFull vulnerability, exclusive behaviors unlocked

This isn’t just cosmetic. At low affinity, the system prompt literally forbids romantic content. The character will reject advances — not because of an AI content filter, but because the character isn’t ready yet.

This creates a genuine relationship arc that users find incredibly engaging. You have to actually build trust before the character opens up.


Layer 5: Model Routing for Uncensored Content

This is the elephant in the room for NSFW RP bots: most high-quality models censor adult content.

Claude writes beautifully but refuses explicit scenes. GPT-4 is similar. DeepSeek V3 will write NSFW content but with rougher prose quality.

The Draft → Rewrite Pipeline

Suzune’s solution is a two-model pipeline:

User message


DeepSeek V3.2 (draft) ── uncensored, handles NSFW content


Claude Haiku (rewrite) ── polishes voice, fixes prose quality

    ├── If rewrite is good → use rewrite
    └── If Claude censored it → discard rewrite, use DS3.2 draft

The key innovation: censorship detection. If Claude’s rewrite is suspiciously shorter than the draft (< 60% length), or contains refusal patterns, or suddenly switches to a different language — we throw away the rewrite and serve the original uncensored draft. (Full details in Navigating AI Content Filters.)

This gives you the best of both worlds: uncensored content with polished prose quality. When it works (which is most of the time for non-explicit scenes), the quality improvement is noticeable.

Pre-Emptive Routing

For scenes that are obviously going to be explicit, we skip the rewrite entirely and route straight to DeepSeek. No point wasting an API call on a rewrite that’s going to get censored.


Putting It All Together

Here’s what happens when a user sends a message to Suzune:

  1. Message arrives → timestamped and saved
  2. System prompt assembled → 20+ dynamic sections combined
  3. Lorebooks scanned → keyword-triggered world info injected
  4. Relationship stage applied → behavior gates based on affection scores
  5. Model generates → DeepSeek V3.2 produces draft
  6. Quality rewrite → Claude polishes (with censorship detection)
  7. Emotion detected → character’s emotional state extracted from response
  8. Response delivered → with appropriate expression sprite

The result? A character that feels like a person — not an AI playing a character.


Try It Yourself

If this sounds overwhelming, here’s where to start:

  1. Start with example dialogue. It’s the highest-impact, lowest-effort technique.
  2. Add time awareness. Just injecting the current datetime makes a surprising difference.
  3. Implement tone re-injection. Copy your speech rules to the end of the system prompt.
  4. Layer in lorebooks gradually. Start with 5–10 entries for your character’s world.

And if you don’t want to build from scratch — the best AI chatbot platforms have done a lot of this work for you. Candy AI and JanitorAI in particular offer good customization options that let you apply some of these techniques without writing code.


This article is part of WaifuStack’s series on building AI roleplay bots. Next up: how we design character personalities with YAML.

Building something similar? Share your approach on X — we’d love to see what you’re working on.


Share this post on:

Previous Post
From Idea to Production: How I Built an AI Roleplay Bot That Actually Works
Next Post
Dynamic Character Visuals: How One Character Can Look Like Two Different People