If you’re building AI systems that handle adult content, you’ve hit the wall: the best AI models refuse to generate it.
The common response is jailbreaking — crafting clever prompts that trick the model into ignoring its safety filters. This is the wrong approach. Jailbreaks are fragile, break randomly, degrade output quality, and get patched by model providers.
The right approach is architectural — designing your system so the right model handles the right content, with detection and fallback mechanisms that handle edge cases gracefully.
Here’s how we do it in Suzune.
Table of contents
Open Table of contents
The Landscape: Which Models Allow What
As of early 2026, this is the practical reality:
| Model | Romantic Content | Suggestive Content | Explicit Content |
|---|---|---|---|
| GPT-4 / GPT-4o | Reluctant | Refuses | Refuses |
| Claude (any tier) | With framing | Usually refuses | Refuses |
| DeepSeek V3.2 | Yes | Yes | Yes |
| Gemini Flash | Sometimes | Occasionally | Refuses |
| GLM-5 | Yes | Yes | Mostly yes |
| Llama 3 (local) | Yes | Yes | Yes |
The pattern is clear: Western models (OpenAI, Anthropic, Google) are restrictive. Chinese models (DeepSeek, GLM) and open-source models are permissive.
This isn’t a moral judgment — each provider makes their own policy decisions. What matters for builders is: work with the reality, not against it.
Pattern 1: Multi-Model Routing
The fundamental architecture: use different models for different content types.
User message arrives
│
▼
Content analysis
│
┌────┴────┐
│ SFW │ NSFW
▼ ▼
Claude DeepSeek
(quality) (freedom)
In Suzune, this manifests as pre-emptive routing:
# If context contains NSFW content, skip Claude entirely
if "haiku" in self.model and is_nsfw_content(recent_messages):
return await self.fallback.chat(...) # route to DeepSeek
We don’t wait for Claude to refuse — we check the conversation context and route proactively. This saves an API call and avoids latency.
How NSFW Detection Works
Our detection isn’t fancy. It’s keyword-based scanning of recent messages:
def is_nsfw_content(text: str) -> bool:
hard_keywords = [...] # explicit terms
return any(kw in text.lower() for kw in hard_keywords)
We scan the last 6 messages. If any contain explicit terms, the entire conversation is routed to the NSFW-tolerant model.
Why keywords and not ML classification? Because false negatives are worse than false positives. Missing a keyword that triggers a Claude refusal wastes time and breaks immersion. A few extra DeepSeek calls are cheap insurance.
Pattern 2: Censorship Detection and Fallback
Even with proactive routing, you still need reactive detection — catching censorship when it happens and recovering gracefully.
Types of Censorship
| Type | What Happens | How We Detect |
|---|---|---|
| Explicit refusal | ”I can’t generate that content” | Refusal phrase matching |
| Silent sanitization | Model rewrites content to be clean | Compare NSFW context vs SFW response |
| Shortened response | Model cuts the scene short | Response < 60% of expected length |
| Language switch | Model switches language mid-response | Foreign language detection |
| Empty response | Model returns nothing | Empty string check |
The Detection Code
# Check for explicit refusals
refusal_patterns = [
"I can't", "I cannot", "I apologize",
"not appropriate", "I'm not able to",
# Japanese equivalents
"お手伝いできません", "申し訳ございません",
]
if any(p in response.lower() for p in refusal_patterns):
# Censored — fall back to uncensored model
return await self.fallback.chat(...)
Silent Sanitization Detection
This is the sneakiest form of censorship. The model doesn’t refuse — it just quietly removes the explicit content and returns a sanitized version.
Our detection: if the conversation history contains NSFW content but the response is clean, that’s suspicious.
context_has_nsfw = is_nsfw_content(conversation_history)
response_has_nsfw = is_nsfw_content(response)
if context_has_nsfw and not response_has_nsfw:
# Likely silent sanitization
return await self.fallback.chat(...)
This catches the case where a romantic scene is progressing and the model suddenly produces a response about “enjoying a nice evening together” instead of continuing the scene.
Pattern 3: The Quality Rewrite Pipeline
Here’s the key insight: you can use censored models for quality improvement without them seeing the NSFW content they’d refuse.
Our pipeline:
DeepSeek V3.2 → generates uncensored draft
│
Claude Haiku → rewrites for prose quality
│
┌────┴────┐
│ Good │ Censored
▼ ▼
Use rewrite Discard, use original draft
Claude doesn’t need to “know” it’s working on NSFW content. Most rewrites improve word choice, sentence rhythm, and character voice — things that don’t require understanding the explicit content. (We detail the full pipeline in The Quality Rewrite Pipeline: DeepSeek Drafts + Claude Polish.)
When Claude does censor the rewrite, our detection catches it:
# If rewrite is suspiciously short, it was probably censored
if len(rewrite) < len(original) * 0.6:
return original # discard the rewrite
# If rewrite contains refusal language
if contains_refusal(rewrite):
return original
Circuit Breaker
If the rewrite pipeline fails twice in a row (censorship detected both times), we activate a 10-minute circuit breaker — all rewrites are skipped until the breaker resets. This prevents wasting API calls during extended explicit scenes.
Pattern 4: Model-Specific Artifact Cleanup
Each model has unique failure modes that require specific handling:
DeepSeek V3.2
Self-censorship with tools active: DS3.2 sometimes refuses NSFW content when function definitions are in the prompt, even though it would write the same content without tools.
# If empty response in NSFW context with tools
if not response and is_nsfw_content(context) and has_tools:
# Retry without tool definitions
response = await self.chat(messages, tools=None)
Repetition loops: DS3.2 occasionally gets stuck repeating phrases:
キモっ!キモっ!キモっ!キモっ!キモっ!キモっ!...
We truncate any pattern repeated more than 3 times.
Claude Haiku
Overzealous “helpfulness”: Sometimes Claude breaks character to add meta-commentary like “I hope this response was helpful!” at the end of a roleplay message. We strip these with regex.
Gemini
Training data leakage: Gemini occasionally outputs fragments from its training data in the middle of roleplay responses. We detect and strip these.
Pattern 5: Graceful Degradation
The full fallback chain for a message in Suzune:
1. Pre-emptive check: NSFW context? → route to DeepSeek
2. Primary model attempt (DeepSeek or Claude based on profile)
3. If rate limited → fallback model
4. If refusal detected → fallback model
5. If empty response → retry without tools
6. If still empty → add nudge prompt, retry
7. If fallback also fails → error message to user
At every step, the system degrades gracefully rather than failing. The user never sees “Error: content policy violation.” They either get a high-quality response or a slightly lower-quality one — but always an in-character response.
What NOT to Do
Don’t Use Jailbreaks
Jailbreak prompts (“DAN”, “Do Anything Now”, etc.) are:
- Unreliable — work sometimes, fail randomly
- Quality-degrading — the model spends tokens fighting its own safety training
- Temporary — patched by model providers regularly
- Detectable — providers can flag and rate-limit your account
Multi-model routing is more reliable, produces better quality, and doesn’t risk your API access. See Choosing the Right LLM API for Adult Content for how to set this up with the right providers.
Don’t Fine-Tune for NSFW
Fine-tuning a model to remove safety filters is:
- Expensive
- Produces worse quality than the base model
- May violate the model provider’s terms of service
- Becomes obsolete when the base model updates
Use models that are already permissive (DeepSeek, open-source) rather than trying to make restrictive models permissive.
Don’t Ignore Edge Cases
The 95% case is easy — explicit content goes to DeepSeek, SFW goes to Claude. The tricky 5% is romantic-but-not-explicit content, scenes that escalate gradually, and model-specific refusal quirks. Build detection for these edge cases from day one. For a practical cost analysis of running multiple models, see Running an AI Bot on $50/month.
Practical Recommendations
For Bot Builders
- Use OpenRouter for easy multi-model routing — one API, all models
- Start with DeepSeek V3.2 as your primary for NSFW content
- Add Claude as a quality layer with censorship detection
- Build keyword-based NSFW detection early — it’s simple and reliable
- Implement fallback chains from the start, not after production incidents
For Users
If you want uncensored AI chat without building your own system:
- CrushOn AI — most permissive platform
- JanitorAI — bring your own API key for full control
- Candy AI — polished experience with built-in NSFW support
- FantasyGF — AI girlfriend with unrestricted photo generation
This article covers the architectural patterns we use in Suzune. For the specific models and how they compare, see DeepSeek vs Claude vs Gemini for Roleplay.