Content Moderation
We take a comprehensive, multi-layered approach to content moderation to ensure that REPLR remains a safe and welcoming platform for everyone.
Our Approach
Content moderation at REPLR operates on multiple levels to catch harmful content at every stage — from creation to distribution. We combine automated detection, human review, and community reporting into a unified system that works around the clock.
Our moderation covers all forms of content on the platform: AI character configurations, user messages, AI-generated responses, marketplace listings, character descriptions, and shared conversation links. No content exists outside the scope of our safety systems.
Automated Systems
Our automated safety systems work in real time to detect and prevent harmful content before it reaches users.
AI-Powered Content Analysis
Machine learning classifiers analyze text in real time, scanning user inputs and AI outputs for policy violations. These models are trained on millions of examples and continuously updated to address new patterns of abuse. They evaluate content for violence, sexual material, hate speech, self-harm, and other prohibited categories.
Prompt Injection Prevention
We deploy multiple layers of defense against prompt injection attacks — attempts by users to manipulate AI characters into generating prohibited content. This includes input sanitization, system prompt isolation, context boundary enforcement, and adversarial testing. These protections are updated continuously as new attack vectors are discovered.
Output Filtering
Every AI response passes through output filters before reaching the user. These filters scan for prohibited content categories, personally identifiable information, and content that violates the character's configured boundaries. Filtered responses are blocked and logged for review. Filtering applies to both text and voice outputs.
Pre-Publication Review
All AI characters submitted to the marketplace undergo automated scanning before human review. This includes analysis of the character's name, description, personality instructions, and sample interactions to flag potential violations before the character is made publicly available.
Human Review
Automated systems are the first line of defense, but human judgment is essential for nuanced decisions. Our Trust & Safety team provides the context and expertise that algorithms cannot.
Dedicated Safety Team
Trained reviewers evaluate flagged content, user reports, and escalated cases. Each reviewer is trained on our policies and undergoes regular calibration to ensure consistency.
24/7 Coverage
Our review operations run around the clock. Critical reports — including child safety, credible threats of violence, and self-harm — are prioritized for immediate review regardless of time of day.
Marketplace Review
Every AI character published to the marketplace is reviewed by a human before it becomes publicly available. Reviewers evaluate character configurations, descriptions, and behavioral parameters against our content guidelines.
Escalation Paths
Complex or sensitive cases are escalated to senior reviewers and, when necessary, to legal counsel or law enforcement. We maintain clear escalation procedures for different types of violations.
What We Moderate
The following content is strictly prohibited on REPLR and will result in enforcement action. This is not an exhaustive list — our Acceptable Use Policy and Community Guidelines provide complete details.
- Child sexual abuse material (CSAM) or any content sexualizing minors
- Credible threats of violence against real people
- Content that promotes terrorism or violent extremism
- Non-consensual intimate imagery
- Content that encourages self-harm or suicide
- Hate speech targeting protected characteristics
- Doxxing or sharing personal information without consent
- Spam, scams, and deceptive content
- Intellectual property infringement
- Instructions for creating weapons, drugs, or explosives
How We Enforce
Enforcement is proportionate to the severity of the violation, the user's history, and the potential for ongoing harm. Our enforcement ladder ensures fair and consistent treatment:
Warning
The user receives a formal warning documenting the violation. The offending content may be removed. Warnings are permanently recorded on the account.
Temporary Suspension
The user's access is suspended for a defined period (24 hours to 30 days). During suspension, the user cannot access their account, create or interact with characters, or use the API.
Permanent Ban
The account is permanently terminated and all content is removed. Reserved for the most serious violations: child safety offenses, credible threats, and repeated policy violations. Banned users are prohibited from creating new accounts.
In cases involving illegal activity, we report to law enforcement and cooperate fully with any investigation. This includes preserving and producing user data in response to valid legal process.
Transparency
We believe transparency is essential to building trust. We are committed to providing visibility into our moderation practices:
- Periodic transparency reports with aggregate data on enforcement actions, report volume, response times, and content categories actioned.
- Public documentation of our policies, including this Safety Center, our Acceptable Use Policy, and our Community Guidelines.
- Notification of enforcement actions to affected users, including the specific policy violated and information about the appeals process.
Working with Law Enforcement
We cooperate with law enforcement agencies to protect users and the public. Our approach to law enforcement engagement includes:
- Mandatory reporting. We are required by law to report child sexual abuse material (CSAM) to the National Center for Missing & Exploited Children (NCMEC). We comply fully with this obligation and report promptly.
- Legal process. We respond to valid legal process including subpoenas, court orders, and search warrants from law enforcement agencies. We evaluate each request to ensure it is legally valid and appropriately scoped.
- Emergency disclosures. When we believe there is an imminent risk of death or serious physical injury, we may voluntarily disclose relevant user information to law enforcement without waiting for legal process.
- Data preservation. Upon receiving a valid preservation request from law enforcement, we preserve the specified account data for 90 days, renewable upon request, pending the issuance of formal legal process.
Law enforcement agencies can contact us at legal@replr.ai for urgent requests or formal legal process.
Appeals Process
We recognize that enforcement decisions are consequential. If you believe an action was taken in error, you can appeal:
How to appeal: Email appeals@replr.ai within 30 days of the enforcement action. Include your username, the action you are contesting, and your explanation.
Timeline: We acknowledge appeals within 2 business days and issue a determination within 5 business days. Appeals are reviewed by a senior team member who was not involved in the original decision.
Finality: Each enforcement action may be appealed once. The appeal determination is final.
See something that needs attention?
Our moderation works best when the community helps. If you encounter content that violates our policies, let us know.
Was this page helpful?
Crisis Resources
988 Suicide & Crisis Lifeline
Call or text 988
Crisis Text Line
Text HOME to 741741
NCMEC CyberTipline
1-800-843-5678
Childhelp National Child Abuse Hotline
1-800-422-4453