Safety Notes: How Our Dual Safety Checks Work
A transparent look at our two-layer safety system: automated validation plus human review for all kid-facing content.
By Safety Team
Safety Notes: How Our Dual Safety Checks Work
Trust isn't built with promises—it's built with systems. Here's exactly how we ensure every piece of kid-facing content meets our safety and quality standards.
Why Dual Checks Matter
AI is powerful, but it's not perfect. Automated systems can catch obvious issues (inappropriate words, external links, formatting problems), but they can't evaluate emotional tone, age-appropriateness, or subtle context.
That's why we use both automated validation and human review for all kid-facing content.
Layer 1: Automated Validation
Our automated safety checks run on every piece of content before it reaches human reviewers:
Inappropriate Language Detection
We maintain a list of words and phrases that aren't appropriate for children. This includes:
- Profanity and crude language
- Scary or violent terms
- Negative self-talk patterns
- Age-inappropriate concepts
External Link Safety
Any link that points outside our domain gets flagged for manual review. We want to ensure children never accidentally leave our safe environment.
Tone Consistency
We check that content matches our "comfort-first, family-safe" tone guidelines. This catches content that might be technically appropriate but emotionally jarring.
Metadata Validation
We verify that all kid-facing content has proper safety metadata:
- Review status
- Reviewer name
- Review date
- Age range appropriateness
Layer 2: Human Review
Every piece of content that passes automated checks goes to our human review team. They evaluate:
Emotional Tone
Is this content warm and supportive? Could anything be misinterpreted as scary or mean? Does it maintain our "cozy and wonder-filled" feeling?
Age Appropriateness
Is the complexity right for the target age range? Are concepts explained clearly? Is the pacing appropriate?
Learning Value
Does this content support cognitive growth? Are there opportunities for curiosity and exploration? Does it celebrate effort and diverse thinking?
Character Consistency
Do characters sound like themselves? Are their actions consistent with their established personalities? Does the content respect our world rules?
What Gets Reviewed
Always reviewed:
- Story content for children
- Character dialogue
- World descriptions
- Interactive prompts
- Learning activities
Sometimes reviewed:
- Parent/educator content (spot-checked)
- Technical documentation (safety team discretion)
- Build journal posts (content team review)
Never kid-facing (no review needed):
- Internal documentation
- Technical specifications
- Partner communications
Our Review Standards
We ask reviewers to evaluate content against these questions:
- Safety: Could this content scare, confuse, or upset a child?
- Tone: Does this feel warm, supportive, and age-appropriate?
- Quality: Is this content clear, engaging, and well-crafted?
- Learning: Does this support cognitive growth and curiosity?
- Consistency: Does this fit our world rules and character voices?
Content must pass all five criteria to be approved.
Transparency in Action
We're sharing this process because transparency builds trust. Parents and educators deserve to know exactly how we protect children.
Current status: All four worlds have completed initial safety reviews. All story engine outputs go through both automated and human checks before any child sees them.
What We're Still Building
Our safety system is evolving. We're working on:
- Faster review turnaround times
- More sophisticated automated checks
- Better tools for reviewers
- Clearer feedback loops for content creators
Questions?
If you have questions about our safety practices, contact us. We're committed to transparency and continuous improvement.
Coming up: "Privacy by Design" - how we're building a platform that doesn't need to collect child data