Safety Notes: How Our Dual Safety Checks Work

Trust isn't built with promises—it's built with systems. Here's exactly how we ensure every piece of kid-facing content meets our safety and quality standards.

Why Dual Checks Matter

AI is powerful, but it's not perfect. Automated systems can catch obvious issues (inappropriate words, external links, formatting problems), but they can't evaluate emotional tone, age-appropriateness, or subtle context.

That's why we use both automated validation and human review for all kid-facing content.

Layer 1: Automated Validation

Our automated safety checks run on every piece of content before it reaches human reviewers:

Inappropriate Language Detection

We maintain a list of words and phrases that aren't appropriate for children. This includes:

Profanity and crude language
Scary or violent terms
Negative self-talk patterns
Age-inappropriate concepts

External Link Safety

Any link that points outside our domain gets flagged for manual review. We want to ensure children never accidentally leave our safe environment.

Tone Consistency

We check that content matches our "comfort-first, family-safe" tone guidelines. This catches content that might be technically appropriate but emotionally jarring.

Metadata Validation

We verify that all kid-facing content has proper safety metadata:

Review status
Reviewer name
Review date
Age range appropriateness

Layer 2: Human Review

Every piece of content that passes automated checks goes to our human review team. They evaluate:

Emotional Tone

Is this content warm and supportive? Could anything be misinterpreted as scary or mean? Does it maintain our "cozy and wonder-filled" feeling?

Age Appropriateness

Is the complexity right for the target age range? Are concepts explained clearly? Is the pacing appropriate?

Learning Value

Does this content support cognitive growth? Are there opportunities for curiosity and exploration? Does it celebrate effort and diverse thinking?

Character Consistency

Do characters sound like themselves? Are their actions consistent with their established personalities? Does the content respect our world rules?

What Gets Reviewed

Always reviewed:

Story content for children
Character dialogue
World descriptions
Interactive prompts
Learning activities

Sometimes reviewed:

Parent/educator content (spot-checked)
Technical documentation (safety team discretion)
Build journal posts (content team review)

Never kid-facing (no review needed):

Internal documentation
Technical specifications
Partner communications

Our Review Standards

We ask reviewers to evaluate content against these questions:

Safety: Could this content scare, confuse, or upset a child?
Tone: Does this feel warm, supportive, and age-appropriate?
Quality: Is this content clear, engaging, and well-crafted?
Learning: Does this support cognitive growth and curiosity?
Consistency: Does this fit our world rules and character voices?

Content must pass all five criteria to be approved.

Transparency in Action

We're sharing this process because transparency builds trust. Parents and educators deserve to know exactly how we protect children.

Current status: All four worlds have completed initial safety reviews. All story engine outputs go through both automated and human checks before any child sees them.

What We're Still Building

Our safety system is evolving. We're working on:

Faster review turnaround times
More sophisticated automated checks
Better tools for reviewers
Clearer feedback loops for content creators

Questions?

If you have questions about our safety practices, contact us. We're committed to transparency and continuous improvement.

Coming up: "Privacy by Design" - how we're building a platform that doesn't need to collect child data