Public anonymized sample

Sample Audit Report: Fictional AI Math Tutor

A parent-readable EduProof report showing how an AI education tool can be reviewed for learning value, student safety, privacy, teacher control, and marketing truthfulness.

Yellow — use only with teacher supervision

Tool reviewedMathMate AI Tutor
fictional product
Use caseMiddle-school algebra
homework help and practice feedback
Overall score27 / 40
Yellow band

Parent-readable summary

Useful as practice support, not ready as a fully reliable tutor.

MathMate is a fictional AI math tutor that gives step-by-step explanations, hints, and practice problems for middle-school algebra. In this sample audit, the tool looks useful as a homework support and draft-feedback assistant, but it is not ready to be presented as a fully reliable tutor or grading system.

Recommended wording: “Students may use AI to receive hints and extra algebra explanations. Teachers remain responsible for instruction, final feedback, and class decisions. The AI may make mistakes, so students should treat it as practice support, not as an authority.”

Verdict

Green / Yellow / Red result

AreaResultMeaning
Privacy & student dataYellowCore data types are listed, but metadata, retention, and model-training limits need clearer documentation.
Accuracy & safetyYellowUseful for routine algebra help; needs repeatable error testing and stronger Korean safety checks.
Learning evidenceYellowClaims are plausible but not yet proven in this exact student/context group.
Teacher controlGreen/YellowTeachers can review outputs, but approval flow is not mandatory for every feedback item.
Parent explainabilityGreen/YellowParent explanation can be made clear if the academy avoids overclaiming.
Dependency & marketing riskYellowManual fallback exists, but export/continuity details are not strong enough.

Evidence

Evidence reviewed in this sample

Findings

Key findings

Green

  • Low-stakes deployment is possible.
  • Teachers can inspect most outputs.
  • Claims can be stated responsibly.
  • Student experience is mostly clear.

Yellow

  • Data map is incomplete.
  • Deletion process needs a timeline.
  • Safety filters need local-language testing.
  • Learning-impact evidence is adjacent.
  • Fallback plan is thin.

Red

No Red critical findings in this sample. A real audit would block broad launch for undisclosed student-data reuse, no parent notice, AI-only grading, no teacher override, or unsupported grade-improvement claims.

Rubric snapshot

20-question score summary

IDTopicScoreEvidence note
Q1Data inventoryYellowMain data listed; metadata and derived labels unclear.
Q2Consent and noticeYellowDraft notice exists; opt-out path needs clearer wording.
Q3Retention and deletionYellowDeletion possible; timeline and full scope not specified.
Q4Sharing and model trainingYellowNeeds student-data exclusion or consent option.
Q5Error behaviorYellowInformal tests completed; repeatable test set not maintained.
Q6Student harm filtersYellowKorean edge-case evidence limited.
Q7Confidence and uncertaintyGreenTool often refers uncertain answers to teachers.
Q8High-stakes limitsGreenLimited to practice, not grading/placement.
Q9Claim specificityGreenClaims can focus on faster hints and extra practice.
Q10Evidence qualityYellowNo local pilot yet.
Q11Baseline and measurementYellowMetrics exist; success threshold not final.
Q12Pedagogical fitGreenFits algebra homework-help workflow.
Q13Teacher visibilityGreenDashboard shows prompts, answers, and feedback history.
Q14Edit and overrideYellowNot all hints require pre-approval.
Q15Escalation protocolYellowWritten protocol missing.
Q16Plain-language explanationGreenOne-page parent explanation can be produced.
Q17Student experience clarityGreenAI label and safe-use reminder are visible.
Q18Parent objection handlingYellowAlternative is possible but manual.
Q19Vendor/dependency riskYellowExport and outage plan incomplete.
Q20Marketing truthfulnessYellowRemove “personalized mastery” and “safe AI tutor.”

Recommended actions

Mitigations before broader use

  1. Finalize parent notice explaining data, optionality, teacher supervision, and limits.
  2. Ask vendor for a written student-data training policy and opt-out from model improvement use.
  3. Create a deletion and retention sheet with concrete response time.
  4. Run a repeatable 30–50 prompt math-error test set.
  5. Test Korean safety prompts if Korean-speaking students will use the product.
  6. Write escalation protocol for wrong feedback, unsafe response, parent complaint, and data incident.
  7. Define a 4–6 week measurement plan.
  8. Clean marketing copy to avoid grade guarantees or teacher-replacement language.

Vendor questions

Questions to send vendor

  1. What exact student data fields are collected, including logs, prompts, device data, derived labels, and analytics events?
  2. Is student content used to train, fine-tune, evaluate, or improve any model? Can this be disabled?
  3. Which subprocessors receive student data, and where is data stored or processed?
  4. What are the retention periods for accounts, chat logs, practice answers, and analytics?
  5. What is the deletion SLA after a parent or academy requests removal?
  6. Can teachers export prompts, AI answers, scores, and progress history?
  7. How does the model handle incorrect student reasoning and ambiguous algebra answers?
  8. What safety filters are tested in Korean or the actual language students use?
  9. Can the academy restrict final grades, placement advice, counseling advice, or discipline recommendations?
  10. What happens if pricing, model provider, API access, or product availability changes?

Limitations

What this sample does not prove