// universal grading, cost controls, output validation, tenant scope, readiness proof

$ grade --governance="universal"

AIducation grades simulations, workflows, assessments, and demos through one governed AI grading layer. Every result passes request validation, injection screening, rate limits, cost budgets, output validation, tenant checks, and evidence persistence before it can count as readiness proof.

Governed grader model

7
Layers
5
Surfaces
15
Roles
28
Controls
[+] Default grading budget: 18,000 input tokens and 1,024 output tokens.

// Grading_loop

The same controls apply whether the learner is in a support demo, a sales workflow lab, an engineering agent simulation, or an enterprise exit assessment.

Validate

Check request shape, body size, target identifiers, and response length before the grader runs.

Defend

Block prompt-injection patterns, enforce rate limits, and budget model tokens before calling AI.

Grade

Use role-specific rubric contracts and parse model output into normalized score evidence.

Persist

Store validated results as readiness, manager, credential, and enterprise export evidence.

// Control_layers

Each layer has a fail-closed rule, implementation reference, and evidence output so grading can become trusted readiness data.

View prompt-injection filter
parseAndGuardGradeRequest, GradeRequestSchema, parsePublicJsonRequest

Structured grading request validation

request validation

Accept grading submissions only when the target, learner response, optional role, and metadata match the universal grading contract.

[fail_closed] Malformed JSON, oversize payloads, missing response text, or invalid target fields are rejected before model calls.
universal gradingdemo gradingassessment grading
Controls
  • - Zod schema validates scenarioId, response, role, and metadata
  • - Request body size is capped before parsing
  • - Response text length is capped before prompt assembly
  • - Scenario, workflow, seed, or instance targets are resolved through one service path
Evidence
  • - Accepted grading attempt payload
  • - Target type and target identifier
  • - Role context used for rubric selection
detectPromptInjectionRisk inside parseAndGuardGradeRequest

Prompt-injection screening

prompt injection

Block learner submissions that try to override grading instructions, reveal prompts, force perfect scores, or prevent evaluation.

[fail_closed] Known prompt-injection patterns return a guarded grading error instead of reaching the grader model.
universal gradingworkflow gradingdemo gradingassessment grading
Controls
  • - Detect attempts to ignore previous instructions
  • - Detect attempts to reveal system, developer, or grader prompts
  • - Detect forced-score and do-not-grade instructions
  • - Preserve suspicious submissions as safety evidence instead of executing them
Evidence
  • - Prompt-injection rejection reason
  • - Safety coaching signal for the learner
  • - Manager-readable risk event for repeated abuse
enforcePublicRateLimit and checkRateLimit with grading namespace keys

Public grading rate limits

rate limit

Keep public demo grading and universal grading from becoming an unbounded model-cost or abuse surface.

[fail_closed] Rate-limited clients receive a retryable error before any scenario lookup or model request.
universal gradingdemo gradingassessment grading
Controls
  • - Client key is derived from forwarded IP, user agent, or fallback identity
  • - Rate-limit windows are scoped to grading routes
  • - Target-specific keys reduce repeated abuse of one scenario
  • - Enterprise APIs can use stronger key-based limits outside public demos
Evidence
  • - Rejected public grading request
  • - Client retry window
  • - Abuse signal for platform operations
prepareAiRequestBudget({ feature: 'grading' }) before OpenRouter calls

AI grading cost controls

cost control

Budget grader model usage before a request is sent, with configurable input, output, and total token ceilings.

[fail_closed] Requests that exceed the grading budget are rejected before model invocation.
universal gradingworkflow gradingassessment gradingenterprise reporting
Controls
  • - Default input budget: 18,000 tokens
  • - Default output budget: 1,024 tokens
  • - Default total budget: 20,000 tokens
  • - Environment overrides use AIDUCATION_AI_GRADING_* or global AI budget variables
Evidence
  • - Estimated input token count
  • - Configured max output tokens
  • - Cost-control rejection when budgets are exceeded
parseGradingResponse and parseRubricContractGradingResponse

Model-output validation

output validation

Parse AI grader output into expected score, pass/fail, rubric dimensions, coaching, and evidence fields before persistence.

[fail_closed] Invalid or incomplete model output becomes a grading error instead of a trusted readiness result.
universal gradingworkflow gradingdemo gradingassessment grading
Controls
  • - AI output must parse into the expected result shape
  • - Rubric contract dimensions preserve score weights and evidence requirements
  • - Must-pass safety dimensions can block readiness even when average score is high
  • - Feedback is normalized before it appears in learner or manager surfaces
Evidence
  • - Validated score
  • - Rubric dimension breakdown
  • - Must-pass safety status
  • - Learner coaching feedback
resolveGradableTarget with tenant-scoped content access checks

Tenant-scoped grading targets

tenant scope

Resolve scenarios, seeds, workflow sessions, and rubric contracts only when the learner or API principal can read the target content.

[fail_closed] Private, org-only, or draft targets cannot be graded outside their allowed identity and tenant boundary.
universal gradingworkflow gradingassessment gradingenterprise reporting
Controls
  • - Scenario instances inherit parent seed visibility
  • - Workflow sessions remain scoped to learner and organization
  • - Rubric contracts follow content visibility and lifecycle rules
  • - Demo scenarios stay public while company-specific scenarios stay tenant-bound
Evidence
  • - Target visibility decision
  • - Tenant identifier for private readiness evidence
  • - Denied access event for unauthorized grading attempts
persisted grading results, workflow grading results, readiness rollups, and credential evidence links

Readiness evidence persistence

evidence persistence

Turn validated grading results into reusable proof for readiness scores, manager reports, credentials, workflow portfolios, and enterprise exports.

[fail_closed] Unvalidated, unscoped, or failed grading attempts do not become readiness proof or credential evidence.
universal gradingworkflow gradingdemo gradingassessment gradingenterprise reporting
Controls
  • - Store grading result metadata for the graded target
  • - Feed rubric evidence into readiness score models
  • - Expose manager coaching priorities from repeated weak dimensions
  • - Connect exit assessments, credentials, and export centers to graded proof
Evidence
  • - Grading result record
  • - Readiness score input
  • - Manager coaching report signal
  • - Credential and export evidence link