Skip to main content

Methodology: Mastery Thresholds

What “mastered” means here

When a LEARN module displays a mastery layer, you should be able to look up exactly what that label means in code. This page is the source-of-truth document for the five layers Oxford Ledge uses, the precise thresholds for each, and the recency and calibration requirements that gate progression.

The model follows Khan Academy’s canonical seven-stage learning loop — signal → diagnosis → offer → delivery → re-check → calibration → progression. We render five visible layers on the ladder; the other two stages (diagnosis and offer) sit between the score-update and the next attempt and don’t carry a label.

1. The five layers

Each concept you study carries one layer per user. The layer never passively regresses — a Mastered concept that goes stale is marked decaying but is not demoted; only a hard regression (two failed calibration items) demotes the layer.

LayerEMA scoreAttemptsRecencyCalibration
Attemptedanyanyanyn/a
Familiar≥ 50≥ 2anyn/a
Proficient≥ 75≥ 4≤ 30 daysn/a
Mastered≥ 90≥ 6≤ 30 days≥ 1 pass
Enduring≥ 90≥ 10≤ 60 days≥ 2 passes spanning ≥ 14 days, first reached Mastered ≥ 30 days ago

These thresholds are policy, not tuning knobs. The exact constants live in learn/mastery/layer.py as FAMILIAR_SCORE, PROFICIENT_SCORE, MASTERED_SCORE, ENDURING_SCORE, and the matching attempt / recency / calibration constants. Any change here requires a changelog entry below.

2. EMA score

The score on a concept is an exponentially-weighted moving average over your quiz attempts on questions tagged with that concept. The most recent attempt carries roughly 30% of the weight; the prior attempts share the remaining 70% on a decaying weight. A perfect attempt brings the EMA up sharply; a missed attempt brings it down sharply. The effect: a long streak of correct answers on a concept moves the score asymptotically toward 100, but a single recent miss pulls it down materially — you can’t coast on history alone.

The score is bounded on [0, 100]. We do not show fractional decimals; the layer-classification thresholds are integer-aligned with the rounded score the UI displays.

3. Why the 70% trigger

If your post-attempt EMA on a concept drops below 70, the platform offers a re-teach — a shorter, alternate-angle primer on the specific sub-skill the diagnosis identified you missed. The offer is opt-in. We borrowed the threshold from the standard Khan Academy mastery curve: 70 is high enough that a chance-correct guess on a 4-option question (25% baseline) cannot generate it, low enough that a user who’s genuinely close to Proficient (75) gets the offer rather than passing through with thin knowledge.

The offer is throttled at most once per concept per seven days. Forced remediation triggers reactance; a single nudge respects the learner. The constant lives in learn/remediation.py as part of the offer-eligibility check.

4. Calibration vs. quiz items

Quiz items inside a module are scoped to that module’s content; you see them while learning. Calibration items are stress-tests pulled from a separate corpus, surfaced after Proficient on a concept, and designed to falsify the proficient label rather than confirm it. They are harder than the easiest quiz item, ask the same sub-skill from a different angle, and sometimes introduce a distractor that maps to a common misconception.

Passing one calibration item gates the Mastered layer. Passing two calibration items, with at least 14 days between the first and the second, gates the Enduring layer. The 14-day window is a spaced-recall requirement — the Enduring layer specifically tests whether the user can recall after a gap, not just within a single study session.

5. Decay (shown, not regressed)

A concept’s layer never passively decreases. If you reach Mastered and then don’t practice for 90 days, you don’t fall back to Proficient — you stay at Mastered with an explicit decaying mark (a desaturated color + dotted-circle marker in the UI). The reason is that demotion-by-time-passing makes the platform feel punitive; the visible decay signal nudges you back without erasing your prior effort.

The per-layer decay windows are: Proficient at 30 days idle; Mastered at 60 days idle; Enduring at 90 days idle. Familiar and Attempted carry no decay signal — they’re early-arc layers where staleness doesn’t carry useful information.

The only path to a layer demotion is a hard regression: two failed calibration items on the same concept within a short window. That signals the prior mastery claim was likely overstated, and the layer steps back one rung.

6. What we deliberately do not do

7. Changelog

DateChange
2026-05-13 Initial publication of the methodology page. Thresholds in §1 mirror learn/mastery/layer.py as of this date.
2026-04-28 Streak v2 ratification: spaced-rep recall semantics; never burn on miss (S55).
2026-04-24 Mastery Definition plan ratified: four named layers, decay-shown-no-regression rule, share-card composition. Schemas v51 + v52 authored dormant pending measurement window.
2026-04-17 Adaptive Mastery Foundation Phases 1–6 shipped (S50). 70% remediation trigger live. EMA scoring on user_concept_mastery live. Phase 7b/8c/9b gated on measurement window.

Source code and references

The active source-of-truth code: learn/mastery/layer.py (compute_layer + threshold constants), learn/remediation.py (70% trigger + 7-day throttle), pg_db/schema_tables/v51_mastery_layer_fields.py (column definitions), pg_db/schema_tables/v52_mastery_transitions.py (audit log).

Editorial: the methodology was reviewed against the public Khan Academy mastery-system documentation and adapted for the financial-investing domain. Where we diverged from Khan’s defaults, the deviation is named in the changelog above.

This page is subject to and should be read alongside the Editorial Standards. Corrections: editorial@oxfordledge.com.