Bỏ qua để đến nội dung

EVT 2026-02-22 - Cross-stage Recommendation Hardening Defaults (Draft Only, No Decision Promotion)

DomainsDOL EnglishProduct638 words3 min read

active

Session scope

Continue autonomous product ideation to close remaining recommendation-policy gaps while keeping UX simple, actionable, and scalable.

Working assumptions

Lane remains DOL English V2.
Baseline recommendation contracts from DEC-0075/0077/0078 stay active.
This file proposes defaults only; no formal DEC promotion in this cycle.

Draft hardening package (5 unresolved logic areas)

1) Mixed-program tie-breaker (weekly behavior spans multiple programs)

Proposed default (simple deterministic order):

Session explicit preference (if user just set skill/program control) wins.
If one program contributes >=60% of last 5 submitted attempts, use that program.
Else if goal_program_id exists, use goal program.
Else use most recent submitted program.

Why this default:

Keeps learner intent first when explicitly expressed.
Avoids overfitting noisy mixed behavior.
Preserves goal alignment without forcing it.

2) Minimum recommendation set under thin inventory + entitlement constraints

Proposed default (actionability first):

Target set size remains 3-7.
Hard floor: at least 2 actionable items.
If only 2 actionable items exist, allow at most 1 lock teaser to reach 3.
If still <2 actionable, expand by nearest-ladder fallback (same program -> adjacent form -> cross-program easy-start) until floor is met.

Why this default:

Prevents dead-end UX.
Controls lock frustration.
Keeps composition logic lean and predictable.

3) Confidence signal visibility

Proposed default (low-only exposure):

Show confidence badge only for low-confidence items.
Max 1 low-confidence item per set and place near set end.
High/medium confidence items do not show badge to reduce visual noise.

Why this default:

Gives transparency when needed most.
Keeps UI clean in normal conditions.

4) Cold-start -> personalized transition

Proposed default (early but stable switch):

Personalized-ready when either:
- submitted_attempts_14d >= 2, or
- goal_program_id is set and submitted_attempts_30d >= 1.
Before ready: use trending_14d + easy_start_bias + program hint.
If inactive for >45 days, degrade to warm-start (not full reset): keep last-known program hint + trending refresh.

Why this default:

Switches early enough to feel smart.
Avoids fragile personalization from too little signal.
Handles long inactivity without hard reset shock.

5) Rollout/rollback guardrails for recommendation policy changes

Proposed default (3-phase release):

Rollout phases: 10% -> 50% -> 100%.
Minimum observation window each phase: 7 days.
Advance gate requires:
- recommendation_start_rate >= +5% vs control, and
- no guardrail breach.
Guardrail breach (any):
- day7_active_rate drop >1.5%,
- recommendation_bounce_rate increase >3%,
- recommendation-related support complaints +20%.
If breach persists 2 consecutive days -> automatic rollback to previous stable config.

Why this default:

Protects core retention while enabling fast iteration.
Creates explicit stop rules and lowers operational risk.

Compact validation plan (draft)

Tie-breaker quality test

Pass: +6% recommendation_start_rate on mixed-program users.
Fail: <2% lift or higher cross-program bounce.

Thin-inventory composition test

Pass: no_actionable_set_rate <= 0.5% and lock_popup_rage_tap_rate not increased.
Fail: actionable floor not met in >1% sessions.

Confidence badge policy test

Pass: helpful_recommendation_pulse +5 points, no CTR drop >2%.
Fail: perceived trust not improved.

Transition threshold test

Pass: personalized cohort outperforms cold-start on start_within_10m by >=8%.
Fail: lift <3%.

Rollout guardrail simulation

Pass: rollback triggers correctly in synthetic breach drills.
Fail: delayed or missing rollback activation.

Open items carried forward (pending user confirmation)

Should inactivity fallback window be 45 days or 30 days?
In mixed-program tie-breaker, should threshold be 60% (current draft) or 70% for stricter consistency?
For phase gate, keep +5% start-rate threshold or lower to +4% for faster rollout?