EVT 2026-02-22 - Cross-stage UX Optimization Iteration Loop (Draft Only, No Decision Promotion)

DomainsDOL EnglishProduct1.394 words7 min read

active

Session scope

Autonomous ideation->refine loop for DOL English V2. Goal: UX best quality with high completeness but lean complexity.

Assumptions (max 5)

Product lane is DOL English V2 only.
Existing confirmed DEC baseline remains valid; this cycle creates draft ideas only.
Recommendation remains optional (no mandatory task, no penalty model).
Current stack can support rule-based ranking before ML.
Existing metrics events can be extended without redesigning tracking architecture.

Iteration 1

1) Draft ideas (9 ideas -> 3 directions)

Direction A - Guided Autonomy Loop

Next-best-action card with one primary CTA and 2 alternates.
Program-adaptive ranking with continuity after each submission.
Session-level controls (skill/difficulty/duration) with one-click reset.

Direction B - Low-friction Activation and Return

Home first-week activation strip (day 0-7), fully optional.
Intent-preserving return after auth/payment.
Empty-state trio CTA (Practice -> Vocab -> Course) with contextual reorder if user has course.

Direction C - Trust and Visibility Layer

One-line reason label per recommendation.
Progress-without-goal panel (trend only, no target gap).
Entitlement-aware lock previews with in-place upgrade path.

2) Choose top direction

Top 1: Direction A - Guided Autonomy Loop.

Impact: direct effect on daily learning actions.
Feasible: rule-based v1 is practical.
Differentiation: preserves user freedom while still guiding.
Risk: recommendation repetition can reduce trust.
Scope: bounded to recommendation surfaces and interactions.

3) Concept draft (v1)

Value prop: Give each learner a clear next step that fits their current context without forcing a rigid path.
User/problem:
- User: self-study learner (free/pro/pro max), mixed motivation.
- Problem: too many choices; users know they should practice but don’t know what to do next.
Scope in:
- Context-aware candidate selection.
- Optional user controls in-session.
- Explainable recommendation reason.
Scope out:
- ML model training.
- New grading logic.
- Mandatory learning path.
Core flow (6):
1. User lands on recommendation surface (Home/Practice Result/Practice Management).
2. System resolves context (program intent, recent submissions, entitlement).
3. Candidate pool built from available items.
4. Ranking applies continuity -> habit-fit -> diversity cap.
5. UI shows 3-7 items with one-line reason.
6. User starts/switches/ignores; system logs action for next refresh.
Must:
- Always actionable recommendation set.
- Optional and non-blocking behavior.
- Fast response on refresh.
Should:
- Respect session controls.
- Cap same-skill repetition in one set.
Could:
- Show low-confidence hint.
Risks + mitigation:
1. Repetition fatigue -> add topic/skill cap and freshness quota.
2. Sparse inventory -> nearest-ladder fallback.
3. Over-control complexity -> keep controls at 3 only.

4) Validation (v1)

H1: Context-aware ranking increases recommendation start rate.
- Test: A/B context-aware vs baseline.
- Pass: +8% recommendation_start_rate and no day7_active_rate drop >2%.
- Fail: lift <3% or retention drop >2%.
H2: Session controls reduce irrelevant skips.
- Test: controls on vs off.
- Pass: -6% recommendation_ignore_rate.
- Fail: ignore unchanged or worse.
H3: One-line reason improves trust.
- Test: reason label on vs off.
- Pass: +5% recommendation_item_click_through_rate.
- Fail: <2% lift.

Self-score (1-5)

Clarity: 4
Simplicity: 4
Feasible: 5
Differentiation: 3
Testable: 4

DoD checklist (>=8)

8-10 ideas created: Dat
Exactly 3 directions: Dat
Top direction selected with 5 criteria: Dat
Value prop one sentence: Dat
User/problem explicit: Dat
Scope in/out explicit: Dat
Flow 5-7 steps: Dat
Must/Should/Could included: Dat
3 risks + mitigation included: Dat
3 hypotheses with pass/fail metrics: Dat
Metric mapping to goal explicit: Chua dat (missing explicit mapping table)
Differentiation >=4: Chua dat (score 3)

Result: iteration 1 NOT PASS (score<4 and DoD has gaps).

Iteration 2 (refined)

1) Updated direction positioning (to raise differentiation)

Top direction renamed: Adaptive Practice Copilot (Rule-based). Differentiation anchor:

Competitors often provide static lists; this direction gives adaptive-but-simple guidance that stays optional, explainable, and entitlement-aware.

2) Final concept draft

Value prop: Adaptive Practice Copilot giúp user luôn có “bước tiếp theo phù hợp nhất” theo chương trình và ngữ cảnh hiện tại, nhưng vẫn giữ toàn quyền lựa chọn.
User/problem:
- User segment: self-study learners + paid learners in mixed consistency states.
- Core problem: choice overload + low confidence in what to practice next, especially after finishing one attempt.
Scope in:
- Program-adaptive candidate resolution.
- Continuity-first ranking after submit.
- Session controls: skill/difficulty/duration.
- Explainability: one-line primary reason.
- Fallback ladder when inventory/entitlement constraints occur.
Scope out:
- Mandatory curriculum.
- New payment plans.
- New scoring rubric.
Core flow (7):
1. User reaches recommendation surface (result/home/management).
2. Context resolver reads current signals (program intent, recent attempts, streak state, entitlement).
3. Candidate generator filters by available-now and access rights.
4. Ranker applies: continuity -> habit-fit -> freshness/diversity -> lock teaser cap.
5. UI renders dynamic set (3-7) + one-line reason + quick controls.
6. User action:
  - Start item -> open attempt directly.
  - Adjust controls -> rerank in-session.
  - Ignore item(s) -> adaptive strategy shift after ignore streak threshold.
7. Telemetry logs action/outcome; next refresh uses latest interaction memory (session TTL).
Must:
- Optional guidance only; no penalty.
- Set always actionable even with thin inventory.
- Stable and explainable ranking order.
- Refresh triggers on submit + manual.
Should:
- Same-skill cap in one cluster.
- Ignore-streak adaptation.
- Entitlement-aware composition with minimal lock noise.
Could:
- Confidence badge shown only when low-confidence.
Risks + mitigation:
1. Risk: Low inventory causes poor relevance.
  - Mitigation: nearest-ladder fallback + transparent reason tag.
2. Risk: User sees too many locked items and churns.
  - Mitigation: lock teaser cap, prioritize available-now first.
3. Risk: Logic drift becomes hard to maintain.
  - Mitigation: contractized ranking order and weekly QA snapshot.

3) Validation (final)

H1: Adaptive continuity ranking improves immediate practice continuation.
- Test A: A/B adaptive continuity vs static mixed list.
  - Pass: +10% post_result_start_within_10m_rate.
  - Fail: <4% lift.
- Test B: holdout cohort by program.
  - Pass: lift is positive in >=70% program cohorts.
  - Fail: positive in <50% cohorts.
H2: Explainability + controls reduce irrelevant skipping without harming simplicity.
- Test A: reason label + controls vs reason-only.
  - Pass: ignore_rate down >=7%, manual_refresh_per_session not up >10%.
  - Fail: ignore drop <3% or refresh spike >10%.
- Test B: usability smoke (5-task completion).
  - Pass: >=85% complete task without assistance.
  - Fail: <75%.
H3: Entitlement-aware composition improves conversion quality and learner trust.
- Test A: available-now-first + lock cap vs no cap.
  - Pass: +8% upgrade_from_lock_click_rate with no rise in recommendation_bounce_rate >3%.
  - Fail: conversion flat and bounce +>3%.
- Test B: qualitative pulse survey (in-product quick poll).
  - Pass: >=70% users choose “suggestion useful/relevant”.
  - Fail: <55%.

Metric mapping (explicit)

User/Problem: “khong biet hoc tiep gi” -> Outcome: start nhanh sau ket qua -> Metric: post_result_start_within_10m_rate -> Experiment: H1A/H1B.
User/Problem: “goi y khong phu hop” -> Outcome: giam bo qua -> Metric: recommendation_ignore_rate, manual_refresh_per_session -> Experiment: H2A.
User/Problem: “bi kho chiu vi lock” -> Outcome: tang chap nhan nang cap + giu trust -> Metric: upgrade_from_lock_click_rate, recommendation_bounce_rate -> Experiment: H3A/H3B.

Self-score (1-5)

Clarity: 5
Simplicity: 4
Feasible: 5
Differentiation: 4
Testable: 5

DoD checklist (>=8)

8-10 ideas created: Dat
Exactly 3 directions: Dat
Top direction selected with impact/feasible/diff/risk/scope: Dat
Value prop one sentence: Dat
User/problem explicit: Dat
Scope in/out explicit: Dat
Flow 5-7 steps: Dat
Must/Should/Could included: Dat
3 risks + mitigation included: Dat
3 hypotheses included: Dat
Each hypothesis has pass/fail criteria: Dat
Metrics map to outcomes explicitly: Dat

Result: iteration 2 PASS.

Change log (3 lines)

Raised differentiation by formalizing unique “Adaptive Practice Copilot” positioning and competitor contrast.
Added explicit User/Problem -> Outcome -> Metric -> Experiment mapping to close testability gap.
Tightened risk controls with lock-teaser cap + contractized ranking order for long-term maintainability.