Hybrid Knowledge Ingestion Implementation Plan
Plans314 words2 min read
Execution note: Use
executing-plansto implement this plan task-by-task.
Goal: Build a private-first ingestion flow that extracts Google Docs/Sheets content, normalizes it, and writes structured markdown drafts into this repository.
Architecture: Use a single CLI (scripts/kb-ingest.mjs) with per-source auth mode (oauth_refresh_token, oauth_access_token, or service_account). Convert raw Google payloads into a stable normalized model via scripts/kb-ingest-lib.mjs, then write staging docs plus raw cache for traceability.
Tech Stack: Node.js (ESM), googleapis, built-in Node test runner (node --test).
Task 1: Create Test Coverage for Normalization Layer
Phần tiêu đề “Task 1: Create Test Coverage for Normalization Layer”Files:
- Test:
tests/kb-ingest/normalize.test.mjs
Steps:
- Add tests for
slugify,extractGoogleLinks,classifyContent,docsToMarkdown,sheetsToMarkdown. - Run
node --test tests/kb-ingest/normalize.test.mjsand verify failure when implementation is missing.
Task 2: Implement Normalization Library
Phần tiêu đề “Task 2: Implement Normalization Library”Files:
- Create:
scripts/kb-ingest-lib.mjs
Steps:
- Implement pure utility functions used by ingestion pipeline.
- Re-run normalization tests until all pass.
Task 3: Implement Hybrid Ingestion CLI
Phần tiêu đề “Task 3: Implement Hybrid Ingestion CLI”Files:
- Create:
scripts/kb-ingest.mjs
Steps:
- Add argument parsing and config loading.
- Add auth client builder for both auth modes.
- Add extractors for
gdocandgsheet. - Add staging markdown + raw cache outputs.
- Add summary/index outputs.
Task 4: Add Config Template and Security Guardrails
Phần tiêu đề “Task 4: Add Config Template and Security Guardrails”Files:
- Create:
config/knowledge-ingest/sources.example.json - Modify:
.gitignore
Steps:
- Provide config template with hybrid examples.
- Ignore local secrets/config/cache paths.
Task 5: Integrate NPM Commands and Documentation
Phần tiêu đề “Task 5: Integrate NPM Commands and Documentation”Files:
- Modify:
package.json - Create:
docs/Shared/Operations/Knowledge/NB_Ops_Knowledge_Ingestion_Workflow.md
Steps:
- Add
kb:ingest,kb:ingest:dry, and test scripts. - Document setup, auth, execution, output paths, and review workflow.
Task 6: Verification
Phần tiêu đề “Task 6: Verification”Files:
- Verify changed files
Steps:
- Run
node --test tests/kb-ingest/normalize.test.mjs. - Run
node scripts/kb-ingest.mjs --help. - Run
node scripts/kb-ingest.mjs --config config/knowledge-ingest/sources.example.json --dry-run --limit 1(expected auth/config failure is acceptable; parser + flow should execute). - Report evidence and remaining gaps.