← Back to Kontasks
Kontext Embeddings Layer Design
Architecture for semantic context injection feeding both API and CLI paths
Output
Kontext: Embeddings Layer Design
Core Concept
Kontext is a semantic intelligence layer that sits between requests and LLM paths. It answers: "What context does this request need?" and injects the right knowledge into either Quick Turn (API) or Full Session (CLI).
Architecture
┌─────────────────────────────────────────────────────────────┐
│ User Request │
└─────────────────────────┬───────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ KONTEXT (Embeddings Layer) │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │
│ │ Classifier │ │ Retriever │ │ Context Builder │ │
│ │ (intent) │→ │ (search) │→ │ (assemble prompt) │ │
│ └─────────────┘ └─────────────┘ └─────────────────────┘ │
│ ▲ ▲ │
│ │ │ │
│ ┌──────┴────────────────┴──────────────────────────────┐ │
│ │ Vector Store (SQLite + vec) │ │
│ │ • Code chunks • CLAUDE.md • Kontask history │ │
│ │ • File summaries • VIBE.md • Tool outputs │ │
│ └──────────────────────────────────────────────────────┘ │
└─────────────────────────┬───────────────────────────────────┘
│
┌──────────────┴──────────────┐
▼ ▼
┌─────────────────────┐ ┌─────────────────────┐
│ Quick Turn (API) │ │ Full Session (CLI) │
│ + injected context │ │ + injected context │
│ → fast response │ │ → tool access │
└─────────────────────┘ └─────────────────────┘Components
| 1. Classifier | Fast intent detection: Is this a code question? Config? Listing data? Quick fact? Determines retrieval strategy. |
| 2. Retriever | Semantic search via embeddings. Finds relevant code chunks, docs, past kontasks, tool outputs. |
| 3. Context Builder | Assembles retrieved chunks into a coherent context block. Handles token budgets. |
| 4. Vector Store | SQLite with sqlite-vec extension. Local, fast, no external dependencies. |
What Gets Embedded
| Source | Chunking Strategy | Update Trigger |
|---|---|---|
| Code files | Function/class boundaries | Git commit |
| CLAUDE.md / VIBE.md | Section headers | File change |
| Kontask outputs | Full output + summary | Kontask complete |
| Tool results | Condensed summaries | Tool execution |
| Conversations | Turn summaries | Session end |
Embedding Model Options
| Voyage AI | voyage-code-2 - best for code, $0.10/1M tokens |
| OpenAI | text-embedding-3-small - general purpose, cheap |
| Local (nomic) | nomic-embed-text - runs locally, no API cost |
API Design
// Request context for a prompt POST /api/kontext/retrieve { "prompt": "How does the Quick Turn mode work?", "maxTokens": 4000, "sources": ["code", "docs", "kontasks"] // optional filter } // Response: assembled context { "context": "## Relevant Code\n...", "sources": [{"file": "...", "score": 0.89}], "tokensUsed": 2341 }
Implementation Path
- Phase 1: SQLite + sqlite-vec setup, embed CLAUDE.md/VIBE.md
- Phase 2: Code file chunking and embedding on git hooks
- Phase 3: Integrate with Quick Turn - inject context before API call
- Phase 4: Integrate with CLI - generate context file for --system-prompt
- Phase 5: Kontask/conversation memory for long-term recall
Quick Actions
Original Request
No layout configured
Details
Type
General
Status
Completed
Scope
vibetools
Tags
kontextarchitectureembeddings
Created
5 Jan 2026, 2:07 pm
Updated
5 Jan 2026, 2:07 pm
Timing
⏱️ 60.4s
1 tools
● TTFE 11%
● Thinking 88%
● Tools 1%
Created By
claude
Raw Data
{
"id": "26af35d9-3707-41df-a5c7-86d84e2bfae2",
"type": "general",
"status": "completed",
"title": "Kontext Embeddings Layer Design",
"description": "Architecture for semantic context injection feeding both API and CLI paths",
"context": {
"output": "<div style=\"background:#1e1e1e;color:#e0e0e0;padding:20px;border-radius:8px;font-family:sans-serif;\"><h2 style=\"color:#fff;margin:0 0 20px 0;\">Kontext: Embeddings Layer Design</h2><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#61afef;\">Core Concept</h3><p style=\"margin:0;color:#9ca3af;line-height:1.6;\">Kontext is a <strong>semantic intelligence layer</strong> that sits between requests and LLM paths. It answers: <em>\"What context does this request need?\"</em> and injects the right knowledge into either Quick Turn (API) or Full Session (CLI).</p></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#ce93d8;\">Architecture</h3><pre style=\"margin:0;font-family:monospace;font-size:12px;line-height:1.6;color:#abb2bf;background:#1a1a1a;padding:12px;border-radius:4px;overflow-x:auto;\">┌─────────────────────────────────────────────────────────────┐\n│ User Request │\n└─────────────────────────┬───────────────────────────────────┘\n ▼\n┌─────────────────────────────────────────────────────────────┐\n│ <span style=\"color:#e5c07b;\">KONTEXT</span> (Embeddings Layer) │\n│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────┐ │\n│ │ <span style=\"color:#98c379;\">Classifier</span> │ │ <span style=\"color:#61afef;\">Retriever</span> │ │ <span style=\"color:#ce93d8;\">Context Builder</span> │ │\n│ │ (intent) │→ │ (search) │→ │ (assemble prompt) │ │\n│ └─────────────┘ └─────────────┘ └─────────────────────┘ │\n│ ▲ ▲ │\n│ │ │ │\n│ ┌──────┴────────────────┴──────────────────────────────┐ │\n│ │ <span style=\"color:#e5c07b;\">Vector Store</span> (SQLite + vec) │ │\n│ │ • Code chunks • CLAUDE.md • Kontask history │ │\n│ │ • File summaries • VIBE.md • Tool outputs │ │\n│ └──────────────────────────────────────────────────────┘ │\n└─────────────────────────┬───────────────────────────────────┘\n │\n ┌──────────────┴──────────────┐\n ▼ ▼\n┌─────────────────────┐ ┌─────────────────────┐\n│ <span style=\"color:#98c379;\">Quick Turn (API)</span> │ │ <span style=\"color:#61afef;\">Full Session (CLI)</span> │\n│ + injected context │ │ + injected context │\n│ → fast response │ │ → tool access │\n└─────────────────────┘ └─────────────────────┘</pre></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#98c379;\">Components</h3><table style=\"width:100%;border-collapse:collapse;font-size:13px;\"><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:10px 0;color:#e5c07b;width:25%;\"><strong>1. Classifier</strong></td><td style=\"color:#9ca3af;\">Fast intent detection: Is this a code question? Config? Listing data? Quick fact? Determines retrieval strategy.</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:10px 0;color:#e5c07b;\"><strong>2. Retriever</strong></td><td style=\"color:#9ca3af;\">Semantic search via embeddings. Finds relevant code chunks, docs, past kontasks, tool outputs.</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:10px 0;color:#e5c07b;\"><strong>3. Context Builder</strong></td><td style=\"color:#9ca3af;\">Assembles retrieved chunks into a coherent context block. Handles token budgets.</td></tr><tr><td style=\"padding:10px 0;color:#e5c07b;\"><strong>4. Vector Store</strong></td><td style=\"color:#9ca3af;\">SQLite with sqlite-vec extension. Local, fast, no external dependencies.</td></tr></table></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#61afef;\">What Gets Embedded</h3><table style=\"width:100%;border-collapse:collapse;font-size:13px;\"><tr style=\"border-bottom:1px solid #444;\"><th style=\"text-align:left;padding:8px 0;color:#7f848e;\">Source</th><th style=\"text-align:left;padding:8px 0;color:#7f848e;\">Chunking Strategy</th><th style=\"text-align:left;padding:8px 0;color:#7f848e;\">Update Trigger</th></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;color:#98c379;\">Code files</td><td style=\"color:#9ca3af;\">Function/class boundaries</td><td style=\"color:#9ca3af;\">Git commit</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;color:#98c379;\">CLAUDE.md / VIBE.md</td><td style=\"color:#9ca3af;\">Section headers</td><td style=\"color:#9ca3af;\">File change</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;color:#98c379;\">Kontask outputs</td><td style=\"color:#9ca3af;\">Full output + summary</td><td style=\"color:#9ca3af;\">Kontask complete</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;color:#98c379;\">Tool results</td><td style=\"color:#9ca3af;\">Condensed summaries</td><td style=\"color:#9ca3af;\">Tool execution</td></tr><tr><td style=\"padding:8px 0;color:#98c379;\">Conversations</td><td style=\"color:#9ca3af;\">Turn summaries</td><td style=\"color:#9ca3af;\">Session end</td></tr></table></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#f97316;\">Embedding Model Options</h3><table style=\"width:100%;border-collapse:collapse;font-size:13px;\"><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;color:#e5c07b;\">Voyage AI</td><td style=\"color:#9ca3af;\">voyage-code-2 - best for code, $0.10/1M tokens</td></tr><tr style=\"border-bottom:1px solid #444;\"><td style=\"padding:8px 0;color:#e5c07b;\">OpenAI</td><td style=\"color:#9ca3af;\">text-embedding-3-small - general purpose, cheap</td></tr><tr><td style=\"padding:8px 0;color:#e5c07b;\">Local (nomic)</td><td style=\"color:#9ca3af;\">nomic-embed-text - runs locally, no API cost</td></tr></table></div><div style=\"background:#2d2d2d;padding:16px;border-radius:6px;margin-bottom:16px;\"><h3 style=\"margin:0 0 12px 0;color:#ce93d8;\">API Design</h3><pre style=\"margin:0;font-family:monospace;font-size:11px;line-height:1.5;color:#abb2bf;background:#1a1a1a;padding:12px;border-radius:4px;\"><span style=\"color:#7f848e;\">// Request context for a prompt</span>\nPOST /api/kontext/retrieve\n{\n \"prompt\": \"How does the Quick Turn mode work?\",\n \"maxTokens\": 4000,\n \"sources\": [\"code\", \"docs\", \"kontasks\"] <span style=\"color:#7f848e;\">// optional filter</span>\n}\n\n<span style=\"color:#7f848e;\">// Response: assembled context</span>\n{\n \"context\": \"## Relevant Code\\n...\",\n \"sources\": [{\"file\": \"...\", \"score\": 0.89}],\n \"tokensUsed\": 2341\n}</pre></div><div style=\"background:#1a2332;border:1px solid #61afef;padding:16px;border-radius:6px;\"><h4 style=\"margin:0 0 8px 0;color:#61afef;\">Implementation Path</h4><ol style=\"margin:0;padding-left:20px;line-height:1.8;color:#9ca3af;\"><li><strong>Phase 1:</strong> SQLite + sqlite-vec setup, embed CLAUDE.md/VIBE.md</li><li><strong>Phase 2:</strong> Code file chunking and embedding on git hooks</li><li><strong>Phase 3:</strong> Integrate with Quick Turn - inject context before API call</li><li><strong>Phase 4:</strong> Integrate with CLI - generate context file for --system-prompt</li><li><strong>Phase 5:</strong> Kontask/conversation memory for long-term recall</li></ol></div></div>",
"requestedAt": "2026-01-05T03:52:00Z",
"requestId": "ede2600e-1396-41d5-b843-3c3e905116e9",
"choices": [
{
"label": "Start Phase 1",
"value": "Let's start implementing kontext Phase 1 - SQLite + sqlite-vec setup with CLAUDE.md/VIBE.md embedding"
},
{
"label": "Add to backlog",
"value": "Add kontext embeddings layer to the VIBE.md backlog with this design"
},
{
"label": "Explore alternatives",
"value": "What other approaches could we consider for context management?"
}
],
"turnTiming": {
"totalMs": 60423,
"ttfeMs": 6494,
"thinkingMs": 53299,
"toolExecutionMs": 627,
"toolCallCount": 1,
"thinkingPct": 88,
"toolsPct": 1,
"ttfePct": 11
}
},
"createdBy": "claude",
"createdAt": "2026-01-05T04:07:27.302Z",
"updatedAt": "2026-01-05T04:07:35.027Z",
"requestId": "ede2600e-1396-41d5-b843-3c3e905116e9",
"scope": "vibetools",
"tags": [
"kontext",
"architecture",
"embeddings"
],
"targetUser": "claude"
}