RAG & Semantic Search

ChromaDB-powered vector search across project files, memories, and error patterns. Semantic file ranking, hybrid error matching, and intelligent context enrichment.

Get Started See Features

RAG capabilities

Project file indexing

Background indexing embeds the first 2KB of each project file into ChromaDB using nomic-embed-text. Incremental - only re-indexes changed files.

Semantic file ranking

When building prompts, files are ranked by semantic relevance to the feature request. Narrows 500+ workspace files to the ~30 most relevant.

Memory semantic search

Search across error patterns, lessons, conventions, and facts using vector similarity. Combines global and per-project memory collections.

Hybrid error matching

When a step fails, matches against known error patterns using both regex (fast, precise) and vector search (semantic). Merged results with confidence scores.

Memory across the pipeline

Project knowledge is injected into every phase of the orchestration pipeline, not just planning.

Planning phase

When generating the task prompt, the system semantically searches memories for conventions, lessons, and facts relevant to the feature request. These are included as a 'Project Knowledge' section so the coder model respects established patterns.

Pre-execution step review

Before executing proposed steps, the reviewer model receives relevant memories alongside the safety checklist. Known conventions help catch violations before any code runs.

Post-execution review

After execution, the reviewer evaluates results against acceptance criteria with full memory context. Lessons from past reviews inform the verdict — flag patterns that violated known conventions.

Auto-fix error recall

When a step fails, hybrid regex + vector search matches the error against known patterns. Solutions that worked before are included in the fix prompt with confidence scores and success rates.

Memory loading is visible in the pipeline graph — each phase shows a purple chip with the number of memories injected into its prompt.

How it integrates

Feature request enrichment

When you create a request, vector pre-filtering identifies the most relevant files. The LLM then classifies them as target (to modify) or reference (for context).

Embedding model routing

Embedding inference is routed separately from code generation. Run nomic-embed-text on a different host to keep your GPU free for generation.

Graceful degradation

All RAG features are optional. If ChromaDB is down, the system falls back to text search. If embeddings fail, file lists pass through unranked.

Memory types

Four memory types: conventions (coding standards), lessons (learnings from past runs), facts (project knowledge), and error patterns (regex + solution steps).