Overview
The vector store is the retrieval layer of the platform, built on Qdrant. It stores document chunks as dense vectors with rich metadata payloads and supports three search modes: vector (semantic similarity), keyword (full-text match), and hybrid (both fused via Reciprocal Rank Fusion). All searches are scoped by user and optionally by space, feed, or document metadata filters.
The store supports multiple embedding models simultaneously via model-specific collections. When a search runs, it queries all collections in parallel and fuses results across them, so embedding model upgrades do not invalidate existing vectors.
Key Concepts
- Model-specific collections — Each embedding model gets its own Qdrant collection (e.g.,
chunks__text_embedding_3_small_1536), enabling side-by-side model comparison and zero-downtime upgrades. - Rich payload — Every point carries document metadata (category, type, data structure), source and feed IDs, and the chunk text itself, enabling pre-filtered search.
- Three search modes — Vector, keyword, and hybrid search are available, with hybrid as the default.
- RRF fusion — Hybrid search uses Qdrant's built-in Reciprocal Rank Fusion to combine vector and keyword results.
- Cross-collection fusion — Multi-model searches fuse results across collections using RRF with k=60.
- Feed scoping via PostgreSQL — Feed-to-document resolution happens in PostgreSQL (the authoritative source) before filtering in Qdrant.
- Optional reranking — Results can be reranked via Cohere, Jina, or a local cross-encoder model after initial retrieval.
Data Model
Collection Naming
Collections follow the pattern {base}__{model_name}_{dimensions}, where the base name comes from the QDRANT_COLLECTION env var (default: chunks).
Payload Structure (ChunkPayload)
| Field | Type | Description |
|---|---|---|
content | text | Chunk text (indexed for full-text keyword search) |
document_id | keyword | Parent document ID |
user_id | keyword | Owner user ID |
space_id | keyword | Parent space ID |
chunk_index | integer | Position within the document |
filename | string | Original filename |
start_char | integer | Start character offset in source text |
end_char | integer | End character offset in source text |
doc_title | string | Extracted document title |
doc_category | string | Extracted category (from metadata extraction) |
doc_document_type | string | Extracted document type |
doc_data_structure | string | Data structure classification (prose, tabular, mixed, structured) |
source_ids | keyword[] | Associated source IDs |
feed_ids | keyword[] | Associated feed IDs |
Collection Indexes
Collections are auto-created with indexes on: user_id, space_id, document_id, source_ids, feed_ids (all keyword), chunk_index (integer), and content (text, for full-text search).
How It Works
Search Modes
Vector search — Computes cosine similarity between the query embedding and stored chunk vectors via Qdrant's query() method.
Keyword search — Full-text match on the content field via Qdrant's scroll() with a text filter. Results are assigned a uniform score of 1.0 since keyword matching does not produce a relevance score.
Hybrid search — Uses Qdrant's prefetch mechanism. Two prefetches run in parallel: one for dense vector similarity and one for keyword matching, each retrieving 4x the requested limit. Results are fused using { fusion: "rrf" } (Reciprocal Rank Fusion).
Multi-Collection Search
- All model-specific collections are discovered at search time.
- The query is embedded using each collection's corresponding model.
- Searches run in parallel across all collections, with each collection returning 3x the requested limit.
- Results are fused across collections using RRF with k=60.
- Collections that fail to respond are skipped gracefully — partial results are still returned.
Filter Building
Every search applies a mandatory user_id filter. Additional optional filters can be combined:
space_id— Scope to a single spacedocument_idordocument_ids— Scope to specific documentsfeed_ids— Resolved to document IDs via PostgreSQLfeed_documentsjoin, then filtered bydocument_idin Qdrantdoc_category,doc_data_structure,doc_document_type— Filter by metadata fieldssource_ids— Scope to specific sources
Reranking
When enabled, an optional reranking step runs after initial retrieval:
| Provider | Model | Notes |
|---|---|---|
| Cohere | rerank-v3.5 | Cloud API |
| Jina | jina-reranker-v2-base-multilingual | Cloud API, multilingual |
| Local | cross-encoder/ms-marco-MiniLM-L-6-v2 | Self-hosted, no API key needed |
Why It Works This Way
Rich Payloads Enable Pre-Filtered Search
Storing doc_category, doc_document_type, doc_data_structure, source_ids, and feed_ids directly in the Qdrant payload allows filters to be applied before vector comparison. This dramatically improves precision for scoped queries — searching within a single feed or category is fast without post-filtering.
Multi-Model Collections Avoid Migration Pain
Creating separate collections per embedding model means upgrading to a better model does not require re-embedding all existing documents. New documents get the new model's collection, old documents remain searchable in their original collection, and cross-collection RRF fusion combines results transparently.
Hybrid Search Catches What Pure Vector Misses
Vector search excels at semantic similarity but can miss exact terminology, acronyms, and model numbers. Keyword search catches these exactly. RRF fusion combines both ranked lists without requiring weight tuning, producing results that satisfy both semantic and lexical relevance.
Feed Scoping via PostgreSQL Ensures Consistency
Feed-to-document membership is managed in PostgreSQL with proper foreign keys and RLS. Resolving feed scope to document IDs in PostgreSQL before querying Qdrant ensures the search always reflects the current, authoritative membership — not a potentially stale denormalised copy in the vector payload.
Configuration
| Env Var | Description |
|---|---|
QDRANT_URL | Qdrant server URL (default http://localhost:6333) |
QDRANT_API_KEY | Qdrant API key (optional, for managed Qdrant) |
QDRANT_COLLECTION | Base collection name (default chunks) |
SEARCH_MODE | Default search mode: vector, keyword, or hybrid (default hybrid) |
SEARCH_VECTOR_WEIGHT | Vector weight for hybrid search (default 0.5) |
RERANK_ENABLED | Enable reranking (default false) |
RERANK_PROVIDER | Reranking provider: cohere, jina, or local (default cohere) |
RERANK_MODEL | Override default reranking model (optional) |
RERANK_BASE_URL | Override reranking API base URL (optional) |
COHERE_API_KEY | Cohere API key (for reranking) |
JINA_API_KEY | Jina API key (for reranking) |
Code Reference
| File | Description |
|---|---|
apps/data-plane/src/lib/vector-store.ts | Qdrant client, search(), searchAllCollections(), upsertChunks(), pointId(), buildFilter(), ensureModelCollection() |
apps/data-plane/src/services/retrieval.ts | searchChunks(), formatChunksForLLM(), multi-model embedding resolution |
apps/data-plane/src/lib/search-config.ts | Search mode and reranking configuration |
apps/data-plane/src/services/rerank.ts | Reranking provider implementations (Cohere, Jina, local) |
Relationships
- Chunking & Embedding — Chunks and their embeddings are the input to the vector store
- Metadata Extraction — Extracted metadata is stored as payload fields for filtered search
- Tool Registry — The
search_documentstool queries the vector store during agent conversations - Feeds — Feed scoping resolves to document IDs before Qdrant filtering
- Spaces — All searches are scoped by
space_idwith RLS enforcement