Overview
Sources are the data repositories that feed the Condelo platform. Each source belongs to a space and represents a distinct collection of documents — whether uploaded files, structured tabular data, webhook payloads, or API-connected external systems. Sources support hierarchical folder organisation, allowing users to structure their knowledge base the same way they think about it.
Documents are linked to sources through a junction table, so a single document can appear in multiple sources without duplication. Every space gets one default source (typically the "Uploads" bucket) that acts as the catch-all for new documents.
Key Concepts
- Source types —
bucket(file uploads),tabular(structured CSV/JSON data),webhook(inbound payloads),api(external system connectors),directory(curated collections). - Default source — Each space has one default source. New uploads land here unless a specific source is selected.
- Hierarchical folders — Folders within a source use a self-referencing
parentIdto form a tree. Folder names are unique within their parent (or at root level). - Document-source junction — The
document_sourcestable links documents to sources and optionally to a specific folder within that source. - Schema and config — Each source carries a
schema(jsonb) describing its data shape and aconfig(jsonb) for type-specific settings.
Data Model
sources
| Column | Type | Notes |
|---|---|---|
id | uuid (PK) | |
spaceId | uuid (FK → spaces) | Owning space |
name | text | Display name |
description | text | Purpose of this source |
sourceType | text | Default "bucket". One of: bucket, tabular, webhook, api, directory |
isDefault | boolean | One default source per space |
icon | text | Display icon |
schema | jsonb | Data shape descriptor (for tabular sources) |
config | jsonb | Type-specific configuration |
createdAt | timestamp | |
updatedAt | timestamp |
source_folders
| Column | Type | Notes |
|---|---|---|
id | uuid (PK) | |
sourceId | uuid (FK → sources) | Parent source |
parentId | uuid (self-referencing) | Null for root-level folders |
name | text | Folder name |
createdAt | timestamp | |
updatedAt | timestamp |
Constraints:
| Constraint | Columns | Condition |
|---|---|---|
| Unique | (sourceId, parentId, name) | No duplicate names at same level |
| Unique partial | (sourceId, name) | WHERE parentId IS NULL — root folder uniqueness |
document_sources
| Column | Type | Notes |
|---|---|---|
documentId | uuid (FK → documents) | |
sourceId | uuid (FK → sources) | |
folderId | uuid (FK → source_folders) | Optional — which folder within the source |
createdAt | timestamp |
Primary key: (documentId, sourceId)
How It Works
- Source created — When a space is provisioned, a default bucket source is created. Users can create additional sources of any type.
- Folders organised — Users create folders within a source. The self-referencing
parentIdforms a tree of arbitrary depth. - Documents linked — When a document is uploaded or ingested, a row is inserted into
document_sourceslinking it to a source and optionally a folder. - Agents query — LLM agents use
list_sources,folder_tree, andlist_folder_contentstools to navigate the source hierarchy and find relevant documents.
Why It Works This Way
Hierarchical Folders for LLM Navigation
The folder tree structure is not just for human organisation. The folder_tree and list_folder_contents tools give LLM agents a way to progressively narrow down to relevant documents without loading the entire document set into context. An agent can traverse the tree top-down, choosing branches based on folder names and descriptions.
Source Type Determines Query Strategy
The sourceType field tells agents upfront how to query a source. A bucket source contains unstructured prose best searched with vector similarity. A tabular source contains structured data better queried with text-to-SQL or column filtering. This distinction prevents agents from wasting tokens on the wrong retrieval strategy.
Junction Table for Multi-Source Documents
A single document can belong to multiple sources. For example, a quarterly report might appear in both a "Finance" source and a "Board Materials" source. The junction table avoids physical duplication while maintaining logical organisation.
Code Reference
| File | Description |
|---|---|
packages/db/src/schema/sources.ts | Sources and source_folders table definitions |
packages/db/src/schema/documents.ts | document_sources junction table |
apps/data-plane/src/tools/definitions/list-sources.ts | LLM tool: list available sources |
apps/data-plane/src/tools/definitions/folder-tree.ts | LLM tool: navigate folder hierarchy |
apps/data-plane/src/tools/definitions/list-folder-contents.ts | LLM tool: list documents in a folder |
Relationships
- Spaces — Every source belongs to a space
- Feeds — Source-default feeds track all documents in a source automatically
- Inferences & Signals — Inferences reference source documents as evidence