Sources & Folders

Data repositories, source types (bucket, tabular, webhook, api, directory), hierarchical folder organisation, and document-source junction.

Overview

Sources are the data repositories that feed the Condelo platform. Each source belongs to a space and represents a distinct collection of documents — whether uploaded files, structured tabular data, webhook payloads, or API-connected external systems. Sources support hierarchical folder organisation, allowing users to structure their knowledge base the same way they think about it.

Documents are linked to sources through a junction table, so a single document can appear in multiple sources without duplication. Every space gets one default source (typically the "Uploads" bucket) that acts as the catch-all for new documents.

Key Concepts

  • Source typesbucket (file uploads), tabular (structured CSV/JSON data), webhook (inbound payloads), api (external system connectors), directory (curated collections).
  • Default source — Each space has one default source. New uploads land here unless a specific source is selected.
  • Hierarchical folders — Folders within a source use a self-referencing parentId to form a tree. Folder names are unique within their parent (or at root level).
  • Document-source junction — The document_sources table links documents to sources and optionally to a specific folder within that source.
  • Schema and config — Each source carries a schema (jsonb) describing its data shape and a config (jsonb) for type-specific settings.

Data Model

sources

ColumnTypeNotes
iduuid (PK)
spaceIduuid (FK → spaces)Owning space
nametextDisplay name
descriptiontextPurpose of this source
sourceTypetextDefault "bucket". One of: bucket, tabular, webhook, api, directory
isDefaultbooleanOne default source per space
icontextDisplay icon
schemajsonbData shape descriptor (for tabular sources)
configjsonbType-specific configuration
createdAttimestamp
updatedAttimestamp

source_folders

ColumnTypeNotes
iduuid (PK)
sourceIduuid (FK → sources)Parent source
parentIduuid (self-referencing)Null for root-level folders
nametextFolder name
createdAttimestamp
updatedAttimestamp

Constraints:

ConstraintColumnsCondition
Unique(sourceId, parentId, name)No duplicate names at same level
Unique partial(sourceId, name)WHERE parentId IS NULL — root folder uniqueness

document_sources

ColumnTypeNotes
documentIduuid (FK → documents)
sourceIduuid (FK → sources)
folderIduuid (FK → source_folders)Optional — which folder within the source
createdAttimestamp

Primary key: (documentId, sourceId)

How It Works

  1. Source created — When a space is provisioned, a default bucket source is created. Users can create additional sources of any type.
  2. Folders organised — Users create folders within a source. The self-referencing parentId forms a tree of arbitrary depth.
  3. Documents linked — When a document is uploaded or ingested, a row is inserted into document_sources linking it to a source and optionally a folder.
  4. Agents query — LLM agents use list_sources, folder_tree, and list_folder_contents tools to navigate the source hierarchy and find relevant documents.

Why It Works This Way

Hierarchical Folders for LLM Navigation

The folder tree structure is not just for human organisation. The folder_tree and list_folder_contents tools give LLM agents a way to progressively narrow down to relevant documents without loading the entire document set into context. An agent can traverse the tree top-down, choosing branches based on folder names and descriptions.

Source Type Determines Query Strategy

The sourceType field tells agents upfront how to query a source. A bucket source contains unstructured prose best searched with vector similarity. A tabular source contains structured data better queried with text-to-SQL or column filtering. This distinction prevents agents from wasting tokens on the wrong retrieval strategy.

Junction Table for Multi-Source Documents

A single document can belong to multiple sources. For example, a quarterly report might appear in both a "Finance" source and a "Board Materials" source. The junction table avoids physical duplication while maintaining logical organisation.

Code Reference

FileDescription
packages/db/src/schema/sources.tsSources and source_folders table definitions
packages/db/src/schema/documents.tsdocument_sources junction table
apps/data-plane/src/tools/definitions/list-sources.tsLLM tool: list available sources
apps/data-plane/src/tools/definitions/folder-tree.tsLLM tool: navigate folder hierarchy
apps/data-plane/src/tools/definitions/list-folder-contents.tsLLM tool: list documents in a folder

Relationships

  • Spaces — Every source belongs to a space
  • Feeds — Source-default feeds track all documents in a source automatically
  • Inferences & Signals — Inferences reference source documents as evidence

Making the unknown, known.

© 2026 Condelo. All rights reserved.