Teidelum Teidelum ← Back to home
GitHub

Architecture

How Teidelum works: modules, data flow, and key design patterns.

Module Overview

ModuleRole
main.rsEntrypoint: opens TeidelumApi, registers FK relationships, serves MCP over stdio
api.rsUnified API: wraps catalog, search, router, graph behind thread-safe interface
mcp.rsMCP tool definitions via rmcp; delegates to TeidelumApi for all operations
router.rsQuery router: dispatches SQL to libteide (local) or connectors (remote)
search.rstantivy wrapper: BM25 ranking + fuzzy matching search engine
catalog.rsMetadata catalog: schemas, FK relationships, local vs remote tracking
graph.rsSQL-based graph traversal: BFS over catalog FK relationships (max 10 hops)
server.rsHTTP server setup: axum router, CORS, API key auth middleware, MCP Streamable HTTP
routes.rsREST endpoint handlers: 12 routes under /api/v1/ delegating to TeidelumApi
connector/kdb.rskdb+ live query adapter
sync/notion.rsNotion incremental sync with cursor tracking
sync/zulip.rsZulip incremental sync with cursor tracking

Data Flow

  1. Sync sources pull data from Notion and Zulip via the SyncSource trait.
  2. Each sync output splits into structured fields (inserted into libteide columnar tables for SQL) and freeform content (indexed into tantivy for full-text search).
  3. The Catalog registers every table's schema and foreign-key relationships.
  4. The Query Router inspects the catalog to dispatch queries: local tables go to libteide, remote tables go through connectors.
  5. The Graph Engine uses catalog FK relationships to traverse between entities, running SQL queries at each BFS hop.
  6. MCP tools wrap everything behind simple operations: search, sql, describe, graph, sync, create_table, add_rows, add_documents, drop_table.

HTTP Server Layer

Teidelum is a single binary with dual transport. MCP stdio is always on for AI agent integration. An HTTP server is opt-in via the --port flag, enabling REST API access for applications and scripts.

Both transports share a single Arc<TeidelumApi> instance—there is no data duplication or sync between them.

teidelum [--port 8080]
│
├── MCP stdio (always on)
│
└── HTTP server (opt-in via --port)
    ├── /api/v1/*     REST API (axum handlers)
    └── /mcp          MCP Streamable HTTP (rmcp)

Key Design Patterns

Dual Storage

Sync modules split incoming data into two stores. Structured fields (names, dates, statuses, assignees) go into libteide columnar tables for fast SQL queries. Freeform content (page bodies, message text) goes into a tantivy full-text index for BM25 search. This separation lets each engine do what it does best.

Catalog-Driven Routing

The Catalog knows which tables exist, where they live (local libteide or remote connector), and how they relate to each other via foreign keys. The query router reads the catalog to decide where to send each SQL query. No hardcoded table lists—add a table to the catalog and the router picks it up automatically.

Incremental Sync

Sync sources track cursors (timestamps or pagination tokens) so each run pulls only changed data. First sync fetches everything; subsequent syncs fetch only what changed since the last cursor position. Cursors are persisted to disk.

Unified API

TeidelumApi wraps all subsystems (catalog, search engine, query router, graph engine) behind a single thread-safe facade. It uses std::sync::RwLock for concurrent read access—multiple MCP tool calls can read the catalog and query data simultaneously.