AI Search

Ebla includes a knowledge layer that indexes your files for semantic search and AI-powered Q&A with mandatory citations.

Requirements

AI search requires an embedding provider (OpenAI or Ollama) and optionally an LLM for Q&A (OpenAI, Anthropic, or Ollama).

How It Works

Indexing Pipeline

┌─────────────┐    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Files     │───▶│   Parser    │───▶│  Chunker    │───▶│  Embedder   │
│ (PDF, MD,   │    │ (extract    │    │ (split into │    │ (vectorize  │
│  TXT, code) │    │  text)      │    │  passages)  │    │  passages)  │
└─────────────┘    └─────────────┘    └─────────────┘    └─────────────┘
                                                               │
                                                               ▼
                                                        ┌─────────────┐
                                                        │  pgvector   │
                                                        │ (store and  │
                                                        │  search)    │
                                                        └─────────────┘

Supported File Types

Type	Extensions	Parser
PDF	`.pdf`	Text extraction with page numbers
Markdown	`.md`, `.markdown`	Section-aware with headings
Plain Text	`.txt`	Paragraph-based chunking
Code	`.go`, `.py`, `.js`, `.ts`, etc.	Function/class-aware chunking
Office	`.docx`, `.xlsx`, `.pptx`	Text extraction (coming soon)

Configuration

Enabling the Knowledge Layer

Add to your server.toml:

[knowledge]
enabled = true

[knowledge.embedding]
provider = "openai"              # openai or ollama
model = "text-embedding-3-small" # OpenAI embedding model
api_key = "sk-..."               # Your OpenAI API key
# Or for Ollama:
# provider = "ollama"
# model = "nomic-embed-text"
# endpoint = "http://localhost:11434"

[knowledge.llm]
provider = "openai"              # openai, anthropic, or ollama
model = "gpt-4-turbo-preview"    # LLM for Q&A
api_key = "sk-..."
temperature = 0.3                # Lower = more focused answers
max_tokens = 1024

# Or for Anthropic:
# provider = "anthropic"
# model = "claude-3-sonnet-20240229"
# api_key = "sk-ant-..."

# Or for Ollama (local):
# provider = "ollama"
# model = "llama3"
# endpoint = "http://localhost:11434"

Embedding Providers

OpenAI (Recommended)

[knowledge.embedding]
provider = "openai"
model = "text-embedding-3-small"  # Cheaper, good quality
# model = "text-embedding-3-large" # Better quality, higher cost
api_key = "sk-..."

Ollama (Self-Hosted)

# First, install Ollama and pull an embedding model
$ ollama pull nomic-embed-text

# Then configure:
[knowledge.embedding]
provider = "ollama"
model = "nomic-embed-text"
endpoint = "http://localhost:11434"

LLM Providers

OpenAI

[knowledge.llm]
provider = "openai"
model = "gpt-4-turbo-preview"  # Best quality
# model = "gpt-4o"             # Faster, multimodal
# model = "gpt-3.5-turbo"      # Cheapest
api_key = "sk-..."
temperature = 0.3

Anthropic

[knowledge.llm]
provider = "anthropic"
model = "claude-3-sonnet-20240229"  # Balanced
# model = "claude-3-opus-20240229"  # Most capable
# model = "claude-3-haiku-20240307" # Fastest
api_key = "sk-ant-..."
temperature = 0.3

Ollama (Local)

# Pull a model
$ ollama pull llama3
$ ollama pull mistral

# Configure
[knowledge.llm]
provider = "ollama"
model = "llama3"
endpoint = "http://localhost:11434"
temperature = 0.3

Indexing

Automatic Indexing

When the knowledge layer is enabled, files are indexed automatically:

New files are indexed when synced
Modified files are re-indexed incrementally
Deleted files are removed from the index

Manual Indexing

# Trigger indexing for a library
$ ebla search index e299620b

Indexing library: My Documents
  Processing: 156 files
  Indexed: 152 files (4 skipped - unsupported types)
  Chunks: 2,341
  Time: 45s

# Check index status
$ ebla search status e299620b

Library: My Documents
  Indexed: 152/156 files (97%)
  Chunks: 2,341
  Last indexed: 2 minutes ago
  Status: up-to-date

Viewing Index Status in Admin UI

Access /admin/indexing to see:

Indexing queue and progress
Per-library index status
Error logs for failed files
Manual re-index trigger

Searching

Hybrid Search

Ebla combines keyword and semantic search for best results:

# CLI: Search files
$ ebla search query e299620b "quarterly revenue report"

Results for "quarterly revenue report":

1. documents/Q1-2026-Report.pdf (page 5)
   Score: 0.92
   "...total quarterly revenue increased by 15% year-over-year..."

2. documents/financial-summary.md (section: Revenue)
   Score: 0.87
   "...Q1 revenue exceeded projections by 8%..."

3. notes/board-meeting.md (section: Financials)
   Score: 0.81
   "...discussed quarterly revenue targets..."

Search Modes

Mode	Description	Best For
Hybrid (default)	Combines keyword + semantic	General queries
Semantic	Vector similarity only	Conceptual queries
Keyword	Full-text search only	Exact matches

API Search

# Search via API
curl -X POST http://server:6333/api/v1/libraries/lib_xyz/knowledge/search \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "quarterly revenue report",
    "mode": "hybrid",
    "limit": 10
  }'

# Response
{
  "results": [
    {
      "file_path": "documents/Q1-2026-Report.pdf",
      "page": 5,
      "section": null,
      "score": 0.92,
      "snippet": "...total quarterly revenue increased by 15% year-over-year...",
      "highlights": ["quarterly", "revenue"]
    }
  ],
  "total": 3,
  "query_time_ms": 45
}

AI Q&A

Asking Questions

Get answers with mandatory citations:

# CLI: Ask a question
$ ebla search ask e299620b "What was our Q1 revenue growth?"

Question: What was our Q1 revenue growth?

Answer:
Based on the Q1 2026 Report, quarterly revenue increased by 15%
year-over-year. The financial summary notes that this exceeded
projections by 8%.

Sources:
  [1] documents/Q1-2026-Report.pdf (page 5)
  [2] documents/financial-summary.md (section: Revenue)

Streaming Responses

For long answers, use streaming:

# API: Streaming Q&A (Server-Sent Events)
curl -N http://server:6333/api/v1/libraries/lib_xyz/knowledge/query/stream \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "Summarize the Q1 financial results"
  }'

# SSE events
data: {"type": "chunk", "content": "Based on the "}
data: {"type": "chunk", "content": "Q1 2026 Report"}
data: {"type": "chunk", "content": ", the company..."}
data: {"type": "sources", "sources": [{"file": "Q1-Report.pdf", "page": 5}]}
data: {"type": "done"}

Web UI Search

The file browser (/app) includes an integrated search:

Files Tab: Search by file name
Content Tab: Full-text and semantic search
Ask Tab: AI-powered Q&A with streaming answers

Citations

Mandatory Citations

Every AI answer includes citations to source documents:

File Path: Which file contains the information
Page Number: For PDFs, the exact page
Section: For Markdown, the heading hierarchy
Snippet: The relevant passage

Citation Format

{
  "citations": [
    {
      "file_path": "documents/Q1-2026-Report.pdf",
      "page": 5,
      "section": null,
      "snippet": "...total quarterly revenue increased by 15% year-over-year...",
      "relevance_score": 0.92
    },
    {
      "file_path": "notes/meeting.md",
      "page": null,
      "section": "Financials > Q1 Review",
      "snippet": "...discussed quarterly revenue targets...",
      "relevance_score": 0.81
    }
  ]
}

Evidence Export

Q&A Sessions

Save and export Q&A sessions for documentation:

# Create a Q&A session
curl -X POST http://server:6333/api/v1/evidence/sessions \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"name": "Q1 Financial Review", "library_id": "lib_xyz"}'

# Add Q&A pairs to session
curl -X POST http://server:6333/api/v1/evidence/sessions/:id/pairs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "question": "What was Q1 revenue growth?",
    "answer": "15% year-over-year",
    "citations": [...]
  }'

# Export session as Markdown
curl http://server:6333/api/v1/evidence/sessions/:id/export/markdown \
  -H "Authorization: Bearer $TOKEN" > q1-review.md

# Export as PDF
curl http://server:6333/api/v1/evidence/sessions/:id/export/pdf \
  -H "Authorization: Bearer $TOKEN" > q1-review.pdf

Export Format

# Exported Markdown
# Q1 Financial Review

## Q: What was Q1 revenue growth?

**A:** 15% year-over-year growth, exceeding projections by 8%.

**Sources:**
- documents/Q1-2026-Report.pdf (page 5)
- documents/financial-summary.md (section: Revenue)

---

## Q: What were the main revenue drivers?

**A:** The primary drivers were...

**Sources:**
- ...

Cross-Library Search

Searching Multiple Libraries

Search across all accessible libraries:

# API: Cross-library search
curl -X POST http://server:6333/api/v1/knowledge/search \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "security audit findings",
    "library_ids": ["lib_abc", "lib_xyz"],  // optional, omit for all
    "limit": 20
  }'

Local Search

Client-Side Embeddings

The client can sync embeddings for offline semantic search:

# Enable local embeddings in client config
# ~/.ebla/config.toml

[knowledge]
local_embeddings = true
sync_embeddings = true  # Download embeddings from server

# Search works offline
$ ebla search query e299620b "revenue report"
# Uses locally cached embeddings

Performance

Indexing Performance

Batch Size: Configurable chunks per API call
Parallel Processing: Multiple files indexed concurrently
Incremental Updates: Only changed content is re-indexed

Search Performance

pgvector: Efficient vector similarity search
IVFFlat Index: Approximate nearest neighbor for large libraries
Query Caching: Repeated queries are cached

Tuning

[knowledge]
# Chunk size (characters)
chunk_size = 1000
chunk_overlap = 200

# Batch size for embedding API calls
embedding_batch_size = 100

# Search result limit
max_results = 20

# Context window for Q&A
max_context_chunks = 10

Privacy Considerations

Data Flow

Embeddings: File content is sent to the embedding provider (OpenAI/Ollama)
Q&A: Relevant passages are sent to the LLM for answer generation
Storage: Embeddings are stored in PostgreSQL (pgvector)

Using Ollama for Privacy

For maximum privacy, use Ollama to run models locally:

# All AI processing stays on your server
[knowledge.embedding]
provider = "ollama"
model = "nomic-embed-text"
endpoint = "http://localhost:11434"

[knowledge.llm]
provider = "ollama"
model = "llama3"
endpoint = "http://localhost:11434"

Disabling AI Features

# Disable knowledge layer entirely
[knowledge]
enabled = false