Personal Knowledge Base

GitHub Download

Feed it URLs and PDFs and it indexes them locally with vector embeddings; ask questions and it answers from the indexed material with cited sources. Everything runs on your machine — embeddings, vector search, the database itself — only the answer synthesis call goes to Anthropic.

how it works

Setup

1. Download Friday

Go to hellofriday.ai and download the macOS installer
Open the DMG and drag Friday to your Applications folder
Launch Friday and complete the initial setup

2. Import the workspace

Open Friday and go to Discover Spaces
Find Personal Knowledge Base and click it
Click Add Space

3. First-run dependencies

The two Python agents bring their own dependencies via pyproject.toml. The first time you trigger ingestion or a query, Friday runs uv to:

Provision a Python 3.12 interpreter under ~/.friday/local/uv/python/
Install sentence-transformers, sqlite-vec, and pymupdf into a cached environment
Download the BAAI/bge-large-en-v1.5 embedding model (~1.3 GB) from HuggingFace into ~/.cache/huggingface/

This is one-time per host and takes a couple of minutes. Subsequent runs are fast. No manual pip or venv step.

The model download is ~1.3 GB. On a slow or flaky connection it can stall. If it does, set HF_HUB_DISABLE_XET=1 in Friday's environment and retrigger — it switches HuggingFace to the plain HTTP downloader, which is more resilient to interrupted transfers.

4. Make sure Python can load SQLite extensions

sqlite-vec is a loadable SQLite extension, so the agents need a Python whose sqlite3 module was compiled with extension support. The default macOS system Python and the python.org 3.12 build ship without it, and uv run --python 3.12 may select one of those. When that happens, the agents fail immediately with a clear error:

This Python lacks SQLite loadable-extension support, which sqlite-vec requires…

The fix is one command — install a uv-managed CPython, which has extension support enabled:

uv python install 3.12

Then retrigger the workspace. requires-python = ">=3.12" in each agent's pyproject.toml only gates the version, not the build flag, so this can't be caught at install time — the preflight guard catches it at runtime instead.

5. That's it

No API keys to configure beyond Anthropic (already set during Friday setup). No external services. The vector DB lives at ~/.friday/local/workspaces/personal-knowledge-base/kb.db.

How it works

Component	Role
`ingest-url` / `ingest-pdf` / `query-kb` signals	HTTP triggers at the matching paths
`ingest-url` job	Two-state FSM: `idle` → `ingest`
`ingest-pdf` job	Two-state FSM: `idle` → `ingest`
`query-kb` job	Two-state FSM: `idle` → `answer`
`url-ingester`, `pdf-ingester`	Both point at the `kb-ingest-agent` Python agent — same code, different signal payloads
`kb-query`	Points at the `kb-query-agent` Python agent
`kb-ingest-agent`	Python user agent: chunks content, embeds with BGE, writes to SQLite + `sqlite-vec`. Uses `pymupdf` for PDF text extraction.
`kb-query-agent`	Python user agent: embeds the question, runs ANN search via `sqlite-vec` `MATCH` query, hands the top 10 chunks to Claude Sonnet for synthesis.
`knowledge-base` long-term memory	Narrative log of every successful ingestion (source, title, doc_id, chunk count) — useful for "what have I ingested?" queries via chat.

The DB schema is three tables: documents (one row per ingested source), chunk_metadata (chunk text + foreign key), and chunk_embeddings (a vec0 virtual table holding 1024-dim float vectors). The kb-query-agent joins the vector match against chunk_metadata and documents to surface source titles in the synthesized answer.