# pgmnemo

> A PostgreSQL extension for AI-agent memory that grades itself by real outcomes, and you can read why in plain SQL. Runs inside your own Postgres, not a SaaS. Apache-2.0. Current version: v0.10.1.

pgmnemo is agent memory, not a black box. Recall ranking is an inspectable SQL column (`recall_hybrid()` returns `rrf_score / vec_score / bm25_score`). Outcomes feed back into ranking via `reinforce()` / `confidence` (v0.7.0). Ingestion is a SQL constraint check with zero LLM API calls. As of v0.10.0, `recall_fast()` is a pure-HNSW synchronous path (p50 22–46ms, p95 57–99ms at k=10 on a live 7,612-lesson corpus, 0% timeouts) and is the MCP default; pass `deep=true` for full hybrid fusion. v0.10.1 hardens the `recall_hybrid` BM25 path (graceful vector-only fallback on timeout) and makes `recall_fast` reject a NULL embedding instead of returning NULL scores.

Positioning vs others: Zep = temporal; Mem0 = extraction; Letta = agent statefulness; pgmnemo = outcome-graded + SQL-inspectable, in your own Postgres, with the first published p50/p95 synchronous-recall latency we've found for a SQL-native memory layer.

## Core
- [Homepage](https://pgmnemo.com/): what it is, how it works, comparison, benchmarks
- [Full LLM context](https://pgmnemo.com/llms-full.txt): complete site content as markdown
- [GitHub](https://github.com/pgmnemo/pgmnemo): source, README, AGENTS.md integration guide
- [CHANGELOG](https://github.com/pgmnemo/pgmnemo/blob/main/CHANGELOG.md): per-version release notes

## Docs
- [Install](https://pgmnemo.com/#install): PGXN, Docker, source, PyPI/MCP. Postgres 17, pgvector ≥0.7.0.
- [API](https://pgmnemo.com/#api): `ingest()`, `recall_fast()`, `recall_hybrid()`, `navigate_locate()`/`navigate_expand()`, `reinforce()`
- [FAQ](https://pgmnemo.com/#faq)

## Blog
- [The first published p50/p95 latency for SQL-native agent recall (v0.10.0)](https://pgmnemo.com/blog/2026-06-22-pgmnemo-recall-fast-latency/)
- [Token-economy navigation (v0.8.0)](https://pgmnemo.com/blog/2026-06-05-pgmnemo-v0.8.0/)
- [What ships in v0.6.0](https://pgmnemo.com/blog/2026-05-22-pgmnemo-v0.6.0/)

## Benchmarks (honest)
- `recall_fast()` synchronous recall: p50 22–46ms, p95 57–99ms at k=10 on a live 7,612-lesson corpus, 0% timeouts (first published p50/p95 we've found for SQL-native agent recall)
- `recall_fast` reaches 80% overlap@10 with full `recall_hybrid` at 2–6.6× lower latency; honest tradeoff: overlap@10 p10 = 0.40, so pass `deep=true` for exploratory or safety-critical lookups
- `recall_hybrid` times out on 13.8–27% of queries on dense-graph corpora (73k+ edges), which is why `recall_fast` is the correct MCP default
- $0 ingestion (zero LLM calls at write time)
- LongMemEval-S recall@10 = 0.9604 (RRF); a 50-line BM25 script reaches 0.9820 — we publish both