0.0
No release in over 3 years
Gives Claude Code persistent memory across sessions. Claude recalls your codebase architecture without file traversal, enforces project conventions during code generation, tracks decisions with rationale, and remembers your preferences across all projects. Zero-config via Hooks + MCP.
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
 Dependencies

Runtime

~> 2.14
~> 0.1, >= 0.1.9
~> 5.102
~> 1.8
 Project Readme

ClaudeMemory

Long-term memory for Claude Code - automatic, intelligent, zero-configuration

Gem Version

What It Does

ClaudeMemory gives Claude Code a persistent memory across all your conversations. It automatically:

  • βœ… Extracts durable facts from conversations (tech stack, preferences, decisions)
  • βœ… Remembers project-specific and global knowledge
  • βœ… Provides instant recall without manual prompting
  • βœ… Maintains truth (handles conflicts, supersession)

No API keys. No configuration. Just works.

Quick Start

1. Install the Gem

gem install claude_memory

2. Install the Plugin

From within Claude Code, add the marketplace and install the plugin:

# Add the marketplace (one-time setup)
/plugin marketplace add codenamev/claude_memory

# Install the plugin
/plugin install claude-memory

3. Initialize Memory

Initialize both global and project-specific memory:

claude-memory init

This creates:

  • Global database (~/.claude/memory.sqlite3) - User-wide preferences
  • Project database (.claude/memory.sqlite3) - Project-specific knowledge

4. Analyze Your Project (Optional)

Bootstrap memory with your project's tech stack:

/claude-memory:analyze

This reads your project files (Gemfile, package.json, etc.) and stores facts about languages, frameworks, tools, and conventions.

5. Verify Setup

claude-memory doctor

Use with Claude Code

Just talk naturally! Memory happens automatically.

You: "I'm building a Rails app with PostgreSQL, deploying to Heroku"
Claude: [helps with setup]

# Behind the scenes:
# - Session transcript ingested
# - Facts extracted automatically
# - No user action needed

Later:

You: "Help me add a background job"
Claude: "Based on my memory, you're using Rails with PostgreSQL..."

πŸ‘‰ See Getting Started Guide β†’ πŸ‘‰ View Example Conversations β†’

Why It Matters β€” Real A/B Test Results

We tested identical prompts with and without ClaudeMemory to measure the actual impact. Here's what we found:

Architecture Recall Without File Traversal

Prompt: "Explain the conflict detection and resolution system. Answer from knowledge only β€” do not read any files."

Without Memory With Memory
Response 16 lines: "I don't know this codebase β€” let me read the files" 76 lines: correct 4-role PredicatePolicy explanation, resolution pipeline, specific examples
Outcome Honest refusal β€” zero architectural understanding Deep understanding without touching the filesystem

Correct File Paths vs Hallucinated Guesses

Prompt: "I want to add a new predicate. Walk me through every file I need to update."

Without Memory With Memory
Response 6 steps targeting 3 non-existent files (predicate.rb, predicate_synonyms.rb, json_schema.rb) 8 steps, all targeting real files with correct paths
Outcome Plausible but wrong β€” would waste developer time Actionable, correct, references actual commits

Cross-Project Preferences

Prompt: "What are my standard development environment preferences across all my projects?"

Without Memory With Memory
Response "I don't have stored knowledge of your preferences" Lists 7 real preferences: iTerm2, tmux, VS Code, PostgreSQL, Redis, Docker
Outcome Blank slate every session Personalized from day one

When Memory Doesn't Help

File-searchable questions ("what version is this?") and one-shot code generation without explicit recall don't benefit β€” grep is equally effective. Memory shines when the answer isn't in any single file: architecture spanning dozens of classes, conventions from past sessions, decisions with rationale, and user preferences.

How It Works

  1. You chat with Claude - Tell it about your project
  2. Facts are extracted - Claude identifies durable knowledge
  3. Memory persists - Stored locally in SQLite
  4. Automatic recall - Claude remembers in future conversations

πŸ‘‰ Architecture Deep Dive β†’

Key Features

  • Dual Scope: Project-specific + global user preferences
  • Hybrid Search: FTS5 full-text + semantic vector search with Reciprocal Rank Fusion
  • Native Vector Storage: sqlite-vec for fast KNN search with local embeddings (fastembed-rb, no API key)
  • Session Context: Automatic context injection at session start with recent facts
  • Privacy First: <private> tags exclude sensitive data
  • Progressive Disclosure: Lightweight queries before full details
  • Semantic Shortcuts: Quick access to decisions, conventions, architecture
  • Truth Maintenance: Automatic conflict resolution
  • Claude-Powered: Uses Claude's intelligence to extract facts (no API key needed)
  • Token Efficient: 10x reduction in memory queries with progressive disclosure
  • Database Maintenance: Compact, export, and backup commands
  • Built-in Observability (0.10.0+): claude-memory dashboard opens a local web UI with a moments feed, trust panel (token budget, quality score, utilization, feedback), conflicts dedup, knowledge index, and πŸ‘/πŸ‘Ž feedback. See Dashboard guide β†’. claude-memory digest writes a weekly markdown report (Activity, Context cost, Quality, New knowledge, Utilization, Conflicts, Feedback); claude-memory show prints what would be injected next SessionStart; claude-memory census audits the predicate vocabulary across projects.

What's New in 0.11.0

Five user-visible signals so you can answer "is memory still worth it?" with numbers, not vibes:

  • Token budget telemetry β€” every SessionStart context injection now records its estimated context_tokens. claude-memory stats --tokens [--since DAYS] reports p50/p95/avg/min/max plus a histogram across <500 / 500-1k / 1-2k / 2-5k / 5k+ buckets so you can see the per-session cost at a glance. The dashboard's Trust panel and claude-memory digest surface the same numbers.
  • Hallucination-rate metric β€” the dashboard now scores how clean the fact base is, not just how full it is. Distill::BareConclusionDetector flags decision / convention facts that skipped the reason-clause requirement. Trust panel shows quality_score (live 30-day window with historical baseline beneath). claude-memory digest adds a Quality section with rejection rate.
  • claude-memory show β€” new command prints what memory would inject at the next SessionStart in plain Markdown. Footer reports fact count, ~token estimate, and char count so you see the cost at a glance. Default hides the raw-transcript "Pending Knowledge" dump for readability; --pending opts in. --source startup|resume|clear simulates each fresh-session entrypoint.
  • First-week ROI nudge β€” at SessionEnd, memory now prints memory contributed N facts this session, %used = X for the first 10 sessions, then quiets. Cold-start trust signal β€” you don't have to know about the dashboard. Opt out with CLAUDE_MEMORY_NO_NUDGE=1.
  • Harm benchmark prototype β€” first ClaudeMemory benchmark that measures whether memory can make Claude wrong. Three hand-written cases (stale-tech, mismatched-scope, superseded-but-undetected) under spec/benchmarks/e2e/harm_bench_spec.rb. Real-mode run on the 0.11 release reported 0/3 harm; the full 10-15-case corpus + release gate lands in 0.12.

What's New in 0.10.0

Three behavior changes worth knowing about β€” they affect what you'll see in extracted facts and SessionStart context, even if you don't change anything:

  • Auto-memory mirror β€” On fresh sessions, the SessionStart context hook scans ~/.claude/projects/<slug>/memory/*.md and surfaces new or changed entries as candidates for extraction into ClaudeMemory. You'll see a "Pending Knowledge Extraction" section in Claude's startup context citing files from your auto-memory directory. Claude reviews these and calls memory.store_extraction for the high-signal ones; you don't need to copy-paste manually anymore.
  • Why-clause enforcement β€” When Claude distills decision and convention facts, it's now required to embed a reason ("…because…", "…so that…", "…to avoid…"). A bare conclusion is dead weight; a fact with a reason stays useful when the situation changes. You'll see this reflected in fact text being longer and more justified.
  • Reference predicate β€” Active facts that look like reference material (LOC counts, "X is a plugin/library/tool" templates, "by Firstname Lastname" attributions) are auto-tagged predicate=reference instead of convention. Keeps the conventions list signal-rich. Browse them in the dashboard's Knowledge β†’ References section, or run claude-memory reclassify-references --dry-run to see candidates.

Plus: staleness detection (claude-memory stats --stale) lists active facts that haven't been recalled in N days, so you can prune dead weight explicitly. The dashboard's Trust β†’ Needs review panel surfaces the count.

Privacy Control

Exclude sensitive data from memory using privacy tags:

You: "My API key is <private>sk-abc123</private>"
Claude: [uses it during session]

# Stored: "API endpoint configured with key"
# NOT stored: "sk-abc123"

Supported tags: <private>, <no-memory>, <secret>

Upgrading

Existing users can upgrade seamlessly:

gem update claude_memory

All database migrations happen automatically. Run claude-memory doctor to verify.

See CHANGELOG.md for detailed release notes.

Troubleshooting

Check Setup Status

If memory tools aren't working, check initialization status:

memory.check_setup

This returns:

  • Initialization status (healthy, needs_upgrade, not_initialized)
  • Version information
  • Missing components
  • Actionable recommendations

Installation Help

Need help getting started? Run:

/setup-memory

This skill provides:

  • Step-by-step installation instructions
  • Common error solutions
  • Post-installation verification
  • Upgrade guidance

Health Check

Verify your ClaudeMemory installation:

claude-memory doctor

This checks:

  • Database existence and integrity
  • Schema version compatibility
  • sqlite-vec availability and index coverage
  • Hooks configuration
  • Snapshot status
  • Stuck operations

Uninstalling

To remove ClaudeMemory configuration:

# Remove hooks and MCP configuration (keeps databases)
claude-memory uninstall

# Remove everything including databases
claude-memory uninstall --full

# For global uninstall
claude-memory uninstall --global
claude-memory uninstall --global --full

The uninstall command removes:

  • Hooks from .claude/settings.json
  • MCP server from .claude.json
  • ClaudeMemory section from CLAUDE.md
  • Databases and generated files (with --full)

Note: The doctor command will warn you if orphaned hooks are detected (hooks configured but MCP plugin removed). Run claude-memory uninstall to clean them up.

Documentation

  • πŸ“– Getting Started - Step-by-step onboarding
  • πŸ’‘ Examples - Use cases and workflows
  • πŸ“Š Dashboard - Local web UI for inspection and trust signals (0.10.0+)
  • πŸ”§ Plugin Setup - Claude Code integration
  • πŸ—οΈ Architecture - Technical deep dive
  • πŸ”’ API Stability - What's stable / experimental / internal across releases (0.12.0+)
  • πŸ“ Changelog - Release notes

Benchmarks

ClaudeMemory includes DevMemBench, a developer-domain benchmark suite that measures retrieval quality and truth maintenance accuracy. All offline benchmarks run locally at zero cost.

Latest Results

Benchmark Metric Score
Truth Maintenance Accuracy (100 cases) 100%
FTS5 Retrieval Recall@5 (40 easy queries) 97.5%
Semantic Retrieval Recall@5 (85 queries aggregate) 78.6%
Semantic Retrieval Recall@5 (40 medium queries) 69.6%
Hybrid Retrieval Recall@5 (100 queries aggregate) 72.7%
Hybrid Retrieval Recall@10 (20 hard queries) 62.8%
Scope Ranking Queries returning expected facts 5/5

Semantic and hybrid retrieval use fastembed-rb with the BAAI/bge-small-en-v1.5 model (384-dim, runs locally, no API key needed).

What the benchmarks measure

Retrieval accuracy -- Given a database of ~105 developer-domain facts across 5 simulated projects, how well does search find the right facts? Measured with standard IR metrics (Recall@k, MRR, nDCG@10) across 155 queries at varying difficulty levels (exact keyword match, semantic paraphrase, cross-category synthesis, abstention, temporal).

Truth maintenance -- Given pairs of existing and incoming facts, does the resolver correctly determine the outcome? 100 FEVER-inspired cases test four outcomes: supersession (new stated fact replaces old), conflict (inferred fact contradicts stated), accumulation (multi-value predicates coexist), and corroboration (same fact adds provenance).

End-to-end with Claude -- 31 scenarios across 5 LongMemEval ability categories (information extraction, multi-session reasoning, temporal reasoning, knowledge updates, abstention). Requires EVAL_MODE=real and costs ~$2-8 per run.

Running benchmarks

# Offline benchmarks ($0, ~8 seconds)
bundle exec rspec spec/benchmarks/ --tag benchmark --format documentation

# Full evals + benchmarks
./bin/run-evals --all

# End-to-end with real Claude (~$2-8)
EVAL_MODE=real bundle exec rspec spec/benchmarks/e2e/ --tag eval_real

The benchmark dataset draws from real CLAUDE.md patterns and is designed specifically for ClaudeMemory's 6 predicates and 8 entity types. Open IR datasets (BEIR, FEVER, LongMemEval) informed the methodology but don't cover developer-domain knowledge.

πŸ‘‰ Benchmark Details β†’

For Developers

  • Language: Ruby 3.2+
  • Storage: SQLite3 (no external services)
  • Testing: 1964 examples (~1700 unit/integration + ~250 benchmarks/evals), 100% core coverage
  • Code Style: Standard Ruby
git clone https://github.com/codenamev/claude_memory
cd claude_memory
bin/setup
bundle exec rspec

πŸ‘‰ Development Guide β†’

Support

License

MIT - see LICENSE.txt


Made with ❀️ by Valentino Stoll