Self-Hosting
Momo is a self-hostable AI memory system designed to run as a single binary with environment-variable-based configuration.
Architecture
Section titled “Architecture”graph TD
subgraph MomoServer["Momo Server"]
subgraph API["API Layer"]
REST["REST API (v1)<br/>Documents · Memories · Search"]
Admin["Admin API<br/>(authed)"]
end
subgraph Services["Services Layer"]
SearchSvc["SearchService"]
MemorySvc["MemoryService"]
Pipeline["ProcessingPipeline"]
Forgetting["ForgettingManager"]
Decay["EpisodeDecay"]
Profile["ProfileRefresh"]
end
subgraph Core["Core Modules"]
Intelligence["Intelligence<br/>Inference · Contradict · Filter"]
Processing["Processing<br/>Extract · Chunk · Embed"]
Embeddings["Embeddings<br/>FastEmbed or API · Reranker"]
end
subgraph DB["Database Layer (LibSQL / Turso)"]
Documents["Documents"]
Chunks["Chunks"]
Memories["Memories"]
Vectors["Vectors"]
Relationships["Relationships"]
end
subgraph Providers["External Providers"]
OCR["OCR<br/>Tesseract or API"]
Transcription["Transcription<br/>Whisper or API"]
LLM["LLM Provider<br/>OpenAI / Ollama /<br/>OpenRouter / Local"]
end
REST & Admin --> Services
Services --> Intelligence & Processing & Embeddings
Processing --> DB
end
Prerequisites
Section titled “Prerequisites”- Rust 1.75+ (if building from source)
- Tesseract OCR (optional): For text extraction from images and PDFs.
- macOS:
brew install tesseract - Ubuntu/Debian:
sudo apt-get install tesseract-ocr tesseract-ocr-eng
- macOS:
- LLM API Key (optional): Required for advanced features like contradiction detection, query rewriting, and memory inference.
Installation
Section titled “Installation”From Source
Section titled “From Source”git clone https://github.com/momomemory/momo.gitcd momocargo build --releaseThe compiled binary will be located at ./target/release/momo.
Docker
Section titled “Docker”Momo is available as a pre-built image on GitHub Container Registry (GHCR).
# One-command setup (recommended)docker run --name momo -d --restart unless-stopped -p 3000:3000 -e MOMO_API_KEYS=dev-key -v momo-data:/data ghcr.io/momomemory/momo:latest# Follow logsdocker logs -f momo# Stop and remove container (data remains in momo-data volume)docker stop momo && docker rm momoNote: The /data volume stores the database. Using the named volume momo-data keeps data across container restarts/redeploys.
Using Just (Development)
Section titled “Using Just (Development)”The monorepo includes a justfile for common tasks:
# Development server (backend + frontend with hot reload)just dev
# Backend only (requires cargo-watch)just dev-backend
# Frontend only (requires Bun)just dev-frontend
# Debug/trace loggingjust dev-debug # RUST_LOG=momo=debugjust dev-trace # RUST_LOG=momo=trace
# Build release binaryjust build-release
# Run testsjust test
# Lint and formatjust fmtjust lint
# Full CI checkjust ciRunning
Section titled “Running”Momo is configured entirely via environment variables. When running, three interfaces are available:
| Interface | Path | Description |
|---|---|---|
| Web Console | / | Built-in Preact frontend for browsing memories and documents |
| REST API | /api/v1 | Full REST API for integration |
| MCP | /mcp (configurable) | Model Context Protocol endpoint |
# Running with default settings (creates momo.db in current directory)./target/release/momo
# Running with custom configurationDATABASE_URL=file:my-memory.db MOMO_PORT=8080 ./target/release/momoAfter starting, open http://localhost:3000 to access the web console.
Embedded Use (C FFI)
Section titled “Embedded Use (C FFI)”Momo also ships an embedded C FFI in momo/ffi for applications that want to link the engine directly instead of talking to the HTTP server.
Build it from momo/:
cargo build -p momo-ffiThis produces:
target/debug/libmomo_ffi.dylibtarget/debug/libmomo_ffi.affi/include/momo.h
The included C example lives at momo/ffi/examples/c and shows the minimal lifecycle:
- Create an engine with
momo_engine_new - Call JSON APIs such as
momo_engine_create_memory_json - Free returned strings with
momo_string_free - Free the engine with
momo_engine_free
If you are using the FFI instead of REST, see Embedded C FFI Reference for exported functions, JSON request/response shapes, worker behavior, and loader-path notes.
Configuration
Section titled “Configuration”Momo follows a provider/model string format for external services (Embeddings, LLM, OCR, Transcription).
For a complete reference of all environment variables organized by concern, see Configuration Reference.
If you are embedding Momo through the C FFI, the engine uses this same configuration when momo_engine_new is called with config_json = NULL.
Server
Section titled “Server”| Variable | Description | Default |
|---|---|---|
MOMO_HOST | Bind address | 0.0.0.0 |
MOMO_PORT | Listen port | 3000 |
MOMO_API_KEYS | Comma-separated API keys for authentication (required for protected API routes) | (None) |
MCP (Built-in)
Section titled “MCP (Built-in)”| Variable | Description | Default |
|---|---|---|
MOMO_MCP_ENABLED | Enable the built-in MCP server routes | true |
MOMO_MCP_PATH | Path for streamable HTTP MCP endpoint | /mcp |
MOMO_MCP_REQUIRE_AUTH | Require Bearer auth for MCP requests | true |
MOMO_MCP_DEFAULT_CONTAINER_TAG | Fallback project/container tag when none provided | default |
MOMO_MCP_PROJECT_HEADER | Header used for project scoping (Supermemory compatible) | x-sm-project |
MOMO_MCP_PUBLIC_URL | Optional public base URL used in OAuth discovery responses | (None) |
MOMO_MCP_AUTHORIZATION_SERVER | Optional OAuth issuer URL for discovery responses | (None) |
Notes:
- MCP auth keys come from
MOMO_API_KEYS. - When
MOMO_MCP_REQUIRE_AUTH=trueand no API keys are configured, MCP requests return401 Unauthorized. - Full protocol usage and manual examples are documented in MCP Guide.
Database
Section titled “Database”| Variable | Description | Default |
|---|---|---|
DATABASE_URL | SQLite/LibSQL path or Turso URL | file:momo.db |
DATABASE_AUTH_TOKEN | Auth token for Turso cloud DB | (None) |
DATABASE_LOCAL_PATH | Local replica path for remote DB | (None) |
Embeddings
Section titled “Embeddings”Local (FastEmbed):
EMBEDDING_MODEL: Model name (default:BAAI/bge-small-en-v1.5)EMBEDDING_DIMENSIONS: Vector dimensions (default:384)EMBEDDING_BATCH_SIZE: Batch size (default:256)
External API:
EMBEDDING_MODEL: Useprovider/model(e.g.,openai/text-embedding-3-small)EMBEDDING_API_KEY: API key for the providerEMBEDDING_BASE_URL: Custom base URLEMBEDDING_TIMEOUT: Request timeout in seconds (default:30)EMBEDDING_MAX_RETRIES: Max retry attempts (default:3)EMBEDDING_RATE_LIMIT: Requests per second (optional)
Processing
Section titled “Processing”| Variable | Description | Default |
|---|---|---|
CHUNK_SIZE | Chunk size in tokens | 512 |
CHUNK_OVERLAP | Overlap between chunks | 50 |
Note: File uploads are limited to 25MB (hardcoded).
Transcription
Section titled “Transcription”| Variable | Description | Default |
|---|---|---|
TRANSCRIPTION_MODEL | Model (e.g., local/whisper-small or openai/whisper-1) | local/whisper-small |
TRANSCRIPTION_API_KEY | API key for cloud providers | (None) |
TRANSCRIPTION_BASE_URL | Custom base URL | (None) |
TRANSCRIPTION_TIMEOUT | Timeout in seconds | 300 |
TRANSCRIPTION_MAX_FILE_SIZE | Max file size in bytes | 104857600 (100MB) |
TRANSCRIPTION_MAX_DURATION | Max duration in seconds | 7200 (2h) |
Memory & Decay
Section titled “Memory & Decay”| Variable | Description | Default |
|---|---|---|
EPISODE_DECAY_DAYS | Half-life for episode decay | 30.0 |
EPISODE_DECAY_FACTOR | Decay multiplier per period | 0.9 |
EPISODE_DECAY_THRESHOLD | Below this, candidates for forgetting | 0.3 (0.0-1.0) |
EPISODE_FORGET_GRACE_DAYS | Grace period before permanent forget | 7 |
FORGETTING_CHECK_INTERVAL | Interval in seconds | 3600 |
ENABLE_INFERENCES | Enable background inference engine | false |
INFERENCE_INTERVAL_SECS | Inference run interval | 86400 (24h) |
INFERENCE_CONFIDENCE_THRESHOLD | Min confidence for inferred memories | 0.7 |
INFERENCE_MAX_PER_RUN | Max inferences per cycle | 50 |
Reranking
Section titled “Reranking”RERANK_ENABLED: Enable reranking (opt-in) (default:false)RERANK_MODEL: Reranker model (default:bge-reranker-base)
LLM Provider
Section titled “LLM Provider”LLM_MODEL: Model (format:provider/model, e.g.,openai/gpt-4o-mini)LLM_API_KEY: API keyENABLE_CONTRADICTION_DETECTION: Enable contradiction logic (default:false)ENABLE_QUERY_REWRITE: Enable query expansion (default:false)ENABLE_AUTO_RELATIONS: Auto-detect relationships (default:true)
OCR_MODEL: OCR provider (default:local/tesseract)OCR_LANGUAGES: Comma-separated language codes (default:eng)OCR_MAX_DIMENSION: Max image dimension (default:4096)
Logging
Section titled “Logging”RUST_LOG: Logging level (default:momo=info,tower_http=debug)
Models & Providers
Section titled “Models & Providers”Local Embedding Models (FastEmbed)
Section titled “Local Embedding Models (FastEmbed)”| Model | Dimensions | Quality | Speed |
|---|---|---|---|
BAAI/bge-small-en-v1.5 (default) | 384 | Good | Fast |
BAAI/bge-base-en-v1.5 | 768 | Better | Medium |
BAAI/bge-large-en-v1.5 | 1024 | Best | Slower |
all-MiniLM-L6-v2 | 384 | Good | Fast |
nomic-embed-text-v1.5 | 768 | Better | Medium |
External Embedding Providers
Section titled “External Embedding Providers”| Provider | Example Model | Default Base URL |
|---|---|---|
| OpenAI | openai/text-embedding-3-small | https://api.openai.com/v1 |
| OpenRouter | openrouter/openai/text-embedding-3-small | https://openrouter.ai/api/v1 |
| Ollama | ollama/nomic-embed-text | http://localhost:11434/v1 |
| LM Studio | lmstudio/bge-small-en-v1.5 | http://localhost:1234/v1 |
OCR Providers
Section titled “OCR Providers”local/tesseract: Local Tesseract (default)mistral/pixtral-12b: Mistral OCR APIdeepseek/deepseek-vl: DeepSeek OCR APIopenai/gpt-4o: OpenAI Vision API
Transcription Providers
Section titled “Transcription Providers”local/whisper-small: Local Whisper (default)openai/whisper-1: OpenAI Whisper API
Features & Management
Section titled “Features & Management”Content Types
Section titled “Content Types”Momo automatically detects and processes:
- Text: Plain text, Markdown, HTML.
- Documents: PDF, DOCX, XLSX, CSV.
- Web: URLs (scrapes page content).
- Images: JPEG, PNG, WebP, TIFF, BMP (via OCR).
- Media: Audio (MP3, WAV, M4A) and Video (MP4, WebM, AVI, MKV) via Transcription.
Changing Embedding Models
Section titled “Changing Embedding Models”If you change your embedding model, Momo will detect a dimension mismatch at startup.
- Without flags: Startup will fail with an error if dimensions don’t match.
- With
--rebuild-embeddingsflag: Documents are queued for reprocessing with the new model. Migration runs in the background; search continues to function with partial results.
Contradiction Detection
Section titled “Contradiction Detection”Momo can detect when new information contradicts existing memories.
- Heuristic: Immediate detection via negation and value changes (<1ms).
- LLM Confirmation: Optional refinement (~200-500ms).
- Resolution: Old memories are marked as “not latest” and linked to the new entry.
- Required: Set
ENABLE_CONTRADICTION_DETECTION=true.
Graceful Degradation
Section titled “Graceful Degradation”Momo is designed to be functional even without external dependencies:
- No LLM: Search and storage work, but advanced features (inference, rewrites) are disabled.
- No Tesseract: Image processing will fail, but text documents work.
- No Whisper: Audio/Video processing will fail, but other ingestion works.
For detailed API information, see API Reference. For MCP integration, see MCP Guide.