Is my data truly private with Maple Memory?

Yes. We use a Zero-Knowledge architecture with AES-256 Application-Level Encryption. We mathematically cannot read your chat history or memories.

Beyond RAG. True Memory. Real intelligence.

Universal Memory for AI Agents.

Q: How does it differ from standard RAG?

Unlike standard stateless RAG, Maple Memory provides a persistent Contextual Memory Graph that automatically branches and merges topics, achieving 98% recall accuracy.

Q: Can I integrate this with existing LLMs?

Absolutely. Maple Memory provides a clean REST API (hm_live_ keys) that works as a middleware for Gemini, Claude, and local models.

Position your agents for the next frontier. A secure, zero-knowledge memory layer that gives LLMs persistent, contextual intelligence.

Get the Memory Layer Technical Docs

Maple Memory AI Memory Latency Benchmark - 206ms Retrieval

Domain Expertise

Generic RAG is a liability. We deploy tailored architectures that master your unique organisational knowledge with 98% recall accuracy.

Music & Creative

Intelligent retrieval for rights management, stem analysis, and creative metadata. Designed for the complexities of modern media production.

Business Solutions

Streamline customer support and internal knowledge management with a personalised RAG layer.

Financial Analysis

High-precision retrieval for market sentiment, regulatory filings, and risk modelling workflows.

Visualise your memory as a graph

Gain deep insights into AI performance. Look inside the brain of your agent to see where it went off track and how it connects complex information.

DEBUG

Traceability

See exactly which memory clusters were activated for any given response.

INSIGHT

Performance Audit

Identify weak connections and refine your agent's knowledge base in real-time.

BRAIN

Neural Mapping

Explore the semantic relationships between disparate data points in your organisation.

The Core Comparison

Understanding the fundamental difference between fragmented retrieval and intelligent memory mapping.

THE PROBLEM

"Dumb" Text Retrieval

Most AI apps use standard vector databases. They treat human conversation like a scattered pile of sticky notes.

If a user asks, "What did we discuss about Python?", a standard system just searches for matching keywords and returns fragmented, isolated quotes:

"I like Python."

"Python has pandas."

"FastAPI is great."

Your AI gets the words, but completely loses the context of why they were said.

THE SOLUTION

Relational Memory Mapping

We built a memory engine that mimics human cognition. Instead of scattering sentences, our system automatically groups related conversations into structured knowledge concepts.

When a user asks about Python, our API doesn't just return a quote, it returns the complete, organised context.

Why Developers Choose Us

Self-Organising Context

No manual tagging or complex prompt engineering required. Our memory engine actively analyses the conversational flow in the background.

Topic Evolution & Branching

Human conversations aren't linear; they branch out. When a user switches from talking about "API Security" to "Frontend Frameworks," our system maps that relationship.

Continuous Learning

Our intelligent semantic engine prevents memory bloat. If a user brings up a topic again weeks later, our system knows to gracefully update and merge the existing knowledge.

Full-Picture Retrieval

Never lose the narrative thread.

Technical Architecture

A schematic overview of our retrieval-augmented generation pipeline.

Zero-Knowledge Entry

End-to-end encryption at the edge. Your memory is your own, mathematically locked before it ever hits the cloud.

Local Retrieval

High-speed vector search within your private infrastructure, ensuring zero data leakage.

Context Synthesis

Blending retrieved facts with agent personality for a grounded, personalised response.

Start Building

Request Early Access

Universal Memory is currently in private beta. Join the waitlist to help shape the future of integrated AI memory.

We'll keep you updated on development progress and beta availability.

We build AI that remembers you, but we built a database that can't read you.

Most AI companies hoard your chat logs in plain text. We think that's a massive security risk. Instead, we use a Zero-Logging, Encrypted-at-Rest architecture. Your data is processed in milliseconds, encrypted instantly, and then digitally shredded from our active memory.

STEP 01

The AI Gets the Meaning, Not the Words

When you send a message, our system immediately converts your English sentences into a mathematical map (called a Vector Embedding). To the AI, it looks like [0.015, -0.892, 0.441]. This math represents the meaning of your message, making it impossible to reverse-engineer back into your exact words.

STEP 02

Zero-RAM Logging

Your actual words are held in our server's temporary memory (RAM) for just a fraction of a second, only long enough to route it to the AI brain. The moment the AI replies, our system's Garbage Collector physically wipes the temporary memory. We don't write your prompts to our logs. If it's not in the logs, it can't be leaked.

STEP 03

AES-256 Encryption

Before your conversation is saved to your long-term memory profile, it is locked using AES-256 Application-Level Encryption. This means by the time your data hits our database, it is completely scrambled.

Frequently Asked Questions

Everything you need to know about our universal memory infrastructure.

Is my data truly private?

Yes. Our local version of the memory service is designed for privacy-first deployment. Your data never leaves your infrastructure, ensuring complete sovereignty and security.

How does it differ from standard RAG?

Standard RAG is often stateless. Our universal memory provides a persistent context layer that allows you to visualise your AI agent's memory and track/debug performance issues by searching exactly where the AI diverged off track.

Can I integrate this with existing LLMs?

Absolutely. We provide a clean API that works with all major LLM providers and local models, acting as a sophisticated middleware for context management.

What are the performance overheads?

Minimal. Our vector search and retrieval pipeline are optimised for high-speed, low-latency performance, even with massive organisational datasets.

Can your database administrators read my chat history?

No. Even if our lead engineer queries the database directly, all they will see are mathematical arrays and AES-256 ciphertext (gibberish).

What happens if your database gets hacked?

If a bad actor steals our entire database, they get nothing but encrypted blobs and numbers.

Does the AI model provider see my data?

To generate a response, the temporary plaintext is securely transmitted to our AI model provider via TLS encryption. However, they are legally bound by enterprise agreements not to use your data to train their models, and the data is discarded after the generation is complete. Our servers act as a blindfolded middleman.

Transparent, Scale-Ready Pricing

Choose the tier that fits your growth. From individual developers to enterprise-grade security.

Self Hosted

Freeforever

DIY option for developers who want to manage their own infrastructure.

Unlimited memories (SQLite only)
No API limits

Basic Features

Simple semantic search (BGE)
Basic graph memory
AES-128 encryption
Optional local Llama 8B
Manual scaling required

Trade-offs

You manage DevOps & uptime
Community support only

Get Started

View Documentation →

Developer

$0/month

Cloud-hosted for testing and prototyping. Limited features.

1,000 Memories
500 API calls / month

Cloud Advantages

Fully managed infrastructure
PostgreSQL + pgvector
Basic Gemini AI integration
Google OAuth multi-tenant

Core Features

Standard memory algorithms
AES-256 encryption
Basic auto-merge

Launch

$15/month

Production-ready with advanced AI and full feature access.

100,000 Memories
10,000 API calls / month

Everything in Developer, plus:

Advanced memory algorithms
Full Gemini AI capabilities
Autonomous auto-merge
Conversation clustering
Version history & rollback
Topic switch detection
Advanced semantic ranking
Custom retention policies
Priority API endpoints
Email support (24h response)

Get Started

Scale

$99/month

High-performance infrastructure for production SaaS applications.

500,000 Memories
50,000 API calls / month

Everything in Launch, plus:

Dedicated async workers
Advanced analytics dashboard
Custom embedding models
Batch import/export API
Horizontal auto-scaling
99.9% uptime SLA
Priority support (4h response)

Get Started

Enterprise

Custom

White-glove service with absolute security and compliance guarantees.

Unlimited / Custom volume

Everything in Scale, plus:

Bring Your Own Key (BYOK)
Dedicated PostgreSQL instance
HIPAA / SOC2 compliance
On-premise deployment option
Custom AI model fine-tuning
Advanced role-based access
99.99% uptime SLA
24/7 phone support + CSM