5-Minute Quickstart
Our Memory API is fully asynchronous. When you send a message, we return a task_id
instantly so your web server is never blocked waiting for the AI to organise your contextual memory
graph.
Here is how to add structured, long-term memory to your AI application in just a few lines of code.
Step 1: Authentication
Get your secret Maple Memory API Key from your Developer Dashboard. You will use this key to authenticate your requests.
import requests
API_URL = "https://api.yourdomain.com"
API_KEY = "heymaple_live_your_secret_heymaple_api_key_here"
HEADERS = {
"Authorization": "Bearer " + API_KEY,
"Content-Type": "application/json"
}
const API_URL = 'https://api.yourdomain.com';
const API_KEY = 'heymaple_live_your_secret_heymaple_api_key_here';
const HEADERS = {
'Authorization': 'Bearer ' + API_KEY,
'Content-Type': 'application/json'
};
Step 2: Send a User Message
Send the user's chat to the /chat endpoint. We will instantly return a tracking ID while
our background workers generate the vector embeddings and update the memory relationships.
# 1. Send the chat message
chat_payload = {
"content": "I'm starting a new project to build a Docker API for tabular queries."
}
response = requests.post(API_URL + "/chat", json=chat_payload, headers=HEADERS)
task_data = response.json()
print("Task Started: " + task_data['task_id'])
// 1. Send the chat message
const chatPayload = {
content: "I'm starting a new project to build a Docker API for tabular queries."
};
const response = await fetch(API_URL + '/chat', {
method: 'POST',
headers: HEADERS,
body: JSON.stringify(chatPayload)
});
const taskData = await response.json();
console.log('Task Started: ' + taskData.task_id);
Step 3: Retrieve the Structured Memory
Because we process memories intelligently in the background (mapping relationships and preventing duplicates), you simply poll the task endpoint to get the updated memory node. Note: Background tasks usually complete in under 2 seconds.
import time
task_id = task_data['task_id']
status = "pending"
attempts = 0
# 2. Poll the task status (with a 10-second timeout safety)
while status in ["pending", "processing"] and attempts < 10:
time.sleep(1)
task_response = requests.get(API_URL + "/task/" + task_id, headers=HEADERS)
result_data = task_response.json()
status = result_data["status"]
attempts += 1
# 3. View your structured memory!
if status == "completed":
memory_node = result_data["result"]["memory"]
print("Action Taken: " + result_data["result"]["action"])
print("Theme: " + memory_node['theme'])
print("Summary: " + memory_node['summary'])
print("Entities Found: " + str(memory_node['entities']))
elif status == "failed":
print("Task failed:", result_data.get("error"))
else:
print("Task timed out.")
- Isolate
- Branch
- Merge
Step 4: Search Your Memories
Use semantic search to retrieve the most relevant contextual memories when your user asks a question.
search_payload = {
"query": "What projects am I working on?",
"limit": 5
}
response = requests.post(API_URL + "/memories/search", json=search_payload, headers=HEADERS)
results = response.json()
for memory in results["memories"]:
print("[" + str(memory['similarity']) + "] " + memory['theme'] + ": " + memory['summary'])
For queries with highly specific technical terms, custom IDs, or jargon, use Hybrid Search. By setting the mode to "hybrid", our engine automatically balances exact keyword extraction with semantic intent to ensure maximum accuracy, no manual tuning required.
hybrid_payload = {
"query": "Docker API deployment",
"limit": 5,
"mode": "hybrid"
}
const hybridPayload = {
query: "Docker API deployment",
limit: 5,
mode: "hybrid"
};
Import historical conversation logs in a single API call. Our system will automatically chunk, cluster, and map them into a clean relational graph.
batch_payload = {
"conversations": [
{
"input": "What's the best way to containerize a Python app?",
"output": "Docker is great for Python apps."
},
{
"input": "I prefer FastAPI over Flask",
"output": "FastAPI is excellent!"
}
]
}
const batchPayload = {
conversations: [
{ input: "What's the best way to containerize a Python app?", output: "Docker is great!" },
{ input: "I prefer FastAPI over Flask", output: "FastAPI is excellent!" }
]
};
Memory Graph (Linked Memories)
Explore related memories as a connected graph to trace how a user's topics shift over time:
# Get a memory's graph (all connected memories)
memory_id = 42
response = requests.get(API_URL + "/memories/" + str(memory_id) + "/graph", headers=HEADERS)
graph = response.json()
print("Main Memory: " + graph['memory']['theme'])
print("Linked Memories: " + str(len(graph['linked_memories'])))
// Get a memory's graph (all connected memories)
const memoryId = 42;
const response = await fetch(API_URL + '/memories/' + memoryId + '/graph', { headers: HEADERS });
const graph = await response.json();
console.log('Main Memory: ' + graph.memory.theme);
console.log('Linked Memories: ' + graph.linked_memories.length);
Rate Limits & Monitoring
Default Limit: 60 requests/minute per user.
Monitoring: Check your API usage, error rates, and response times anytime by hitting
the /metrics endpoint or visiting your dashboard.
{"detail": "Rate limit exceeded. Try again in 30 seconds."}