API Reference

Build with Persistent Memory

Everything you need to add long-term memory to your AI app — with copy-paste examples for every endpoint.

API Live
https://memory-layer-api.onrender.com

What is Memory Layer?

Memory Layer is a persistent memory API for AI applications. It stores what your users say, finds relevant context automatically, and injects it into your LLM prompts — so your AI always knows who it's talking to.

One API key. Two calls. Your AI now has memory.

Store

Save facts, preferences, and context for any user.

Chat

Memory is auto-retrieved and injected into every LLM call.

Isolate

Millions of end-users — each with their own private memory.

Works with any LLM

Memory Layer is model-agnostic. Use it alongside OpenAI, Anthropic, Google Gemini, Mistral, or any other LLM — we handle the memory layer, you handle the model.

Quick Start

Get your AI talking to memory in under 5 minutes.

Important: Always use end user's email as external_user_id

Every API call must include external_user_id set to the logged-in user's email address (e.g. john_doe@gmail.com). This is what isolates each user's memories from one another. Never use generic placeholders like "user-123" — they break per-user isolation and may expose one user's memories to another.

1

Get your API key

Log in at memorylayer.tech/dashboard, go to API Keys, and click Create Key. Copy the key — it's only shown once.

2

Store your first memory

bash
curl -X POST https://memory-layer-api.onrender.com/v1/memory/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "User is building a SaaS dashboard in React", "external_user_id": "john_doe@gmail.com"}'
3

Chat with memory context

bash
curl -X POST https://memory-layer-api.onrender.com/v1/chat/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "What am I building?", "external_user_id": "john_doe@gmail.com"}'
4

See the response

json
{
  "reply": "You are building a SaaS dashboard in React.",
  "used_memories": [
    { "memory_text": "User is building a SaaS dashboard in React", "score": 0.91 }
  ],
  "memory_created": true
}

Authentication

All API requests use a Bearer token in the Authorization header. Your API key looks like mlive_xxxxx_...

http
Authorization: Bearer mlive_abc123_XyzAbCdEfGhIjKlMnOpQrStUvWxYz

How identity works

Every API key is linked to the user who created it. That user is automatically the owner of every memory the key writes — you never need to send your own user ID. To isolate memories per end-user inside your app, send external_user_id. Those are the only two identifiers that ever matter: your account (from the key) and your end-user (from the body).

Keep your key secret

Never expose your API key in client-side browser code. Always call the Memory Layer API from your backend server. Store your key in an environment variable like MEMORY_LAYER_API_KEY.

How to get an API key (step by step)

  1. 1.Sign in at memorylayer.tech
  2. 2.Go to Dashboard → API Keys in the sidebar
  3. 3.Click "Create New Key" and give it a name (e.g. "Production")
  4. 4.Copy the full key immediately — it is only displayed once
  5. 5.Add it to your server environment: MEMORY_LAYER_API_KEY=mlive_...

Memory API

Store, search, retrieve, and delete memories. All endpoints require an API key.

POST/v1/memory/

Store a new memory for a user. Generates a vector embedding automatically and persists it to the database.

Request Body

FieldTypeRequiredDescription
contentstringrequiredThe text to remember. Can be any fact, preference, or context.
external_user_idstringrequired⚠️ REQUIRED — Your end-user's email address (e.g. john_doe@gmail.com). Must be unique per end-user. Memories are stored in complete isolation per ID. Never use a generic placeholder like "user-123".
metadataobjectdefault: {}Optional key-value metadata (source, session_id, tags, etc.).

Response Fields

FieldTypeRequiredDescription
statusstringoptional"success" on successful store.
memory.idstring (UUID)optionalUnique ID of the stored memory. Save this if you want to delete it later.
memory.importance_scorefloatoptionalAuto-calculated importance score (0–1). Increases as the memory is accessed.
memory.created_atISO datetimeoptionalTimestamp when the memory was stored.
bash
curl -X POST https://memory-layer-api.onrender.com/v1/memory/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User prefers dark mode and uses VS Code",
    "external_user_id": "john_doe@gmail.com",
    "metadata": { "source": "onboarding" }
  }'
POST/v1/memory/search

Search memories using semantic similarity. Returns the most relevant memories for a given query, ranked by cosine similarity.

Request Body

FieldTypeRequiredDescription
querystringrequiredNatural language query to search for. Uses semantic similarity, not keyword matching.
external_user_idstringrequired⚠️ REQUIRED — End user's email address (e.g. john_doe@gmail.com). Scopes the search to only that user's memories. Without this, all your users' memories are searched together.
limitintegerdefault: 10Max number of results to return (1–100).
similarity_thresholdfloatdefault: 0.35Minimum similarity score to include a result (0–1). Lower = more results.

Response Fields

FieldTypeRequiredDescription
resultsarrayoptionalArray of matching memories, sorted by relevance.
results[].similarityfloatoptionalCosine similarity score (0–1). 1.0 = exact match.
results[].memory.contentstringoptionalThe memory text.
results[].combined_scorefloatoptionalWeighted score: 70% similarity + 30% importance.
total_countintegeroptionalNumber of results returned.

Also available as GET

You can also search with a GET request using query params: GET /v1/memory/search?query=...&external_user_id=...&limit=5
bash
curl -X POST https://memory-layer-api.onrender.com/v1/memory/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What editor does the user prefer?",
    "external_user_id": "john_doe@gmail.com",
    "limit": 5,
    "similarity_threshold": 0.3
  }'
GET/v1/memory/

List all memories owned by your API key, ordered by creation date (newest first). Supports pagination.

Query Parameters

FieldTypeRequiredDescription
limitintegerdefault: 50Number of memories to return (1–200).
offsetintegerdefault: 0Number of memories to skip (for pagination).
bash
curl "https://memory-layer-api.onrender.com/v1/memory/?limit=20&offset=0" \
  -H "Authorization: Bearer YOUR_API_KEY"
GET/v1/memory/{memory_id}

Retrieve a single memory by its ID.

Path Parameters

FieldTypeRequiredDescription
memory_idstring (UUID)requiredThe ID of the memory to retrieve.
bash
curl "https://memory-layer-api.onrender.com/v1/memory/e32754fa-9b24-4c3e-b8f1-2a1c9d3e4f56" \
  -H "Authorization: Bearer YOUR_API_KEY"
DELETE/v1/memory/{memory_id}

Permanently delete a memory by ID. This action cannot be undone.

Path Parameters

FieldTypeRequiredDescription
memory_idstring (UUID)requiredThe ID of the memory to delete. Get this from the store or list endpoints.
bash
curl -X DELETE \
  "https://memory-layer-api.onrender.com/v1/memory/e32754fa-9b24-4c3e-b8f1-2a1c9d3e4f56" \
  -H "Authorization: Bearer YOUR_API_KEY"
GET/v1/memory/stats

Get aggregate statistics for memories owned by your API key.

bash
curl "https://memory-layer-api.onrender.com/v1/memory/stats" \
  -H "Authorization: Bearer YOUR_API_KEY"

Chat API

Send a message and get an LLM response with memory automatically retrieved and injected as context. No need to manage context windows manually.

POST/v1/chat/

Primary chat endpoint. Automatically: embeds your message → finds relevant memories → injects them into the LLM prompt → returns the response.

What happens under the hood:

EmbedYour message is converted to a vector embedding (~400ms)
SearchTop-K relevant memories are retrieved from the vector DB (~100ms)
AugmentMemories are injected into the system prompt as context
LLM callGemini generates a personalized response (~800ms)
StoreYour message is saved as a new memory (fire-and-forget)

Request Body

FieldTypeRequiredDescription
messagestringrequiredThe user's message (max 8,000 characters).
external_user_idstringrequired⚠️ REQUIRED — End user's email address (e.g. john_doe@gmail.com). Memories are stored and retrieved scoped to this ID. This is what makes the AI remember the right person.
top_kintegerdefault: 5How many memories to inject (1–20).
conversation_idstringoptionalOptional ID to group messages into a conversation thread.

Response Fields

FieldTypeRequiredDescription
replystringoptionalThe LLM's response, personalized with memory context.
used_memoriesarrayoptionalList of memories that were injected into the prompt.
used_memories[].memory_textstringoptionalThe content of the memory that was used.
used_memories[].scorefloatoptionalSimilarity score (0–1) for this memory.
memory_createdbooleanoptionalWhether the user's message was saved as a new memory.
bash
curl -X POST https://memory-layer-api.onrender.com/v1/chat/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What tools do I prefer for coding?",
    "external_user_id": "john_doe@gmail.com",
    "top_k": 5
  }'
POST/v1/chat/legacy

Backward-compatible chat endpoint. Returns reply, memory_ids, and memories_used.

Request Body

FieldTypeRequiredDescription
messagestringrequiredThe user's message.
external_user_idstringoptionalYour end-user's ID for per-user memory isolation.
top_kintegerdefault: 5Number of memories to retrieve.
bash
curl -X POST https://memory-layer-api.onrender.com/v1/chat/legacy \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "What am I building?", "external_user_id": "john_doe@gmail.com"}'
json
{
  "reply": "You are building a SaaS dashboard in React.",
  "memory_ids": ["e32754fa-..."],
  "mode": "process",
  "memories_used": [
    { "id": "e32754fa-...", "content": "User is building a React dashboard", "similarity": 0.91 }
  ]
}

Per-User Isolation

If you're building a SaaS product with many end-users, you need each end-user's memories to be completely private from one another. Memory Layer handles this with a single field: external_user_id.

❌ Without external_user_id

All memories under your API key are pooled together. A search for one end-user could return another end-user's memories.

✅ With external_user_id

Each end-user's memories are stored and searched in complete isolation. One API key safely serves millions of end-users.

How it works

You use one API key for your entire app — it identifies your account. Every call passes external_user_id set to your own user's ID (UUID, email, or any unique string). Memory Layer stores and retrieves memories scoped to that ID — zero cross-user leakage.
javascript
// Store memory for end-user "alice" in your SaaS app
await fetch('https://memory-layer-api.onrender.com/v1/memory/', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',  // One key for your whole app
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    content: 'Alice is on the Pro plan and works in fintech',
    external_user_id: 'alice@company.com',  // ✅ Use end user's email address
  }),
});

// Store memory for end-user "bob"
await fetch('https://memory-layer-api.onrender.com/v1/memory/', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer YOUR_API_KEY' },
  body: JSON.stringify({
    content: 'Bob is building a healthcare app with Python',
    external_user_id: 'bob@company.com',  // ✅ Different email — completely isolated
  }),
});

Always set external_user_id — use the end user's email address

Use your end user's actual email address (e.g. john_doe@gmail.com) — not a generic placeholder like "user-123". Without a proper external_user_id, all users' memories are pooled and any search may return memories belonging to a different user. Treat this field as mandatory in every store, search, and chat call.

API Key Management

You can manage API keys programmatically. These endpoints use your Supabase session JWT (obtained after logging in), not the API key itself.

POST/v1/keys/

Create a new API key. The full key value is only returned once — save it immediately.

Request Body

FieldTypeRequiredDescription
namestringrequiredA human-readable name for this key (e.g. "Production", "Testing").
bash
curl -X POST https://memory-layer-api.onrender.com/v1/keys/ \
  -H "Authorization: Bearer YOUR_SUPABASE_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "name": "Production Key" }'

Save your key immediately

The full API key (api_key) is returned only once at creation. It cannot be retrieved again. If you lose it, revoke and create a new one.
GET/v1/keys/

List all API keys for your account. The key values are masked for security.

bash
curl https://memory-layer-api.onrender.com/v1/keys/ \
  -H "Authorization: Bearer YOUR_SUPABASE_JWT"
POST/v1/keys/{key_id}/revoke

Revoke an API key. Requests using this key will immediately start receiving 401 errors.

Path Parameters

FieldTypeRequiredDescription
key_idstringrequiredThe ID of the key to revoke (from the list endpoint).
bash
curl -X POST https://memory-layer-api.onrender.com/v1/keys/key_abc123/revoke \
  -H "Authorization: Bearer YOUR_SUPABASE_JWT"

Integration Guides

Drop-in client libraries for your preferred language or framework.

🐍 Python

Works with Django, FastAPI, Flask, or any Python app. No extra dependencies beyond requests.

python
import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://memory-layer-api.onrender.com"

class MemoryLayerClient:
    def __init__(self, api_key: str, user_id: str):
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json",
        }
        self.user_id = user_id

    def remember(self, content: str) -> dict:
        """Store a memory for this user."""
        res = requests.post(
            f"{BASE_URL}/v1/memory/",
            headers=self.headers,
            json={"content": content, "external_user_id": self.user_id},
        )
        return res.json()

    def recall(self, query: str, limit: int = 5) -> list:
        """Search memories by semantic similarity."""
        res = requests.post(
            f"{BASE_URL}/v1/memory/search",
            headers=self.headers,
            json={
                "query": query,
                "external_user_id": self.user_id,
                "limit": limit,
                "similarity_threshold": 0.3,
            },
        )
        return res.json().get("results", [])

    def chat(self, message: str) -> str:
        """Send a message with memory-augmented context."""
        res = requests.post(
            f"{BASE_URL}/v1/chat/",
            headers=self.headers,
            json={"message": message, "external_user_id": self.user_id},
        )
        return res.json().get("reply", "")


# Usage
client = MemoryLayerClient(API_KEY, user_id="john_doe@gmail.com")
client.remember("I am building a React dashboard for a SaaS startup")
reply = client.chat("What am I building?")
print(reply)
# "You are building a React dashboard for a SaaS startup."

🟨 JavaScript / Node.js

Works in Node.js, Deno, Bun, or any environment with the native Fetch API. Zero dependencies.

javascript
// memory-layer.js — Drop-in client for any Node.js or browser project
const BASE_URL = 'https://memory-layer-api.onrender.com';

export class MemoryLayerClient {
  constructor(apiKey, userId) {
    this.headers = {
      'Authorization': `Bearer ${apiKey}`,
      'Content-Type': 'application/json',
    };
    this.userId = userId;
  }

  async remember(content, metadata = {}) {
    const res = await fetch(`${BASE_URL}/v1/memory/`, {
      method: 'POST',
      headers: this.headers,
      body: JSON.stringify({
        content,
        external_user_id: this.userId,
        metadata,
      }),
    });
    return res.json();
  }

  async recall(query, limit = 5, threshold = 0.3) {
    const res = await fetch(`${BASE_URL}/v1/memory/search`, {
      method: 'POST',
      headers: this.headers,
      body: JSON.stringify({
        query,
        external_user_id: this.userId,
        limit,
        similarity_threshold: threshold,
      }),
    });
    return res.json();
  }

  async chat(message, topK = 5) {
    const res = await fetch(`${BASE_URL}/v1/chat/`, {
      method: 'POST',
      headers: this.headers,
      body: JSON.stringify({
        message,
        external_user_id: this.userId,
        top_k: topK,
      }),
    });
    return res.json();
  }
}

// Usage
const client = new MemoryLayerClient('YOUR_API_KEY', 'john_doe@gmail.com');
await client.remember('I prefer React over Vue for frontend work');
const { reply } = await client.chat('What frontend framework do I use?');
console.log(reply);
// "You prefer React for frontend work."

Next.js

Keep your API key secure on the server using Next.js API routes. Never expose it to the browser.

typescript
// app/api/chat/route.ts — Next.js App Router API route
import { NextRequest, NextResponse } from 'next/server';

const MEMORY_API = 'https://memory-layer-api.onrender.com';
const API_KEY = process.env.MEMORY_LAYER_API_KEY!; // Store in .env.local

export async function POST(req: NextRequest) {
  const { message, userId } = await req.json();

  const res = await fetch(`${MEMORY_API}/v1/chat/`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      message,
      external_user_id: userId,  // Your app's user ID
      top_k: 5,
    }),
  });

  const data = await res.json();
  return NextResponse.json({
    reply: data.reply,
    memoriesUsed: data.used_memories.length,
  });
}

// components/Chat.tsx — Frontend component
'use client';
export function Chat({ userId }: { userId: string }) {
  const [messages, setMessages] = useState<string[]>([]);

  const send = async (message: string) => {
    const res = await fetch('/api/chat', {
      method: 'POST',
      body: JSON.stringify({ message, userId }),
    });
    const { reply } = await res.json();
    setMessages(prev => [...prev, message, reply]);
  };
  // ...
}

.env.local setup

Add MEMORY_LAYER_API_KEY=mlive_... to your .env.local file. Next.js keeps this server-side automatically.

Advanced Features

Tools to keep your memory store clean, organized, and efficient at scale.

POST/v1/memory/summarize

Clusters related memories and generates summary memories using the LLM. Reduces noise and improves retrieval quality over time.

bash
curl -X POST "https://memory-layer-api.onrender.com/v1/memory/summarize" \
  -H "Authorization: Bearer YOUR_API_KEY"
POST/v1/memory/prune

Remove old or low-importance memories to keep your store lean. Use dry_run=true first to preview what would be pruned.

Query Parameters

FieldTypeRequiredDescription
max_age_daysintegerdefault: 90Archive memories older than this many days.
min_importancefloatdefault: 0.1Archive memories with importance score below this threshold.
max_memoriesintegerdefault: 10000Maximum memories to keep. Prunes oldest when exceeded.
archive_onlybooleandefault: trueArchive instead of hard-delete (recommended).
dry_runbooleandefault: falsePreview what would be pruned without actually doing it.
bash
curl -X POST \
  "https://memory-layer-api.onrender.com/v1/memory/prune?max_age_days=90&min_importance=0.1&dry_run=true" \
  -H "Authorization: Bearer YOUR_API_KEY"
POST/v1/memory/maintenance/{job_type}

Run maintenance jobs on-demand. Available job types are listed below.

Job Types

FieldTypeRequiredDescription
pruningstringoptionalRemove old and low-importance memories.
summarizationstringoptionalCluster and summarize related memories.
link-optimizationstringoptionalRebuild similarity links between memories.
importance-recalculationstringoptionalRecalculate importance scores based on access counts.
allstringoptionalRun all maintenance jobs in sequence.
bash
curl -X POST "https://memory-layer-api.onrender.com/v1/memory/maintenance/all" \
  -H "Authorization: Bearer YOUR_API_KEY"

Error Reference

Memory Layer uses standard HTTP status codes. All errors include a machine-readable detail field.

CodeNameMeaningWhat to do
200OKRequest succeededRead the response body.
201CreatedMemory was storedRead memory.id from the response.
400Bad RequestInvalid input or missing required fieldsCheck your request body against the parameter table.
401UnauthorizedMissing or invalid API keyCheck your Authorization header. Ensure you have the "Bearer " prefix.
404Not FoundMemory ID does not existVerify the memory_id. It may have been deleted.
422Validation ErrorRequest body failed validationCheck the details array in the response for which field is wrong.
429Rate LimitedToo many requestsSlow down, implement exponential backoff, or upgrade your plan.
500Server ErrorInternal error on our sideRetry with backoff. If it persists, contact support.

Error response examples

json
// 401 — Invalid or missing API key
{
  "detail": "Invalid or missing API key"
}

// 422 — Validation error (missing required field)
{
  "status": "error",
  "message": "Validation error",
  "details": [
    {
      "type": "missing",
      "loc": ["body", "query"],
      "msg": "Field required"
    }
  ],
  "timestamp": "2026-03-14T08:00:00.000Z"
}

// 404 — Memory not found
{
  "detail": "Memory not found"
}

// 429 — Rate limit exceeded
{
  "detail": "Rate limit exceeded. Upgrade your plan or wait."
}

// 500 — Internal server error
{
  "detail": "Internal server error. Please try again."
}

Rate Limits & Plan Limits

PlanMemoriesSearches / moTokens / mo
Free1,000100 / day100K
Starter — $9/mo100,0001,0001M
Pro — $49/mo1,000,00010,00010M
EnterpriseUnlimitedUnlimitedUnlimited

Exceeding limits

Requests that exceed your plan limits return a 429 Rate Limited response. Upgrade your plan to increase limits.

Best Practices

⚠️ Always set external_user_id (use email)

Pass the end user's email address as external_user_id in every store, search, and chat call — no exceptions. Using a real email prevents cross-user memory leakage and enables accurate per-user stats. Never use generic IDs like "user-123".

Call from your backend

Never expose your API key in browser JavaScript. Make your API calls from your server and proxy the response to your frontend.

Use lower similarity thresholds for chat

For conversational context, a threshold of 0.3–0.4 captures relevant memories without being too strict. Use 0.7+ for precision lookups.

Implement exponential backoff

If you receive a 429 or 500, wait 1s, then 2s, then 4s before retrying. This keeps your app resilient under load.

Run maintenance periodically

For long-running apps, schedule /v1/memory/maintenance/all weekly to keep the memory store clean and retrieval fast.

Use top_k=3–5 for chat

Injecting too many memories increases token usage and can confuse the LLM. 3–5 relevant memories is the sweet spot for most use cases.