Memory LayerAPI Documentation

API Reference

Everything you need to add persistent memory to your AI app — with real examples for every endpoint.

API Live
https://memory-layer-api.onrender.com

What is Memory Layer?

Memory Layer is a persistent memory API for AI applications. It stores what your users say, finds relevant context automatically, and injects it into your LLM prompts — so your AI always knows who it's talking to.

One API key. Two calls. Your AI now has memory.

Store

Save facts, preferences, and context for any user.

Chat

Memory is auto-retrieved and injected into every LLM call.

Isolate

Millions of end-users — each with their own private memory.

Works with any LLM

Memory Layer is model-agnostic. Use it alongside OpenAI, Anthropic, Google Gemini, Mistral, or any other LLM — we handle the memory layer, you handle the model.

Quick Start

Get your AI talking to memory in under 5 minutes.

1

Get your API key

Log in at memorylayer.tech/dashboard, go to API Keys, and click Create Key. Copy the key — it's only shown once.

2

Store your first memory

bash
curl -X POST https://memory-layer-api.onrender.com/v1/memory/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"content": "User is building a SaaS dashboard in React", "external_user_id": "user-123"}'
3

Chat with memory context

bash
curl -X POST https://memory-layer-api.onrender.com/v1/chat/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "What am I building?", "external_user_id": "user-123"}'
4

See the response

json
{
  "reply": "You are building a SaaS dashboard in React.",
  "used_memories": [
    { "memory_text": "User is building a SaaS dashboard in React", "score": 0.91 }
  ],
  "memory_created": true
}

Authentication

All API requests use a Bearer token in the Authorization header. Your API key looks like mlive_xxxxx_...

http
Authorization: Bearer mlive_abc123_XyzAbCdEfGhIjKlMnOpQrStUvWxYz

Keep your key secret

Never expose your API key in client-side browser code. Always call the Memory Layer API from your backend server. Store your key in an environment variable like MEMORY_LAYER_API_KEY.

How to get an API key (step by step)

  1. 1.Sign in at memorylayer.tech
  2. 2.Go to Dashboard → API Keys in the sidebar
  3. 3.Click "Create New Key" and give it a name (e.g. "Production")
  4. 4.Copy the full key immediately — it is only displayed once
  5. 5.Add it to your server environment: MEMORY_LAYER_API_KEY=mlive_...

Memory API

Store, search, retrieve, and delete memories. All endpoints require an API key.

POST/v1/memory/

Store a new memory for a user. Generates a vector embedding automatically and persists it to the database.

Request Body

FieldTypeRequiredDescription
contentstringrequiredThe text to remember. Can be any fact, preference, or context.
external_user_idstringoptionalYour end-user's ID. Required for multi-tenant apps to keep memories isolated per user.
metadataobjectdefault: {}Optional key-value metadata (source, session_id, tags, etc.).

Response Fields

FieldTypeRequiredDescription
statusstringoptional"success" on successful store.
memory.idstring (UUID)optionalUnique ID of the stored memory. Save this if you want to delete it later.
memory.importance_scorefloatoptionalAuto-calculated importance score (0–1). Increases as the memory is accessed.
memory.created_atISO datetimeoptionalTimestamp when the memory was stored.
bash
curl -X POST https://memory-layer-api.onrender.com/v1/memory/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "content": "User prefers dark mode and uses VS Code",
    "external_user_id": "user-123",
    "metadata": { "source": "onboarding" }
  }'
POST/v1/memory/search

Search memories using semantic similarity. Returns the most relevant memories for a given query, ranked by cosine similarity.

Request Body

FieldTypeRequiredDescription
querystringrequiredNatural language query to search for. Uses semantic similarity, not keyword matching.
external_user_idstringoptionalFilter results to a specific end-user. Highly recommended for multi-tenant apps.
limitintegerdefault: 10Max number of results to return (1–100).
similarity_thresholdfloatdefault: 0.35Minimum similarity score to include a result (0–1). Lower = more results.

Response Fields

FieldTypeRequiredDescription
resultsarrayoptionalArray of matching memories, sorted by relevance.
results[].similarityfloatoptionalCosine similarity score (0–1). 1.0 = exact match.
results[].memory.contentstringoptionalThe memory text.
results[].combined_scorefloatoptionalWeighted score: 70% similarity + 30% importance.
total_countintegeroptionalNumber of results returned.

Also available as GET

You can also search with a GET request using query params: GET /v1/memory/search?query=...&external_user_id=...&limit=5
bash
curl -X POST https://memory-layer-api.onrender.com/v1/memory/search \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What editor does the user prefer?",
    "external_user_id": "user-123",
    "limit": 5,
    "similarity_threshold": 0.3
  }'
GET/v1/memory/

List all memories for your tenant, ordered by creation date (newest first). Supports pagination.

Query Parameters

FieldTypeRequiredDescription
limitintegerdefault: 50Number of memories to return (1–200).
offsetintegerdefault: 0Number of memories to skip (for pagination).
bash
curl "https://memory-layer-api.onrender.com/v1/memory/?limit=20&offset=0" \
  -H "Authorization: Bearer YOUR_API_KEY"
GET/v1/memory/{memory_id}

Retrieve a single memory by its ID.

Path Parameters

FieldTypeRequiredDescription
memory_idstring (UUID)requiredThe ID of the memory to retrieve.
bash
curl "https://memory-layer-api.onrender.com/v1/memory/e32754fa-9b24-4c3e-b8f1-2a1c9d3e4f56" \
  -H "Authorization: Bearer YOUR_API_KEY"
DELETE/v1/memory/{memory_id}

Permanently delete a memory by ID. This action cannot be undone.

Path Parameters

FieldTypeRequiredDescription
memory_idstring (UUID)requiredThe ID of the memory to delete. Get this from the store or list endpoints.
bash
curl -X DELETE \
  "https://memory-layer-api.onrender.com/v1/memory/e32754fa-9b24-4c3e-b8f1-2a1c9d3e4f56" \
  -H "Authorization: Bearer YOUR_API_KEY"
GET/v1/memory/stats

Get aggregate statistics for all memories stored under your API key.

bash
curl "https://memory-layer-api.onrender.com/v1/memory/stats" \
  -H "Authorization: Bearer YOUR_API_KEY"

Chat API

Send a message and get an LLM response with memory automatically retrieved and injected as context. No need to manage context windows manually.

POST/v1/chat/

Primary chat endpoint. Automatically: embeds your message → finds relevant memories → injects them into the LLM prompt → returns the response.

What happens under the hood:

EmbedYour message is converted to a vector embedding (~400ms)
SearchTop-K relevant memories are retrieved from the vector DB (~100ms)
AugmentMemories are injected into the system prompt as context
LLM callGemini generates a personalized response (~800ms)
StoreYour message is saved as a new memory (fire-and-forget)

Request Body

FieldTypeRequiredDescription
messagestringrequiredThe user's message (max 8,000 characters).
external_user_idstringoptionalYour end-user's ID. Only that user's memories are searched and injected. Required for multi-tenant apps.
top_kintegerdefault: 5How many memories to inject (1–20).
conversation_idstringoptionalOptional ID to group messages into a conversation thread.
user_idstringoptionalInternal user ID. Defaults to your tenant ID from the API key.

Response Fields

FieldTypeRequiredDescription
replystringoptionalThe LLM's response, personalized with memory context.
used_memoriesarrayoptionalList of memories that were injected into the prompt.
used_memories[].memory_textstringoptionalThe content of the memory that was used.
used_memories[].scorefloatoptionalSimilarity score (0–1) for this memory.
memory_createdbooleanoptionalWhether the user's message was saved as a new memory.
bash
curl -X POST https://memory-layer-api.onrender.com/v1/chat/ \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "message": "What tools do I prefer for coding?",
    "external_user_id": "user-123",
    "top_k": 5
  }'
POST/v1/chat/legacy

Backward-compatible chat endpoint. Returns reply, memory_ids, and memories_used.

Request Body

FieldTypeRequiredDescription
messagestringrequiredThe user's message.
user_idstringoptionalUser ID. Defaults to tenant from API key.
top_kintegerdefault: 5Number of memories to retrieve.
external_user_idstringoptionalEnd-user ID for multi-tenant isolation.
bash
curl -X POST https://memory-layer-api.onrender.com/v1/chat/legacy \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"message": "What am I building?", "external_user_id": "user-123"}'
json
{
  "reply": "You are building a SaaS dashboard in React.",
  "memory_ids": ["e32754fa-..."],
  "mode": "process",
  "memories_used": [
    { "id": "e32754fa-...", "content": "User is building a React dashboard", "similarity": 0.91 }
  ]
}

Multi-Tenant Users

If you're building a SaaS product with many end-users, you need each user's memories to be completely private. Memory Layer handles this with a single field: external_user_id.

❌ Without external_user_id

All memories for all users are pooled together. A search for one user could return memories from another user.

✅ With external_user_id

Each user's memories are stored and searched in complete isolation. One API key serves millions of users safely.

How it works

You use one API key for your entire app. Every API call includes external_user_id set to your own user's ID (UUID, email, or any unique string). Memory Layer stores and retrieves memories scoped to that ID — zero cross-user leakage.
javascript
// Store memory for end-user "alice" in your SaaS app
await fetch('https://memory-layer-api.onrender.com/v1/memory/', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_KEY',  // One key for your whole app
    'Content-Type': 'application/json',
  },
  body: JSON.stringify({
    content: 'Alice is on the Pro plan and works in fintech',
    external_user_id: 'alice',  // Alice's ID in YOUR system
  }),
});

// Store memory for end-user "bob"
await fetch('https://memory-layer-api.onrender.com/v1/memory/', {
  method: 'POST',
  headers: { 'Authorization': 'Bearer YOUR_API_KEY' },
  body: JSON.stringify({
    content: 'Bob is building a healthcare app with Python',
    external_user_id: 'bob',   // Bob's ID — completely isolated from Alice
  }),
});

Always set external_user_id in production

If you forget to set it, memories are stored without user isolation and all users' searches will return pooled results. Make this a required field in your API wrapper.

API Key Management

You can manage API keys programmatically. These endpoints use your Supabase session JWT (obtained after logging in), not the API key itself.

POST/v1/keys/

Create a new API key. The full key value is only returned once — save it immediately.

Request Body

FieldTypeRequiredDescription
namestringrequiredA human-readable name for this key (e.g. "Production", "Testing").
bash
curl -X POST https://memory-layer-api.onrender.com/v1/keys/ \
  -H "Authorization: Bearer YOUR_SUPABASE_JWT" \
  -H "Content-Type: application/json" \
  -d '{ "name": "Production Key" }'

Save your key immediately

The full API key (api_key) is returned only once at creation. It cannot be retrieved again. If you lose it, revoke and create a new one.
GET/v1/keys/

List all API keys for your account. The key values are masked for security.

bash
curl https://memory-layer-api.onrender.com/v1/keys/ \
  -H "Authorization: Bearer YOUR_SUPABASE_JWT"
POST/v1/keys/{key_id}/revoke

Revoke an API key. Requests using this key will immediately start receiving 401 errors.

Path Parameters

FieldTypeRequiredDescription
key_idstringrequiredThe ID of the key to revoke (from the list endpoint).
bash
curl -X POST https://memory-layer-api.onrender.com/v1/keys/key_abc123/revoke \
  -H "Authorization: Bearer YOUR_SUPABASE_JWT"

Integration Guides

Drop-in client libraries for your preferred language or framework.

🐍 Python

Works with Django, FastAPI, Flask, or any Python app. No extra dependencies beyond requests.

python
import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://memory-layer-api.onrender.com"

class MemoryLayerClient:
    def __init__(self, api_key: str, user_id: str):
        self.headers = {
            "Authorization": f"Bearer {api_key}",
            "Content-Type": "application/json",
        }
        self.user_id = user_id

    def remember(self, content: str) -> dict:
        """Store a memory for this user."""
        res = requests.post(
            f"{BASE_URL}/v1/memory/",
            headers=self.headers,
            json={"content": content, "external_user_id": self.user_id},
        )
        return res.json()

    def recall(self, query: str, limit: int = 5) -> list:
        """Search memories by semantic similarity."""
        res = requests.post(
            f"{BASE_URL}/v1/memory/search",
            headers=self.headers,
            json={
                "query": query,
                "external_user_id": self.user_id,
                "limit": limit,
                "similarity_threshold": 0.3,
            },
        )
        return res.json().get("results", [])

    def chat(self, message: str) -> str:
        """Send a message with memory-augmented context."""
        res = requests.post(
            f"{BASE_URL}/v1/chat/",
            headers=self.headers,
            json={"message": message, "external_user_id": self.user_id},
        )
        return res.json().get("reply", "")


# Usage
client = MemoryLayerClient(API_KEY, user_id="user-123")
client.remember("I am building a React dashboard for a SaaS startup")
reply = client.chat("What am I building?")
print(reply)
# "You are building a React dashboard for a SaaS startup."

🟨 JavaScript / Node.js

Works in Node.js, Deno, Bun, or any environment with the native Fetch API. Zero dependencies.

javascript
// memory-layer.js — Drop-in client for any Node.js or browser project
const BASE_URL = 'https://memory-layer-api.onrender.com';

export class MemoryLayerClient {
  constructor(apiKey, userId) {
    this.headers = {
      'Authorization': `Bearer ${apiKey}`,
      'Content-Type': 'application/json',
    };
    this.userId = userId;
  }

  async remember(content, metadata = {}) {
    const res = await fetch(`${BASE_URL}/v1/memory/`, {
      method: 'POST',
      headers: this.headers,
      body: JSON.stringify({
        content,
        external_user_id: this.userId,
        metadata,
      }),
    });
    return res.json();
  }

  async recall(query, limit = 5, threshold = 0.3) {
    const res = await fetch(`${BASE_URL}/v1/memory/search`, {
      method: 'POST',
      headers: this.headers,
      body: JSON.stringify({
        query,
        external_user_id: this.userId,
        limit,
        similarity_threshold: threshold,
      }),
    });
    return res.json();
  }

  async chat(message, topK = 5) {
    const res = await fetch(`${BASE_URL}/v1/chat/`, {
      method: 'POST',
      headers: this.headers,
      body: JSON.stringify({
        message,
        external_user_id: this.userId,
        top_k: topK,
      }),
    });
    return res.json();
  }
}

// Usage
const client = new MemoryLayerClient('YOUR_API_KEY', 'user-123');
await client.remember('I prefer React over Vue for frontend work');
const { reply } = await client.chat('What frontend framework do I use?');
console.log(reply);
// "You prefer React for frontend work."

Next.js

Keep your API key secure on the server using Next.js API routes. Never expose it to the browser.

typescript
// app/api/chat/route.ts — Next.js App Router API route
import { NextRequest, NextResponse } from 'next/server';

const MEMORY_API = 'https://memory-layer-api.onrender.com';
const API_KEY = process.env.MEMORY_LAYER_API_KEY!; // Store in .env.local

export async function POST(req: NextRequest) {
  const { message, userId } = await req.json();

  const res = await fetch(`${MEMORY_API}/v1/chat/`, {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${API_KEY}`,
      'Content-Type': 'application/json',
    },
    body: JSON.stringify({
      message,
      external_user_id: userId,  // Your app's user ID
      top_k: 5,
    }),
  });

  const data = await res.json();
  return NextResponse.json({
    reply: data.reply,
    memoriesUsed: data.used_memories.length,
  });
}

// components/Chat.tsx — Frontend component
'use client';
export function Chat({ userId }: { userId: string }) {
  const [messages, setMessages] = useState<string[]>([]);

  const send = async (message: string) => {
    const res = await fetch('/api/chat', {
      method: 'POST',
      body: JSON.stringify({ message, userId }),
    });
    const { reply } = await res.json();
    setMessages(prev => [...prev, message, reply]);
  };
  // ...
}

.env.local setup

Add MEMORY_LAYER_API_KEY=mlive_... to your .env.local file. Next.js keeps this server-side automatically.

Advanced Features

Tools to keep your memory store clean, organized, and efficient at scale.

POST/v1/memory/summarize

Clusters related memories and generates summary memories using the LLM. Reduces noise and improves retrieval quality over time.

bash
curl -X POST "https://memory-layer-api.onrender.com/v1/memory/summarize" \
  -H "Authorization: Bearer YOUR_API_KEY"
POST/v1/memory/prune

Remove old or low-importance memories to keep your store lean. Use dry_run=true first to preview what would be pruned.

Query Parameters

FieldTypeRequiredDescription
max_age_daysintegerdefault: 90Archive memories older than this many days.
min_importancefloatdefault: 0.1Archive memories with importance score below this threshold.
max_memoriesintegerdefault: 10000Maximum memories to keep. Prunes oldest when exceeded.
archive_onlybooleandefault: trueArchive instead of hard-delete (recommended).
dry_runbooleandefault: falsePreview what would be pruned without actually doing it.
bash
curl -X POST \
  "https://memory-layer-api.onrender.com/v1/memory/prune?max_age_days=90&min_importance=0.1&dry_run=true" \
  -H "Authorization: Bearer YOUR_API_KEY"
POST/v1/memory/maintenance/{job_type}

Run maintenance jobs on-demand. Available job types are listed below.

Job Types

FieldTypeRequiredDescription
pruningstringoptionalRemove old and low-importance memories.
summarizationstringoptionalCluster and summarize related memories.
link-optimizationstringoptionalRebuild similarity links between memories.
importance-recalculationstringoptionalRecalculate importance scores based on access counts.
allstringoptionalRun all maintenance jobs in sequence.
bash
curl -X POST "https://memory-layer-api.onrender.com/v1/memory/maintenance/all" \
  -H "Authorization: Bearer YOUR_API_KEY"

Error Reference

Memory Layer uses standard HTTP status codes. All errors include a machine-readable detail field.

CodeNameMeaningWhat to do
200OKRequest succeededRead the response body.
201CreatedMemory was storedRead memory.id from the response.
400Bad RequestInvalid input or missing required fieldsCheck your request body against the parameter table.
401UnauthorizedMissing or invalid API keyCheck your Authorization header. Ensure you have the "Bearer " prefix.
404Not FoundMemory ID does not existVerify the memory_id. It may have been deleted.
422Validation ErrorRequest body failed validationCheck the details array in the response for which field is wrong.
429Rate LimitedToo many requestsSlow down, implement exponential backoff, or upgrade your plan.
500Server ErrorInternal error on our sideRetry with backoff. If it persists, contact support.

Error response examples

json
// 401 — Invalid or missing API key
{
  "detail": "Invalid or missing API key"
}

// 422 — Validation error (missing required field)
{
  "status": "error",
  "message": "Validation error",
  "details": [
    {
      "type": "missing",
      "loc": ["body", "query"],
      "msg": "Field required"
    }
  ],
  "timestamp": "2026-03-14T08:00:00.000Z"
}

// 404 — Memory not found
{
  "detail": "Memory not found"
}

// 429 — Rate limit exceeded
{
  "detail": "Rate limit exceeded. Upgrade your plan or wait."
}

// 500 — Internal server error
{
  "detail": "Internal server error. Please try again."
}

Rate Limits & Plan Limits

PlanMemoriesSearches / moTokens / mo
Free1,000100 / day100K
Starter — $9/mo100,0001,0001M
Pro — $49/mo1,000,00010,00010M
EnterpriseUnlimitedUnlimitedUnlimited

Exceeding limits

Requests that exceed your plan limits return a 429 Rate Limited response. Upgrade your plan to increase limits.

Best Practices

Always set external_user_id

In any app with more than one user, always pass external_user_id in every store, search, and chat call. Use your own user IDs — UUIDs, emails, or integers all work.

Call from your backend

Never expose your API key in browser JavaScript. Make your API calls from your server and proxy the response to your frontend.

Use lower similarity thresholds for chat

For conversational context, a threshold of 0.3–0.4 captures relevant memories without being too strict. Use 0.7+ for precision lookups.

Implement exponential backoff

If you receive a 429 or 500, wait 1s, then 2s, then 4s before retrying. This keeps your app resilient under load.

Run maintenance periodically

For long-running apps, schedule /v1/memory/maintenance/all weekly to keep the memory store clean and retrieval fast.

Use top_k=3–5 for chat

Injecting too many memories increases token usage and can confuse the LLM. 3–5 relevant memories is the sweet spot for most use cases.