# Agent Memory with Cognee on Tigris S3 Storage

An AI agent starts every conversation from scratch unless you give it a memory layer to hold what it has learned. Tigris makes a good home for that memory. It is S3-compatible object storage, and the part that matters here is that a whole memory state becomes something you can snapshot, fork, and roll back the way you branch code.

Cognee is the memory engine that runs on top. Feed it text, files, or URLs and it builds a knowledge graph of entities, relationships, and the embeddings your agent searches against. Point Cognee at a Tigris bucket and that graph, its vector indexes, and the raw source data all live in storage you control. The memory API reads as a lifecycle of four verbs:

| Verb       | What it does                                                                                |
| ---------- | ------------------------------------------------------------------------------------------- |
| `remember` | Store data permanently: ingest it and build the knowledge graph.                            |
| `recall`   | Retrieve relevant knowledge, auto-routing across graph, vector, and lexical search.         |
| `improve`  | Enrich the graph: apply feedback, build indexes, fold session memory into permanent memory. |
| `forget`   | Remove memory. One item, a dataset, or everything, through one call.                        |

The older `add`, `cognify`, `search`, and `memify` calls still exist as lower-level building blocks, but the four verbs above are the surface you reach for.

Here is what storing that memory on Tigris buys you. In Cognee's default file-based mode, the entire memory lives as files under a single bucket: the raw data, the vector indexes, and the knowledge graph together. Snapshot the bucket and you capture all of it at one instant. Fork it and an agent gets a private copy to work in, where it can trial a new embedding model, test a risky `improve`, or explore a what-if without touching the memory that production depends on. A bad learning becomes a rollback rather than a re-ingestion, and global distribution and zero egress fees come along on top.

This guide starts with those file-based defaults, which need no servers to run, then covers the **Postgres** backend for distributed, multi-instance deployments. The [snapshot and fork workflow](#experimenting-with-memory-snapshots-and-forks) comes later.

Here is a basic agent memory architecture:

Agent → Cognee → Tigris: remember, recall, and forget with memory stored on Tigris.

## Prerequisites[​](#prerequisites "Direct link to Prerequisites")

1. **A Tigris account** with an Access Key ID and Secret Access Key (keys start with `tid_` and `tsec_`). Create them via the [Tigris Access Key guide](/docs/iam/manage-access-key/.md).
2. **A Tigris bucket** for storing agent memory.
3. **Python 3.10+** installed.
4. **An LLM API key**, OpenAI by default, though Cognee also supports Anthropic, Gemini, Ollama, and others.

## Step 1: Install Cognee[​](#step-1-install-cognee "Direct link to Step 1: Install Cognee")

```
pip install "cognee[aws]"
```

The `[aws]` extra adds S3 support through `s3fs`. The default database backends ship with core Cognee: LanceDB for vectors, Ladybug for the knowledge graph, and SQLite for metadata. This one install covers the full stack. For the Postgres backend (see [Choosing a database backend](#choosing-a-database-backend)), install `"cognee[postgres,aws]"` instead.

## Step 2: Create a Tigris bucket[​](#step-2-create-a-tigris-bucket "Direct link to Step 2: Create a Tigris bucket")

* AWS CLI
* Python

```
aws s3api create-bucket \

  --bucket my-agent-memory \

  --endpoint-url https://t3.storage.dev \

  --region auto
```

```
import boto3

from botocore.config import Config


s3 = boto3.client(

    "s3",

    endpoint_url="https://t3.storage.dev",

    aws_access_key_id="tid_YOUR_ACCESS_KEY_ID",

    aws_secret_access_key="tsec_YOUR_SECRET_ACCESS_KEY",

    region_name="auto",

    config=Config(s3={"addressing_style": "virtual"}),

)


s3.create_bucket(Bucket="my-agent-memory")
```

note

Tigris requires `virtual` addressing style when using boto3. The endpoint is `https://t3.storage.dev`.

## Step 3: Configure environment variables[​](#step-3-configure-environment-variables "Direct link to Step 3: Configure environment variables")

Create a `.env` file in your project root. Cognee loads `.env` automatically on import, so it must exist before you `import cognee`.

```
# Tigris credentials (boto3/s3fs read these automatically)

AWS_ACCESS_KEY_ID=tid_YOUR_ACCESS_KEY_ID

AWS_SECRET_ACCESS_KEY=tsec_YOUR_SECRET_ACCESS_KEY

AWS_REGION=auto

AWS_ENDPOINT_URL=https://t3.storage.dev


# LLM provider

LLM_API_KEY=sk-your-openai-api-key

LLM_MODEL=openai/gpt-4o-mini


# Store everything on Tigris instead of the local filesystem

STORAGE_BACKEND=s3

STORAGE_BUCKET_NAME=my-agent-memory

DATA_ROOT_DIRECTORY=s3://my-agent-memory/cognee/data

SYSTEM_ROOT_DIRECTORY=s3://my-agent-memory/cognee/system
```

| Variable                                      | Purpose                                                                                                                                 |
| --------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` | Tigris credentials. Cognee and boto3 read these automatically.                                                                          |
| `AWS_REGION`                                  | Set to `auto`. Tigris handles region routing for you.                                                                                   |
| `AWS_ENDPOINT_URL`                            | Points S3 requests at Tigris instead of AWS. Required for any S3-compatible endpoint.                                                   |
| `STORAGE_BACKEND`                             | Set to `s3` to use Tigris instead of the local filesystem. Without it, Cognee writes to disk even if you've set `s3://` URIs elsewhere. |
| `STORAGE_BUCKET_NAME`                         | Your Tigris bucket name. Cognee uses it to auto-configure the cache directory on S3 (e.g. `s3://my-bucket/cognee/cache`).               |
| `DATA_ROOT_DIRECTORY`                         | Where Cognee stores raw ingested data and file uploads. This is your durable source of truth.                                           |
| `SYSTEM_ROOT_DIRECTORY`                       | Where Cognee stores its databases: vector indexes (LanceDB), the knowledge graph (Ladybug), and metadata (SQLite).                      |

## Step 4: Build agent memory[​](#step-4-build-agent-memory "Direct link to Step 4: Build agent memory")

Here's a complete example that teaches your agent a couple of facts and then queries what it knows. Run it once to confirm everything is wired up before integrating into your agent loop.

```
import asyncio

import cognee


async def main():

    # Start fresh during development (drops all memory for the current user).

    await cognee.forget(everything=True)


    # Teach the agent. remember() ingests the text AND builds the graph,

    # so there's no separate "process" step.

    await cognee.remember(

        "Tigris is a globally distributed, S3-compatible object storage "

        "service. It distributes data to regions closest to your users and "

        "caches frequently accessed data at the edge.",

        dataset_name="facts",

    )

    await cognee.remember(

        "Cognee is a memory engine for AI agents. It builds knowledge graphs "

        "from unstructured data and retrieves with combined vector similarity "

        "and graph traversal.",

        dataset_name="facts",

    )


    # Recall. Cognee auto-routes to the best retrieval strategy.

    results = await cognee.recall("How does Tigris distribute data globally?")

    for i, result in enumerate(results, 1):

        print(f"[{i}] {result}")


    return results


if __name__ == "__main__":

    asyncio.run(main())
```

### What each verb does[​](#what-each-verb-does "Direct link to What each verb does")

1. **`cognee.remember()`** ingests raw content *and* builds structured memory in one call: chunking, embedding generation, entity extraction, and knowledge graph construction. With `STORAGE_BACKEND=s3`, the raw input is persisted to Tigris immediately, so your source of truth is safe even if a later step fails.
2. **`cognee.recall()`** retrieves relevant information, querying vector indexes for semantic matches and traversing the knowledge graph for related entities. It auto-selects the retrieval strategy, so you usually just pass a question.
3. **`cognee.improve()`** (optional) enriches an existing graph. It applies feedback weights and builds indexes. `remember()` runs a light improvement pass by default. Call it directly when you want heavier enrichment.
4. **`cognee.forget()`** removes memory. See [Managing memory with forget](#managing-memory-with-forget).

note

For short-term context, pass `session_id="..."` to `remember()` and `recall()`. Session data is stored for fast retrieval and bridged into the permanent graph in the background. This is how per-conversation memory graduates into long-term knowledge.

## Step 5: Feed memory from files and S3[​](#step-5-feed-memory-from-files-and-s3 "Direct link to Step 5: Feed memory from files and S3")

Your agent can learn from local files, S3 objects, or a mix. This is useful for bootstrapping an agent with an existing document corpus.

```
import asyncio

import cognee


async def feed_documents():

    await cognee.remember("/path/to/research-paper.pdf", dataset_name="corpus")

    await cognee.remember("/path/to/documents/", dataset_name="corpus")          # whole directory

    await cognee.remember("s3://my-agent-memory/uploads/report.txt", dataset_name="corpus")

    await cognee.remember("s3://my-agent-memory/uploads/", dataset_name="corpus") # all files under a prefix


    results = await cognee.recall("key findings about distributed storage")

    for result in results:

        print(result)


asyncio.run(feed_documents())
```

tip

An `s3://` URI to a single file ingests that file. A prefix URI (ending in `/`) recursively discovers everything underneath it, so you can point Cognee at an entire document library in one call. Inputs already on `s3://` are read in place. Cognee records them by reference rather than copying them.

## Choosing a database backend[​](#choosing-a-database-backend "Direct link to Choosing a database backend")

Cognee is three stores plus a raw-data layer. The stores are relational (state), vector (embeddings), and graph (entities). **Raw data belongs on Tigris in every configuration.** The structured stores are pluggable. The right choice depends on whether you run one Cognee process or many.

### Default: file-based stores (single instance)[​](#default-file-based-stores-single-instance "Direct link to Default: file-based stores (single instance)")

SQLite, Ladybug, and LanceDB are embedded engines. With the Step 3 config they live entirely on Tigris, with Cognee syncing the database files to and from S3, pulling on open and pushing on commit. This is ideal for development and single-worker agents, since there is no infrastructure to run and everything sits in one bucket.

warning

This sync is **last-writer-wins**. It is safe for a single Cognee process, but multiple instances writing concurrently can clobber each other's database files. For distributed deployments, use Postgres below.

### Postgres: networked stores (distributed production)[​](#postgres-networked-stores-distributed-production "Direct link to Postgres: networked stores (distributed production)")

When several stateless Cognee instances share memory, the structured stores need a networked database. One Postgres instance backs all three: relational, vector (pgvector), and graph. Raw data stays on Tigris. Install `"cognee[postgres,aws]"` and use this `.env` instead:

```
# Raw data still on Tigris

STORAGE_BACKEND=s3

AWS_ENDPOINT_URL=https://t3.storage.dev

AWS_REGION=auto

AWS_ACCESS_KEY_ID=tid_YOUR_ACCESS_KEY_ID

AWS_SECRET_ACCESS_KEY=tsec_YOUR_SECRET_ACCESS_KEY

DATA_ROOT_DIRECTORY=s3://my-agent-memory/cognee/data

SYSTEM_ROOT_DIRECTORY=s3://my-agent-memory/cognee/system


# Structured stores on one Postgres (local, or managed: Neon / Supabase / RDS)

DB_PROVIDER=postgres

DB_HOST=127.0.0.1

DB_PORT=5432

DB_USERNAME=cognee

DB_PASSWORD=cognee

DB_NAME=cognee_db

VECTOR_DB_PROVIDER=pgvector

GRAPH_DATABASE_PROVIDER=postgres


# Embedding dims must stay within pgvector's index ceiling (2000)

EMBEDDING_MODEL=openai/text-embedding-3-small

EMBEDDING_DIMENSIONS=1536
```

The Postgres graph and pgvector adapters reuse the `DB_*` credentials, so one instance serves all three structured stores. The Postgres must have the `vector` extension available (managed providers like Neon and Supabase ship it).

tip

The application code does not change between backends. The same `remember`, `recall`, `improve`, and `forget` calls run the same way whether memory sits on the file-based defaults or Postgres. Only the `.env` differs.

### What can live where[​](#what-can-live-where "Direct link to What can live where")

| Layer      | Tigris (`s3://`)                 | Networked DB           |
| ---------- | -------------------------------- | ---------------------- |
| Raw data   | **Yes, always**                  | —                      |
| Vectors    | Yes (LanceDB)                    | Yes (pgvector)         |
| Relational | Single instance only (file sync) | Yes (Postgres)         |
| Graph      | Single instance only (file sync) | Yes (Postgres / Neo4j) |

## Managing memory with forget[​](#managing-memory-with-forget "Direct link to Managing memory with forget")

`forget` replaces the older prune/delete calls with one set of patterns:

```
# Forget a single item from a dataset

await cognee.forget(data_id=item_id, dataset="corpus")


# Forget an entire dataset (raw data + graph + vectors)

await cognee.forget(dataset="corpus")


# Drop only the graph/vectors but KEEP raw files, so you can re-process

# with different settings (e.g. a new embedding model)

await cognee.forget(dataset="corpus", memory_only=True)


# Forget everything the current user owns (handy as a dev reset)

await cognee.forget(everything=True)
```

Raw data is the durable source of truth on Tigris, which makes `memory_only=True` the safe way to rebuild derived memory. Drop the graph and vectors, then `remember` the same data again under new settings, and you never re-upload the files.

## Experimenting with memory: snapshots and forks[​](#experimenting-with-memory-snapshots-and-forks "Direct link to Experimenting with memory: snapshots and forks")

Because your memory lives in a bucket, you can treat whole memory states as cheap, branchable objects rather than just durable storage.

* **Snapshot before anything destructive.** A bulk `forget` or a heavy `improve` rewrites memory in place, so snapshot the bucket first and a bad run becomes a one-command rollback instead of a full re-ingestion.
* **Fork to experiment.** Fork the bucket and point a throwaway agent at the copy to trial a new embedding model, a different ontology, or an aggressive pruning policy, all against real memory with zero risk to production.
* **Branch per tenant or per run.** Give each customer or each experiment its own fork so their memories evolve independently from a shared baseline.

In the default file-based config, a single bucket snapshot captures the *entire* memory, because raw data, vectors, and graph all live under one bucket. A fork is therefore a complete, independent memory the agent can mutate freely. The [storagesdk](https://storagesdk.dev) makes both a one-liner:

```
import { Storage } from "@storagesdk/core";

import { tigris } from "@storagesdk/adapters/tigris";


const storage = new Storage({ adapter: tigris({ bucket: "my-agent-memory" }) });


// Capture the current memory state before a risky change

const snap = await storage.snapshots.create({ name: "pre-improve" });


// ...run cognee.improve(...) / cognee.forget(...)...

// If it goes badly, the snapshot is your rollback.


// Or branch the whole memory so an agent can experiment on a copy

const fork = await storage.forks.create({

  name: "exp-new-embeddings",

  fromSnapshot: snap.id,

});
```

note

With the Postgres backend, the structured stores live in Postgres rather than the bucket, so a bucket snapshot captures only raw data. Pair it with a Postgres backup on the same schedule if you want derived memory restored to the same instant, too.

## Per-agent memory isolation[​](#per-agent-memory-isolation "Direct link to Per-agent memory isolation")

tip

For multi-agent or multi-user systems, Cognee's `ENABLE_BACKEND_ACCESS_CONTROL` is `True` by default. Cognee keeps separate databases per user and dataset, so agents can't read each other's memory. Scope memory by passing distinct `dataset_name` values to `remember` and `datasets=[...]` to `recall`.

## Production considerations[​](#production-considerations "Direct link to Production considerations")

### Performance[​](#performance "Direct link to Performance")

Tigris caches frequently accessed objects at edge locations closest to your users. Agents running repeated lookups against the same knowledge base benefit automatically, with no configuration needed. Cold reads see S3-level latency, and recently accessed data comes back faster.

### Cost[​](#cost "Direct link to Cost")

Tigris charges for storage but has **zero egress fees**. That matters for agents that query memory repeatedly, and for multi-region deployments where agents and users sit far apart. Keeping raw data on Tigris also keeps Postgres sized to structured state, instead of bloating it with blobs.

### Security[​](#security "Direct link to Security")

warning

Never commit credentials to version control. In production, prefer IAM roles or instance profiles over static keys.

Lock down Cognee before exposing it to external traffic:

```
ACCEPT_LOCAL_FILE_PATH=False        # Disable local file path access

ALLOW_HTTP_REQUESTS=False           # Restrict outbound requests

REQUIRE_AUTHENTICATION=True         # Enable API auth

ENABLE_BACKEND_ACCESS_CONTROL=True  # Per-agent isolation
```

## Troubleshooting[​](#troubleshooting "Direct link to Troubleshooting")

| Issue                                  | Solution                                                                                                                                                      |
| -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Environment variables not loaded       | Ensure `.env` exists before `import cognee` (it loads `.env` on import). Variables set after import won't take effect.                                        |
| Authentication errors                  | Verify `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY` are set. Tigris keys start with `tid_` and `tsec_`. Watch for whitespace when copy-pasting.            |
| Region errors                          | Set `AWS_REGION=auto`. Tigris handles routing automatically.                                                                                                  |
| Wrong S3 endpoint                      | Confirm `AWS_ENDPOINT_URL=https://t3.storage.dev` is in your `.env`.                                                                                          |
| Cognee uses local storage              | Ensure `STORAGE_BACKEND=s3` and that `DATA_ROOT_DIRECTORY` / `SYSTEM_ROOT_DIRECTORY` use `s3://` URIs.                                                        |
| Database locked / corrupted under load | Multiple instances writing file-based stores on S3. Switch to the Postgres backend.                                                                           |
| pgvector index creation fails          | Embedding dimensions exceed pgvector's 2000-dim index ceiling. Use a ≤1536-dim model like `text-embedding-3-small`.                                           |
| Slow first query                       | Expected. The first read fetches from Tigris; subsequent queries benefit from edge caching.                                                                   |
| boto3 addressing errors                | Tigris requires `virtual` addressing style. Use `Config(s3={"addressing_style": "virtual"})` with boto3 directly. LanceDB and s3fs handle this automatically. |

## References[​](#references "Direct link to References")

* [Cognee Documentation](https://docs.cognee.ai/)
* [Cognee GitHub Repository](https://github.com/topoteretes/cognee)
* [storagesdk: snapshots and forks for Tigris](https://storagesdk.dev)
* [Tigris Object Storage Documentation](https://www.tigrisdata.com/docs/)