[Blog](/blog/.md)

<!-- -->

/

<!-- -->

[Build with Tigris](/blog/tags/build-with-tigris/.md)

# Durable global streams in Tigris with S2

Xe Iaso · May 5, 2026 ·

<!-- -->

12 min read

[![Xe Iaso](https://avatars.githubusercontent.com/u/529003?v=4)](https://xeiaso.net)

[Xe Iaso](https://xeiaso.net)

Senior Cloud Whisperer

![An oil painting of a bengal tiger surfing down a river of data streams](/blog/assets/images/hero-image-aae3393167ced3f242f3c27fa8058678.webp)

Streaming infrastructure is an essential part of production workloads, however most options either do too little or too much. The simple ones will gladly discard your data if a consumer disconnects, but the more complicated ones require a dedicated team of arcane experts that speak in strange tongues.

There's room in the middle for something between over-engineered and under-engineered, and [S2](https://s2.dev/) fits right into that gap. S2 is serverless storage for streams — like Kafka and S3 had a baby. It's not a message queue; it's a durable, ordered, append-only stream store. With [S2 Lite](https://github.com/s2-streamstore/s2), you can run it yourself on hardware you can look at on top of Tigris. I have it running in my homelab Kubernetes cluster to ingest the Bluesky public feed.

S2 themselves are pitching this as [infrastructure for distributed AI agents](https://s2.dev/blog/distributed-ai-agents), and the fit is obvious once you see it: a stream per agent, durable reasoning traces, replay from sequence zero to audit how a conclusion formed. Traditional queues weren't built for that shape of workload. S2 was.

## The streaming spectrum[​](#the-streaming-spectrum "Direct link to The streaming spectrum")

Streaming and messaging systems can get surprisingly complicated based on the levels of abstraction and complexity at play. To understand where S2 fits, let's cover the different kinds of systems you'll see in the wild. None of these are perfect (sometimes you *do* want to just drop data), but each has their own tradeoffs that make things easier or harder.

### Shout to the void[​](#shout-to-the-void "Direct link to Shout to the void")

If you've ever used an IRC channel, this is what I'm talking about. If nobody is around to hear your message, it is as good as gone. If a consumer is disconnected when the message is sent, they won't receive it. This is great for things like cache invalidation, typing indicators in chat applications, or other fundamentally ephemeral data along that line.

These systems are usually very cheap to run and have little to no resource requirements. Consider these options:

* IRC servers (channels work as a simple publish-subscribe mechanism).
* Redis/Valkey PUBLISH/SUBSCRIBE.
* NATS Core (without the durability layer JetStream).

### Worker queues[​](#worker-queues "Direct link to Worker queues")

One of the main drawbacks with shout-to-the-void systems is that they work great for broadcast-style messages or unicast-style messages, but they don't really have a good way to handle a case where you need one worker out of a group to handle a given message. This is where worker queues like Amazon's SQS, RabbitMQ, or Redis/Valkey RPUSH/LPOP come into play.

With those queues you build up a bunch of state server-side and then as workers grab data, they slowly chew through the backlog until nothing is left. This idea is the backbone of many web frameworks (if you've ever gone deep into Rails, Sidekiq implements this pattern).

Usually if the worker crashes, the message gets redelivered, however once you consume the message, it's *gone*. No replay, no going back.

### Durable streams with replay[​](#durable-streams-with-replay "Direct link to Durable streams with replay")

One of the other big problems with a "simple" worker queue is that you can only have *one* consumer group handling a queue at once. If you want to have multiple worker groups consume the queue at their own paces, you must replicate messages to multiple topics or risk data loss by having your consumers do that replication after consumption.

This does work (for what it's worth I've seen this pan out at fairly large scales with the right kind of discipline), but there's a better option: durable, replayable streams.

This is where systems like NATS JetStream and Redis Streams come in. They maintain an ordered log of events on the server side with consumer groups tracking positions. For efficiency the server usually maintains at least a day of scrollback, but most of the time you only need a few hours.

### Apache Kafka[​](#apache-kafka "Direct link to Apache Kafka")

And finally we have the final boss of streaming infrastructure: Apache Kafka. Don't set it up from scratch, save that for your enemies to do for you. It is a byzantine abomination that really earns its name from the [kafkaesque](https://en.wikipedia.org/wiki/Franz_Kafka) process you need to go through to use it. Once it's set up it's fairly decent, but as a result of it being the workhorse in many Fortune 500 companies, it ends up accumulating features that you'll probably never need to use at any point in your career.

As a result, you do get *everything* you could possibly want, at the cost of having to maintain the sanity of a slowly dwindling group of experts that end up being able to type out JVM flags from memory. I don't think we should encourage that if we can avoid it.

### S2 is a durable stream store[​](#s2-is-a-durable-stream-store "Direct link to S2 is a durable stream store")

S2 takes a different approach from the systems above. It's not a message queue with consumer groups — it's a stream storage API. Clients maintain their own read position by tracking sequence numbers, timestamps, or tail offsets. Any number of consumers can independently read the same stream at their own pace without any server-side coordination.

Where S2 really stands apart is durability. NATS JetStream flushes to disk every two minutes before acknowledging (Jepsen [found problems](https://jepsen.io/analyses/nats-jetstream-2.9.0) with this). Redis Streams is in-memory first. S2 writes to object storage *before* acknowledging. Your data is durable on S3-compatible storage before S2 tells you the write succeeded. This helps avoid the [MongoDB problem of data durability](https://youtu.be/b2F-DItXtZs).

This matters more than usual when the stream *is* the audit trail. If you're logging an agent's reasoning steps and tool calls, "acknowledged but lost" isn't a missing log line — it's a hallucinated decision with no provenance. You want the bytes on disk before the ack comes back.

## What S2 actually is[​](#what-s2-actually-is "Direct link to What S2 actually is")

S2 is serverless storage for streams, accessible over HTTP (with SDKs and a CLI too). Every stream gets its own URL. S2 has these concepts:

* **Basins** are namespaces (akin to object storage buckets or other generic namespacing concepts)
* **Streams** are ordered sequences of records within a basin (like a folder really)
* **Records** are the individual messages: headers, a body (up to 1 MiB), a sequence number, and a timestamp

Three operations cover most of what you need: **append** records to a stream, **read** records from any position forward, and **check-tail** to see where the stream ends.

The managed cloud service at [s2.dev](https://s2.dev) handles all of this serverlessly for you. But for this post, we're taking a look at [S2 Lite](https://github.com/s2-streamstore/s2): the open-source, self-hostable, MIT-licensed, single-binary implementation you can point at any S3-compatible object storage, including Tigris.

Under the hood, S2 Lite uses [SlateDB](https://slatedb.io), an embedded key-value database that stores its data entirely in object storage. SlateDB writes its SST files and WAL to S3. S2 Lite wraps SlateDB to provide the streaming API on top. The whole thing is written in Rust and ships as a single Docker image.

## Running S2 Lite on Tigris[​](#running-s2-lite-on-tigris "Direct link to Running S2 Lite on Tigris")

First, create a bucket in Tigris to hold your S2 data. You can use the [Tigris CLI](/blog/tigris-cli/.md) or the dashboard:

```
tigris mk s2-streams
```

Then start S2 Lite, pointing it at your Tigris bucket:

```
docker run -p 8080:80 \
  -e AWS_ACCESS_KEY_ID=$TIGRIS_ACCESS_KEY \
  -e AWS_SECRET_ACCESS_KEY=$TIGRIS_SECRET_KEY \
  -e AWS_ENDPOINT_URL_S3=https://t3.storage.dev \
  ghcr.io/s2-streamstore/s2 lite \
  --bucket s2-streams \
  --path s2data
```

That's it. S2 Lite is now running locally, storing all stream data durably in Tigris. Since Tigris replicates globally, your stream data inherits that dynamic data placement. A consumer reading from Frankfurt gets low-latency access to data written from Virginia.

Install the S2 CLI to interact with it:

```
brew install s2-streamstore/tap/s2
```

Configure it to talk to your local S2 Lite instance:

```
export S2_ACCOUNT_ENDPOINT=http://localhost:8080
export S2_BASIN_ENDPOINT=http://localhost:8080
export S2_ACCESS_TOKEN=ignored
```

Now create a basin and a stream:

```
s2 create-basin my-basin
s2 create-stream my-basin/events
```

Append some records and read them back:

```
echo "hello from s2" | s2 append my-basin/events
s2 read my-basin/events
```

Records go in, records come out. The main difference is that those records live in Tigris instead of on some disk somewhere. You can read them again tomorrow, or next month, or whenever your retention policy allows. The ones that can't fit into memory in the server will be seamlessly fetched from Tigris.

## Building things with Bluesky[​](#building-things-with-bluesky "Direct link to Building things with Bluesky")

The use case that got me interested: ingesting the [Bluesky firehose](https://docs.bsky.app/docs/advanced-guides/firehose). The AT Protocol firehose pushes every post, like, follow, and block across the entire network over a WebSocket connection. It's a lot of data, and if your consumer crashes, you want to resume from where you left off without missing events.

The raw firehose is also a terrifying amount of data, enough that you'd want to avoid wasting Bluesky's bandwidth with making multiple redundant consumers. Using a durable stream store like S2 means that you'd be able to have multiple apps consume the data from Bluesky without having to have multiple raw firehose feeds active at once.

With S2 on Tigris, you'd write a small service that reads from the firehose WebSocket and appends each event to an S2 stream. Downstream consumers (a search indexer, a feed generator, a moderation pipeline) each read from the same stream at their own pace. If one crashes, it picks up from its last sequence number. The data stays in Tigris for as long as you want it. I'd suggest maintaining the data for a day or so.

This pattern works for any high-volume event source: webhook ingestion, IoT telemetry, CDC streams from your database, logs from code execution sandboxes. S2 calls this "high-cardinality streams" because you can create a separate stream per user, per session, per device, or per agent without worrying about partition limits. These partition limits are where systems like Kafka can cause a lot of grief, being able to spin up arbitrary amounts of streams with a moment's notice is a blessing that you won't really understand until you need to and can't.

## Coordinating distributed agents[​](#coordinating-distributed-agents "Direct link to Coordinating distributed agents")

Agents need an awkward combination of properties: each one bounded to its own execution context with no crosstalk between reasoning paths, but all of those contexts durable and visible to a supervisor or human reviewer. A stream per agent gives you that.

Give every agent its own S2 stream. Tool calls and model outputs get appended in order. When an agent forks a sub-task, spin up a new stream. When cohorts need to coordinate, let them tail each other. A supervisor agent, an evals pipeline, and a human reviewer can all read from the same data at their own pace without you fanning anything out.

To debug why an agent reached a conclusion, replay from sequence zero. The reasoning trace sits durably in Tigris in the order the agent produced it, which is what makes evals and post-hoc analysis tractable.

## Why object storage for streams[​](#why-object-storage-for-streams "Direct link to Why object storage for streams")

If you squint at a durable message stream, it looks a lot like an append-only sequence of monotonically increasing objects. Object storage is already the cheapest durable storage available, with [costs based on how much you are storing](https://www.tigrisdata.com/docs/pricing/). Using it as the backing store for streams means:

* **Storage costs stay low.** Tigris charges $0.02/GiB/month for storage. A terabyte of retained stream data costs $20/month, not the hundreds you'd pay for EBS volumes backing a Kafka cluster.
* **No disks to manage.** S2 Lite is a single stateless binary. There's no local disk state to lose, no RAID arrays to rebuild, no volume snapshots to coordinate. You can even run it on a diskless server that boots from the network.
* **Global distribution comes free.** Tigris replicates data to the regions where it's accessed. You don't configure this; it happens automatically.

The trade-off is latency. Object storage writes are slower than local NVMe drives. S2 Lite on standard object storage will have higher tail latencies because any internet destination by definition takes longer to get to than local storage.

To be clear though: this is fine for most stream processing workloads. Tigris' [global performance](https://www.tigrisdata.com/docs/concepts/regions/) will mitigate the worst of it. But keep in mind that you probably won't build a real-time stock trading platform on this.

## Try it out[​](#try-it-out "Direct link to Try it out")

S2 Lite is MIT-licensed and [on GitHub](https://github.com/s2-streamstore/s2). The managed cloud service at [s2.dev](https://s2.dev) has a free tier if you want to skip the self-hosting. Either way, Tigris makes a natural backend: the zero egress fees mean your consumers can read as much as they want without racking up transfer costs.

I set this up in my homelab to monitor the Bluesky firehose for my own private needs and not having to worry about egress fees or paying The Big Cloud for the privilege to read my own data is refreshing. Let's be real though, not having to wrangle JVMs is its own reward.

Next on my list is building something to monitor Hacker News and maybe ingest the contents into a vector database for advanced semantic search against Silicon Valley's collective subconscious. Stay tuned for that!

If you build something interesting with S2 on Tigris, come tell us about it in [our Discord](https://community.tigrisdata.com). Otherwise keep warm out there and we'll keep telling you about all the cool ways you can use object storage.

Durable streams for multi-agent systems

Tigris gives you globally distributed, S3-compatible object storage with zero egress fees. Point S2 Lite at it and start streaming.

[Get started with Tigris](https://www.tigrisdata.com/docs/get-started/)

**Tags:**

* [Build with Tigris](/blog/tags/build-with-tigris/.md)
* [Engineering](/blog/tags/engineering/.md)
* [Streaming](/blog/tags/streaming/.md)
* [Open Source](/blog/tags/open-source/.md)
* [AI](/blog/tags/ai/.md)