Stream Training Datasets

Stream directly from object storage

Feed datasets to GPUs without staging locally.

Training jobs need data fed to the GPU fast enough that compute never stalls waiting on storage. With Tigris you can stream objects straight into PyTorch DataLoaders without staging anything locally, and the same bucket works from any cloud or region.

The S3 Connector for PyTorch reads objects directly from Tigris into your training loop. S3IterableDataset streams sequentially; S3MapDataset gives random access for shuffling. Each DataLoader worker automatically gets a distinct partition.

Get started with TAG → PyTorch quickstart →

Benefits

Near-local speed with TAG caching

Adding TAG (Tigris Acceleration Gateway) as a local S3-compatible caching proxy eliminates network round-trips after epoch 1. In benchmarks, warm epochs run 5.7× faster, workers needed to saturate the GPU drop from 16 to 4, and local NVMe cache provides ~200× throughput headroom over what the GPU can consume.

Multi-cloud without egress costs

GPU instances can run on any provider — Lambda, CoreWeave, Crusoe, or a hyperscaler. All nodes read from the same global Tigris bucket at the nearest replica, so there are no cross-cloud egress costs and no per-region storage to manage.

DataLoader best practices

Use pin_memory=True and persistent_workers=True on your DataLoader for faster host-to-GPU transfers and lower worker startup overhead between epochs.

Works with any S3-compatible cache or accelerator

Tigris is a drop-in backend for the tools your infrastructure already uses.

Any product that reads from S3-compatible storage works with Tigris out of the box. Point it at t3.storage.dev and your existing caching layer, GPU accelerator, or data loader keeps working — no code changes, no vendor lock-in.

This includes local caching proxies like TAG, distributed caching layers, parallel filesystems with S3 import (Weka, VAST, DDN, managed Lustre), GPU-direct storage solutions, and framework-native data loaders that accept an S3 endpoint.

Training with big data →

Benefits

Local caches and proxies

S3-compatible caching proxies like TAG sit between your compute and Tigris, caching hot data on local NVMe. After the first read, subsequent requests are served at disk speed. Any proxy that speaks S3 can use Tigris as its upstream origin — just set the endpoint URL.

Distributed caching layers

Distributed caching systems that sit between compute frameworks and S3-compatible storage work with Tigris as a backend. Configure Tigris as the under-storage system and the caching layer handles tiered caching across your cluster — hot data stays on local SSD/NVMe, warm data spills to remote storage. This is especially useful for multi-tenant clusters where many jobs read overlapping datasets.

Parallel filesystems

Weka, VAST, DDN, and managed Lustre products all support S3 data import. Use Tigris as the durable source of truth and hydrate into the parallel filesystem before compute starts. The filesystem is provisioned only for the duration of the job, so you avoid paying for high-performance storage around the clock.

GPU-direct storage

NVIDIA's GPUDirect Storage enables direct NVMe-to-GPU data paths, bypassing the CPU during data loading. Combine with Tigris hydration to local NVMe or a parallel filesystem for maximum throughput.

Framework-native loaders

PyTorch's S3 Connector, Hugging Face datasets, and other framework loaders accept an S3 endpoint URL. Point them at Tigris and they stream data directly into your training loop — no staging, no custom integration code.

Stream directly from object storage​

Benefits​

Works with any S3-compatible cache or accelerator​

Benefits​

Stream directly from object storage

Benefits

Works with any S3-compatible cache or accelerator

Benefits