DuckDB Quickstart

DuckDB is an in-process embedded analytical database optimized for fast queries, ease of use, and embedding efficient analytics inside existing applications. It's the SQLite for analytics, but with a stronger focus on data analytics, aggregation, and Online Analytical Processing (OLAP) queries.

To get started using DuckDB with Tigris, you need to do the following things:

Create new bucket at storage.new if you don't have one
Create an access keypair for that bucket at storage.new/accesskey
Configure DuckDB to use Tigris
Query files in Tigris

(OPTIONAL) Create a new bucket

Open storage.new in your web browser.

Give your bucket a name and select what storage tier it should use by default. As a general rule of thumb:

Standard is the default. If you're not sure what you want, pick standard.
Infrequent Access is cheaper than Standard, but charges per gigabyte of retrieval.
Instant Retrieval Archive is for long-term storage where you might need urgent access at any moment.
Archive is for long-term storage where you don't mind having to wait for data to be brought out of cold storage.

Click "Create".

Create an access keypair for that bucket

Open storage.new/accesskey in your web browser.

Give the keypair a name. This name will be shown in your list of access keys, so be sure to make it descriptive enough that you can figure out what it's for later.

You can either give this key access to all of the buckets you have access to or grant access to an individual bucket by name. Type the name of your bucket and give it Editor permissions.

Click "Create".

Copy the Access Key ID, Secret Access Key, and other values into a safe place such as your password manager. Tigris will not show you the Secret Access Key again.

Configure DuckDB to use Tigris

Open the DuckDB command line in your terminal:

$ duckdb
v1.2.2 7c039464e4
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
D

Create a new secret with the CREATE SECRET command:

CREATE OR REPLACE SECRET tigris
  (      TYPE  s3
  ,  PROVIDER  config
  ,    KEY_ID  'tid_access_key_id'
  ,    SECRET  'tsec_secret_access_key'
  ,    REGION  'auto'
  ,  ENDPOINT  't3.storage.dev'
  , URL_STYLE  'vhost'
  );

Query files in DuckDB

Once you're in, you can query files in your bucket like you would normally, just open them with the s3://<bucketname> prefix. For example, you can import the data from the LinkedIn Data Jobs Dataset by creating a local table like this:

CREATE TABLE IF NOT EXISTS jobs
  ( id               INT64  NOT NULL
  , title            TEXT   NOT NULL
  , company          TEXT   NOT NULL
  , "location"       TEXT   NOT NULL
  , link             TEXT   NOT NULL
  , source           TEXT   NOT NULL
  , date_posted      DATE   NOT NULL
  , work_type        TEXT   NOT NULL
  , employment_type  TEXT   NOT NULL
  , "description"    TEXT   NOT NULL
  );

And then importing the data from Tigris:

INSERT INTO jobs
SELECT * FROM
  READ_CSV
    ( 's3://xe-duckdb/clean_jobs.csv'
    , header = true
    , columns =
      {              'id': 'INT64'
      ,           'title': 'TEXT'
      ,         'company': 'TEXT'
      ,        'location': 'TEXT'
      ,            'link': 'TEXT'
      ,          'source': 'TEXT'
      ,     'date_posted': 'DATE'
      ,       'work_type': 'TEXT'
      , 'employment_type': 'TEXT'
      ,     'description': 'TEXT'
      }
    );

Now you can query it like normal:

SELECT
  *
FROM
  jobs
WHERE
  location LIKE 'Seattle%';

(OPTIONAL) Create a new bucket​

Create an access keypair for that bucket​

Configure DuckDB to use Tigris​

Query files in DuckDB​

(OPTIONAL) Create a new bucket

Create an access keypair for that bucket

Configure DuckDB to use Tigris

Query files in DuckDB