Importing data
Tigris support importing of datasets in JSON and CSV format with automatic schema inference. File format is detected automatically.
JSON format
The dataset for import can be a JSON array of documents or a stream of new line delimited documents. Every document is a JSON document conforming to the JSON specification.
Import process recognizes standard JSON types:
- float64
- string
- boolean
- object
- array
in addition to that Tigris infers the following types:
- UUID - string in UUID format
- Time - string in RFC3339 format
- Binary array - base64 encoded string
- Int64
CSV format
CSV files for import should conform to RFC4180
and must have header with the field names as the first line of the file.
Nested field names should be separated by .
(dot).
Except array, which is currently unsupported, Tigris detects same field types as in the case of JSON.
Importing using CLI
In order to import a dataset you need to be logged in to the Tigris instance and CLI installed.
This is example dataset in JSON and CSV format for import:
- JSON
- CSV
{
"address": {
"city": "Bend",
"state": "Oregon"
},
"id": "14a0d22a-78c0-4a14-a5ea-294b02e52be6",
"name": "Allie",
"height": 164
}
{
"address": {
"city": "San Francisco",
"state": "California"
},
"id": "95f0561c-8367-4b75-9c54-7c26a36c3131",
"name": "Krystalle",
"height": 176
}
id,name,address.city,address.state,height
95f0561c-8367-4b75-9c54-7c26a36c3131,Krystalle,San Francisco,California,176
14a0d22a-78c0-4a14-a5ea-294b02e52be6,Allie,Bend,Oregon,164
Assuming above example is saved in the file, it can be imported by running:
- JSON
- CSV
tigris import --project test users <users.json
tigris import --project test users <users.csv
Which automatically creates users
collection if it doesn't exist.
After successful import the data is available for querying. You can
navigate to webconsole and search through the data.
Created collection schema can be checked by running:
tigris --project test describe collection users
or using webconsole, where language schema in TypeScript, Go and Java generated as well.
Output
{
"collection": "users",
"schema": {
"title": "users",
"properties": {
"address": {
"type": "object",
"properties": {
"city": {
"type": "string"
},
"state": {
"type": "string"
}
}
},
"height": {
"type": "integer"
},
"id": {
"type": "string",
"format": "uuid"
},
"name": {
"type": "string"
}
},
"primary_key": ["id"]
}
}
Modifying import behavior
- To prevent unintentional appending to existing collection,
it's rejected by default. Use
--append
option to force appending. - By default, field with the name "id" is set as a primary key of the
collection, to alter that use
--primary-key
option with comma separated list of the field names, which constitutes the primary key. - Fields can be marked as autogenerated by the server with
--autogenerate
option. Which is also comma separated list of autogenerated fields. - Alternative CSV delimiter and comment can be set by
--csv-delimiter
and--csv-comment
options.