Metrics and Tracing
In addition to reports and charts built into
Cloud console, Tigris provides a OpenMetrics /
Prometheus compatible metrics endpoint
which can be used to gather insights into the health, performance and usage of
Tigris also supports tracing (only datadog support right now and only server side). In the near future, tigris server will support opentelemetry/openmetrics. As well as client side tracing, so the tigris level metrics can be tied to meaningful user sessions at the application level.
The metrics exposed by tigris server are counters and timers around various areas of the server. These can be grouped based on what layers are they measuring. These are the following:
- Requests: metrics about the high level GRPC/HTTP requests, issued by the users of tigris.
- Fdb: metrics about how tigris uses the underlying foundationdb.
- Search: metrics about how tigris uses the underlying typesense search engine.
- Session: metrics about query execution within tigris server.
- Size: metrics about the size of databases and collections.
- Net: metrics about the network traffic between the tigris server and it's clients.
- Auth: metrics about authentication.
- Go: standard go process level metrics.
Each of the metric groups are have tags, so we can easily drill down for example from global QPS (queries per second) to QPS on a certain database or collection. If you are using Tigris Cloud, these metrics are the sources of what is displayed on the web console.
A line only appears in the output of /metrics when it is accessed the first time. This is normal for services exposing metrics like this. Because of rich tagging, it's impossible to pre-populate all the metrics with 0 counters, because we have no way to know in advance which databases and collections will be created in the future.