This is part 2 of the NoSQL Data Modeling Series. In the first part of the series, we covered modeling NoSQL one-to-one relations. In this second part of the series, we will explore modeling NoSQL one-to-many relations. We'll look at two ways to implement NoSQL one-to-many relations. The first uses an embedded NoSQL design pattern, and the second uses a separate collection with a relation field. We will also look at when it is best to use either of these designs.
Getting startedโ
To follow along, you will need to get set up with Tigris. This is a quick and easy process. The Tigris Docs have a Quickstart to show you how to sign up. Once you have signed up, create a new project called relations. Then, create a new local application from a Tigris template to get started.
$ npx create-tigris-app@latest --project relations --example playground
Follow the prompts to add your clientId
and clientSecret
.
Embedded NoSQL one-to-manyโ
Let us start with modeling NoSQL one-to-many relations with an embedded
document. In the playground repository there is a Post
model defined in
src/db/models/post.ts
. For our example we can reduce some of the fields. Edit
it so that it looks like this:
import {
Field,
PrimaryKey,
TigrisCollection,
TigrisDataTypes,
} from "@tigrisdata/core";
@TigrisCollection("post")
export class Post {
@PrimaryKey(TigrisDataTypes.UUID, { order: 1, autoGenerate: true })
id?: string;
@Field({ timestamp: "createdAt" })
createdAt?: Date;
@Field({ timestamp: "updatedAt" })
updatedAt?: Date;
@Field()
title: string;
@Field()
content?: string;
@Field({ default: false })
published?: boolean;
}
This is a simplified model we will use to model blog posts. Next, add an embedded comments object to keep track of all the comments for this blog post.
import {
Field,
PrimaryKey,
TigrisCollection,
TigrisDataTypes,
} from "@tigrisdata/core";
export class Comment {
@Field()
content: string;
@Field()
name: string;
@Field({ timestamp: "createdAt" })
createdAt?: Date;
}
@TigrisCollection("post")
export class Post {
@PrimaryKey(TigrisDataTypes.UUID, { order: 1, autoGenerate: true })
id?: string;
@Field({ timestamp: "createdAt" })
createdAt?: Date;
@Field({ timestamp: "updatedAt" })
updatedAt?: Date;
@Field()
title: string;
@Field()
content?: string;
@Field({ default: false })
published?: boolean;
@Field(TigrisDataTypes.ARRAY, { elements: Comment })
comments: Array<Comment>;
}
In the above code, we define a new Comment
class on line 8. We then embed this
as an array in our Post
object on line 39
.
When there is a new comment, we will fetch the post document from Tigris and
append the latest comment. Open up index.ts
and add the following code to do
that:
import { Tigris } from "@tigrisdata/core";
import { Post } from "./db/models/post";
// setup client
const tigrisClient = new Tigris();
async function setup() {
// ensure branch exists, create it if it needs to be created dynamically
await tigrisClient.getDatabase().initializeBranch();
// register schemas
await tigrisClient.registerSchemas([Post]);
}
async function main() {
await setup();
const db = await tigrisClient.getDatabase();
const postCollection = await db.getCollection<Post>(Post);
const post = await postCollection.insertOne({
title: "A book review of Designing Data-Intensive Applications",
content: `
Design Data-intensive Applications by Martin Kleppmann is a must read
for anyone that loves learning about databases and build distributed systems.
It covers a large range of topics on the complexity of building distributed systems.
It is one of my favorite technical books.
`,
comments: [],
});
post.comments.push({
content:
"Thank you for the excellent book review. I've added it to my reading list",
name: "Henry",
});
post.comments.push({
content:
"I strongly disagree, everyone should be learning distributed systems from reading academic papers only!",
name: "Angry Max",
});
const updated = await postCollection.insertOrReplaceOne(post);
console.log(updated.comments);
const foundPost = postCollection.findOne({
filter: { title: "A book review of Designing Data-Intensive Applications" },
});
console.log("Fetching post from server");
console.log(foundPost);
}
main()
.then(async () => {
console.log("Setup complete ...");
process.exit(0);
})
.catch(async (e) => {
console.error(e);
process.exit(1);
});
The main lines to focus on are line 18
to line 43
. Here we create a post and
save it to Tigris. Then we add two comments to the comments
Array field in the
document and save it to the server. Finally, we find the first document in the
collection that matches the title and print it to the console. This is how you
can work with one-to-many NoSQL relations using the embedded NoSQL design
pattern.
This pattern works well for documents that do not have a lot of changes to them, so there are very few writes of the document to the NoSQL database. However, if there are a lot of writes, like in the case of a blog post receiving a lot of comments, we will get a lot of conflicts. We can handle this by retrying the update or fetching the document again, adding the comments, and trying to save it again. But this is a very tedious process and is prone to writing over data.
Instead, we can solve this by breaking up the comments into their own collection and using Tigris's transactions.
Reference one-to-many NoSQL relationsโ
Breaking up the comments into their own collections avoids any write conflicts but will mean queries will require a read from two collections. We will query from the blog collection to get the post we want and then from the comments collection to get all the comments for a specific post. Below is what is required to change the Tigris schema to follow this NoSQL design pattern:
import {
Field,
PrimaryKey,
TigrisCollection,
TigrisDataTypes,
} from "@tigrisdata/core";
@TigrisCollection("comment")
export class Comment {
@Field(TigrisDataTypes.UUID)
postId: string;
@Field()
content: string;
@Field()
name: string;
}
@TigrisCollection("post")
export class Post {
@PrimaryKey(TigrisDataTypes.UUID, { order: 1, autoGenerate: true })
id?: string;
@Field({ timestamp: "createdAt" })
createdAt?: Date;
@Field({ timestamp: "updatedAt" })
updatedAt?: Date;
@Field()
title: string;
@Field()
content?: string;
@Field({ default: false })
published?: boolean;
}
In the above snippet, we have removed the reference to the comments in Post
collection and instead have created a new collection called comment
. Now
update the index.ts
file to support the new collection:
import { Tigris } from "@tigrisdata/core";
import { Post, Comment } from "./db/models/post";
// setup client
const tigrisClient = new Tigris();
async function setup() {
// ensure branch exists, create it if it needs to be created dynamically
await tigrisClient.getDatabase().initializeBranch();
// register schemas
await tigrisClient.registerSchemas([Post, Comment]);
}
async function main() {
await setup();
const db = await tigrisClient.getDatabase();
const postCollection = await db.getCollection<Post>(Post);
const commentCollection = await db.getCollection<Comment>(Comment);
const post = await postCollection.insertOne({
title: "A book review of Designing Data-Intensive Applications",
content: `
Design Data-intensive Applications by Martin Kleppmann is a must read
for anyone that loves learning about databases and build distributed systems.
It covers a large range of topics on the complexity of building distributed systems.
It is one of my favorite technical books.
`,
});
if (!post.id) {
throw "post id was not set";
}
commentCollection.insertOne({
postId: post.id,
content:
"Thank you for the excellent book review. I've added it to my reading list",
name: "Henry",
});
commentCollection.insertOne({
postId: post.id,
content:
"I strongly disagree, everyone should be learning distributed systems from reading academic papers only!",
name: "Angry Max",
});
const tx = await db.beginTransaction();
const foundPost = await postCollection.findOne(
{
filter: {
title: "A book review of Designing Data-Intensive Applications",
},
},
tx
);
console.log("Fetching post from server");
console.log(foundPost);
let commentCursor = await commentCollection.findMany(
{
filter: { postId: foundPost.id },
},
tx
);
for await (let comment of commentCursor) {
console.log(`"${comment.name}" wrote "${comment.content}"`);
}
tx.commit();
}
main()
.then(async () => {
console.log("Complete ...");
process.exit(0);
})
.catch(async (e) => {
console.error(e);
process.exit(1);
});
We import the new collection on line 2
and add it to our NoSQL database schema
on line 10
.
The main query changes start at line 33
, where we insert each comment
individually. In this small example, we could use
insertMany
to batch insert the comments. However, I've intentionally inserted them
individually as if inserted by separate requests triggered via web requests.
On line 47
we start a transaction to read from the post and comment
collection. We create a transaction so that we get a single consistent snapshot
from the two collections. On line 48
we query for the specific post, and then
on line 58
we fetch all comments related to the post. This allows us to have a
high number of consistent reads and writes for this blog post without causing
write conflicts and making sure our data is always consistent.
The two different NoSQL one-to-many relation design patterns work well when used correctly. If you are not doing a lot of updates and not adding a lot of documents related to the parent, then the embedded NoSQL design pattern will be the most efficient. However, referencing a separate collection is the best pattern if many documents link to the parent document.
Thanks for following along. The final blog post in the NoSQL Data Modeling Series series will cover NoSQL many-to-many relations.
Stay connected
Make sure you don't miss the next post in the series by subscribing to the Tigris Newsletter: