Whole CRR Sync
Whole CRR (conflict free replicated relation) sync means that we want all of the CRRs on different databases to be identical after they merge.
What follows is an example of how to implement whole CRR sync to merge two databases together.
The base primitives around which everything is built are tables and crrs.
- Tables are just regular
sqlitetables which can be merged with other tables on other devices
You can have local-only data (regular tables) as well as synced data (crrs) in the same database. Anything that is not a
crr will not be synced.
For the rest of this guide we'll assume that you're making a collaborative presentation editor like Google Slides, Keynote or Powerpoint. This application will have completely local data (like selection state) and shared data (like decks, slides and slide contents).
provide keywords are just part of the documentation
framework to allow code cells to communicate by exporting and importing
variables from one another.
We can start by defining the base data model as normal
And then decide which of those should be synced and thus
crrs. We'll want to replicate the
component tables given those states should be shared across all devices and users. We won't replicate the tables which represent local-only state like
Now that we have a data model sketched out we can apply it to a database. Let's import the
crsqlite-wasm module and open a database in memory. Too learn how to persist data to disk, see /docs/js/persistence.
Then apply the schema we defined earlier to the database.
Great! Let's run a few commands to make sure that worked.
As you can see, all the tables were created as expected. There are a few internal tables as well which are used by crsqlite to track merge state.
You can modify the code above and press
shift + enter to re-run the cell.
Now that we have a database with a data model we can start writing data to it. Let's start by creating a deck along with a few slides and components.
And to check on things --
But what if we need to collaborate between multiple devices? We need to be able to replicate the data between devices. Let's see how that works.
The main primitive for this is crsql_changes which allows you to:
- pull changes since a given time
- apply changes from another db
While you can select all changes in the database, you can also narrow down to specific logical clock ranges, tables, primary keys and more.
Let us see what some of the changes in our DB looks like.
A bunch of stuff. The details don't really matter for now, just know that:
- The output of a select from
crsql_changescan be directly fed back into an insert into
crsql_changeson another database
- That a list of changes compresses well -- so don't worry about the repeated values
To understand everything in a chageset, see crsql_changes.
Let's see what happens when we apply a changeset to a new database.
Spin up a new DB that has no data in it.
One constraint is that databases must have the same set of CRRs in order to sync between them. So let's apply the schema to the new database.
Confirm that db2 is in fact empty.
And finally apply all changes from the first database to the second.
Now checking the contents of db2 post sync --
We can also perform concurrent edits on each DB and then merge them together.
In order to ensure we only sync what has changed since the last time we synced, we'll need to track the latest
db_version we've seen from our peers.
crsqlite provides a crsql_tracked_peers table that networking layers can use to track the
db_version of each peer. Networking layers are of course allowed to create their own tables to track this information as well.
crsql_tracked_peers table is essentially acting as a mechanism to store cursors that represent the last point in time that we synced with another peer or site. We use these values to only pull or push changes since the last time we pulled or pushed when talking with a peer.
db2 wants all changes from
db1 that it has not yet seen,
db2 can ask
db1 for changes where the
db_version is greater than the value it has stored for
You can experiment with the
db_version to see how it affects the changes that are returned.
You've seen how you can compose:
to build a sync layer for your application.
The basic ideas:
- Convert tables to conflict free replicated relations with
- Discover the unique id of each database with
- Pull and apply changes to/from databases with
- Keep track of what you've synced with
Another example implementation of the above: https://observablehq.com/@tantaman/cr-sqlite-basic-setup (opens in a new tab)