Background & Concepts
vlcn allows applications to maintain their state locally and, if desired, merge that state with other peers or a central service at some future point in time. This merging process is guaranteed to converge all nodes to the same state and not run into conflicts.
- Development of offline & Local-First applications
- Apps that are responsive even in the face of network partitions
- Reduced cloud and infrastructure costs by leveraging compute and storage of client devices
vlcn also provides some primitives to simplify application development by allowing developers to subscribe to the database and react on changes. The same concept as the proposal made here.
The core technologies that make this possible are:
- CRDTs & CRRs
- Live Queries
SQLite is the world's most used embedded database. Your phone, tv, car, desktop, laptop, wifi route, etc. all probably have one or many instances of
SQLite embedded into them.
SQLite is extremely tiny and fast and embeds directly into your application, meaning queries take microseconds to fulfill.
SQLite, however, has little support for merging and applying changes from other
SQLite databases. The support it does have requires the developer to manually resolve conflicts.
CRDTs & CRRs
That brings us to CRDTs or "conflict free replicated data types." Any number of people can make concurrent writes to a
CRDT. As messages are exchanged between nodes in the system, all instances of the
CRDT are guaranteed to eventually converge to the same state.
CRDTs sound a bit magic but they can be incredibly dumb (a set that can only grow is a
CRDT), or much more complex (a data structure for capturing edits to rich text). The other thing to consider is whether or not the academic notion of "convergence" matches the expectations of your product use case. For the vast majority of use cases,
vlcn has found the answer to be yes. Many use cases can also get away with simple
CRDT algorithms such as LWW maps, LWW registers, and RW-sets.
At the time of this writing,
- Models tables as remove win set (RW-Sets)
- Models rows as last write wins maps (LWW Maps)
- Models columns as last write wins registers (LWW Registers)
In other words:
- If multiple users concurrently add many rows (all rows have differing primary keys), all rows will be retained.
- If a user deletes a row that another user updated, the delete wins
- If multiple users concurrently edit different columns of the same row (same primary key), the final state of the row will be a combination of everyone's values
- If multiple users edit the same column of the same row, the final state of the cell will be the write from the user who is deemed to have written last.
Add-wins (rather than remove-wins) semantics can easily be added by the developer via including a column that acts as an
vlcn doesn't support this by default given the current climate around privacy and "deleted means deleted."
CRDTs that are on the roadmap but not yet built:
- Sequence/Array & Rich Text CRDTs
- Multi-value registers
- *Fractional Indexing
One thing to point out is that
CRDTs are eventually consistent. I.e.,
vlcn is adding partition tolerant and eventually consistent data structures to a normally strongly consistent relational model. This introduces some new angles to consider when designing a schema that includes
CRDTs which is covered in docs/bits-table-requirements.
The composition of the set of CRDTs required to fulfill the table abstraction in a RDBMS is called a
Conflict Free Replicated Relation.
*while "fractional indexing" doesn't sound like a CRDT it needs some special care to be usable in a multi-writer system.
Live queries allow your application to:
- Express the data it needs
- Be notified whenever that data changes
This is especially helpful in situations where your database could be merging in updates from other peers or a server in the background. E.g., when doing any sort of live editing, collaboration or device sync.