System Design of Google docs

System Design of Google docs

Let’s do a system design of collaborative Editing and scale them

Flowchart

Functional Requirements

  1. Tools for Collaborative Editing

Non-Functional Requirements

  1. Availability »» Consistency

  2. Performance / Low Latency

  3. Security

  4. Reliabiltiy

Capacity Estimation

  1. Write-to-read := 1:4 (4 reads for respective 1 write for each doc)

  2. Average size of a google doc is 1-5Mb

  3. 500 million active users, each stores an average of 100 documents that makes 50 billion documents stored globally

  4. Concurrent Users editing 1-2 million

  5. Total stored data 250-500 PB

These data’s are approximated usecases might vary based on the scale

Algorithms used for Collaborative Editing

Google uses
Operational Transformation
Notion uses
Operational Transformations with some CRDTs Techinques

Different Algorithms

  • Operational Transformation (OT)

  • Conflict free replicated data types (CRDTs)

  • Differential Synchronization

  • Event Sourcing

  • TreeDoc Algorithm

  • Wave ( The very first ) Google made Collaborative editing

Comparision

AlgorithmStrengthsWeaknessesUsage

Operational Transformation (OT)

Robust real-time editing with concurrent user handling. Highly optimized.

Requires a central server to coordinate transformations.

Google Docs, Etherpad, CKEditor

Conflict-Free Replicated Data Types (CRDTs)

Decentralized, supports offline-first applications. Conflict-free merging.

Higher memory and computational overhead.

Automerge, Peer-to-peer apps

Differential Synchronization

Simple for syncing small text-based changes.

Not optimized for real-time editing.

Collaborative text editing

Event Sourcing

Tracks full edit history. Ideal for audit trails.

Harder to handle concurrency. Event logs can grow large.

Complex multi-user apps

Treedoc

Efficient, decentralized for text editing.

GUID overhead and complexity.

Text-based collaborative editing systems

Wave

Handles rich media and complex document structures.

Discontinued; limited support.

Early influences on collaboration tools

Operational Transformation (OT)

Operational Transformation is a powerful technique that enables real-time collaborative editing by allowing multiple users to make concurrent changes to a document while ensuring consistency and conflict resolution. By transforming operations based on the current document state, OT maintains the integrity of the collaborative experience, making it a foundational technology for applications like Google Docs and other collaborative platforms.

1. Basic Concepts

  • Operation: An edit made by a user, such as inserting or deleting text.

  • Document State: The current version of the document, represented as a sequence of characters or data elements.

  • Transformation: A method to modify operations so that they can be applied in a consistent manner, regardless of the order in which they are received.

2. Types of Operations

OT typically deals with two types of operations:

  • Insert: Adds a character or element at a specific position in the document.

  • Delete: Removes a character or element from a specific position.

Let’s take an Example, sorry i prefer always paper over online docs am so sorry 😭😭😭

Let’s take an example there’s a word called AT in the current doc and we have a and b our two users let’s call them andy and bandy sorry but promise to bring candy for the next blog

Let’s say andy decides to delete the char A and bandy decides to insert H before A to make a meaningfull word called HAT but talking about collaboration what we get is

HA | HT

we need a mechanism we can definitely think about locks but think its supposeed to collaborative it will consume much more time hence we think about an algorithm which was figured about by google Operational transforamtion and now is widely used algo worldwide for collab based editing

we hence add according to the image made in green diagram

adding the inverse of a and inverse b

High Level Design

Deep dive

What is Google Docs?

  • Google Docs is an online word processor that lets you create and edit documents in your web browser.

  • It is part of Google Workspace and works well with other Google tools like Drive and Sheets.

2. Real-Time Collaboration

  • Multiple people can work on a document at the same time, and everyone can see changes instantly.

  • Google Docs uses a system called Operational Transformation to manage these changes and avoid conflicts.

3. Document Structure

  • Documents are organized in a tree-like structure, starting from the main document down to sections, paragraphs, and images.

  • This structure makes it easy to manage and edit the content.

4. Editing Features

  • You can easily add, delete, or change text without disrupting the whole document.

  • Each change is simple to implement, making editing fast and straightforward.

5. Version History

  • Google Docs automatically saves your work and keeps a history of all changes.

  • You can go back to earlier versions of the document if you need to, which is great for tracking changes.

6. Comments and Suggestions

  • Users can leave comments and suggestions linked to specific parts of the document, making it easy to provide feedback.

  • This allows for discussion without directly changing the main text.

7. Accessibility

  • You can access Google Docs from any device with internet access, whether it's a computer or a smartphone.

  • It also works offline, so you can edit documents without being connected to the internet.

8. Formatting Options

  • Google Docs offers many formatting tools, such as changing fonts, sizes, and colors, so you can make your documents look professional.

  • You can also insert images, tables, and links to enhance your content.

9. Add-ons

  • You can use third-party add-ons to add more features to Google Docs, like tools for managing citations or creating diagrams.

  • This customization helps you work more efficiently.

10. Security

  • Google ensures your documents are secure with strong protections like encryption and user login.

  • You can control who can view or edit your documents, keeping your information safe.

11. Performance

  • Google Docs uses techniques to load parts of the document only when needed, which speeds up the experience.

  • Frequently used parts of the document are cached for faster access, especially in large files.

12. Common Uses

  • People use Google Docs for many reasons, such as writing essays, collaborating on projects, or creating reports.

  • Majorly used in industries for majority of the paper work

Did you find this article valuable?

Support Thirumalai by becoming a sponsor. Any amount is appreciated!