The app for independent voices

๐—ช๐—ต๐˜† ๐—ก๐—ฒ๐˜๐—ณ๐—น๐—ถ๐˜… ๐—ฑ๐—ฒ๐—ฐ๐—ถ๐—ฑ๐—ฒ๐—ฑ ๐˜๐—ผ ๐—ฎ๐—ฏ๐—ฎ๐—ป๐—ฑ๐—ผ๐—ป ๐˜๐—ต๐—ฒ ๐—–๐—ค๐—ฅ๐—ฆ ๐—ฎ๐—ฝ๐—ฝ๐—ฟ๐—ผ๐—ฎ๐—ฐ๐—ต

Netflix launched ๐—ง๐˜‚๐—ฑ๐˜‚๐—บ in late 2021 as its official fan website, a destination for exclusive interviews, behind-the-scenes content, and show-related material.

The team built it using CQRS to optimize read performance for serving content to fans.

The write path utilized a third-party CMS with a dedicated ingestion service that handled content updates via webhooks.

This service transformed CMS data into a read-optimized format and published it to Kafka.

But their CQRS implementation couldn't deliver the read-after-write consistency their editorial workflow required.

Tudum launched with standard CQRS separation:

Write path: third-party CMS with webhook-driven ingestion service.

Read path: Kafka โ†’ Cassandra โ†’ Page Data Service with near cache โ†’ Page Construction Service.

The ingestion pipeline transformed CMS data into read-optimized format (CDN URLs instead of asset IDs, hydrated movie metadata instead of placeholders). Kafka decoupled writes from reads, and Cassandra provided the query store.

But they have ๐—ฎ ๐—ฐ๐—ผ๐—ป๐˜€๐—ถ๐˜€๐˜๐—ฒ๐—ป๐—ฐ๐˜† ๐—ฝ๐—ฟ๐—ผ๐—ฏ๐—น๐—ฒ๐—บ. Editorial previews required strong read-after-write consistency, but the architecture delivered eventual consistency with significant delays:

1. CMS webhook triggers the ingestion service

2. Ingestion service queries CMS APIs, validates, transforms, and produces to Kafka

3. Data Service Consumer processes Kafka message, writes to Cassandra

4. Near cache refreshes on 60-second cycle

With N keys and 60-second refresh intervals, the cache is updated one key per second. Growing content made this issue worse, and preview delays stretched to over 30 seconds.

So, Netflix replaced Kafka, Cassandra, and a cache layer with ๐—ฅ๐—”๐—ช ๐—›๐—ผ๐—น๐—น๐—ผ๐˜„, their in-memory object database.

Key characteristics:

โœ… Entire dataset is distributed across the application cluster memory

โœ… Compression reduces the memory footprint to 25% of the uncompressed size

โœ… Eventual consistency by default, strong consistency per-request option

โœ… O(1) synchronous data access, no I/O per request

This enabled them to ๐—ฎ๐—ฐ๐—ต๐—ถ๐—ฒ๐˜ƒ๐—ฒ ๐˜€๐—ถ๐—ด๐—ป๐˜๐—ถ๐—ณ๐—ถ๐—ฐ๐—ฎ๐—ป๐˜ ๐—ฝ๐—ฒ๐—ฟ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐—ป๐—ฐ๐—ฒ ๐—ถ๐—บ๐—ฝ๐—ฟ๐—ผ๐˜ƒ๐—ฒ๐—บ๐—ฒ๐—ป๐˜๐˜€, such as reducing editorial previews from minutes to seconds and page construction from 1.4 seconds to 0.4 seconds.

CQRS optimizes for scale and separation of concerns. However, when your dataset fits in memory, and you require strong consistency, simpler approaches often prove more effective.

The "right" pattern depends on your specific constraints. Netflix optimized for editorial workflow, rather than theoretical scalability limits they hadn't yet reached.

Architecture decisions should align with actual requirements, not anticipated ones.

Image: Netflix

Mar 11
at
9:15 AM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.