If you're a data engineer and want to learn how big-tech companies do data engineering, read these articles:
◉ Uber
- Infra Evolution: lnkd.in/gqFspYhy
- Real-time infra: lnkd.in/gKe9sPEa
- Data infra: lnkd.in/g6M2Wwdq
- Solve Spark shuffle problem: lnkd.in/gRzA3hBi
- Uber's motivation to create Hudi: lnkd.in/gvqCgbPD
- Data quality at Uber: lnkd.in/gJHeGeht
◉ LinkedIn
- Data Infra: lnkd.in/gmWyDGt7
- 7 trillion messages with Kafka: lnkd.in/g2p5tf3R
- Real-time processing with Apache Beam: lnkd.in/dFzBU67q
- Metadata platform: lnkd.in/g4ftJghQ
- Replacing Kafka: lnkd.in/gaE8ydjB
◉ Netflix
- Real-time data infra: lnkd.in/gfPiJjHd
- Data engineering stack: lnkd.in/eXkQa5W4
- Operating Iceberg at scale: lnkd.in/g2MDDtJ3
- Data quality with Iceberg: lnkd.in/gqZN54qm
◉ Meta
- Real-time infra: lnkd.in/gvk6uWm7
- Modernize the lakehouse infra: lnkd.in/gq-bZCCn
- Data Lineage at Scale: lnkd.in/gmMrxQsz
◉ Doordash
- Real-time processing system: lnkd.in/gVBpXtNt
- Evolve realtime processing platform with Iceberg: lnkd.in/gWXWnj4y
◉ Twitter
- 4 billion events in real-time daily: lnkd.in/gAgrcTMr
◉ Notion
- 200 billion data entities: lnkd.in/dQWJZ5c4
◉ Discord
- Evolve to handle trillions of data points: lnkd.in/dsMR95Zj
◉ Clickhouse
- How they build their internal data warehouse: lnkd.in/gUeATcxu
◉ Spotify
- How they build their data platform: lnkd.in/g6kC8bZE
◉ Walmart
- Why these chose Apache Hudi: lnkd.in/gi5cAFTy
◉ Airbnb
- How they build the semantic layer: lnkd.in/gkagEbah
(To be updated...)
--
I'm writing articles for 𝟭𝟳,𝟬𝟬𝟬+ data engineers worldwide. Join the community with 𝟱𝟬% 𝗱𝗶𝘀𝗰𝗼𝘂𝗻𝘁 𝗼𝗻 𝘁𝗵𝗲 𝗮𝗻𝗻𝘂𝗮𝗹 𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗽𝘁𝗶𝗼𝗻 now: