The app for independent voices

I just finished the Data Ingestion with dlt workshop as part of Data Engineering Zoomcamp 2026 (#dezoomcamp).

Here’s what stuck:

dlt in one line

dlt is a Python “data load tool” that moves data from sources (APIs, DBs) into destinations (DuckDB, BigQuery, etc.) with declarative config instead of custom ETL code.

What I built

A small pipeline that pulls NYC Yellow Taxi trip data from a paginated REST API into DuckDB: one source, one resource, offset pagination, and a pipeline.run(). No API key, no manual pagination logic.

Concepts that clicked

  • Pipeline = runner (source → destination).

  • Source = where data comes from; resource = one stream → one table.

  • Run = Extract → Normalize → Load; dlt handles schema and loading.

Verdict

Less code, fewer bugs, and a clear path from “API” to “queryable dataset.” I’m going to reuse this pattern for other APIs.

If you’re doing the Zoomcamp, the dlt workshop is a solid way to get from zero to a working pipeline in an afternoon.

#dezoomcamp #dataengineering #dlt #datapipeline

Mar 1
at
4:23 PM
Relevant people

Log in or sign up

Join the most interesting and insightful discussions.