One skill every ML engineer has to master โ†“

๐— ๐—Ÿ ๐—ฆ๐˜†๐˜€๐˜๐—ฒ๐—บ ๐—ฑ๐—ฒ๐˜€๐—ถ๐—ด๐—ป

Yes. And do you know why?

Because good ML system design has NOT changed at all in the last 5 years.

And it won't.

๐—ช๐—ต๐˜† โ“

Because any ML system is (and will always be) made of 3 types of programs (aka pipelines)

1๏ธโƒฃ โ†’ ๐—™๐—ฒ๐—ฎ๐˜๐˜‚๐—ฟ๐—ฒ ๐—ฝ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ๐˜€ that transforms raw data into ML model features (e.g. vector embeddings) that are saved in a Feature Store or Vector DB.

2๏ธโƒฃโ†’ ๐—ง๐—ฟ๐—ฎ๐—ถ๐—ป๐—ถ๐—ป๐—ด/๐—™๐—ถ๐—ป๐—ฒ-๐˜๐˜‚๐—ป๐—ถ๐—ป๐—ด ๐—ฝ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ๐˜€ that read historical features from the Feature Store/Vector DB and generate a new model artifact, either by training from scratch or fine-tuning a base LLM. This model artifact is then pushed to a model registry.

3๏ธโƒฃ โ†’ ๐—œ๐—ป๐—ณ๐—ฒ๐—ฟ๐—ฒ๐—ป๐—ฐ๐—ฒ ๐—ฝ๐—ถ๐—ฝ๐—ฒ๐—น๐—ถ๐—ป๐—ฒ๐˜€ that load the model from the registry, and the input from the client app (for example a vector of numerical features, or a text prompt), generate a prediction (or a generation) and return it to the client app.

This is a universal blueprint, that together with CI/CD workflows (aka MLOps) helpx you build any ML system.

Now you can go and thank Jim Dowling for the idea ๐Ÿ’ก

----

Hi there! It's Pau Labarta Bajo ๐Ÿ‘‹

Every day I share free, hands-on content, on production-grade ML, to help you build real-world ML products.

๐—™๐—ผ๐—น๐—น๐—ผ๐˜„ ๐—บ๐—ฒ on Substack so you don't miss what's coming next

#mlops #machinelearning

Dec 16
at
1:17 PM