Benjamin Marie on Substack: "SmolLM2 1.7B is a very good alternative to Qwen2.5 1.5B/3B and Llama 3.2. fully open (training data, recipe, etc all to be released) better or as good as models of similar sizes for most tasks much smaller vocabulary => smaller activations => much cheaper fine-tuning! The 1…"

SmolLM2 1.7B is a very good alternative to Qwen2.5 1.5B/3B and Llama 3.2.

fully open (training data, recipe, etc all to be released)
better or as good as models of similar sizes for most tasks
much smaller vocabulary => smaller activations => much cheaper fine-tuning!

The 135M version is also much better than the first iteration of SmolLM 135M but I still struggle to find applications for this model, other than for educational purposes.

Nov 3

5:01 PM