PrismML recently released Bonsai-8b (and 4b and 1.7b), what they called “Concentrating Intelligence”.
The models are indeed super small in terms of disk space.
It is a fantastic idea, focused on edge computing (modern ones)
BUT
forgetting the 1-bit LLM agenda: remove the bottleneck of GPU computing, from training to inference time.
medium.com/artificial-i…