This is a news item from last month, but it was a big deal. Twice the sparsity of DeepSeek V3 with significant mathematics and coding chops. I find it telling that these models are geared toward agentic applications that typically require reliability and accuracy, in other words, less hallucinations and other LLM related issues.
Aug 18
at
10:44 PM
Log in or sign up
Join the most interesting and insightful discussions.