Joshua (@joshuahong80): "Thank you for this essay. Your writing has brought forth one of the most damning parts of agentic system development: non-determinism. When you ship an agent, the workflow is usually fixed at shipping but because the underlying LLM pipeline will ever slightly change due to the n…"

Make money doing the work you believe in

Thank you for this essay. Your writing has brought forth one of the most damning parts of agentic system development: non-determinism. When you ship an agent, the workflow is usually fixed at shipping but because the underlying LLM pipeline will ever slightly change due to the non-deterministic nature of these frontier language models, the corresponding workflow also must adjust during post-production operation but they normally don’t. And this renders the agent unreliable and the testing nearly impossible. This is why there are many eval + observability platforms (LandSmith, Arize, W&B) proliferating but they don’t actually fix the issues. I will have to do it manually and that suck.

May 18

5:34 PM

Make money doing the work you believe in

Log in or sign up