Youssef Hosni (@youssefhosni95): "PaperBanana wraps gpt-image-1.5 or gemini-3-pro-image-preview in a 7-agent loop: retrieve 13 reference figures, plan the diagram in text, then run Visualizer against Critic until the image matches the methodology. Single-shot image gen produces something diagram-shaped but unfa…"

Make money doing the work you believe in

PaperBanana wraps gpt-image-1.5 or gemini-3-pro-image-preview in a 7-agent loop: retrieve 13 reference figures, plan the diagram in text, then run Visualizer against Critic until the image matches the methodology.

Single-shot image gen produces something diagram-shaped but unfaithful; the critic re-reads source context and revises until it isn't. MCP server for Claude Code and Cursor.

Unofficial implementation of Google Research's paper. 1.5k stars, MIT.

Repo link: github.com/llmsresearch…

May 18

12:32 PM

Make money doing the work you believe in

Log in or sign up