Lots of commentary about RLMs and how they're amazing. I couldn't quite grok how, so this is my "we have RLMs at home" approach, which I've been using for a bit in bits and pieces, but pulled together and shared.
Basically, the thesis is that everything is/ should be a tool call. So when I'm coding in codex, I want the LLM to not get its context window blown out with /compact screwing things up. But this is fixable in two ways:
1. Just store the transcript and tell the model about it so it can grep when needed
2. Reduce the size of the transcript by wiring instructions into into /compact (which you can do in codex, point experimental_compact_prompt_file to your prompt)
Which means you can take a "regular" codex and compare to a "memex" codex and see whether they work better. Lo and behold, it does, because of course it does!
And why wouldn't it? You're reducing context rot directly by telling GPT to compact the conv by keeping the relevant portions in some structured way, so you don't lose the objective and progress but can still go look up the full info if and when needed.
I'm sure we can make this broader than just codex, but one step at a time.
I'm also like 90% sure this approach will scale, and openai/ anthropic/ gemini/ others will simply use it in their next training runs.
Repo: