Things still annoying about codex
- it is freaked out about any token usage
- constant smoke tests vs real tests, keeps watering it down
- fixated on reproducibility of experiments and messes with random variables
- be too clever and try solve everything at once, just slow down!