Imagine if you had 6 options now. Do you do all 6 in parallel? That's the best, but the most wasteful. Even with tokens we do things like speculative decoding! So we have to find better ways of doing the mix.
Things get more complex when you don't know what H and L are. For instance, if you had GPT 5.5 and Opus 4.7, how do you choose which to route where? Or do you do both every time? That's when it gets interesting.