At first glance, document parsing might look like OCR, but it’s really a three-part problem.
First, you need to detect the layout (where are the blocks?); then you recognize the content (what’s inside those blocks?); and finally, you have to make sense of how everything fits together in the way humans would read it, what’s the logical flow?