Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
description:
Fixed the bug that when parsing PDF, when the PDF content is converted from PPT to a file, the layout of the content is found to be reversed. As shown in the picture below, if calculated from the lower right corner of bbox, rectangle A should be ranked behind B, but if the rectangle has text, the text of rectangle A should be read first in front of rectangle B, so I think Maybe using the upper left corner of the rectangle as the basis for bbox sorting will be more suitable for most people's reading habits.
So I think when switching the coordinate system, (x0, y0) should be kept as the upper left corner point of the rectangle