Does the original image information in the PDF need to be parsed? #37

ic-xu · 2024-04-28T01:29:33Z

Description

PDF is a document with mixed graphics and text. When we are doing RAG, the pictures in the PDF often contain important information, so we generally need to return the parsed pictures to the user as they are; currently I have a private version that makes pictures Extraction, I am not sure whether the main branch needs this part of the function

Filimoa · 2024-04-28T15:03:57Z

Open a PR and we can take a look!

Filimoa added the enhancement New feature or request label Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Does the original image information in the PDF need to be parsed? #37

Does the original image information in the PDF need to be parsed? #37

ic-xu commented Apr 28, 2024

Filimoa commented Apr 28, 2024

Does the original image information in the PDF need to be parsed? #37

Does the original image information in the PDF need to be parsed? #37

Comments

ic-xu commented Apr 28, 2024

Description

Filimoa commented Apr 28, 2024