issues Search Results · repo:pymupdf/RAG language:Python
Filter by
184 results
(104 ms)184 results
inpymupdf/RAG (press backspace or delete to remove)As the title suggests
The identified content is shown in the following figure
img width= 1143 height= 973 alt= Image src=
https://github.com/user-attachments/assets/1403f9fb-571c-4d5f-94f8-3505e4e5f6bc ...
not a bug
neverlatetolearn0
- 3
- Opened 5 days ago
- #298
The following diagram is not extracted by pymupdf4llm.to_markdown( uart.pdf , write_images=True) as images, which it
should.
img width= 1665 height= 800 alt= Image src=
https://github.com/user-attachments/assets/f39dd327-6d4c-484d-84ed-939e72ee25ce ...
bug
fix developed
xcpky
- 4
- Opened 10 days ago
- #296
We highly recommend posting bugs, issues, feature requests and discussions on our forum.pymupdf.com
jamie-lemon
- Opened 13 days ago
- #295
I have a PDF version of a picture (manually filled out form). When using pymupdf4llm.to_markdown(doc, page_chunks=True),
the page image is not detected. I believe this has to do with the size of the image ...
bug
fix developed
jmoreno11
- 7
- Opened 13 days ago
- #294
To_markdown function extracts table in markdown format perfectly if in the pdf the table has borders. Like this...
img width= 1233 height= 401 alt= Image src=
https://github.com/user-attachments/assets/a730d77f-87c9-4912-9456-cbfde65af19f ...
wontfix
Aryabhattacharjee
- 2
- Opened 14 days ago
- #293
Summary
I m encountering an issue where pymupdf4llm.to_markdown() returns an empty string for a specific PDF in version 0.0.25
(and also 0.0.24). However, the same file works correctly in version 0.0.17. ...
azhurb
- 4
- Opened on Jun 18
- #289
Instead of printing progress in to_markdown(), pass the progress using generator or something so that ui based
applications can use it
enhancement
devilsaint99
- 1
- Opened on Jun 16
- #288
Hello, there is an exception when trying to extract the pdf text. It seems that some fonts are missing. The exception
was found in versions 20 to 25, but not in versions 14 or so Here s my pdf output_first_20_pages.pdf ...
upstream
yumingmin88
- 1
- Opened on Jun 16
- #287
Hello @JorjMcKie ,
I tried to extract markdown text from the given file and it gave me the given error. Could you please help to identify
and fix the issue ?
Here is my code:
import pymupdf4llm
md_text ...
wontfix
urvisism
- 3
- Opened on Jun 11
- #283
Hi,
With pymupdf 1.26.0, pymupdf4llm 0.0.24, I recently found that the text can sometimes be duplicated and the duplication
seems to be conducted several time under a row. Don t know if it is something ...
fix developed
IronK77
- 2
- Opened on Jun 10
- #282

Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Restrict your search to the title by using the in:title qualifier.
Learn how you can use GitHub Issues to plan and track your work.
Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub IssuesProTip!
Restrict your search to the title by using the in:title qualifier.