Spaces:
Sleeping
Sleeping
File size: 820 Bytes
419f968 4304db6 d822059 4304db6 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
## Load text from PDF files (using PyMuPDF4LLM)
??? note "Note"
**Underlying Library:** `pymupdf4llm`
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
You can interact with the underlying library and fine-tune the outputs via `**kwargs`.
Use it in our library with:
```python
from medrag_multi_modal.document_loader.text_loader import PyMuPDF4LLMTextLoader
```
For details and available `**kwargs`, please refer to the sources below.
**Sources:**
- [Docs](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm/)
- [GitHub](https://github.com/pymupdf/PyMuPDF)
- [PyPI](https://pypi.org/project/pymupdf4llm/)
::: medrag_multi_modal.document_loader.text_loader.pymupdf4llm_text_loader
|