File size: 820 Bytes
419f968
4304db6
d822059
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4304db6
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
## Load text from PDF files (using PyMuPDF4LLM)

??? note "Note"
    **Underlying Library:** `pymupdf4llm`

    PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

    You can interact with the underlying library and fine-tune the outputs via `**kwargs`.

    Use it in our library with:
    ```python
    from medrag_multi_modal.document_loader.text_loader import PyMuPDF4LLMTextLoader
    ```

    For details and available `**kwargs`, please refer to the sources below.

    **Sources:**

    - [Docs](https://pymupdf.readthedocs.io/en/latest/pymupdf4llm/)
    - [GitHub](https://github.com/pymupdf/PyMuPDF)
    - [PyPI](https://pypi.org/project/pymupdf4llm/)

::: medrag_multi_modal.document_loader.text_loader.pymupdf4llm_text_loader