HuggingFaceM4
/

Idefics3-8B-Llama3

@@ -18,8 +18,6 @@ library_name: transformers
     <img src="https://huggingface.co/HuggingFaceM4/idefics-80b/resolve/main/assets/IDEFICS.png" alt="Idefics-Obelics logo" width="200" height="100">
 </p>
-**Transformers version**: until the next Transformers pypi release, please install Transformers from source and use [this PR](https://github.com/huggingface/transformers/pull/32473) to be able to use Idefics3. TODO: change when new version.
 # Idefics3
 Idefics3 is an open multimodal model that accepts arbitrary sequences of image and text inputs and produces text outputs. The model can answer questions about images, describe visual content, create stories grounded on multiple images, or simply behave as a pure language model without visual inputs. It improves upon [Idefics1](https://huggingface.co/HuggingFaceM4/idefics-80b-instruct) and [Idefics2](https://huggingface.co/HuggingFaceM4/idefics2-8b), significantly enhancing capabilities around OCR, document understanding and visual reasoning.

     <img src="https://huggingface.co/HuggingFaceM4/idefics-80b/resolve/main/assets/IDEFICS.png" alt="Idefics-Obelics logo" width="200" height="100">
 </p>
 # Idefics3
 Idefics3 is an open multimodal model that accepts arbitrary sequences of image and text inputs and produces text outputs. The model can answer questions about images, describe visual content, create stories grounded on multiple images, or simply behave as a pure language model without visual inputs. It improves upon [Idefics1](https://huggingface.co/HuggingFaceM4/idefics-80b-instruct) and [Idefics2](https://huggingface.co/HuggingFaceM4/idefics2-8b), significantly enhancing capabilities around OCR, document understanding and visual reasoning.