File size: 3,173 Bytes

83580cd
7735ca3
da5ae82
 
 
21678fa
da5ae82
 
 
 
 
21678fa
 
 
 
 
 
 
 
 
 
602d7dc
 
 
 
21678fa
602d7dc
21678fa
 
 
 
602d7dc
 
21678fa
 
 
 
 
602d7dc
83580cd
da5ae82
 
 
 
 
16ef421
da5ae82

---
license: cc-by-nc-3.0
inference: false
datasets:
- wisdomik/QUILT-LLaVA-Instruct-107K
- wisdomik/Quilt_VQA
- wisdomik/QuiltVQA_RED
pipeline_tag: text-generation
tags:
- medical
- histopathology
- arxiv:2312.04746
extra_gated_prompt: >-
  Please read and agree to the following terms: 1. The requester details
  provided are not faked. 2. The model will not be used for commercial/clinical
  purposes and will be used for the purpose of scientific research only. 3. The
  data will not be re-distributed, published, copied, or further disseminated in
  any way or form whatsoever, whether for profit or not. 4. The right
  study/paper (Quilt-1M(https://quilt1m.github.io/) and Quilt-LLaVa
  (https://quilt-llava.github.io) papers) will be cited in any publication(s)
  that uses this model/data 
extra_gated_fields:
  Email: text
  First and last name: text
  Affiliation: text
  Type of Affiliation:
    type: select
    options:
    - Academia
    - Industry
    - Other
  I want to use this model for:
    type: select
    options:
    - Research
    - Education
    - label: Other
      value: other
  I agree to the aforementioned terms of use: checkbox
---

<br>
<br>

<p align="center">
  <img src="https://quilt-llava.github.io/static/images/teaser.png" alt="fig2" width="70%"/>
</p>

# Quilt-LlaVA Model Card

## Model details

**Model type:**
[Quilt-LLaVA](https://quilt-llava.github.io/) is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on histopathology educational video sourced images and GPT-generated multimodal instruction-following data.
It is an auto-regressive language model, based on the transformer architecture.


**Citation**
```bibtex
@article{seyfioglu2023quilt,
  title={Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos},
  author={Seyfioglu, Mehmet Saygin and Ikezogwo, Wisdom O and Ghezloo, Fatemeh and Krishna, Ranjay and Shapiro, Linda},
  journal={arXiv preprint arXiv:2312.04746},
  year={2023}
}
```
**Model date:**
Quilt-LlaVA-v1.5-7B was trained in November 2023.

**Paper or resources for more information:**
https://quilt-llava.github.io/

## License
Llama 2 is licensed under the LLAMA 2 Community License, 
Copyright (c) Meta Platforms, Inc. All Rights Reserved.

**Where to send questions or comments about the model:**
https://github.com/quilt-llava/quilt-llava.github.io/issues

## Intended use
**Primary intended uses:**
The primary use of Quilt-LlaVA is research on medical large multimodal models and chatbots.

**Primary intended users:**
The primary intended users of these models are AI researchers.

We primarily imagine the model will be used by researchers to better understand the robustness, generalization, and other capabilities, biases, and constraints of large vision-language generative histopathology models.

## Training dataset
- 723K filtered image-text pairs from QUILT-1M https://quilt1m.github.io/.
- 107K GPT-generated multimodal instruction-following data from QUILT-Instruct https://huggingface.co/datasets/wisdomik/QUILT-LLaVA-Instruct-107K.


## Evaluation dataset
A collection of 4 academic VQA histopathology benchmarks