Fine Tuning LLAVA for classification

#24

by J812 - opened Apr 28

J812

Apr 28

Hello, Thank you for this great model :) I tried to use LLAVA as a classification model, where I needed to classify pictures in multiple categories, so far it seems that the model performed well for classification in couple of the categories but not in all of them.
I am wondering if it would be possible to finetune this model with my own data sets for better performance for my use case. If so could you please refer me to a documentation/piece of code on how I can to fine-tune this model ? Thank you :)

nielsr

Llava Hugging Face org Apr 29

Definitely! Although for image classification LLaVa might be a bit of an overkill. One could probably just leverage smaller image classifiers like ConvNeXT or SigLIP for this purpose.

See the demo notebooks: https://github.com/huggingface/notebooks/blob/main/examples/image_classification.ipynb.

If you want to fine-tune LLaVa, here's a demo script for that using the TRL library: https://github.com/huggingface/trl/blob/main/examples/scripts/vsft_llava.py

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment