Fine Tuning LLAVA for classification
Hello, Thank you for this great model :) I tried to use LLAVA as a classification model, where I needed to classify pictures in multiple categories, so far it seems that the model performed well for classification in couple of the categories but not in all of them.
I am wondering if it would be possible to finetune this model with my own data sets for better performance for my use case. If so could you please refer me to a documentation/piece of code on how I can to fine-tune this model ? Thank you :)
Definitely! Although for image classification LLaVa might be a bit of an overkill. One could probably just leverage smaller image classifiers like ConvNeXT or SigLIP for this purpose.
See the demo notebooks: https://github.com/huggingface/notebooks/blob/main/examples/image_classification.ipynb.
If you want to fine-tune LLaVa, here's a demo script for that using the TRL library: https://github.com/huggingface/trl/blob/main/examples/scripts/vsft_llava.py