therealcyberlord
/

vit-indian-food

Image Classification

Inference Endpoints

Model card Files Files and versions Community

therealcyberlord commited on Apr 9, 2024

Commit

85ebd16

·

verified ·

1 Parent(s): 86e95ff

Update README.md

Files changed (1) hide show

README.md +24 -9

README.md CHANGED Viewed

@@ -8,12 +8,27 @@ metrics:
 - recall
 ---
-Fine-tuned ViT on the Indian Food Dataset: https://huggingface.co/datasets/bharat-raghunathan/indian-foods-dataset
-Evaluation metrics on the testing set (961 images):
- • accuracy: 0.9667
- • precision: 0.9670
- • recall: 0.9667

 - recall
 ---
+# Indian Food Classification with Vision Transformer (ViT)
+## Overview
+This model is a fine-tuned Vision Transformer (ViT) for the task of classifying images of Indian foods. The model was trained on the [Indian Foods Dataset](https://huggingface.co/datasets/bharat-raghunathan/indian-foods-dataset) from Hugging Face Datasets.
+## Dataset
+The Indian Foods Dataset contains 4,770 images across 15 different classes of popular Indian dishes. The dataset is split into:
+- Training: 3,047 images
+- Validation: 762 images
+- Testing: 961 images
+## Model
+The base model used is the vision transformer (google/vit-base-patch16-224-in21k). The model was fine-tuned on the Indian Foods Dataset for 10 epochs using the AdamW optimizer with a learning rate of 2e-4.
+## Evaluation
+The model was evaluated on the test set and achieved the following metrics:
+- Accuracy: 0.9667
+- Precision: 0.9670
+- Recall: 0.9667
+## Usage
+You can use this pre-trained model directly from Hugging Face