Training in progress, epoch 5

Browse files

Files changed (9) hide show

README.md +31 -41
all_results.json +11 -11
eval_results.json +6 -6
model.safetensors +1 -1
runs/May25_19-08-37_5f59a01ef625/events.out.tfevents.1716667021.5f59a01ef625.42.1 +3 -0
runs/May25_20-15-01_5f59a01ef625/events.out.tfevents.1716668143.5f59a01ef625.42.2 +3 -0
train_results.json +6 -6
trainer_state.json +0 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,62 +1,43 @@
 ---
 license: apache-2.0
-base_model: google/vit-base-patch16-224-in21k
 tags:
 - generated_from_trainer
 metrics:
 - accuracy
 model-index:
-- name: Facial Expression Recognition
   results:
   - task:
       name: Image Classification
       type: image-classification
     metrics:
     - name: Accuracy
       type: accuracy
-      value: 0.8571428571428571
-pipeline_tag: image-classification
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# Vision Transformer (ViT) for Facial Expression Recognition Model Card
-## Model Overview
-- **Model Name:** [motheecreator/vit-Facial-Expression-Recognition](https://huggingface.co/motheecreator/vit-Facial-Expression-Recognition)
-- **Task:** Facial Expression/Emotion Recognition
-- **Datasets:** [FER2013](https://www.kaggle.com/datasets/msambare/fer2013), [MMI Facial Expression Database](https://mmifacedb.eu)
-- **Model Architecture:** [Vision Transformer (ViT)](https://huggingface.co/docs/transformers/model_doc/vit)
-- **Finetuned from model:** [vit-base-patch16-224-in21k](https://huggingface.co/google/vit-base-patch16-224-in21k)
-- Loss: 0.4353
-- Accuracy: 0.8571
 ## Model description
-The vit-face-expression model is a Vision Transformer fine-tuned for the task of facial emotion recognition.
-It is trained on the FER2013 and MMI facial Expression datasets , which consist of facial images categorized into seven different emotions:
-- Angry
-- Disgust
-- Fear
-- Happy
-- Sad
-- Surprise
-- Neutral
-## Data Preprocessing
-The input images are preprocessed before being fed into the model. The preprocessing steps include:
-- **Resizing:** Images are resized to the specified input size.
-- **Normalization:** Pixel values are normalized to a specific range.
-- **Data Augmentation:** Random transformations such as rotations, flips, and zooms are applied to augment the training dataset.
 ## Intended uses & limitations
@@ -72,16 +53,25 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
-- train_batch_size: 32
-- eval_batch_size: 32
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 10
 ### Framework versions
@@ -89,4 +79,4 @@ The following hyperparameters were used during training:
 - Transformers 4.36.0
 - Pytorch 2.0.0
 - Datasets 2.1.0
-- Tokenizers 0.15.0

 ---
 license: apache-2.0
+base_model: motheecreator/vit-Facial-Expression-Recognition
 tags:
 - generated_from_trainer
+datasets:
+- image_folder
 metrics:
 - accuracy
 model-index:
+- name: vit-Facial-Expression-Recognition
   results:
   - task:
       name: Image Classification
       type: image-classification
+    dataset:
+      name: image_folder
+      type: image_folder
+      config: default
+      split: train
+      args: default
     metrics:
     - name: Accuracy
       type: accuracy
+      value: 0.7390639923591213
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# vit-Facial-Expression-Recognition
+This model is a fine-tuned version of [motheecreator/vit-Facial-Expression-Recognition](https://huggingface.co/motheecreator/vit-Facial-Expression-Recognition) on the image_folder dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.8219
+- Accuracy: 0.7391
 ## Model description
+More information needed
 ## Intended uses & limitations
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 5
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Accuracy |
+|:-------------:|:-----:|:----:|:---------------:|:--------:|
+| 0.7175        | 1.0   | 654  | 0.7081          | 0.7309   |
+| 0.6952        | 2.0   | 1308 | 0.6931          | 0.7379   |
+| 0.5041        | 3.0   | 1962 | 0.7038          | 0.7444   |
+| 0.2461        | 4.0   | 2617 | 0.7843          | 0.7393   |
+| 0.1846        | 5.0   | 3270 | 0.8219          | 0.7391   |
 ### Framework versions
 - Transformers 4.36.0
 - Pytorch 2.0.0
 - Datasets 2.1.0
+- Tokenizers 0.15.0

all_results.json CHANGED Viewed

@@ -1,13 +1,13 @@
 {
-    "epoch": 10.0,
-    "eval_accuracy": 0.8571428571428571,
-    "eval_loss": 0.4352613687515259,
-    "eval_runtime": 236.2313,
-    "eval_samples_per_second": 108.097,
-    "eval_steps_per_second": 3.378,
-    "total_flos": 7.915696500716863e+19,
-    "train_loss": 0.14746467811720712,
-    "train_runtime": 9966.943,
-    "train_samples_per_second": 102.483,
-    "train_steps_per_second": 0.801
 }

 {
+    "epoch": 5.0,
+    "eval_accuracy": 0.7444126074498567,
+    "eval_loss": 0.7038247585296631,
+    "eval_runtime": 48.5993,
+    "eval_samples_per_second": 107.718,
+    "eval_steps_per_second": 13.478,
+    "total_flos": 8.109125174606561e+18,
+    "train_loss": 0.5082513862064489,
+    "train_runtime": 2793.8499,
+    "train_samples_per_second": 37.468,
+    "train_steps_per_second": 1.17
 }

eval_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 10.0,
-    "eval_accuracy": 0.8571428571428571,
-    "eval_loss": 0.4352613687515259,
-    "eval_runtime": 236.2313,
-    "eval_samples_per_second": 108.097,
-    "eval_steps_per_second": 3.378
 }

 {
+    "epoch": 5.0,
+    "eval_accuracy": 0.7444126074498567,
+    "eval_loss": 0.7038247585296631,
+    "eval_runtime": 48.5993,
+    "eval_samples_per_second": 107.718,
+    "eval_steps_per_second": 13.478
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fb8a0f55d171e73b2625993fddfb2d2c452c945a1cefeb5100d9dfb84cd26493
 size 343239356

 version https://git-lfs.github.com/spec/v1
+oid sha256:e822056179703bef809ffdcacdb067655b1b63f4056e267851098e6fe7de4f60
 size 343239356

runs/May25_19-08-37_5f59a01ef625/events.out.tfevents.1716667021.5f59a01ef625.42.1 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:61108c513b18c46023d744fbd4ab6f219744c22b617198857655903d8a8701da
+size 411

runs/May25_20-15-01_5f59a01ef625/events.out.tfevents.1716668143.5f59a01ef625.42.2 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1bdfcdd98c1d5e3991498dcb1a1445eecb33ca6d135649c0ecef6f64e4708abe
+size 15217

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 10.0,
-    "total_flos": 7.915696500716863e+19,
-    "train_loss": 0.14746467811720712,
-    "train_runtime": 9966.943,
-    "train_samples_per_second": 102.483,
-    "train_steps_per_second": 0.801
 }

 {
+    "epoch": 5.0,
+    "total_flos": 8.109125174606561e+18,
+    "train_loss": 0.5082513862064489,
+    "train_runtime": 2793.8499,
+    "train_samples_per_second": 37.468,
+    "train_steps_per_second": 1.17
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a16e3b200609261a095fe1416c4e1526bc10f4a534016d7fa4a74fc30454f98d
 size 4283

 version https://git-lfs.github.com/spec/v1
+oid sha256:6e52b0836ce2c508087bf7e69ccdfb8eebe0f72926ffbd0e35808c19a08fdda1
 size 4283