twitter-xlmr-clip-finetuned-all-123

This model is a fine-tuned version of cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual on the all dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7405
  • Precision: 0.6431
  • Recall: 0.6554
  • F1: 0.6401

Model description

More information needed

Usage

To use the model use the following script. Kindly refer to the app.py for the Transform and VisionTextDualEncoderModel class definitions.

import torch
import torch.nn as nn

import torchvision
from torchvision.transforms import CenterCrop, ConvertImageDtype, Normalize, Resize
from torchvision.transforms.functional import InterpolationMode
from torchvision import transforms
from torchvision.io import ImageReadMode, read_image


from transformers import CLIPModel, AutoModel
from huggingface_hub import hf_hub_download
from safetensors.torch import load_model

from datasets import load_dataset, load_metric
from transformers import (
    AutoConfig,
AutoImageProcessor,
    AutoModelForSequenceClassification,
    AutoTokenizer,
    logging,
)

id2label = {0: "negative", 1: "neutral", 2: "positive"}
label2id = {"negative": 0, "neutral": 1, "positive": 2}

tokenizer = AutoTokenizer.from_pretrained("cardiffnlp/twitter-xlm-roberta-base-sentiment-multilingual")

model = VisionTextDualEncoderModel(num_classes=3)
config = model.vision_encoder.config

# https://huggingface.co/FFZG-cleopatra/M2SA/blob/main/model.safetensors
sf_filename = hf_hub_download("FFZG-cleopatra/M2SA", filename="model.safetensors")

load_model(model, sf_filename) 
image_processor = AutoImageProcessor.from_pretrained("openai/clip-vit-base-patch32")

def predict_sentiment(text, image):
    # read the image file   
    image = read_image(image, mode=ImageReadMode.RGB)
       
    text_inputs = tokenizer(
            text,
            max_length=512,
            padding="max_length",
            truncation=True,
            return_tensors="pt"
        )
    
    image_transformations = Transform(
        config.vision_config.image_size,
        image_processor.image_mean,
        image_processor.image_std,
    )
    image_transformations = torch.jit.script(image_transformations)
    pixel_values = image_transformations(image)
    text_inputs["pixel_values"] = pixel_values.unsqueeze(0)
   
    prediction = None
    with torch.no_grad():
        outputs = model(**text_inputs)
        print(outputs)
        prediction = np.argmax(outputs["logits"], axis=-1)
        print(id2label[prediction[0].item()])
    return id2label[prediction[0].item()]

text = "I feel good today"
image = "link-to-image"
predict_sentiment(text, image)

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 123
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1
0.6444 0.06 500 0.8771 0.6905 0.4537 0.4197
0.5499 0.12 1000 0.8167 0.7197 0.4260 0.4117
0.5357 0.18 1500 0.8084 0.7263 0.4696 0.4424
0.5175 0.24 2000 0.8704 0.6666 0.4266 0.3717
0.5285 0.3 2500 0.9067 0.7529 0.4565 0.4221
0.5081 0.36 3000 0.7414 0.7655 0.6114 0.6356
0.506 0.42 3500 0.8713 0.5830 0.6591 0.5786
0.5049 0.48 4000 0.7514 0.5551 0.4568 0.4464
0.4999 0.54 4500 0.7584 0.6519 0.5502 0.5767
0.507 0.6 5000 0.8072 0.6479 0.5626 0.5636
0.5048 0.66 5500 0.8080 0.6260 0.5725 0.5730
0.4907 0.72 6000 0.7966 0.6976 0.5138 0.5224
0.493 0.78 6500 0.8193 0.7099 0.4949 0.4922
0.4668 0.84 7000 0.7502 0.6282 0.6942 0.6501
0.4717 0.9 7500 0.7636 0.6372 0.5109 0.5191
0.4774 0.96 8000 0.7652 0.7513 0.5360 0.5587
0.4676 1.02 8500 0.8482 0.6372 0.5918 0.5836
0.4361 1.08 9000 0.7456 0.6687 0.5177 0.5175
0.4536 1.14 9500 0.8449 0.7363 0.5160 0.5156
0.4277 1.2 10000 0.8648 0.6382 0.5247 0.5173
0.4444 1.26 10500 0.8723 0.5871 0.6622 0.5959
0.4269 1.32 11000 0.7856 0.6151 0.5521 0.5526
0.4322 1.38 11500 0.7405 0.6431 0.6554 0.6401
0.4435 1.44 12000 0.7682 0.6568 0.5751 0.5923
0.4429 1.5 12500 0.8824 0.5956 0.6006 0.5545
0.4381 1.56 13000 0.7879 0.4457 0.4727 0.4395
0.4389 1.62 13500 0.7555 0.6260 0.6984 0.6502
0.4529 1.68 14000 0.7981 0.6621 0.5546 0.5663
0.4509 1.74 14500 0.7827 0.6160 0.6321 0.6172
0.4413 1.8 15000 0.7895 0.6381 0.6357 0.6285
0.4198 1.86 15500 0.8345 0.5940 0.5526 0.5602
0.4415 1.92 16000 0.8746 0.6615 0.6612 0.6459
0.443 1.98 16500 0.8155 0.6516 0.5265 0.5352
0.4068 2.04 17000 0.7642 0.5838 0.6220 0.5975
0.3905 2.1 17500 0.7929 0.6720 0.5555 0.5740
0.3969 2.16 18000 0.8949 0.5330 0.4771 0.4687
0.3841 2.22 18500 0.9233 0.6028 0.5410 0.5492
0.4031 2.28 19000 0.7720 0.6089 0.5719 0.5776
0.3878 2.34 19500 0.9046 0.6265 0.5358 0.5318
0.4001 2.41 20000 0.8451 0.6960 0.5622 0.5761
0.3997 2.47 20500 0.8964 0.6170 0.5665 0.5541
0.3945 2.53 21000 0.8001 0.5553 0.5180 0.5195
0.4005 2.59 21500 0.8357 0.5519 0.5100 0.5170
0.3907 2.65 22000 0.8017 0.5884 0.5409 0.5552
0.3858 2.71 22500 0.8283 0.6036 0.5792 0.5862
0.3973 2.77 23000 0.9024 0.5770 0.5665 0.5393
0.3969 2.83 23500 0.8341 0.5642 0.5528 0.5558
0.3911 2.89 24000 0.8966 0.6045 0.5088 0.5070
0.3856 2.95 24500 0.8349 0.6021 0.5586 0.5689
0.3961 3.01 25000 0.9364 0.6119 0.5412 0.5585
0.3301 3.07 25500 0.9542 0.5757 0.6084 0.5813
0.3385 3.13 26000 1.0137 0.5563 0.5294 0.5346
0.3475 3.19 26500 0.9311 0.6359 0.5675 0.5822

Framework versions

  • Transformers 4.38.2
  • Pytorch 2.2.1+cu121
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
429M params
Tensor type
F32
ยท
Inference API
Unable to determine this model's library. Check the docs .

Model tree for FFZG-cleopatra/M2SA

Space using FFZG-cleopatra/M2SA 1