german-zeroshot

This model is a fine-tuned version of deepset/gbert-large on facebook/xnli de dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4592
  • Accuracy: 0.8486

Usage

# Use a pipeline as a high-level helper

pipe = pipeline(
    "zero-shot-classification",
    model="kaixkhazaki/german-zeroshot",
    tokenizer="kaixkhazaki/german-zeroshot",
    device=0 if torch.cuda.is_available() else -1  # Use GPU if available
)

#Enter your text and possible candidates of classification
sequence = "Können Sie mir die Schritte zur Konfiguration eines VPN auf einem Linux-Server erklären?"
candidate_labels = [
    "Technische Dokumentation", 
    "IT-Support", 
    "Netzwerkadministration", 
    "Linux-Konfiguration", 
    "VPN-Setup"
]
pipe(sequence,candidate_labels)
>>
{'sequence': 'Können Sie mir die Schritte zur Konfiguration eines VPN auf einem Linux-Server erklären?',
'labels': ['VPN-Setup', 'Linux-Konfiguration', 'Netzwerkadministration', 'IT-Support', 'Technische Dokumentation'],
'scores': [0.53142249584198, 0.26030370593070984, 0.09126164764165878, 0.06451434642076492, 0.052497804164886475]}


#example 2
sequence = "Wie lautet die Garantiezeit für dieses Produkt?"
candidate_labels = [
    "Garantiebedingungen", 
    "Kundendienst", 
    "Produktdetails", 
    "Reklamation", 
    "Kaufberatung"
]
pipe(sequence,candidate_labels)
>>
{'sequence': 'Wie lautet die Garantiezeit für dieses Produkt?',
'labels': ['Garantiebedingungen', 'Kundendienst', 'Produktdetails', 'Reklamation', 'Kaufberatung'],
'scores': [0.414899080991745, 0.2377401739358902, 0.1381743848323822, 0.12171833217144012, 0.08746808022260666]}

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall
0.6429 0.1630 1000 0.5203 0.8004 0.8006 0.8009 0.8004
0.5715 0.3259 2000 0.5209 0.7964 0.7968 0.8005 0.7964
0.5897 0.4889 3000 0.5435 0.7924 0.7940 0.8039 0.7924
0.5701 0.6519 4000 0.5242 0.7880 0.7884 0.8078 0.7880
0.5238 0.8149 5000 0.4816 0.8233 0.8226 0.8263 0.8233
0.5285 0.9778 6000 0.4483 0.8265 0.8273 0.8303 0.8265
0.4302 1.1408 7000 0.4751 0.8209 0.8214 0.8277 0.8209
0.4163 1.3038 8000 0.4560 0.8285 0.8289 0.8344 0.8285
0.3942 1.4668 9000 0.4330 0.8414 0.8422 0.8454 0.8414
0.3875 1.6297 10000 0.4171 0.8430 0.8432 0.8455 0.8430
0.3639 1.7927 11000 0.4194 0.8442 0.8447 0.8487 0.8442
0.3768 1.9557 12000 0.4215 0.8474 0.8477 0.8492 0.8474
0.2443 2.1186 13000 0.4750 0.8390 0.8398 0.8452 0.8390
0.2404 2.2816 14000 0.4592 0.8486 0.8487 0.8505 0.8486
0.2154 2.4446 15000 0.4914 0.8418 0.8424 0.8466 0.8418
0.2157 2.6076 16000 0.4804 0.8454 0.8458 0.8488 0.8454
0.2249 2.7705 17000 0.4809 0.8466 0.8471 0.8507 0.8466
0.2204 2.9335 18000 0.4777 0.8466 0.8470 0.8502 0.8466

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
37
Safetensors
Model size
336M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for kaixkhazaki/german-zeroshot

Finetuned
(14)
this model

Dataset used to train kaixkhazaki/german-zeroshot