File size: 3,652 Bytes
39387db f8b82fd 39387db 4ba12c5 a0ea9ec 4ba12c5 a0ea9ec b013c5b a0ea9ec 4ba12c5 e5f9d40 4ba12c5 e5f9d40 f8b82fd 4ba12c5 445cf26 bb7c4d4 39387db 950e1b5 39387db 950e1b5 39387db 950e1b5 39387db 950e1b5 39387db 950e1b5 39387db 950e1b5 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
---
license: mit
base_model: microsoft/deberta-v3-base
tags:
- generated_from_trainer
datasets:
- squad_v2
model-index:
- name: deberta-v3-base-finetuned-squad2
results:
- task:
name: Question Answering
type: question-answering
dataset:
type: squad_v2
name: SQuAD 2
config: squad_v2
split: validation
metrics:
- type: exact_match
value: 84.56161037648447
name: Exact-Match
verified: true
- type: f1
value: 87.81110592215731
name: F1-score
verified: true
language:
- en
pipeline_tag: question-answering
metrics:
- exact_match
- f1
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
## Model description
DeBERTa-v3-base fine-tuned on SQuAD 2.0 : Encoder-based Transformer Language model.
The DeBERTa V3 base model comes with 12 layers and a hidden size of 768.
It has only 86M backbone parameters with a vocabulary containing 128K tokens which introduces 98M parameters in the Embedding layer.
This model was trained using the 160GB data as DeBERTa V2.
Suitable for Question-Answering tasks, predicts answer spans within the context provided.<br>
**Language model:** microsoft/deberta-v3-base
**Language:** English
**Downstream-task:** Question-Answering
**Training data:** Train-set SQuAD 2.0
**Evaluation data:** Evaluation-set SQuAD 2.0
**Hardware Accelerator used**: GPU Tesla T4
## Intended uses & limitations
For Question-Answering -
```python
!pip install transformers
from transformers import pipeline
model_checkpoint = "IProject-10/deberta-v3-base-finetuned-squad2"
question_answerer = pipeline("question-answering", model=model_checkpoint)
context = """
🤗 Transformers is backed by the three most popular deep learning libraries — Jax, PyTorch and TensorFlow — with a seamless integration
between them. It's straightforward to train your models with one before loading them for inference with the other.
"""
question = "Which deep learning libraries back 🤗 Transformers?"
question_answerer(question=question, context=context)
```
## Results
Evaluation on SQuAD 2.0 validation dataset:
```
exact: 84.56161037648447,
f1: 87.81110592215731,
total: 11873,
HasAns_exact: 81.62955465587045,
HasAns_f1: 88.13786447600818,
HasAns_total: 5928,
NoAns_exact: 87.48528174936922,
NoAns_f1: 87.48528174936922,
NoAns_total: 5945,
best_exact: 84.56161037648447,
best_exact_thresh: 0.9994288682937622,
best_f1: 87.81110592215778,
best_f1_thresh: 0.9994288682937622,
total_time_in_seconds: 336.43560706100106,
samples_per_second: 35.29055709566211,
latency_in_seconds: 0.028336191953255374
```
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3
### Training results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 0.7299 | 1.0 | 8217 | 0.7246 |
| 0.5104 | 2.0 | 16434 | 0.7321 |
| 0.3547 | 3.0 | 24651 | 0.8493 |
This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on the squad_v2 dataset.
It achieves the following results on the evaluation set:
- Loss: 0.8493
### Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.3
- Tokenizers 0.13.3 |