File size: 3,652 Bytes
39387db
f8b82fd
39387db
 
 
 
 
 
 
4ba12c5
a0ea9ec
 
 
 
 
4ba12c5
a0ea9ec
b013c5b
 
a0ea9ec
4ba12c5
 
 
e5f9d40
4ba12c5
 
 
e5f9d40
f8b82fd
4ba12c5
 
445cf26
bb7c4d4
 
 
39387db
 
 
 
 
 
 
950e1b5
 
 
 
 
39387db
950e1b5
 
 
 
 
 
39387db
950e1b5
39387db
950e1b5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39387db
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
950e1b5
 
 
 
39387db
 
 
 
 
950e1b5
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
---

license: mit
base_model: microsoft/deberta-v3-base
tags:
- generated_from_trainer
datasets:
- squad_v2
model-index:
- name: deberta-v3-base-finetuned-squad2
  results:
  - task:
      name: Question Answering
      type: question-answering
    dataset:
      type: squad_v2
      name: SQuAD 2
      config: squad_v2
      split: validation
    metrics:
    - type: exact_match
      value: 84.56161037648447
      name: Exact-Match
      verified: true
    - type: f1
      value: 87.81110592215731
      name: F1-score
      verified: true
      
language:
- en
pipeline_tag: question-answering
metrics:
- exact_match
- f1
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

## Model description

DeBERTa-v3-base fine-tuned on SQuAD 2.0 : Encoder-based Transformer Language model.
The DeBERTa V3 base model comes with 12 layers and a hidden size of 768. 
It has only 86M backbone parameters with a vocabulary containing 128K tokens which introduces 98M parameters in the Embedding layer. 
This model was trained using the 160GB data as DeBERTa V2.
Suitable for Question-Answering tasks, predicts answer spans within the context provided.<br>

**Language model:** microsoft/deberta-v3-base  
**Language:** English  
**Downstream-task:** Question-Answering  
**Training data:** Train-set SQuAD 2.0  
**Evaluation data:** Evaluation-set SQuAD 2.0   
**Hardware Accelerator used**: GPU Tesla T4

## Intended uses & limitations

For Question-Answering - 

```python
!pip install transformers
from transformers import pipeline
model_checkpoint = "IProject-10/deberta-v3-base-finetuned-squad2"
question_answerer = pipeline("question-answering", model=model_checkpoint)

context = """
🤗 Transformers is backed by the three most popular deep learning libraries — Jax, PyTorch and TensorFlow — with a seamless integration
between them. It's straightforward to train your models with one before loading them for inference with the other.
"""

question = "Which deep learning libraries back 🤗 Transformers?"
question_answerer(question=question, context=context)
```

## Results

Evaluation on SQuAD 2.0 validation dataset:

```
 exact: 84.56161037648447,
 f1: 87.81110592215731,
 total: 11873,
 HasAns_exact: 81.62955465587045,
 HasAns_f1: 88.13786447600818,
 HasAns_total: 5928,
 NoAns_exact: 87.48528174936922,
 NoAns_f1: 87.48528174936922,
 NoAns_total: 5945,
 best_exact: 84.56161037648447,
 best_exact_thresh: 0.9994288682937622,
 best_f1: 87.81110592215778,
 best_f1_thresh: 0.9994288682937622,
 total_time_in_seconds: 336.43560706100106,
 samples_per_second: 35.29055709566211,
 latency_in_seconds: 0.028336191953255374
```

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 3

### Training results

| Training Loss | Epoch | Step  | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|
| 0.7299        | 1.0   | 8217  | 0.7246          |
| 0.5104        | 2.0   | 16434 | 0.7321          |
| 0.3547        | 3.0   | 24651 | 0.8493          |

This model is a fine-tuned version of [microsoft/deberta-v3-base](https://huggingface.co/microsoft/deberta-v3-base) on the squad_v2 dataset.
It achieves the following results on the evaluation set:
- Loss: 0.8493
  
### Framework versions

- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.3
- Tokenizers 0.13.3