Model Card for etweedy/roberta-base-squad-v2
An instance of roberta-base for QA which was fine-tuned for context-based question answering on the SQuAD v2 dataset, a dataset of English-language context-question-answer triples designed for extractive question answering training and benchmarking. Version 2 of SQuAD (Stanford Question Answering Dataset) contains the 100,000 examples from SQuAD Version 1.1, along with 50,000 additional "unanswerable" questions, i.e. questions whose answer cannot be found in the provided context.
The original RoBERTa (Robustly Optimized BERT Pretraining Approach) model was introduced in this paper and this repository
Demonstration space
Try out inference on this model using this app
Overview
Pretrained model: roberta-base
Language: English
Downstream-task: Extractive QA
Training data: SQuAD v2 train split
Eval data: SQuAD v2 validation split
How to Get Started with the Model
Initializing pipeline:
from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
repo_id = "etweedy/roberta-base-squad-v2"
QA_pipeline = pipeline(
task = 'question-answering',
model=repo_id,
tokenizer=repo_id,
handle_impossible_answer = True
)
Inference:
input = {
'question': 'Who invented Twinkies?',
'context': 'Twinkies were invented on April 6, 1930, by Canadian-born baker James Alexander Dewar for the Continental Baking Company in Schiller Park, Illinois.'
}
response = QA_pipeline(**input)
Training Hyperparameters
batch_size = 16
n_epochs = 3
learning_rate = 3e-5
base_LM_model = ["roberta-base"](https://huggingface.co/roberta-base)
max_seq_len = 384
stride=128
lr_schedule = LinearWarmup
warmup_proportion = 0.0
mixed_precision="fp16"
Evaluation results
The model was evaluated on the validation split of SQuAD v2 and attained the following results:
{"exact": 79.87029394424324,
"f1": 82.91251169582613,
"total": 11873,
"HasAns_exact": 77.93522267206478,
"HasAns_f1": 84.02838248389763,
"HasAns_total": 5928,
"NoAns_exact": 81.79983179142137,
"NoAns_f1": 81.79983179142137,
"NoAns_total": 5945}
BibTeX base model citation:
@article{DBLP:journals/corr/abs-1907-11692,
author = {Yinhan Liu and
Myle Ott and
Naman Goyal and
Jingfei Du and
Mandar Joshi and
Danqi Chen and
Omer Levy and
Mike Lewis and
Luke Zettlemoyer and
Veselin Stoyanov},
title = {RoBERTa: {A} Robustly Optimized {BERT} Pretraining Approach},
journal = {CoRR},
volume = {abs/1907.11692},
year = {2019},
url = {http://arxiv.org/abs/1907.11692},
archivePrefix = {arXiv},
eprint = {1907.11692},
timestamp = {Thu, 01 Aug 2019 08:59:33 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-1907-11692.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
- Downloads last month
- 7