Model Card for etweedy/roberta-base-squad-v2

An instance of roberta-base for QA which was fine-tuned for context-based question answering on the SQuAD v2 dataset, a dataset of English-language context-question-answer triples designed for extractive question answering training and benchmarking. Version 2 of SQuAD (Stanford Question Answering Dataset) contains the 100,000 examples from SQuAD Version 1.1, along with 50,000 additional "unanswerable" questions, i.e. questions whose answer cannot be found in the provided context.

The original RoBERTa (Robustly Optimized BERT Pretraining Approach) model was introduced in this paper and this repository

Demonstration space

Try out inference on this model using this app

Overview

Pretrained model: roberta-base Language: English
Downstream-task: Extractive QA
Training data: SQuAD v2 train split Eval data: SQuAD v2 validation split

How to Get Started with the Model

Initializing pipeline:

from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
repo_id = "etweedy/roberta-base-squad-v2"
QA_pipeline = pipeline(
    task = 'question-answering',
    model=repo_id,
    tokenizer=repo_id,
    handle_impossible_answer = True
)

Inference:

input = {
    'question': 'Who invented Twinkies?',
    'context': 'Twinkies were invented on April 6, 1930, by Canadian-born baker James Alexander Dewar for the Continental Baking Company in Schiller Park, Illinois.'
}
response = QA_pipeline(**input)

Training Hyperparameters

batch_size = 16
n_epochs = 3
learning_rate = 3e-5
base_LM_model = ["roberta-base"](https://huggingface.co/roberta-base)
max_seq_len = 384
stride=128
lr_schedule = LinearWarmup
warmup_proportion = 0.0
mixed_precision="fp16"

Evaluation results

The model was evaluated on the validation split of SQuAD v2 and attained the following results:

{"exact": 79.87029394424324,
"f1": 82.91251169582613,
"total": 11873,
"HasAns_exact": 77.93522267206478,
"HasAns_f1": 84.02838248389763,
"HasAns_total": 5928,
"NoAns_exact": 81.79983179142137,
"NoAns_f1": 81.79983179142137,
"NoAns_total": 5945}

BibTeX base model citation:

@article{DBLP:journals/corr/abs-1907-11692,
  author    = {Yinhan Liu and
               Myle Ott and
               Naman Goyal and
               Jingfei Du and
               Mandar Joshi and
               Danqi Chen and
               Omer Levy and
               Mike Lewis and
               Luke Zettlemoyer and
               Veselin Stoyanov},
  title     = {RoBERTa: {A} Robustly Optimized {BERT} Pretraining Approach},
  journal   = {CoRR},
  volume    = {abs/1907.11692},
  year      = {2019},
  url       = {http://arxiv.org/abs/1907.11692},
  archivePrefix = {arXiv},
  eprint    = {1907.11692},
  timestamp = {Thu, 01 Aug 2019 08:59:33 +0200},
  biburl    = {https://dblp.org/rec/journals/corr/abs-1907-11692.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

etweedy
/

roberta-base-squad-v2