Paper-Summarization-ArXiv
This model is a fine-tuned version of google/pegasus-x-base on the arxiv-summarization dataset.
Base Model: Pegasus-x-base (State-of-the-art for Long Context Summarization)
Finetuning Dataset:
- We used full of ArXiv Dataset (Cohan et al., 2018, NAACL-HLT 2018) [PDF]
- (Full length is 200,000+)
GPU: (RTX A6000) x 1
Train time: About 120 hours for 5 epochs
Test time: About 8 hours for test dataset.
Intended uses & limitations
- Research Paper Summarization
Compare to Baseline
Pegasus-X-base zero-shot Performance:
- R-1 | R-2 | R-L | R-LSUM : 6.2269 | 0.7894 | 4.6905 | 5.4591
This model
- R-1 | R-2 | R-L | R-LSUM : 43.2305 | 16.6571 | 24.4315 | 33.9399 at
model.generate(input_ids =inputs["input_ids"].to(device), attention_mask=inputs["attention_mask"].to(device), length_penalty=1, num_beams=2, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, top_k=25,top_p=0.95)
- R-1 | R-2 | R-L | R-LSUM : 40.8486 | 16.3717 | 25.2937 | 33.6923 (refer to PEGASUS-X's paper) at
model.generate(input_ids =inputs["input_ids"].to(device), attention_mask=inputs["attention_mask"].to(device), length_penalty=1, num_beams=1, max_length=128*2,top_p=1)
- R-1 | R-2 | R-L | R-LSUM : 38.1317 | 15.0357 | 23.0286 | 30.9938 (Diverse Beam-Search Decoding) at
model.generate(input_ids =inputs["input_ids"].to(device), attention_mask=inputs["attention_mask"].to(device), num_beam_groups=5,diversity_penalty=1.0,num_beams=5,min_length=150,max_length=128*4)
- R-1 | R-2 | R-L | R-LSUM : 43.3017 | 16.6023 | 24.1867 | 33.7019 at
model.generate(input_ids =inputs["input_ids"].to(device), attention_mask=inputs["attention_mask"].to(device), length_penalty=1.2, num_beams=4, max_length=128*4,min_length=150, no_repeat_ngram_size= 3, temperature=0.9,top_k=50,top_p=0.92)
Training procedure
We use huggingface-based environment such as datasets, trainer, etc.
Training hyperparameters
The following hyperparameters were used during training:
learning_rate: 1e-05,
train_batch_size: 1,
eval_batch_size: 1,
seed: 42,
gradient_accumulation_steps: 64,
total_train_batch_size: 64,
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08,
lr_scheduler_type: linear,
lr_scheduler_warmup_steps: 1586,
num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.6153 | 1.0 | 3172 | 2.1045 |
2.202 | 2.0 | 6344 | 2.0511 |
2.1547 | 3.0 | 9516 | 2.0282 |
2.132 | 4.0 | 12688 | 2.0164 |
2.1222 | 5.0 | 15860 | 2.0127 |
Framework versions
- Transformers 4.32.1
- Pytorch 2.0.1
- Datasets 2.12.0
- Tokenizers 0.13.2
- Downloads last month
- 100
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for UNIST-Eunchan/Research-Paper-Summarization-Pegasus-x-ArXiv
Base model
google/pegasus-x-baseDataset used to train UNIST-Eunchan/Research-Paper-Summarization-Pegasus-x-ArXiv
Spaces using UNIST-Eunchan/Research-Paper-Summarization-Pegasus-x-ArXiv 2
Evaluation results
- ROUGE-1 on ccdv/arxiv-summarizationtest set self-reported43.230
- ROUGE-2 on ccdv/arxiv-summarizationtest set self-reported16.657
- ROUGE-L on ccdv/arxiv-summarizationtest set self-reported24.431
- ROUGE-LSum on ccdv/arxiv-summarizationtest set self-reported33.940