Functional Interpolation for Relative Positions Improves Long Context Transformers Paper • 2310.04418 • Published Oct 6, 2023 • 4
SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs Paper • 2106.09997 • Published Jun 18, 2021 • 2
Neural Machine Translation of Rare Words with Subword Units Paper • 1508.07909 • Published Aug 31, 2015 • 4
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models Paper • 2403.14438 • Published Mar 21 • 2
Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models Paper • 2403.18814 • Published Mar 27 • 44
RoBERTa: A Robustly Optimized BERT Pretraining Approach Paper • 1907.11692 • Published Jul 26, 2019 • 7