---
license: mit
---

# BLaIR-roberta-base

<!-- Provide a quick summary of what the model is/does. -->

BLaIR, which is short for "**B**ridging **La**nguage and **I**tems for **R**etrieval and **R**ecommendation", is a series of language models pre-trained on Amazon Reviews 2023 dataset.

BLaIR is grounded on pairs of *(item metadata, language context)*, enabling the models to:
* derive strong item text representations, for both recommendation and retrieval;
* predict the most relevant item given simple / complex language context.

[[📑 Paper](https://arxiv.org/abs/2403.03952)] · [[💻 Code](https://github.com/hyp1231/AmazonReviews2023)] · [[🌐 Amazon Reviews 2023 Dataset](https://amazon-reviews-2023.github.io/)] · [[🤗 Huggingface Datasets](https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2023)] · [[🔬 McAuley Lab](https://cseweb.ucsd.edu/~jmcauley/)]

## Model Details

- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model:** [roberta-base](https://huggingface.co/FacebookAI/roberta-base)
- **Repository:** [https://github.com/hyp1231/AmazonReviews2023](https://github.com/hyp1231/AmazonReviews2023)
- **Paper:** [https://arxiv.org/abs/2403.03952](https://arxiv.org/abs/2403.03952)

## Citation

If you find Amazon Reviews 2023 dataset, BLaIR checkpoints, Amazon-C4 dataset, or our scripts/code helpful, please cite the following paper.

```bibtex
@article{hou2024bridging,
  title={Bridging Language and Items for Retrieval and Recommendation},
  author={Hou, Yupeng and Li, Jiacheng and He, Zhankui and Yan, An and Chen, Xiusi and McAuley, Julian},
  journal={arXiv preprint arXiv:2403.03952},
  year={2024}
}
```

## Contact

Please let us know if you encounter a bug or have any suggestions/questions by [filling an issue](https://github.com/hyp1231/AmazonReview2023/issues/new) or emailing Yupeng Hou ([@hyp1231](https://github.com/hyp1231)) at [yphou@ucsd.edu](mailto:yphou@ucsd.edu).