--- language: - en - ru license: mit datasets: - sberquad - UCLNLP/adversarial_qa metrics: - rouge pipeline_tag: text2text-generation --- # Model Card for mTk-AdversarialQA_en-SberQuAD_ru-1B This model is a generative in-context few-shot learner specialized in Russian. It was trained on a combination of English AdversarialQA and Russian SberQuAD datasets. You can find detailed information on [Project Github](https://github.com/fewshot-goes-multilingual/slavic-incontext-learning) & the referenced paper. ## Model Details ### Model Description - **Developed by:** Michal Stefanik & Marek Kadlcik, Masaryk University - **Model type:** mt5 - **Language(s) (NLP):** en,ru - **License:** MIT - **Finetuned from model:** google/mt5-large ### Model Sources - **Repository:** https://github.com/fewshot-goes-multilingual/slavic-incontext-learning - **Paper:** https://arxiv.org/abs/2304.01922 ## Uses This model is intended to be used in a few-shot in-context learning format in the target language (Russian), or in the source language (English, see below). It was evaluated for unseen task learning (with k=3 demonstrations) in Russian: see the referenced paper for details. ### How to Get Started with the Model Use the code below to get started with the model. ```python from transformers import AutoModelForSeq2SeqLM, AutoTokenizer model = AutoModelForSeq2SeqLM.from_pretrained("{this model path}") tokenizer = AutoTokenizer.from_pretrained("{this model path}") # Instead, use keywords "Вопрос", "Контекст" and "Отвечать" for Russian few-shot prompts input_text = """ Question: What is the customer's name? Context: Origin: Barrack Obama, Customer id: Bill Moe. Answer: Bill Moe, Question: What is the customer's name? Context: Customer id: Barrack Obama, if not deliverable, return to Bill Clinton. Answer: """ inputs = tokenizer(input_text, return_tensors="pt") outputs = model.generate(**inputs) print("Answer:") print(tokenizer.decode(outputs)) ``` ## Training Details Training this model can be reproduced by running `pip install -r requirements.txt && python train_mt5_qa_en_AQA+ru_info.py `. See the referenced script for hyperparameters and other training configurations. ## Citation If you use our models or other resources in your research, please cite our work as follows. **BibTeX:** ```bib @inproceedings{stefanik2023resources, author = {\v{S}tef\'{a}nik, Michal and Kadlčík, Marek and Gramacki, Piotr and Sojka, Petr}, title = {Resources and Few-shot Learners for In-context Learning in Slavic Languages}, booktitle = {Proceedings of the 9th Workshop on Slavic Natural Language Processing}, publisher = {ACL}, numpages = {9}, url = {https://arxiv.org/abs/2304.01922}, } ```