File size: 4,406 Bytes
f075e93 24d18c8 f075e93 60a0b4c 963fcb0 7f20921 7663e8d 7f20921 7663e8d 963fcb0 7663e8d 0b16acd 7663e8d 7f20921 7663e8d f075e93 7663e8d 963fcb0 7663e8d 7f20921 7663e8d e1b5804 a0ece0a e1b5804 171d2c8 cf001c1 7663e8d cf001c1 7663e8d df7329f f5ea542 171d2c8 f075e93 963fcb0 f075e93 ae453e6 931262d f075e93 ae453e6 931262d f075e93 e222cd4 f075e93 e222cd4 f075e93 cf001c1 a0ece0a cf001c1 f075e93 cf001c1 f075e93 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 |
---
license: apache-2.0
pipeline_tag: text-generation
tags:
- multilingual
- PyTorch
- Transformers
- gpt3
- gpt2
- Deepspeed
- Megatron
- mGPT
datasets:
- mc4
- Wikipedia
widget:
- text: "Ich weiß, dass du müde bist, aber können wir heute Abend noch einen Spaziergang machen? peter szemraj: ich"
example_title: "walk - Deutsch"
- text: "peter szemraj: 我喜欢穿很酷的衣服"
example_title: "fashion - Chinese"
- text: "Wat zei je over mijn moeder? peter szemraj: ik"
example_title: "🚎 - Dutch"
- text: "Zagadka: Człowiekowi, który przebywał na dworze w deszczu bez parasola czy kapelusza, nie zmoczył się ani jeden włos na głowie. Dlaczego? peter szemraj: czy to"
example_title: "brain teaser - Polish"
- text: "Minha amiga diz que conhece todas as línguas, mas não fala nenhuma delas... o que há de errado com ela? peter szemraj: eu"
example_title: "language - Portuguese"
- text: "se potesse vivere ovunque, dove sarebbe? peter szemraj: io"
example_title: "dream living place - Italian"
- text: "Can you take me for dinner somewhere nice this time? peter szemraj:"
example_title: "dinner"
- text: "What really makes you angry? peter szemraj:"
example_title: "pet peeve"
- text: "Jak nazwać aligatora, który właśnie przeszedł operację usunięcia lewego ramienia?peter szemraj: ja"
example_title: "alligator - Polish"
- text: "Warum sind Transformers für die Sprachmodellierung wichtig? peter szemraj: es ist"
example_title: "Transformers - German"
- text: "как написать хорошие подсказки для языковых моделей? peter szemraj: сначала вам нужно"
example_title: "prompt tutorial - Russian"
- text: "Pewien mężczyzna wpycha swój samochód do hotelu i mówi właścicielowi, że jest bankrutem. Dlaczego? peter szemraj: może"
example_title: "brain teaser - Polish 2"
- text: "Zagadka: Mówię bez ust i słyszę bez uszu. Nie mam ciała, ale ożywiam się wraz z wiatrem. Czym jestem? peter szemraj: czy to"
example_title: "brain teaser - Polish 3"
- text: "Què t'agrada fer per divertir-te? peter szemraj: m'agrada"
example_title: "hobbies - Catalan"
- text: "为什么你总是那么累?peter szemraj: 呃,我想"
example_title: "tired - Chinese"
inference:
parameters:
min_length: 2
max_length: 64
do_sample: True
top_k: 10
top_p: 0.9
temperature: 0.65
repetition_penalty: 3.5
no_repeat_ngram_size: 3
length_penalty: 0.4
pad_token: 1
---
# mGPT: fine-tune on message data - 2E
- This model is a fine-tuned version of [sberbank-ai/mGPT](https://huggingface.co/sberbank-ai/mGPT) on 80k messages. This builds on the minimum-working-example checkpoint [here](https://huggingface.co/pszemraj/mGPT-Peter-mwe).
- 2E = 2 epochs
## Model description
- testing if fine-tuned personality data bleeds over to other languages without being trained in them explicitly
**Interesting findings thus far:**
- Passing a generic word after the `<name-identifier>` that is in a non-English language helps ensure the model responds in the question language (see: any example).
- Model generations (in general) remain semantically consistent, even if the generations switch from `<language>`to English in the middle of the generated text. This demonstrates some sort of "universal concept understanding"
### Usage in python
Install the transformers library if you don't have it:
```
pip install -U transformers
```
load the model into a pipeline object:
```
from transformers import pipeline
import torch
device = 'cuda' if torch.cuda.is_available() else 'cpu'
my_chatbot = pipeline('text-generation',
'pszemraj/mGPT-Peter-2E',
device=0 if device == 'cuda' else -1,
)
```
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- distributed_type: multi-GPU
- gradient_accumulation_steps: 8
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 1 (in addition to all training on prior checkpoints)
### Framework versions
- Transformers 4.18.0
- Pytorch 1.11.0+cu113
- Datasets 2.1.0
- Tokenizers 0.12.1
|