File size: 1,321 Bytes
865b874
69c25a8
 
 
865b874
69c25a8
 
 
 
 
 
865b874
69c25a8
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
---
language:
  - en
  - ru
license: apache-2.0


tags:
- gpt
- NLG

---
# HuYaLM 100B

**Hu**gging Face **YaLM 100B** (by [BlackSamorez](https://github.com/BlackSamorez)) is a _transformers_ compatible implementation of **YaLM 100B** model, originally trained by Yandex for 65 days on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian.

This particular implementation was motivated by the fact that the model was originally published with outdated code, incompatible with latest achievemnt in the field. This code, being compatible with _transformers_, should automatically support much needed features such as [quantization](https://huggingface.co/docs/transformers/main_classes/quantization) and [adapter training](https://huggingface.co/docs/peft/index).

Training details and best practices on acceleration and stabilizations can be found on **[Medium](https://medium.com/p/d1df53d0e9a6)** (English) and **[Habr](https://habr.com/ru/company/yandex/blog/672396/)** (Russian) articles. The original code published by Yandex can be found on [GitHub](https://github.com/yandex/YaLM-100B).

This code, as well as the model itself, is published under [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) license, permitting commercial use.