language:
- en
- ru
license: apache-2.0
tags:
- gpt
- NLG
HuYaLM 100B
Hugging Face YaLM 100B (by BlackSamorez) is a transformers compatible implementation of YaLM 100B model, originally trained by Yandex for 65 days on a cluster of 800 A100 graphics cards and 1.7 TB of online texts, books, and countless other sources in both English and Russian.
This particular implementation was motivated by the fact that the model was originally published with outdated code, incompatible with latest achievemnt in the field. This code, being compatible with transformers, should automatically support much needed features such as quantization and adapter training.
Training details and best practices on acceleration and stabilizations can be found on Medium (English) and Habr (Russian) articles. The original code published by Yandex can be found on GitHub.
This code, as well as the model itself, is published under Apache 2.0 license, permitting commercial use.