BlackSamorez
/

HuYaLM-100B-fp16

Inference Endpoints

Model card Files Files and versions Community

HuYaLM-100B-fp16 / README.md

BlackSamorez's picture

Readme update

234c0fa over 1 year ago

|

history blame contribute delete

1.36 kB

	---
	language:
	- en
	- ru
	license: apache-2.0


	tags:
	- gpt
	- NLG

	---
	# HuYaLM 100B

	Hugging Face YaLM 100B (by [BlackSamorez](https://github.com/BlackSamorez)) is a transformers-compatible implementation of the YaLM 100B model. Originally trained by Yandex, the model used 800 A100 graphics cards and 1.7 TB of diverse text data, including online texts and books, in both English and Russian.

	The motivation behind this particular implementation is to update the originally published, outdated code to align with the latest advancements in the field. As this code is compatible with the _transformers_ library, it inherently supports crucial features like [quantization](https://huggingface.co/docs/transformers/main_classes/quantization) (for model size optimization) and [adapter training](https://huggingface.co/docs/peft/index) (for efficient fine-tuning).

	For more details on training, acceleration, and stabilization techniques, you can refer to articles on [Medium](https://medium.com/p/d1df53d0e9a6) (in English) and [Habr](https://habr.com/ru/company/yandex/blog/672396/) (in Russian). The original code from Yandex is available on [GitHub](https://github.com/yandex/YaLM-100B).

	This code and model are distributed under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0) license, which allows for commercial use.