llm-course-hw1 / README.md
Yana6's picture
Update README.md
8071f4a verified
metadata
tags:
  - model_hub_mixin
  - pytorch_model_hub_mixin
  - gpt-like
  - russian
  - jokes
language:
  - ru
metrics:
  - perplexity
model_type: transformer
library_name: pytorch

Russian Jokes GPT-small

“LLM-course HW1”

Модель

Эта модель предназначена для генерации шуток на русском языке. Использует архитектуру трансформера с ALiBi и GQA для позиционного кодирования и внимания.

Использование

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("your_username/my-ru-joke-small")
tokenizer = AutoTokenizer.from_pretrained("your_username/my-ru-joke-small")

prompt = "Встретились два экономиста,"
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
generated_ids = model.generate(input_ids, max_new_tokens=40, temperature=0.9, top_k=40)

print(tokenizer.decode(generated_ids[0], skip_special_tokens=True))

This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
- Code: [More Information Needed]
- Paper: [More Information Needed]
- Docs: [More Information Needed]