KYUNGYONG
/

ko-gemma-2-9b-it-4bit

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

ko-gemma-2-9b-it-4bit / README.md

KYUNGYONG's picture

Upload README.md with huggingface_hub

da8dbb1 verified 3 days ago

|

history blame contribute delete

1.29 kB

	---
	license: gemma
	library_name: transformers
	pipeline_tag: text-generation
	extra_gated_heading: Access Gemma on Hugging Face
	extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
	agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging
	Face and click below. Requests are processed immediately.
	extra_gated_button_content: Acknowledge license
	tags:
	- conversational
	- mlx
	- mlx-my-repo
	base_model: rtzr/ko-gemma-2-9b-it
	language:
	- ko
	---

	# KYUNGYONG/ko-gemma-2-9b-it-4bit

	The Model [KYUNGYONG/ko-gemma-2-9b-it-4bit](https://huggingface.co/KYUNGYONG/ko-gemma-2-9b-it-4bit) was converted to MLX format from [rtzr/ko-gemma-2-9b-it](https://huggingface.co/rtzr/ko-gemma-2-9b-it) using mlx-lm version 0.21.5.

	## Use with mlx

	```bash
	pip install mlx-lm
	```

	```python
	from mlx_lm import load, generate

	model, tokenizer = load("KYUNGYONG/ko-gemma-2-9b-it-4bit")

	prompt="hello"

	if hasattr(tokenizer, "apply_chat_template") and tokenizer.chat_template is not None:
	messages = [{"role": "user", "content": prompt}]
	prompt = tokenizer.apply_chat_template(
	messages, tokenize=False, add_generation_prompt=True
	)

	response = generate(model, tokenizer, prompt=prompt, verbose=True)
	```