suriya7
/

Gemma-2B-Finetuned-Python-Model

Text2Text Generation

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

suriya7 commited on Mar 20, 2024

Commit

c0f6805

·

verified ·

1 Parent(s): 9ee9713

Update README.md

Files changed (1) hide show

README.md +11 -3

README.md CHANGED Viewed

@@ -34,12 +34,17 @@ Gemma-2B Fine-Tuned Python Model is a deep learning model based on the Gemma-2B
 1. **Install Gemma Python Package**:
    ```bash
     pip install -q -U transformers==4.38.0
    ```
 ## Inference
 1. **How to use the model in our notebook**:
 ```python
 # Load model directly
 from transformers import AutoTokenizer, AutoModelForCausalLM
 tokenizer = AutoTokenizer.from_pretrained("suriya7/Gemma-2B-Finetuned-Python-Model")
@@ -51,12 +56,15 @@ prompt_template = f"""
 <end_of_turn>\n<start_of_turn>model
 """
 prompt = prompt_template
-encodeds = tokenizer(prompt, return_tensors="pt", add_special_tokens=True)
-model_inputs = encodeds.to('cuda')
 # Increase max_new_tokens if needed
-generated_ids = model.generate(**model_inputs, max_new_tokens=1000, do_sample=False, pad_token_id=tokenizer.eos_token_id)
 ans = ''
 for i in tokenizer.decode(generated_ids[0], skip_special_tokens=True).split('<end_of_turn>')[:2]:
     ans += i

 1. **Install Gemma Python Package**:
    ```bash
     pip install -q -U transformers==4.38.0
+    pip install torch
    ```
 ## Inference
 1. **How to use the model in our notebook**:
 ```python
 # Load model directly
+import torch
 from transformers import AutoTokenizer, AutoModelForCausalLM
 tokenizer = AutoTokenizer.from_pretrained("suriya7/Gemma-2B-Finetuned-Python-Model")
 <end_of_turn>\n<start_of_turn>model
 """
 prompt = prompt_template
+encodeds = tokenizer(prompt, return_tensors="pt", add_special_tokens=True).input_ids
+device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
+model.to(device)
+inputs = encodeds.to(device)
 # Increase max_new_tokens if needed
+generated_ids = model.generate(inputs, max_new_tokens=1000, do_sample=False, pad_token_id=tokenizer.eos_token_id)
 ans = ''
 for i in tokenizer.decode(generated_ids[0], skip_special_tokens=True).split('<end_of_turn>')[:2]:
     ans += i