Spaces:

joaogante
/

assisted_generation_demo

Running on Zero

joaogante HF staff commited on Aug 21, 2024

Commit

ef976dc

verified ·

1 Parent(s): e08f7f0

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -21,7 +21,7 @@ def run_generation(user_text, use_assistant, temperature, max_new_tokens):
         do_sample = True
     # Get the model and tokenizer, and tokenize the user text.
-    model_inputs = tokenizer([user_text], return_tensors="pt").to(torch_device)
     # Start generation on a separate thread, so that we don't block the UI. The text is pulled from the streamer
     # in the main thread. Adds timeout to the streamer to handle exceptions in the generation thread.

         do_sample = True
     # Get the model and tokenizer, and tokenize the user text.
+    model_inputs = tokenizer([user_text], return_tensors="pt").to(model.device)
     # Start generation on a separate thread, so that we don't block the UI. The text is pulled from the streamer
     # in the main thread. Adds timeout to the streamer to handle exceptions in the generation thread.