Spaces:

Junfeng5
/

Liquid_demo

Running on Zero

Junfeng5 commited on 19 days ago

Commit

9eb57f2

verified ·

1 Parent(s): 5a81375

Update app.py

Files changed (1) hide show

app.py CHANGED Viewed

@@ -61,9 +61,9 @@ vqllm = AutoModelForCausalLM.from_pretrained(
     model_id,
     attn_implementation='flash_attention_2',
     torch_dtype=torch.bfloat16,
-    load_in_8bit=True,
-    max_memory={0: "40GiB" },
-    ) # .to("cuda:0")
 stop_flag = False

     model_id,
     attn_implementation='flash_attention_2',
     torch_dtype=torch.bfloat16,
+    # load_in_8bit=True,
+    # max_memory={0: "40GiB" },
+    ).to("cuda:0")
 stop_flag = False