jondurbin
/

airoboros-110b-3.3

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

jondurbin commited on May 15, 2024

Commit

05db6fd

·

verified ·

1 Parent(s): bb0527d

Update README.md

Files changed (1) hide show

README.md +2 -8

README.md CHANGED Viewed

@@ -580,26 +580,20 @@ Experiment, and find out what works and doesn't.
 2) After you created your account update your billing and navigate to the deploy page.
 3) Select the following
     - GPU Type: A6000
-    - GPU Quantity: 2
     - Category: Creator
     - Image: Jon Durbin
     - Coupon Code: JonDurbin
 4) Deploy the VM!
 5) Navigate to 'Running Instances' to retrieve instructions to login to the VM
 6) Once inside the VM, open the terminal and run `volume=$PWD/data`
-7) Run `model=jondurbin/airoboros-34b-3.3`
 8) `sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
 9) The model will take some time to load...
 10) Once loaded the model will be available on port 8080
 For assistance with the VM join the [Massed Compute Discord Server](https://discord.gg/Mj4YMQY3DA)
-### Latitude.sh
-[Latitude](https://www.latitude.sh/r/4BBD657C) has h100 instances available (as of today, 2024-02-08) for $3/hr!
-They have a few blueprints available for testing LLMs, but a single h100 should be plenty to run this model with 8k ctx.
 ## Support me
 - https://bmc.link/jondurbin

 2) After you created your account update your billing and navigate to the deploy page.
 3) Select the following
     - GPU Type: A6000
+    - GPU Quantity: 4
     - Category: Creator
     - Image: Jon Durbin
     - Coupon Code: JonDurbin
 4) Deploy the VM!
 5) Navigate to 'Running Instances' to retrieve instructions to login to the VM
 6) Once inside the VM, open the terminal and run `volume=$PWD/data`
+7) Run `model=jondurbin/airoboros-110b-3.3`
 8) `sudo docker run --gpus all --shm-size 1g -p 8080:80 -v $volume:/data ghcr.io/huggingface/text-generation-inference:1.3 --model-id $model`
 9) The model will take some time to load...
 10) Once loaded the model will be available on port 8080
 For assistance with the VM join the [Massed Compute Discord Server](https://discord.gg/Mj4YMQY3DA)
 ## Support me
 - https://bmc.link/jondurbin