cszhzleo commited on
Commit
1ab7d0c
·
verified ·
1 Parent(s): 6c694ac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -3
README.md CHANGED
@@ -1,3 +1,27 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ ---
4
+
5
+ run
6
+ ```
7
+ docker run -it --name llama-31 --rm \
8
+ -p 8080:80 \
9
+ -v /home/ec2-user/models-hf/:/models \
10
+ -e HF_MODEL_ID=/models/NousResearch/Meta-Llama-3.1-8B-Instruct \
11
+ -e MAX_INPUT_TOKENS=256 \
12
+ -e MAX_TOTAL_TOKENS=4096 \
13
+ -e MAX_BATCH_SIZE=1 \
14
+ -e LOG_LEVEL="info,text_generation_router=debug,text_generation_launcher=debug" \
15
+ --device=/dev/neuron0 \
16
+ neuronx-tgi:latest \
17
+ --model-id /models/NousResearch/Meta-Llama-3.1-8B-Instruct \
18
+ --max-batch-size 1 \
19
+ --max-input-tokens 256 \
20
+ --max-total-tokens 4096
21
+
22
+ ```
23
+
24
+ test
25
+ ```
26
+ curl 127.0.0.1:8080/generate -X POST -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' -H 'Content-Type: application/json'
27
+ ```