LagPixelLOL
v2ray
AI & ML interests
Looking for compute sponsors, please contact me through my email [email protected]!
Recent Activity
updated
a model
about 11 hours ago
v2ray/nai-lora-iewa
published
a model
about 13 hours ago
v2ray/nai-lora-iewa
new activity
1 day ago
cognitivecomputations/DeepSeek-R1-AWQ:Can't get 48 TPS on 8x H800
Organizations
v2ray's activity
Can't get 48 TPS on 8x H800
1
#21 opened 1 day ago
by
Light4Bear

gpt-4chan Neo-J
1
#1 opened 2 days ago
by
gman402
Pipeline Parallellism
1
#20 opened 2 days ago
by
leo98xh
8*a100 OUT OF MEMORY
1
#19 opened 2 days ago
by
Jaren
requests get stuck when sending long prompts (already solved, but still don't know why?)
1
#18 opened 2 days ago
by
uv0xab
Significant Speed Drop with Increasing Input Length on H800 GPUs
2
#17 opened 3 days ago
by
wangkkk956
Docker start with vllm failed. Official vllm docker image 0.7.3
1
#7 opened 3 days ago
by
kuliev-vitaly
when i use vllm v0.7.2 to deploy r1 awq, i got empty content
13
#10 opened 11 days ago
by
bupalinyu
why "MLA is not supported with awq_marlin quantization. Disabling MLA." with 4090 * 32 (4 node / vllm 0.7.2)
3
#14 opened 4 days ago
by
FightLLM
when i run command ,it didnot work. ( via vllm 0.7.3)
2
#16 opened 3 days ago
by
xueshuai
skips the thinking process
11
#5 opened 16 days ago
by
muzizon
Any one can run this model with SGlang framework?
2
#13 opened 4 days ago
by
muziyongshixin
Any thresholds recommendation for this model?
3
#1 opened 9 days ago
by
narugo

GPTQ Support
2
#1 opened about 2 months ago
by
warlock-edward
vllm support a100
17
#2 opened about 1 month ago
by
HuggingLianWang
Code used to convert this / could you do v3 base?
1
#3 opened 30 days ago
by
deltanym

What calibration dataset do you use when applying AWQ?
2
#5 opened 11 days ago
by
HandH1998
Deployment framework
27
#2 opened about 1 month ago
by
xro7
MLA is not supported with moe_wna16 quantization. Disabling MLA.
5
#7 opened 12 days ago
by
AMOSE