ValueError: gemm_fp8_nt_groupwise is only supported on SM100, SM103 in trtllm backend.
#3
by Fernanda24 - opened
In sglang other DSv32 quants work like the AWQ from QuantTrio. I believe they fallback to the tilelang reference kernels provided by deepseek-ai. vLLM has no sm120 (rtx 5090, 6000 etc) fallback while sglang has the NSA backedn that fallbacks to tilelang. Catn't get this one to work though.
ValueError: gemm_fp8_nt_groupwise is only supported on SM100, SM103 in trtllm backend.