hai
cloudyu
AI & ML interests
Looking for a full time job.
Recent Activity
new activity about 1 month ago
Qwen/Qwen3.5-397B-A17B:感谢老铁除夕坚持工作 new activity about 1 month ago
inclusionAI/LLaDA2.1-mini:error report to run example new activity about 1 month ago
Qwen/Qwen3-Coder-Next:"num_experts_per_tok": 10 这个设置是领导拍脑袋拍出来的吗?Organizations
感谢老铁除夕坚持工作
❤️ 6
#12 opened about 1 month ago
by
cloudyu
error report to run example
3
#3 opened about 1 month ago
by
cloudyu
"num_experts_per_tok": 10 这个设置是领导拍脑袋拍出来的吗?
1
#12 opened about 1 month ago
by
cloudyu
Update README.md
#17 opened 2 months ago
by
cherry0328
咱这个模型是非得国庆前更新吗??
😔👍 114
31
#1 opened 6 months ago
by
luckjone
国庆是休息日,请给我们关注的同学一点休息时间
👀👍 64
1
#10 opened 6 months ago
by
luckjone
Transformers does not recognize this architecture
6
#6 opened 6 months ago
by
eva20150932-atlascloud
mac studio : loading model vocabulary: unknown pre-tokenizer type: 'grok-2'
#5 opened 6 months ago
by
cloudyu
demo能不能亲自跑一下,成功了再发出来?
#8 opened 7 months ago
by
cloudyu
Why is the chat_template mixed with Chinese and English?
👍 2
5
#8 opened 7 months ago
by
Daucloud
please share how export qwen3 to onnx foramt, many thanks!
👍 1
2
#1 opened 10 months ago
by
cloudyu
It's challenging for QwQ to generate long codes...
2
#38 opened about 1 year ago
by
DXBTR74
error when to try this gguf
👀 1
3
#3 opened about 1 year ago
by
cloudyu
unknown pre-tokenizer type: 'deepseek-r1-qwen'
👍 4
2
#1 opened about 1 year ago
by
Neman
Adding Evaluation Results
#3 opened about 2 years ago
by
leaderboard-pr-bot
Templete Prompt
2
#20 opened over 1 year ago
by
sadra
there are 3 "r"s in the playful "strawrberry"?
🧠 2
5
#6 opened over 1 year ago
by
JieYingZhang
Adding Evaluation Results
#16 opened over 1 year ago
by
leaderboard-pr-bot
不知道下载哪些内容
1
#18 opened over 1 year ago
by
qcnace
Adding Evaluation Results
#1 opened about 2 years ago
by
leaderboard-pr-bot