·
AI & ML interests
NLP, RLHF, IR
Organizations
Makrrr/qwen3-8B-reasonmed-finetune-extreme
Text Generation
• 8B • Updated
• 2
Makrrr/qwen2.5-7B-reasonmed-finetune-extreme
Text Generation
• 8B • Updated
• 4
Makrrr/Qwen3-1.7B-GSM8K-GRPO-verl
Reinforcement Learning
• 2B • Updated
• 28
• 3
Makrrr/a2c-PandaReachDense-v3
Reinforcement Learning
• Updated
Reinforcement Learning
• Updated
• 20
Makrrr/ppo-SnowballTarget
Reinforcement Learning
• Updated
• 12
Makrrr/Pixelcopter-PLE-v0
Reinforcement Learning
• Updated
Reinforcement Learning
• Updated
Makrrr/dqn-SpaceInvadersNoFrameskip-v4
Reinforcement Learning
• Updated
• 5
Reinforcement Learning
• Updated
Makrrr/q-FrozenLake-v1-4x4-noSlippery
Reinforcement Learning
• Updated
Reinforcement Learning
• Updated
• 23
Makrrr/ppo-LunarLander-v2
Reinforcement Learning
• Updated
• 1