Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
83.7
TFLOPS
61
38
447
David Golchinfar
PRO
DavidGF
Follow
Prasad23's profile picture
DeepMount00's profile picture
regisss's profile picture
65 followers
·
47 following
https://vago-solutions.ai
DavidGFar
dgolchin
AI & ML interests
finetune llms, improve german language understanding and generated text of llms
Recent Activity
reacted
to
anakin87
's
post
with ❤️
about 3 hours ago
A small model that struggled against a random opponent now beats GPT-5-mini at tic-tac-toe I took https://huggingface.co/LiquidAI/LFM2-2.6B and trained it through play. 🧑🍳 Here's how: 1️⃣ Build a solid RL env with Verifiers (Prime Intellect) 2️⃣ Generate synthetic data: <200 games sampled from GPT-5-mini playing in the env 3️⃣ SFT warm-up to teach format 4️⃣ Group-based RL (CISPO) against opponents making 20-70% random moves 5️⃣ RL again with stronger opponents (0-25% random moves) + 1.25 temperature to push exploration and shake off suboptimal strategies Done! Beats GPT-5-mini 🏆 --- 🎮 Play against the model: https://huggingface.co/spaces/anakin87/LFM2-2.6B-mr-tictactoe 🤗 Model: https://huggingface.co/anakin87/LFM2-2.6B-mr-tictactoe 📚 Walkthrough/course: https://github.com/anakin87/llm-rl-environments-lil-course 🤗 Dataset and checkpoints: https://huggingface.co/collections/anakin87/lfm2-26b-mr-tic-tac-toe
liked
a Space
about 3 hours ago
anakin87/LFM2-2.6B-mr-tictactoe
liked
a model
2 days ago
lightonai/DenseOn
View all activity
Organizations
DavidGF
's models
1
Sort: Recently updated
DavidGF/SauerkrautTTS-Preview-0.1-Q8_0-GGUF
3B
•
Updated
Apr 2, 2025
•
15