| | --- |
| | title: README |
| | emoji: π± |
| | colorFrom: green |
| | colorTo: red |
| | sdk: static |
| | pinned: false |
| | --- |
| | |
| | # π± KodCode: A Diverse, Challenging, and Verifiable Synthetic Dataset for Coding |
| |
|
| | KodCode is the largest fully-synthetic open-source dataset providing verifiable solutions and tests for coding tasks. It contains 12 distinct subsets spanning various domains (from algorithmic to package-specific knowledge) and difficulty levels (from basic coding exercises to interview and competitive programming challenges). KodCode is designed for both supervised fine-tuning (SFT) and RL tuning. |
| |
|
| |
|
| | <div align="center"> |
| |
|
| | πΈοΈ [Project Website](https://kodcode-ai.github.io/) | π [Technical Report](https://arxiv.org/abs/2503.02951) | πΎ [Github Repo](https://github.com/KodCode-AI/kodcode) | π€ [KodCode-V1 (For RL)](https://huggingface.co/datasets/KodCode/KodCode-V1) | π€ [KodCode-V1-SFT-R1 (for SFT)](https://huggingface.co/datasets/KodCode/KodCode-V1-SFT-R1) |
| |
|
| | </div> |