| --- |
| base_model: Qwen/Qwen2.5-32B-Instruct |
| library_name: transformers |
| model_name: step-conditional-control |
| tags: |
| - generated_from_trainer |
| - trl |
| - sft |
| license: apache-2.0 |
| --- |
| |
| # Model Summary |
|
|
| - **Repository:** [simplescaling/s1](https://github.com/simplescaling/s1) |
| - **Paper:** https://arxiv.org/abs/2501.19393 |
|
|
| # Use |
|
|
| This is the token-conditional control model for our paper. You can evaluate using the information [here](https://github.com/simplescaling/s1?tab=readme-ov-file#evaluation). |
|
|
| # Training information |
|
|
| [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/hashimoto-group/o1/runs/xaantfal) |
|
|
| - TRL: 0.13.0 |
| - Transformers: 4.48.0 |
| - Pytorch: 2.3.1 |
| - Datasets: 3.0.1 |
| - Tokenizers: 0.21.0 |
|
|
| # Citation |
|
|
| ```bibtex |
| @misc{muennighoff2025s1simpletesttimescaling, |
| title={s1: Simple test-time scaling}, |
| author={Niklas Muennighoff and Zitong Yang and Weijia Shi and Xiang Lisa Li and Li Fei-Fei and Hannaneh Hajishirzi and Luke Zettlemoyer and Percy Liang and Emmanuel Candès and Tatsunori Hashimoto}, |
| year={2025}, |
| eprint={2501.19393}, |
| archivePrefix={arXiv}, |
| primaryClass={cs.CL}, |
| url={https://arxiv.org/abs/2501.19393}, |
| } |
| ``` |