HerrHruby/offline_acemath_rl_30b_inst_hard_16k_filtered_with_dishsoap_curr_3_steps_2_samples_step_120 31B • Updated Dec 29, 2025 • 1
HerrHruby/offline_acemath_rl_30b_inst_hard_16k_filtered_with_dishsoap_curr_3_steps_2_samples_step_100 31B • Updated Dec 29, 2025 • 1
HerrHruby/offline_acemath_rl_30b_inst_hard_16k_filtered_with_dishsoap_curr_3_steps_2_samples_step_60 31B • Updated Dec 27, 2025 • 1
HerrHruby/online_acemath_rl_30b_inst_hard_16k_filtered_no_summ_3_steps_2_samples_step_80 31B • Updated Dec 24, 2025 • 1
HerrHruby/online_acemath_rl_30b_inst_hard_16k_no_summ_2_steps_2_samples_step_80_high_ent 31B • Updated Dec 19, 2025 • 1
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_3_steps_2_samples_step_170 4B • Updated Nov 29, 2025 • 2
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_3_steps_2_samples_step_120 4B • Updated Nov 28, 2025
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_omit_step_150 4B • Updated Nov 26, 2025 • 1
HerrHruby/online_acemath_rl_4b_inst_hard_16k_no_summ_3_steps_2_samples_step_120 4B • Updated Nov 24, 2025 • 1
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_step_180 4B • Updated Nov 23, 2025 • 1
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_16k_no_summ_curr_step_60 4B • Updated Nov 22, 2025
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_small_16k_thinking_no_summ_curr_step_100 4B • Updated Nov 13, 2025 • 1
HerrHruby/offline_acemath_rl_4b_inst_hard_with_dishsoap_small_16k_thinking_no_summ_curr_step_70 4B • Updated Nov 13, 2025 • 2
HerrHruby/online_acemath_rl_4b_hard_16k_thinking_no_summ_2_steps_2_samples_4b_base_028_clip_step_110 4B • Updated Nov 12, 2025 • 2
HerrHruby/online_acemath_rl_4b_hard_16k_thinking_no_summ_2_steps_2_samples_4b_base_027_clip_step_90 4B • Updated Nov 11, 2025 • 1
HerrHruby/online_pope_rl_800_4_steps_2_samples_omit_initial_thinking_step_2_step_140 4B • Updated Nov 9, 2025
HerrHruby/online_pope_rl_800_4_steps_2_samples_omit_initial_thinking_step_2_step_96 4B • Updated Nov 7, 2025 • 1
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_2_steps_2_samples_4b_base_buggy_step_80 4B • Updated Nov 7, 2025 • 2
HerrHruby/online_acemath_rl_4b_inst_hard_with_dishsoap_curr_trained_90_steps_3_steps_2_samples_60_steps 4B • Updated Nov 2, 2025 • 1
HerrHruby/online_acemath_rl_4b_inst_hard_with_dishsoap_curr_trained_90_steps_3_steps_2_samples_20_steps 4B • Updated Nov 2, 2025 • 2
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_4_steps_2_samples_step_200 4B • Updated Oct 29, 2025 • 3
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_4_steps_2_samples_step_100 4B • Updated Oct 28, 2025 • 1
HerrHruby/offline_acemath_rl_4b_inst_hard_16k_thinking_no_summ_step_120 4B • Updated Oct 26, 2025 • 1
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_vanilla_like_step_90 4B • Updated Oct 21, 2025 • 2
HerrHruby/online_acemath_rl_4b_inst_hard_16k_thinking_no_summ_thinking_step_90 4B • Updated Oct 21, 2025 • 1