YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

RLT Qwen 7B Reasoning Teacher

Reinforcement Learning Teacher model based on Qwen2.5-7B-Instruct-AWQ.

Training Results

  • Training Steps: 30/30 completed
  • Final Loss: -21.89
  • Training Time: 43 minutes

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained('hiroshij/rlt-qwen-7b-reasoning-teacher')
model = AutoModelForCausalLM.from_pretrained('hiroshij/rlt-qwen-7b-reasoning-teacher')
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support