| --- |
| base_model: unsloth/Olmo-3-7B-Think |
| library_name: transformers |
| license: apache-2.0 |
| language: |
| - en |
| - fr |
| tags: |
| - granite |
| - fine-tuned |
| - conversational |
| - distillation |
| - thinking |
| - reasoning |
| datasets: |
| - TeichAI/glm-4.7-2000x |
| pipeline_tag: text-generation |
| --- |
| |
| # olmo-3-DISTILL-glm-4.7-think |
|
|
| This model is a fine-tuned version of [unsloth/Olmo-3-7B-Think](https://huggingface.co/unsloth/Olmo-3-7B-Think) trained on high-reasoning conversational data from GLM 4.7 by Z.ai. |
|
|
| ## Model Details |
|
|
| - **Base Model:** unsloth/Olmo-3-7B-Think |
| - **Fine-tuning Dataset:** TeichAI/glm-4.7-2000x |
| - **Context Length:** 1048576 tokens |
| - **Special Feature:** Thinking/Reasoning with `<think>` tags |
|
|
| ## Quantized Versions (GGUF) |
|
|
| **🔗 GGUF versions available here: [olmo-3-DISTILL-glm-4.7-think-GGUF](https://huggingface.co/glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF)** |
|
|
| | Format | Size | Use Case | |
| |--------|------|----------| |
| | Q2_K | Smallest | Low memory, reduced quality | |
| | Q4_K_M | Recommended | Best balance | |
| | Q5_K_M | Good | Higher quality | |
| | Q8_0 | Large | Near lossless | |
| | F16 | Largest | Original precision | |
|
|
| ## Usage |
|
|
| ### Transformers |
|
|
| ```python |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| model = AutoModelForCausalLM.from_pretrained("glogwa68/olmo-3-DISTILL-glm-4.7-think") |
| tokenizer = AutoTokenizer.from_pretrained("glogwa68/olmo-3-DISTILL-glm-4.7-think") |
| |
| messages = [{"role": "user", "content": "Hello, how are you?"}] |
| inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True) |
| outputs = model.generate(inputs, max_new_tokens=256) |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| ``` |
|
|
| ### Ollama (GGUF) |
|
|
| ```bash |
| ollama run hf.co/glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF:Q4_K_M |
| ``` |
|
|
| ### llama.cpp |
|
|
| ```bash |
| llama-cli --hf-repo glogwa68/olmo-3-DISTILL-glm-4.7-think-GGUF --hf-file olmo-3-distill-glm-4.7-think-q4_k_m.gguf -p "Hello" |
| ``` |
|
|
| ## Training Details |
|
|
| - **Epochs:** 2 |
| - **Learning Rate:** 2e-5 |
| - **Batch Size:** 8 (with gradient accumulation) |
| - **Precision:** FP16 |
| - **Hardware:** Multi-GPU with DeepSpeed ZeRO-3 |
|
|
| ## License |
|
|
| Apache 2.0 |
|
|