chatOP β SmolLM2-135M Fine-tuned
A fine-tuned version of SmolLM2-135M-Instruct trained to act as a friendly study assistant for CS and ML concepts.
Model Details
| Base model | HuggingFaceTB/SmolLM2-135M-Instruct |
| Model type | Causal Language Model |
| Fine-tuning method | SFT + LoRA |
| Language | English |
| License | Apache 2.0 |
Training Details
Dataset
- Name: HuggingFaceTB/smoltalk
- Split used:
all - Format: Conversational chat messages (system, user, assistant)
Hyperparameters
| Parameter | Value |
|---|---|
| Learning rate | 2e-4 |
| Training steps | 200 |
| Batch size | 2 |
| Packing | True |
| Optimizer | AdamW |
| LR scheduler | Linear |
| Precision | fp16 |
LoRA Config
| Parameter | Value |
|---|---|
Rank (r) |
16 |
Alpha (lora_alpha) |
16 |
| Dropout | 0.05 |
| Target modules | all-linear |
| Bias | none |
| Trainable parameters | ~1.7M / 135M (1.26%) |
Hardware
- GPU: NVIDIA Tesla T4 (Google Colab)
- Training time: ~15β20 minutes
Example Outputs
Prompt: What is machine learning?
Machine learning is a field of AI where systems learn from data to make predictions or decisions without being explicitly programmed for each task.
Prompt: Explain transformers in simple terms.
A Transformer is a neural network architecture that uses attention mechanisms to understand relationships between all words in a sentence simultaneously, rather than reading them one by one like older models did.
Prompt: What is the difference between LoRA and full fine-tuning?
Full fine-tuning updates all parameters of a model which requires a lot of memory. LoRA instead freezes the base model and adds small trainable matrices to each layer, training only ~1% of parameters while achieving similar results.
Limitations
- Small model (135M parameters) β responses may be less detailed than larger models
- Trained for only 200 steps β longer training would improve quality
- Best for short explanations β may struggle with very complex multi-step reasoning
- Not suitable for factual lookups β may hallucinate specific facts or numbers
Training Framework
Built as a learning project while studying the HuggingFace LLM Course β specifically Chapter 11 (Fine-tuning LLMs).
Libraries used:
- π€ Transformers
- π€ PEFT
- TRL (SFTTrainer)
- π€ Datasets
- Accelerate
Author
Made by puravky β undergrad student exploring ML and AI.
- Downloads last month
- 71
Model tree for puravky/chatOP
Base model
HuggingFaceTB/SmolLM2-135M