chatOP β€” SmolLM2-135M Fine-tuned

A fine-tuned version of SmolLM2-135M-Instruct trained to act as a friendly study assistant for CS and ML concepts.

Model Details

Base model HuggingFaceTB/SmolLM2-135M-Instruct
Model type Causal Language Model
Fine-tuning method SFT + LoRA
Language English
License Apache 2.0

Training Details

Dataset

Hyperparameters

Parameter Value
Learning rate 2e-4
Training steps 200
Batch size 2
Packing True
Optimizer AdamW
LR scheduler Linear
Precision fp16

LoRA Config

Parameter Value
Rank (r) 16
Alpha (lora_alpha) 16
Dropout 0.05
Target modules all-linear
Bias none
Trainable parameters ~1.7M / 135M (1.26%)

Hardware

  • GPU: NVIDIA Tesla T4 (Google Colab)
  • Training time: ~15–20 minutes

Example Outputs

Prompt: What is machine learning?

Machine learning is a field of AI where systems learn from data to make predictions or decisions without being explicitly programmed for each task.

Prompt: Explain transformers in simple terms.

A Transformer is a neural network architecture that uses attention mechanisms to understand relationships between all words in a sentence simultaneously, rather than reading them one by one like older models did.

Prompt: What is the difference between LoRA and full fine-tuning?

Full fine-tuning updates all parameters of a model which requires a lot of memory. LoRA instead freezes the base model and adds small trainable matrices to each layer, training only ~1% of parameters while achieving similar results.

Limitations

  • Small model (135M parameters) β€” responses may be less detailed than larger models
  • Trained for only 200 steps β€” longer training would improve quality
  • Best for short explanations β€” may struggle with very complex multi-step reasoning
  • Not suitable for factual lookups β€” may hallucinate specific facts or numbers

Training Framework

Built as a learning project while studying the HuggingFace LLM Course β€” specifically Chapter 11 (Fine-tuning LLMs).

Libraries used:

  • πŸ€— Transformers
  • πŸ€— PEFT
  • TRL (SFTTrainer)
  • πŸ€— Datasets
  • Accelerate

Author

Made by puravky β€” undergrad student exploring ML and AI.

Downloads last month
71
Safetensors
Model size
0.1B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for puravky/chatOP

Adapter
(43)
this model

Dataset used to train puravky/chatOP