chatOP — SmolLM2-135M Fine-tuned

A fine-tuned version of SmolLM2-135M-Instruct trained to act as a friendly study assistant for CS and ML concepts.

Model Details


Base model	HuggingFaceTB/SmolLM2-135M-Instruct
Model type	Causal Language Model
Fine-tuning method	SFT + LoRA
Language	English
License	Apache 2.0

Training Details

Dataset

Name: HuggingFaceTB/smoltalk
Split used: all
Format: Conversational chat messages (system, user, assistant)

Hyperparameters

Parameter	Value
Learning rate	2e-4
Training steps	200
Batch size	2
Packing	True
Optimizer	AdamW
LR scheduler	Linear
Precision	fp16

LoRA Config

Parameter	Value
Rank (`r`)	16
Alpha (`lora_alpha`)	16
Dropout	0.05
Target modules	all-linear
Bias	none
Trainable parameters	~1.7M / 135M (1.26%)

Hardware

GPU: NVIDIA Tesla T4 (Google Colab)
Training time: ~15–20 minutes

Example Outputs

Prompt: What is machine learning?

Machine learning is a field of AI where systems learn from data to make predictions or decisions without being explicitly programmed for each task.

Prompt: Explain transformers in simple terms.

A Transformer is a neural network architecture that uses attention mechanisms to understand relationships between all words in a sentence simultaneously, rather than reading them one by one like older models did.

Prompt: What is the difference between LoRA and full fine-tuning?

Full fine-tuning updates all parameters of a model which requires a lot of memory. LoRA instead freezes the base model and adds small trainable matrices to each layer, training only ~1% of parameters while achieving similar results.

Limitations

Small model (135M parameters) — responses may be less detailed than larger models
Trained for only 200 steps — longer training would improve quality
Best for short explanations — may struggle with very complex multi-step reasoning
Not suitable for factual lookups — may hallucinate specific facts or numbers

Training Framework

Built as a learning project while studying the HuggingFace LLM Course — specifically Chapter 11 (Fine-tuning LLMs).

Libraries used:

🤗 Transformers
🤗 PEFT
TRL (SFTTrainer)
🤗 Datasets
Accelerate

Author

Made by puravky — undergrad student exploring ML and AI.

Downloads last month: 71

Safetensors

Model size

0.1B params

Tensor type

BF16

Model tree for puravky/chatOP

Base model

HuggingFaceTB/SmolLM2-135M

Quantized

HuggingFaceTB/SmolLM2-135M-Instruct

Adapter

(43)

this model

puravky
/

chatOP