๐ง MiniAxion1.5-3M
Emergent reasoning in a 2.7M parameter model. A tiny Portuguese-first language model that learns how to think before it learns how to be correct.
๐ Overview
MiniAxion1.5-3M is an ultra-compact (~2.7M parameters) GPT-style language model designed to investigate reasoning emergence at extreme small scale.
Unlike typical small models optimized for fluency, MiniAxion is explicitly trained to produce:
Structured reasoning traces Step-by-step thinking () Deterministic answer formatting
It operates primarily in Portuguese, making it a rare example of a non-English reasoning-first nano model.
โก Why This Model Is Interesting
Most models follow this trajectory:
Language โ Knowledge โ Reasoning
MiniAxion flips part of that:
Structure โ Reasoning format โ (still learning correctness)
๐ก Key insight:
The model demonstrates that reasoning structure can emerge independently of reasoning accuracy.
๐งช Evaluation Task Performance Task Accuracy Addition 10% Subtraction 10% Multiplication 0% Even/Odd 100% Comparison 5% Sequence Completion 0% Word Problems (Addition) 10% Word Problems (Subtraction) 0% Word Problems (Multiplication) 10% True/False 100% Chat/Greetings 100%
๐ง Reasoning Behavior Metrics Metric Score Thinking Rate 100% Step Format 100% Answer Completion 100%
โ The model always thinks โ The model always structures reasoning โ The model always produces an answer
๐ Interpretation
MiniAxion exhibits a clear dissociation:
โ What it learned Reasoning format Step-by-step decomposition Logical task patterns (parity, boolean) โ What it did NOT learn Arithmetic correctness Numerical reasoning Multi-step computation
๐ฌ Core Finding
Reasoning โ Correctness
MiniAxion shows that:
Models can internalize thinking patterns Without actually learning how to solve problems
This makes it a strong candidate for studying:
Emergent reasoning Tiny Recursive Models (TRMs) Reasoning distillation
๐๏ธ Architecture Type: GPT-style Transformer Parameters: ~2.7M Objective: Next-token prediction Language: Portuguese (primary) Specialization: Structured reasoning traces
๐ง Training Strategy
The model was trained with a reasoning-first approach:
Portuguese language grounding Structured reasoning data () Emphasis on: Deterministic formats Multi-step thinking Explicit reasoning tokens
๐ซ No RLHF ๐ซ No instruction tuning at scale ๐ซ No large model distillation (yet)
โ ๏ธ Limitations
- Arithmetic Collapse
Near-random performance in:
Addition
Subtraction
Multiplication
โ Indicates lack of numerical representation learning
Strong dependence on:
Prompt format
Token patterns
Seen reasoning templates
๐ฎ Future Work
This model is just the beginning.
๐ Scaling
5M / 10M / 20M versions
Track emergence of correctness
๐งช Distillation
Inject reasoning from larger models
Improve accuracy without scaling params
๐ Self-Play / Synthetic Data
Generate reasoning loops
Reinforce correct chains
๐งฉ Hybrid Reasoning
Combine symbolic + neural learning
Fix arithmetic weakness
๐งพ Example Output
Identifico os nรบmeros Tento somar os valores Ajusto o resultado 74โ Perfect reasoning structure โ Incorrect answer
๐ก Takeaway
MiniAxion1.5-3M proves something important:
Even a 2.7M model can learn to simulate thinking before it learns to actually think correctly.
๐ค Use Cases
Research on emergent reasoning
Tiny model experimentation (CPU-friendly)
Educational demos of:
Chain-of-Thought
Reasoning failure modes
Base model for:
Distillation
NRM experiments
- Downloads last month
- -