Mixture of Experts (MoE)
Collection
Sometimes I finetune models specifically to take on expert roles in a MoE configuration, sometimes I find interesting models others have fine tuned. • 8 items • Updated
A Mixture of Experts model built on Llama 3.2 3B, combining four specialized fine-tunes with a general-purpose model.
| Expert | Specialization |
|---|---|
| LLM-Data-Science-Llama3.2-3B | Machine learning, neural networks, fine-tuning, pre-training |
| CreativeWriter-Llama3.2-3B | Fiction writing, story structure, scene development, plot analysis |
| Llama-3.2-3B-VanRossum | Python programming, debugging, algorithm implementation |
| CogBeTh-Llama3.2-3B | Mental health support, anxiety, stress management, self-care |
The model uses a hidden gate mechanism to route inputs to the most relevant expert(s) based on the content of the prompt. Each expert was fine-tuned for its domain before being merged into this MoE architecture using mergekit.
Compatible with any Llama 3.2 inference setup. No special configuration required — the routing happens automatically.