MiniLLM

community

https://github.com/microsoft/LMOps/tree/main/minillm

AI & ML interests

Training efficient language models (MiniLLM, MiniPLM)

Organization Card

Community About org cards

Training Small Language Models with Knowledge Distillation

Official pre-trained models and baselines in

MiniLLM: Knowledge distillation of LLMs during instruction tuning.
MiniPLM: Knowledge distillation of LLMs during pre-training.

Collections 2

models 50

MiniLLM/MiniLLM-gpt2-340M

Text Generation • Updated Apr 11, 2025 • 1.02k • 6

MiniLLM/SFT-gpt2-120M

Text Generation • 0.1B • Updated Mar 25, 2025 • 1.25k

MiniLLM/SFT-gpt2-760M

Text Generation • 0.8B • Updated Mar 25, 2025 • 10

MiniLLM/MiniPLM-Qwen-500M

Text Generation • 0.5B • Updated Mar 25, 2025 • 73 • • 7

MiniLLM/MiniPLM-llama3.1-212M

Text Generation • 0.2B • Updated Mar 25, 2025 • 11 • 6

MiniLLM/MiniPLM-Mamba-130M

Text Generation • 0.1B • Updated Mar 25, 2025 • 8 • 3

MiniLLM/MiniPLM-Qwen-1.2B

Text Generation • 1B • Updated Mar 25, 2025 • 145 • 4

MiniLLM/Ref-Pretrain-Qwen-104M

Text Generation • 0.1B • Updated Mar 25, 2025 • 17 • 2

MiniLLM/Pretrain-Qwen-1.2B

Text Generation • 1B • Updated Mar 25, 2025 • 6

MiniLLM/Pretrain-Qwen-500M

Text Generation • 0.5B • Updated Mar 25, 2025 • 16

datasets 10

MiniLLM/pile-diff_samp-qwen_1.8B-qwen_104M-r0.5

Updated Mar 25, 2025 • 295

MiniLLM/pile-tokenized

Updated Nov 14, 2024 • 34 • 2

MiniLLM/roberta-corpus-processed

Updated Oct 22, 2024 • 35

MiniLLM/openwebtext-processed

Updated Sep 27, 2024 • 87

MiniLLM/dolly-processed

Viewer • Updated Sep 26, 2024 • 110k • 248 • 1

MiniLLM/sinst

Viewer • Updated Sep 26, 2024 • 8.35k • 136 • 1

MiniLLM/uinst

Viewer • Updated Sep 26, 2024 • 64.8k • 145 • 1

MiniLLM/self-inst

Viewer • Updated Sep 26, 2024 • 242 • 133 • 2

MiniLLM/Vicuna

Viewer • Updated Sep 26, 2024 • 80 • 82 • 1

MiniLLM/dolly

Viewer • Updated Sep 26, 2024 • 500 • 267