FuseChat: Knowledge Fusion of Chat Models
Paper • 2408.07990 • Published • 15
sthenno/tempesthenno-kto-0205-ckpt80
update: now checking for evaluations without chat templates
This is a merge of pre-trained language models created using mergekit.
This model was merged using the SCE merge method using sthenno/tempesthenno-nuslerp-0124 as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
name: tempesthenno-icy-0130
merge_method: sce
parameters:
select_topk: 0.8
normalize: true
dtype: float32
out_dtype: bfloat16
base_model: sthenno/tempesthenno-nuslerp-0124
tokenizer:
source: base
chat_template: chatml
models:
- model: sthenno/tempesthenno-icy-0130-01
- model: sthenno/tempesthenno-icy-0130-02
- model: sthenno/tempesthenno-icy-0130-03
Detailed results can be found here
| Metric | Value |
|---|---|
| Avg. | 39.74 |
| IFEval (0-Shot) | 62.18 |
| BBH (3-Shot) | 50.10 |
| MATH Lvl 5 (4-Shot) | 37.99 |
| GPQA (0-shot) | 19.69 |
| MuSR (0-shot) | 19.84 |
| MMLU-PRO (5-shot) | 48.65 |