gpt2-multilingual-20-deu-repair_3epochs

This model is a fine-tuned version of CausalNLP/gpt2-hf_multilingual-20 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.2187

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.95) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 500
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss
3.2849	0.0876	500	3.3025
3.2491	0.1752	1000	3.2839
3.2539	0.2628	1500	3.2729
3.2063	0.3504	2000	3.2656
3.2336	0.4380	2500	3.2591
3.2188	0.5256	3000	3.2539
3.2212	0.6132	3500	3.2500
3.2204	0.7008	4000	3.2457
3.2059	0.7884	4500	3.2427
3.2236	0.8760	5000	3.2396
3.2001	0.9636	5500	3.2368
3.1861	1.0512	6000	3.2354
3.1757	1.1388	6500	3.2339
3.1804	1.2264	7000	3.2319
3.1862	1.3140	7500	3.2297
3.1533	1.4016	8000	3.2279
3.1724	1.4892	8500	3.2262
3.1641	1.5768	9000	3.2249
3.1587	1.6644	9500	3.2235
3.1577	1.7520	10000	3.2221
3.1716	1.8396	10500	3.2212
3.2024	1.9272	11000	3.2203
3.1689	2.0147	11500	3.2199
3.1569	2.1023	12000	3.2198
3.1717	2.1899	12500	3.2195
3.1449	2.2775	13000	3.2192
3.1487	2.3651	13500	3.2190
3.1555	2.4527	14000	3.2189
3.1932	2.5403	14500	3.2188
3.1394	2.6279	15000	3.2187
3.1678	2.7155	15500	3.2187
3.1385	2.8032	16000	3.2187
3.1675	2.8908	16500	3.2187
3.183	2.9784	17000	3.2187

Framework versions

Transformers 4.57.3
Pytorch 2.9.0
Datasets 4.4.1
Tokenizers 0.22.1

Downloads last month: 63

Safetensors

Model size

0.2B params

Tensor type

BF16

Model tree for CausalNLP/gpt2-multilingual-20-deu-repair_3epochs

Base model

CausalNLP/gpt2-hf_multilingual-20

Finetuned

(6)

this model