Uploaded finetuned model
- Developed by: gateremark
- License: apache-2.0
- Finetuned from model : google/translategemma-12b-it
This gemma3 model was trained 2x faster with Unsloth and Huggingface's TRL library.
Kikuyu TranslateGemma-12B
Fine-tuned English → Kikuyu translation model based on Google's TranslateGemma-12B-it.
🌍 Live Demo: bit.ly/c-elo-live (Talk to our AI)
Learn about our training process and challenges faced
Model Details
| Attribute | Value |
|---|---|
| Base Model | google/translategemma-12b-it |
| Parameters | 12.4B |
| Fine-tuning Method | LoRA (r=128, alpha=256) |
| Training Data | 30,430 English-Kikuyu pairs |
| BLEU Score | 19.61 |
| Framework | Unsloth + TRL |
Usage
Quick Start
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "gateremark/kikuyu_translategemma_12b_merged_V2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
def translate_to_kikuyu(text: str) -> str:
messages = [
{
"role": "user",
"content": [{
"type": "text",
"source_lang_code": "en",
"target_lang_code": "ki",
"text": text
}]
}
]
input_ids = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<end_of_turn>"),
]
with torch.no_grad():
outputs = model.generate(
input_ids=input_ids,
max_new_tokens=256,
temperature=0.3,
do_sample=True,
eos_token_id=terminators,
)
response = tokenizer.decode(
outputs[0][input_ids.shape[1]:],
skip_special_tokens=True
)
return response.strip()
# Example
print(translate_to_kikuyu("Hello, how are you?"))
# Output: Hihi, ũrĩ atĩa?
Using with Unsloth (Faster Inference)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="gateremark/kikuyu_translategemma_12b_merged_V2",
max_seq_length=2048,
dtype=None,
load_in_4bit=True, # Optional: reduce VRAM to ~6GB
)
Training Details
Dataset
- Source: gateremark/english-kikuyu-translations
- Size: 30,430 parallel sentences
- Split: 95% train (28,908) / 5% eval (1,522)
Hyperparameters
| Parameter | Value |
|---|---|
| LoRA rank (r) | 128 |
| LoRA alpha | 256 |
| LoRA dropout | 0 |
| Learning rate | 2e-4 |
| Batch size | 64 (16 × 4 accum) |
| Epochs | 2 |
| Optimizer | AdamW 8-bit |
| NEFTune α | 5 |
| Weight decay | 0.01 |
Target Modules
["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
Hardware
- GPU: NVIDIA H200 (139GB VRAM)
- Platform: Lightning.ai
- Training Time: ~90 minutes
Evaluation
| Metric | Score |
|---|---|
| BLEU | 19.61 |
| Eval Loss | 0.578 |
Sample Translations
| English | Kikuyu |
|---|---|
| Hello, how are you? | Hihi, ũrĩ atĩa? |
| The weather is beautiful today. | Rĩera nĩ rĩega mũno ũmũthĩ |
| I love learning new languages. | Nĩ ngenagĩra gũthoma thiomi njerũ |
Limitations
- Direction: English → Kikuyu only (reverse not trained)
- Domain: General text; may struggle with technical/specialized content
- Dialects: Trained on standard Kikuyu; dialect variations not covered
- Evaluation: BLEU-based; human evaluation in progress
Intended Use
- Translation tools for Kikuyu speakers
- Language learning applications
- Research on low-resource African language NLP
- Cultural preservation initiatives
Citation
@misc{gatere2026kikuyutranslategemma,
author = {Mark Gatere},
title = {Kikuyu TranslateGemma-12B: Fine-tuning TranslateGemma for Low-Resource Bantu Language Translation},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/gateremark/kikuyu_translategemma_12b_merged_V2}}
}
Acknowledgments
- Google for the TranslateGemma base model
- Unsloth for 2x faster fine-tuning
- Lightning.ai for GPU compute
- Downloads last month
- 182
Model tree for gateremark/kikuyu_translategemma_12b_merged_V2
Base model
google/translategemma-12b-it