Supertron-embedding-300M: High-Efficiency Semantic Representation Model

Model Description

Supertron-embedding-300M is a high-performance, compact embedding model fine-tuned from the google/embeddinggemma-300m architecture. It is specifically designed to provide state-of-the-art semantic representations for Retrieval-Augmented Generation (RAG), semantic search, and document clustering applications while maintaining a low computational footprint suitable for production environments.

Developed by: Surpem
Model Type: Sentence Transformer
Architecture: Gemma-based Dense Transformer
Base Model: google/embeddinggemma-300m
License: Apache 2.0
Language: English (en)

Results

Supertron-embedding-300M demonstrates competitive performance across the Massive Text Embedding Benchmark (MTEB). It is particularly effective in Semantic Textual Similarity (STS) tasks, outperforming many larger models in its weight class.

Task Category	Task Name	Metric	Score
Semantic Similarity	STSBenchmark	cos_sim_spearman	87.10
Semantic Similarity	STS12	cos_sim_spearman	80.18
Semantic Similarity	BIOSSES	cos_sim_spearman	82.98
Retrieval	NFCorpus	NDCG@10	37.07
Classification	AmazonCounterfactual	Accuracy	83.34
Clustering	TwentyNewsgroups	V-Measure	50.01

Get Started

This model can be easily integrated using the sentence-transformers library.

from sentence_transformers import SentenceTransformer

model_id = "surpem/Supertron-embedding-300M"

# Load the model
model = SentenceTransformer(model_id)

# Define target text
sentences = [
    "The financial results exceeded market expectations.",
    "The company reported better than expected quarterly earnings."
]

# Compute embeddings
embeddings = model.encode(sentences)

# Calculate cosine similarity
similarity = model.similarity(embeddings[0], embeddings[1])
print(f"Semantic Similarity: {similarity.item():.4f}")
Training Procedure
Hyperparameters
Precision: bfloat16

Max Sequence Length: 256 tokens

Optimizer: AdamW

Batch Size: 256

Learning Rate: 2e-5

Citation
Code-Snippet
@misc{surpem2026supertron,
      title={Supertron-embedding-300M: High-Efficiency Semantic Representation Model},
      author={Surpem},
      year={2026},
      url={[https://huggingface.co/surpem/Supertron-embedding-300M](https://huggingface.co/surpem/Supertron-embedding-300M)},
}

Downloads last month: 212

Safetensors

Model size

0.3B params

Tensor type

BF16

Model tree for Surpem/Supertron-embedding-300M

Base model

google/embeddinggemma-300m

Finetuned

(237)

this model

Collection including Surpem/Supertron-embedding-300M

Supertron-embedding

Collection

1 item • Updated 3 days ago

Evaluation results

cos_sim_spearman on MTEB STSBenchmark
test set self-reported

87.101
cos_sim_spearman on MTEB STS12
test set self-reported

80.177
cos_sim_spearman on MTEB BIOSSES
test set self-reported

82.978
ndcg_at_10 on MTEB NFCorpus
test set self-reported

37.074
accuracy on MTEB AmazonCounterfactualClassification
test set self-reported

83.342
v_measure on MTEB TwentyNewsgroupsClustering.v2
test set self-reported

50.011