Zero-Shot Image Classification
Transformers
ONNX
Chinese
English
m2_encoder
feature-extraction
multimodal
image-text-retrieval
bilingual
chinese
english
vision-language
custom-code
custom_code
Eval Results (legacy)
Instructions to use malusama/M2-Encoder-1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use malusama/M2-Encoder-1B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="malusama/M2-Encoder-1B", trust_remote_code=True) pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("malusama/M2-Encoder-1B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
Add top-of-page benchmark figure
Browse files
README.md
CHANGED
|
@@ -96,6 +96,8 @@ It supports Chinese-English image-text retrieval, zero-shot image classification
|
|
| 96 |
|
| 97 |
This is the larger M2-Encoder variant with a wider backbone and 1024-dimensional embeddings, intended for better retrieval and zero-shot classification quality.
|
| 98 |
|
|
|
|
|
|
|
| 99 |
## Links
|
| 100 |
|
| 101 |
- Paper: https://arxiv.org/abs/2401.15896
|
|
|
|
| 96 |
|
| 97 |
This is the larger M2-Encoder variant with a wider backbone and 1024-dimensional embeddings, intended for better retrieval and zero-shot classification quality.
|
| 98 |
|
| 99 |
+

|
| 100 |
+
|
| 101 |
## Links
|
| 102 |
|
| 103 |
- Paper: https://arxiv.org/abs/2401.15896
|