Zero-Shot Image Classification
Transformers
ONNX
Chinese
English
m2_encoder
feature-extraction
multimodal
image-text-retrieval
bilingual
chinese
english
vision-language
custom-code
custom_code
Eval Results (legacy)
Instructions to use malusama/M2-Encoder-1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use malusama/M2-Encoder-1B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="malusama/M2-Encoder-1B", trust_remote_code=True) pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("malusama/M2-Encoder-1B", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| import torch.nn as nn | |
| class Pooler(nn.Module): | |
| def __init__(self, hidden_size): | |
| super().__init__() | |
| self.dense = nn.Linear(hidden_size, hidden_size) | |
| self.activation = nn.Tanh() | |
| def forward(self, hidden_states): | |
| first_token_tensor = hidden_states[:, 0] | |
| pooled_output = self.dense(first_token_tensor) | |
| pooled_output = self.activation(pooled_output) | |
| return pooled_output | |
| class ITCHead(nn.Module): | |
| def __init__(self, hidden_size, out_size): | |
| super().__init__() | |
| self.fc = nn.Linear(hidden_size, out_size, bias=False) | |
| def forward(self, x): | |
| x = self.fc(x) | |
| return x | |