Text Classification
sentence-transformers
Safetensors
Transformers
English
nvembed
feature-extraction
mteb
text
text-embeddings-inference
sparse-encoder
sparse
csr
custom_code
Eval Results (legacy)
Instructions to use Y-Research-Group/CSR-NV_Embed_v2-Classification-MTOPIntent with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use Y-Research-Group/CSR-NV_Embed_v2-Classification-MTOPIntent with sentence-transformers:
from sentence_transformers import SentenceTransformer model = SentenceTransformer("Y-Research-Group/CSR-NV_Embed_v2-Classification-MTOPIntent", trust_remote_code=True) sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] - Transformers
How to use Y-Research-Group/CSR-NV_Embed_v2-Classification-MTOPIntent with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="Y-Research-Group/CSR-NV_Embed_v2-Classification-MTOPIntent", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Y-Research-Group/CSR-NV_Embed_v2-Classification-MTOPIntent", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| datasets: | |
| - mteb/mtop_intent | |
| language: | |
| - en | |
| pipeline_tag: text-classification | |
| library_name: sentence-transformers | |
| tags: | |
| - mteb | |
| - text | |
| - transformers | |
| - text-embeddings-inference | |
| - sparse-encoder | |
| - sparse | |
| - csr | |
| model-index: | |
| - name: CSR | |
| results: | |
| - dataset: | |
| name: MTEB MTOPIntentClassification (en) | |
| type: mteb/mtop_intent | |
| revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | |
| config: en | |
| split: test | |
| languages: | |
| - eng-Latn | |
| metrics: | |
| - type: accuracy | |
| value: 0.906407 | |
| - type: f1 | |
| value: 0.694457 | |
| - type: f1_weighted | |
| value: 0.917326 | |
| - type: main_score | |
| value: 0.906407 | |
| task: | |
| type: Classification | |
| - dataset: | |
| name: MTEB MTOPIntentClassification (de) | |
| type: mteb/mtop_intent | |
| revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | |
| config: de | |
| split: test | |
| languages: | |
| - deu-Latn | |
| metrics: | |
| - type: accuracy | |
| value: 0.851 | |
| - type: f1 | |
| value: 0.601279 | |
| - type: f1_weighted | |
| value: 0.863969 | |
| - type: main_score | |
| value: 0.851 | |
| task: | |
| type: Classification | |
| - dataset: | |
| name: MTEB MTOPIntentClassification (es) | |
| type: mteb/mtop_intent | |
| revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | |
| config: es | |
| split: test | |
| languages: | |
| - spa-Latn | |
| metrics: | |
| - type: accuracy | |
| value: 0.906738 | |
| - type: f1 | |
| value: 0.642295 | |
| - type: f1_weighted | |
| value: 0.910882 | |
| - type: main_score | |
| value: 0.906738 | |
| task: | |
| type: Classification | |
| - dataset: | |
| name: MTEB MTOPIntentClassification (fr) | |
| type: mteb/mtop_intent | |
| revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | |
| config: fr | |
| split: test | |
| languages: | |
| - fra-Latn | |
| metrics: | |
| - type: accuracy | |
| value: 0.849045 | |
| - type: f1 | |
| value: 0.59923 | |
| - type: f1_weighted | |
| value: 0.863301 | |
| - type: main_score | |
| value: 0.849045 | |
| task: | |
| type: Classification | |
| - dataset: | |
| name: MTEB MTOPIntentClassification (hi) | |
| type: mteb/mtop_intent | |
| revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | |
| config: hi | |
| split: test | |
| languages: | |
| - hin-Deva | |
| metrics: | |
| - type: accuracy | |
| value: 0.751094 | |
| - type: f1 | |
| value: 0.44095 | |
| - type: f1_weighted | |
| value: 0.762567 | |
| - type: main_score | |
| value: 0.751094 | |
| task: | |
| type: Classification | |
| - dataset: | |
| name: MTEB MTOPIntentClassification (th) | |
| type: mteb/mtop_intent | |
| revision: ae001d0e6b1228650b7bd1c2c65fb50ad11a8aba | |
| config: th | |
| split: test | |
| languages: | |
| - tha-Thai | |
| metrics: | |
| - type: accuracy | |
| value: 0.75566 | |
| - type: f1 | |
| value: 0.498529 | |
| - type: f1_weighted | |
| value: 0.76994 | |
| - type: main_score | |
| value: 0.75566 | |
| task: | |
| type: Classification | |
| base_model: | |
| - nvidia/NV-Embed-v2 | |
| For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [Github](https://github.com/neilwen987/CSR_Adaptive_Rep). | |
| ## Usage | |
| 📌 **Tip**: For NV-Embed-V2, using Transformers versions **later** than 4.47.0 may lead to performance degradation, as ``model_type=bidir_mistral`` in ``config.json`` is no longer supported. | |
| We recommend using ``Transformers 4.47.0.`` | |
| ### Sentence Transformers Usage | |
| You can evaluate this model loaded by Sentence Transformers with the following code snippet: | |
| ```python | |
| import mteb | |
| from sentence_transformers import SparseEncoder | |
| model = SparseEncoder( | |
| "Y-Research-Group/CSR-NV_Embed_v2-Classification-MTOPIntent", | |
| trust_remote_code=True | |
| ) | |
| model.prompts = { | |
| "MTOPIntentClassification": "Instruct: Classify the intent of the given utterance in task-oriented conversation\nQuery:" | |
| } | |
| task = mteb.get_tasks(tasks=["MTOPIntentClassification"]) | |
| evaluation = mteb.MTEB(tasks=task) | |
| evaluation.run(model, | |
| eval_splits=["test"], | |
| output_folder="./results/MTOPIntentClassification", | |
| show_progress_bar=True | |
| encode_kwargs={"convert_to_sparse_tensor": False, "batch_size": 8}, | |
| ) # MTEB don't support sparse tensors yet, so we need to convert to dense tensors | |
| ``` | |
| ## Citation | |
| ```bibtex | |
| @inproceedings{wenbeyond, | |
| title={Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation}, | |
| author={Wen, Tiansheng and Wang, Yifei and Zeng, Zequn and Peng, Zhong and Su, Yudi and Liu, Xinyang and Chen, Bo and Liu, Hongwei and Jegelka, Stefanie and You, Chenyu}, | |
| booktitle={Forty-second International Conference on Machine Learning} | |
| } | |
| ``` |