| | --- |
| | license: apache-2.0 |
| | base_model: |
| | - sentence-transformers/all-MiniLM-L6-v2 |
| | --- |
| | **This model is a neuron compiled version of https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 *** |
| |
|
| | It was compiled on version 2.19.1 of the Neuron SDK. You may need to run the compilation process again. |
| |
|
| | See https://huggingface.co/docs/optimum-neuron/en/inference_tutorials/sentence_transformers for more details |
| |
|
| | For information on how to run on SageMaker: https://huggingface.co/docs/optimum-neuron/en/inference_tutorials/sentence_transformers |
| |
|
| | To run: |
| |
|
| | ``` |
| | from optimum.neuron import NeuronModelForSentenceTransformers |
| | from transformers import AutoTokenizer |
| | model_id = "jburtoft/all-MiniLM-L6-v2-neuron" |
| | |
| | # Use the line below if you have to compile the model yourself |
| | #model_id = "all-MiniLM-L6-v2-neuron" |
| | |
| | |
| | model = NeuronModelForSentenceTransformers.from_pretrained(model_id) |
| | tokenizer = AutoTokenizer.from_pretrained(model_id) |
| | |
| | # Run inference |
| | prompt = "I like to eat apples" |
| | encoded_input = tokenizer(prompt, return_tensors='pt') |
| | outputs = model(**encoded_input) |
| | |
| | token_embeddings = outputs.token_embeddings |
| | sentence_embedding = outputs.sentence_embedding |
| | |
| | print(f"token embeddings: {token_embeddings.shape}") # torch.Size([1, 7, 384]) |
| | print(f"sentence_embedding: {sentence_embedding.shape}") # torch.Size([1, 384]) |
| | ``` |
| |
|
| | To compile: |
| | ``` |
| | optimum-cli export neuron -m sentence-transformers/all-MiniLM-L6-v2 --sequence_length 512 --batch_size 1 --task feature-extraction all-MiniLM-L6-v2-neuron |
| | ``` |