Instructions to use Respeecher/ukrainian-data2vec with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Respeecher/ukrainian-data2vec with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="Respeecher/ukrainian-data2vec")# Load model directly from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("Respeecher/ukrainian-data2vec") model = AutoModel.from_pretrained("Respeecher/ukrainian-data2vec") - Notebooks
- Google Colab
- Kaggle
| # Model Card for Respeecher/ukrainian-data2vec | |
| This model can be used as Feature Extractor model for Ukrainian language audio data | |
| It can also be used as Backbone for downstream tasks, like ASR, Audio Classification, etc. | |
| ### How to Get Started with the Model | |
| ```python | |
| from transformers import AutoProcessor, Data2VecAudioModel | |
| import torch | |
| from datasets import load_dataset, Audio | |
| dataset = load_dataset("mozilla-foundation/common_voice_11_0", "uk", split="validation") | |
| # Resample | |
| dataset = dataset.cast_column("audio", Audio(sampling_rate=16_000)) | |
| processor = AutoProcessor.from_pretrained("Respeecher/ukrainian-data2vec") | |
| model = Data2VecAudioModel.from_pretrained("Respeecher/ukrainian-data2vec") | |
| # audio file is decoded on the fly | |
| inputs = processor(dataset[0]["audio"]["array"], sampling_rate=sampling_rate, return_tensors="pt") | |
| with torch.no_grad(): | |
| outputs = model(**inputs) | |
| last_hidden_states = outputs.last_hidden_state | |
| list(last_hidden_states.shape) | |
| ``` | |