How to use TJKlein/CLIP-ViT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-image-classification", model="TJKlein/CLIP-ViT") pipe( "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png", candidate_labels=["animals", "humans", "landscape"], )
# Load model directly from transformers import AutoProcessor, AutoModelForZeroShotImageClassification processor = AutoProcessor.from_pretrained("TJKlein/CLIP-ViT") model = AutoModelForZeroShotImageClassification.from_pretrained("TJKlein/CLIP-ViT")