Pager / README.md
7lalala's picture
Create README.md
5332aaf verified
metadata
library_name: transformers
pipeline_tag: image-text-to-text
tags:
  - qwen3-vl
  - vision-language
  - multimodal
  - image-text-to-text

Pager

This repository contains the model weights, tokenizer, processor, and configuration files for Pager, a vision-language model based on the Qwen3-VL architecture.

Files

The repository includes:

  • config.json
  • generation_config.json
  • tokenizer.json
  • tokenizer_config.json
  • vocab.json
  • merges.txt
  • special_tokens_map.json
  • added_tokens.json
  • preprocessor_config.json
  • video_preprocessor_config.json
  • chat_template.jinja
  • model.safetensors.index.json
  • model-00001-of-00004.safetensors
  • model-00002-of-00004.safetensors
  • model-00003-of-00004.safetensors
  • model-00004-of-00004.safetensors

Usage

Install dependencies:

pip install -U transformers accelerate safetensors pillow

Load the model:

import torch
from transformers import AutoProcessor, AutoModelForImageTextToText

model_id = "OpenRaiser/Pager"

processor = AutoProcessor.from_pretrained(
    model_id,
    trust_remote_code=True
)

model = AutoModelForImageTextToText.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto",
    trust_remote_code=True
)

print("Model loaded successfully.")

If your local transformers version does not support this model class, please upgrade transformers first.

Notes

  • The model weights are stored in four .safetensors shards.
  • model.safetensors.index.json maps model parameters to the corresponding weight shards.
  • This repository is intended for research and development use.

Citation

If you use this model, please cite or link to this repository:

https://huggingface.co/OpenRaiser/Pager