Instructions to use Burnt-Toast/fujin-9b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries
PEFT
How to use Burnt-Toast/fujin-9b with PEFT:
```
Task type is invalid.
```

How to use Burnt-Toast/fujin-9b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Burnt-Toast/fujin-9b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("Burnt-Toast/fujin-9b")
model = AutoModelForImageTextToText.from_pretrained("Burnt-Toast/fujin-9b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Burnt-Toast/fujin-9b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Burnt-Toast/fujin-9b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Burnt-Toast/fujin-9b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Burnt-Toast/fujin-9b

SGLang

How to use Burnt-Toast/fujin-9b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Burnt-Toast/fujin-9b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Burnt-Toast/fujin-9b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Burnt-Toast/fujin-9b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Burnt-Toast/fujin-9b",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Burnt-Toast/fujin-9b with Docker Model Runner:
```
docker model run hf.co/Burnt-Toast/fujin-9b
```

output-fujin

This model is a fine-tuned version of Qwen/Qwen3.5-9B.

W&B run: https://wandb.ai/cooawoo-personal/huggingface/runs/sr7glk4m

Training procedure

Hyperparameters

Parameter	Value
Learning rate	`0.0002`
LR scheduler	SchedulerType.COSINE
Per-device batch size	1
Gradient accumulation	8
Effective batch size	8
Epochs	1
Max sequence length	2048
Optimizer	OptimizerNames.PAGED_ADEMAMIX_8BIT
Weight decay	0.01
Warmup ratio	0.05
Max gradient norm	1.0
Precision	bf16
Loss type	nll

LoRA configuration

Parameter	Value
Rank (r)	128
Alpha	16
Dropout	0.05
Target modules	attn.proj, down_proj, gate_proj, in_proj_a, in_proj_b, in_proj_qkv, in_proj_z, k_proj, linear_fc1, linear_fc2, o_proj, out_proj, q_proj, qkv, up_proj, v_proj
Quantization	4-bit (nf4)

Dataset statistics

Dataset	Samples	Total tokens	Trainable tokens
rpDungeon/some-revised-datasets/rosier_inf_strict_text.parquet	36,438	65,084,381	65,084,381

Training config

model_name_or_path: Qwen/Qwen3.5-9B
bf16: true
gradient_checkpointing: true
gradient_checkpointing_kwargs:
  use_reentrant: false
use_liger: true
max_length: 2048
learning_rate: 0.0002
warmup_ratio: 0.05
weight_decay: 0.01
lr_scheduler_type: cosine
label_smoothing_factor: 0.1
per_device_train_batch_size: 1
gradient_accumulation_steps: 8
optim: paged_ademamix_8bit
max_grad_norm: 1.0
use_peft: true
load_in_4bit: true
lora_r: 128
lora_alpha: 16
lora_dropout: 0.05
logging_steps: 1
disable_tqdm: true
save_strategy: steps
save_steps: 500
save_total_limit: 3
report_to: wandb
output_dir: output-fujin
data_config: data.yaml
prepared_dataset: prepared
num_train_epochs: 1
saves_per_epoch: 3
run_name: qwen35-9b-qlora

Data config

datasets:
- path: rpDungeon/some-revised-datasets
  data_files: rosier_inf_strict_text.parquet
  type: text
  truncation_strategy: split
shuffle_datasets: true
shuffle_combined: true
shuffle_seed: 42
eval_split: 0.0
split_seed: 42
assistant_only_loss: false

Framework versions

PEFT 0.18.1
Loft: 0.1.0
Transformers: 5.2.0
Pytorch: 2.10.0
Datasets: 4.5.0
Tokenizers: 0.22.2

Downloads last month: -

Safetensors

Model size

10B params

Tensor type

BF16

F32

Model tree for Burnt-Toast/fujin-9b

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Adapter

(170)

this model

Adapters

2 models