Meta Reward Modeling (MRM)

Overview

Meta Reward Modeling (MRM) is a personalized reward modeling framework designed to adapt to diverse user preferences with limited feedback.
Instead of learning a single global reward function, MRM treats each user as a separate learning task and applies a meta-learning approach to learn a shared initialization that enables fast, few-shot personalization.

MRM represents user-specific rewards as adaptive combinations over shared base reward functions and optimizes this structure through a bi-level meta-learning framework.
To improve robustness across heterogeneous users, MRM introduces a Robust Personalization Objective (RPO) that emphasizes hard-to-learn users during meta-training.

This repository provides trained checkpoints for reward modeling and user-level preference evaluation.


Links


Evaluation

The model is evaluated using user-level preference accuracy with few-shot personalization.
Inference follows the same adaptation procedure used during training: for each user, the reward weights are initialized from the meta-learned initialization and updated with a small number of gradient steps on user-specific preference data.

Example evaluation script

python inference.py \
  --embed_pt data/emb/prism/V1.pt \
  --meta_json data/emb/prism/V1.json \
  --ckpt path/to/checkpoint.pt \
  --dataset PRISM \
  --seen_train_limit -1 \
  --unseen_train_limit -1 \
  --hidden_layers 2 \
  --inner_lr 1e-3 \
  --eval_inner_epochs 1 \
  --val_ratio 0.9 \
  --score_threshold -1 \
  --seed 42 \
  --device cuda:0

Citation

If you use this model or code in your research, please cite:

@article{mrm2025,
  title   = {Meta Reward Modeling for Personalized Alignment},
  author  = {Author Names},
  journal = {arXiv preprint arXiv:XXXX.XXXXX},
  year    = {2025}
}

License

This model is released under the MIT License.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ModalityDance/MRM-PRISM-V1

Dataset used to train ModalityDance/MRM-PRISM-V1

Collection including ModalityDance/MRM-PRISM-V1