Edit Models filters

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Mixture of Experts

Carbon Emissions

Models

93

Full-text search

Active filters: reward-model

mradermacher/ThinkPRM-7B-GGUF

8B • Updated Jul 11 • 359

mradermacher/ThinkPRM-7B-i1-GGUF

8B • Updated Jul 11 • 662

Huanghz/align2llava-7b-lora-question

Updated May 21 • 4

Huanghz/align2llava-7b-lora-answer

Updated May 21 • 4

nvidia/Qwen-2.5-Nemotron-32B-Reward

Text Classification • 32B • Updated Jun 26 • 80 • 2

nvidia/Qwen-3-Nemotron-32B-Reward

Text Classification • 32B • Updated Jun 26 • 65 • 18

zhuohaoyu/RewardAnything-8B-v1

Text Generation • 8B • Updated Jun 5 • 33 • 3

mradermacher/RewardAnything-8B-v1-GGUF

8B • Updated Jul 11 • 174

WisdomShell/RewardAnything-8B-v1

Text Generation • 8B • Updated Jun 5 • 470 • • 22

Skywork/Skywork-Reward-V2-Qwen3-8B

Text Classification • 8B • Updated Jul 6 • 6.76k • 18

ContextualAI/ctx-bird-reward-250121

Text Generation • 33B • Updated 4 days ago • 32 • 3

Bifrost-AI/Qwen-3-Nemotron-32B-Reward-F16

Text Classification • 32B • Updated Jul 11 • 11

tensorblock/WisdomShell_RewardAnything-8B-v1-GGUF

Text Generation • 8B • Updated Jul 18 • 133

ulab-ai/sotopia-rl-qwen2.5-7B-rm

Feature Extraction • Updated Aug 7 • 1

ilgee/Binary-Think-RM-3B

3B • Updated Nov 2 • 6 • 1

gandhiraketla277/demo-lora-reward-model

Text Generation • Updated Aug 10 • 1

Schrieffer/Llama-SARM-4B

Reinforcement Learning • 5B • Updated 25 days ago • 38 • 1

ykorkmaz/rfm_no_failure

4B • Updated Aug 30 • 11

abraranwar/spur_metaworld

4B • Updated Aug 31 • 3

ykorkmaz/rfm_progress_only

4B • Updated Sep 1 • 9

kewu93/skywork-medarena-lora-v1

Updated Sep 18 • 3

kewu93/skywork-medarena-lora-v2

Text Classification • Updated Sep 18 • 8

nabeelshan/rlhf-gpt2-pipeline

Text Generation • Updated Sep 24

Schrieffer/Llama-SARM-4B-PostSAEPretrain

Feature Extraction • 5B • Updated 24 days ago • 78 • 1

dongboklee/gPRM-14B

Text Generation • Updated Oct 6 • 16 • 1

dongboklee/gPRM-14B-merged

Text Generation • 15B • Updated Oct 6 • 153 • 2

dongboklee/gORM-14B

Text Generation • Updated Oct 6

dongboklee/gORM-14B-merged

Text Generation • 15B • Updated Oct 6 • 264 • 1

mradermacher/gPRM-14B-merged-GGUF

15B • Updated Oct 3 • 146 • 1

mradermacher/gORM-14B-merged-GGUF

15B • Updated Oct 3 • 224 • 1