SpectralPO
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Organization Card
This repo contains all the models for paper -
Spectral Policy Optimization: Coloring your Incorrect Reasoning in GRPO
https://arxiv.org/abs/2505.11595
Please cite
@inproceedings{chen2025spectral,
title = {Spectral Policy Optimization: Coloring your Incorrect Reasoning in {GRPO}},
author = {Peter Chen and Xiaopeng Li and Ziniu Li and Xi Chen and Tianyi Lin},
booktitle = {2nd AI for Math Workshop @ ICML 2025},
year = {2025},
url = {https://openreview.net/forum?id=IIBDElbi7s}
}
models
27
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-QwQ-Ablation
8B
•
Updated
•
5
SpectralPO/DeepSeek-R1-Distill-Qwen-32B-GRPO
Updated
SpectralPO/DeepSeek-R1-Distill-Qwen-32B-SPO
Updated
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-Qwen3-235B
8B
•
Updated
•
4
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-QwQ
8B
•
Updated
•
5
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-DeepSeek-V3
8B
•
Updated
•
5
•
1
SpectralPO/DeepSeek-R1-Distill-Llama-8B-SPO
8B
•
Updated
•
4
SpectralPO/DeepSeek-R1-Distill-Llama-8B-GRPO
8B
•
Updated
•
5
SpectralPO/Qwen2.5-32B-Instruct-GRPO
33B
•
Updated
•
4
SpectralPO/Qwen2.5-32B-Instruct-SPO
33B
•
Updated
•
4
datasets
0
None public yet