Interplay-LM-Reasoning
/

extrapolation_rl

Text Generation

Model card Files Files and versions

Clockz commited on Dec 10, 2025

Commit

d95b6a3

·

verified ·

1 Parent(s): a1ca024

Create README.md

Files changed (1) hide show

README.md +43 -0

README.md ADDED Viewed

	@@ -0,0 +1,43 @@

+---
+license: mit
+---
+<h1 align="center">
+On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
+</h1>
+<div align="center">
+<a href="https://chenlong-clock.github.io">Charlie Zhang</a>, <a href="https://www.phontron.com">Graham Neubig</a>,
+<a href="https://xiangyue9607.github.io">Xiang Yue</a>
+Carnegie Mellon University, Language Technologies Institute
+</div>
+<div align="center">
+[![arXiv](https://img.shields.io/badge/arXiv-2512.07783-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2512.07783)
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
+![Python](https://img.shields.io/badge/python-3.9%2B-blue)
+</div>
+This repository contains post-trained checkpoints in extrapolation tasks.
+## 📚 Citation
+If you find this work or code useful, please consider citing:
+```bibtex
+@misc{zhang2025interplaypretrainingmidtrainingrl,
+      title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models},
+      author={Charlie Zhang and Graham Neubig and Xiang Yue},
+      year={2025},
+      eprint={2512.07783},
+      archivePrefix={arXiv},
+      primaryClass={cs.CL},
+      url={https://arxiv.org/abs/2512.07783},
+}
+```