File size: 1,367 Bytes

---
license: mit
pipeline_tag: text-generation
---

<h1 align="center">
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models
</h1>

<div align="center">

<a href="https://chenlong-clock.github.io">Charlie Zhang</a>, <a href="https://www.phontron.com">Graham Neubig</a>, 
<a href="https://xiangyue9607.github.io">Xiang Yue</a>

Carnegie Mellon University, Language Technologies Institute

</div>

<div align="center">

[![arXiv](https://img.shields.io/badge/arXiv-2512.07783-b31b1b.svg?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2512.07783)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
![Python](https://img.shields.io/badge/python-3.9%2B-blue)

</div>


This repository contains post-training related checkpoints in extrapolation tasks.

**Code:** [GitHub Repository](https://github.com/Interplay-LM-Reasoning/Interplay-LM-Reasoning)

## 📚 Citation

If you find this work or code useful, please consider citing:

```bibtex
@misc{zhang2025interplaypretrainingmidtrainingrl,
      title={On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models}, 
      author={Charlie Zhang and Graham Neubig and Xiang Yue},
      year={2025},
      eprint={2512.07783},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2512.07783}, 
}
```