SPES
Collection
Pretrained models for paper "Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm" • 3 items • Updated
SPES-7B is a pretrained language model released as part of paper:
Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm
This model is intended for:
If you use this model, please cite the SPES paper.
@article{zhang2026spes,
title={Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm},
author={Zhang, Jinrui and Xiao, Chaodong and Wu, Aoqi and Zhang, Xindong and Zhang, Lei},
year={2026}
}