SPES-7B

SPES-7B is a pretrained language model released as part of paper:

Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm

Model Details

Model name: SPES-7B
Model type: Causal language model
Parameters: 7B
Framework: SPES
License: Apache-2.0

Project Links

GitHub: https://github.com/zjr2000/SPES
Paper: https://huggingface.co/papers/2602.11543

Intended Use

This model is intended for:

research on decentralized LLM pretraining
research on MoE training and synchronization
experimentation and evaluation of pretrained language models

Citation

If you use this model, please cite the SPES paper.

@article{zhang2026spes,
  title={Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm},
  author={Zhang, Jinrui and Xiao, Chaodong and Wu, Aoqi and Zhang, Xindong and Zhang, Lei},
  year={2026}
}

Downloads last month: 13

Safetensors

Model size

7B params

Tensor type

BF16

Collection including zjr2000/SPES-7B

SPES

Collection

Pretrained models for paper "Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm" • 3 items • Updated 1 day ago

Paper for zjr2000/SPES-7B

Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm

Paper • 2602.11543 • Published 27 days ago • 5