ROOT: Robust Orthogonalized Optimizer for Neural Network Training Paper โข 2511.20626 โข Published 13 days ago โข 169 โข 4
DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models Paper โข 2403.00818 โข Published Feb 26, 2024 โข 19 โข 2