RADLADS: Rapid Attention Distillation to Linear Attention Decoders at Scale
Paper • 2505.03005 • Published • 36
hugging face space, for rwkv-x related developments, including build assets, etc Nothing in here are considered "official releases" AKA - we treat this as a giant file dump