QuickSilver -- Speeding up LLM Inference through Dynamic Token Halting, KV Skipping, Contextual Token Fusion, and Adaptive Matryoshka Quantization Paper • 2506.22396 • Published Jun 27
ObfusQAte: A Proposed Framework to Evaluate LLM Robustness on Obfuscated Factual Question Answering Paper • 2508.07321 • Published Aug 10