-
XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization
Paper • 2508.10395 • Published • 42 -
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models
Paper • 2508.09834 • Published • 53 -
Causal Attention with Lookahead Keys
Paper • 2509.07301 • Published • 21
Tanmay Gangwani
tgangs
AI & ML interests
None yet
Recent Activity
updated
a collection
about 2 months ago
Agents
updated
a collection
about 2 months ago
Agents
updated
a collection
about 2 months ago
Agents
Organizations
None yet