H-Net Collection The family of hierarchical networks (H-Nets) from https://arxiv.org/abs/2507.07955 β’ 8 items β’ Updated Jul 11 β’ 20
Encoders vs Decoders: the Ettin Suite Collection A collection of SOTA, open-data, paired encoder-only and decoder only models ranging from 17M params to 1B. See the paper at https://arxiv.org/abs/250 β’ 32 items β’ Updated Jul 16 β’ 25
view article Article NVIDIA Releases 3 Million Sample Dataset for OCR, Visual Question Answering, and Captioning Tasks Aug 11 β’ 75
Optimizing Length Compression in Large Reasoning Models Paper β’ 2506.14755 β’ Published Jun 17 β’ 10
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper β’ 2504.01990 β’ Published Mar 31 β’ 299
RLVR Collection Model and data for 'Expanding RL with Verifiable Rewards Across Diverse Domains' β’ 3 items β’ Updated Mar 31 β’ 13
π IOI Collection Resources related to International Olympiad in Informatics (IOI) problems β’ 5 items β’ Updated May 13 β’ 7
view article Article ColPali: Efficient Document Retrieval with Vision Language Models π Jul 5, 2024 β’ 303
From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples Paper β’ 2404.07544 β’ Published Apr 11, 2024 β’ 20
ShortGPT: Layers in Large Language Models are More Redundant Than You Expect Paper β’ 2403.03853 β’ Published Mar 6, 2024 β’ 66
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits Paper β’ 2402.17764 β’ Published Feb 27, 2024 β’ 626
User-LLM: Efficient LLM Contextualization with User Embeddings Paper β’ 2402.13598 β’ Published Feb 21, 2024 β’ 20
STEVE-1: A Generative Model for Text-to-Behavior in Minecraft Paper β’ 2306.00937 β’ Published Jun 1, 2023 β’ 9