-
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO
Paper • 2502.14669 • Published • 15 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement
Paper • 2503.17352 • Published • 24
Abhranil Chandra PRO
abhranil14
AI & ML interests
Reinforcement Learning, Deep Unsupervised Learning, NLP and Bayesian Deep Learning
Recent Activity
updated
a model
5 days ago
abhranil14/L8B_on_MBPP_Code_G27B_IT_H_Paraphrased_subset_W_354_BS_64_lr_2e-5_epoch10_linear_schedule
published
a model
5 days ago
abhranil14/L8B_on_MBPP_Code_G27B_IT_H_Paraphrased_subset_W_354_BS_64_lr_2e-5_epoch10_linear_schedule
updated
a model
13 days ago
abhranil14/G2B_on_CODE_MBPP_G_601_subset_wrt_G_601_BS_64_lr_2e-5_epoch10_linear_schedule
Organizations
FM4 EmbodiedAI/Robotics/DecisionMaking
Foundation Models Empirical Analysis
-
An Empirical Study of Autoregressive Pre-training from Videos
Paper • 2501.05453 • Published • 41 -
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Paper • 2501.06186 • Published • 65 -
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Paper • 2501.05707 • Published • 20 -
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
Paper • 2502.17535 • Published • 8
RL
RL/FM/Agent Data/Benchmark
FM_Training_Infra
Survey LLM/VLM/MLM
Reasoning/System2
-
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 78 -
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Paper • 2408.07199 • Published • 22 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 11 -
V-STaR: Training Verifiers for Self-Taught Reasoners
Paper • 2402.06457 • Published • 9
Augmenting Pretrained FMs with Post-Training/RL
-
AlphaMaze: Enhancing Large Language Models' Spatial Intelligence via GRPO
Paper • 2502.14669 • Published • 15 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27 -
Offline Reinforcement Learning for LLM Multi-Step Reasoning
Paper • 2412.16145 • Published • 38 -
OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement
Paper • 2503.17352 • Published • 24
RL/FM/Agent Data/Benchmark
FM4 EmbodiedAI/Robotics/DecisionMaking
FM_Training_Infra
Foundation Models Empirical Analysis
-
An Empirical Study of Autoregressive Pre-training from Videos
Paper • 2501.05453 • Published • 41 -
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs
Paper • 2501.06186 • Published • 65 -
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Paper • 2501.05707 • Published • 20 -
The Lottery LLM Hypothesis, Rethinking What Abilities Should LLM Compression Preserve?
Paper • 2502.17535 • Published • 8
Survey LLM/VLM/MLM
RL
Reasoning/System2
-
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking
Paper • 2403.09629 • Published • 78 -
Agent Q: Advanced Reasoning and Learning for Autonomous AI Agents
Paper • 2408.07199 • Published • 22 -
Let's Verify Step by Step
Paper • 2305.20050 • Published • 11 -
V-STaR: Training Verifiers for Self-Taught Reasoners
Paper • 2402.06457 • Published • 9