When Reasoning Beats Scale: A 1.5B Reasoning Model Outranks 13B LLMs as Discriminator Paper • 2505.03786 • Published Apr 30, 2025 • 1
Running 29 Polish Linguistic and Cultural Competency Benchmark 🏆 29 Display evaluation results in a leaderboard
Running on CPU Upgrade 13.8k Open LLM Leaderboard 🏆 13.8k Track, rank and evaluate open LLMs and chatbots