nobrand
/

KULLM-R

@@ -12,7 +12,7 @@ language:
 # KULLM-R
-Introducing KULLM-R, a large language model specialized for high-level reasoning queries in Korean, with a particular focus on complex mathematical problems. The model is designed to provide both the correct reasoning paths and answers for such queries, offering enhanced reasoning efficiency and language transferability to Korean compared to general-purpose reasoning models. It leverages reinforcement learning strategies for effective path exploration and Korean-specific generation.
 ## Model Details
@@ -34,7 +34,7 @@ KULLM-R is distinguished from standard reasoning LLMs based on Qwen3-8B by its f
 - **Reasoning Efficiency Aware Reinforcement Learning**: Introduces RL techniques considering both reasoning path efficiency and answer correctness, reducing unnecessary steps while maintaining answer quality.
 - **Reasoning Path Pruning**: Specialized for high-difficulty reasoning problems by pruning ineffective paths and emphasizing transparency and readability in generated answers.
 - **Support High Readability in Korean System**: Enhanced both logical reasoning and natural Korean expression ability in answer.
-- **Adaptive Length Penalty**: Adaptive penalties optimize the reasoning process according to the question’s complexity and reasoning path length, ensuring efficient solutions for complex problems.
 ## Data & Training Process
@@ -105,7 +105,7 @@ print("content:", content)
 ```
 > [!NOTE]
-> As mentioned in Qwen3, use `Temperature=0.6`, `TopP=0.95`, `TopK=20`, and `MinP=0` (the default setting in `generation_config.json`). **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions. For more detailed guidance, please refer to the [Best Practices](#best-practices) section.
 ## Evaluation
@@ -128,7 +128,6 @@ print("content:", content)
 ## Citation
 ```
 @misc{KULLM-R2025,
   title   = {KULLM-R: Korea University Large Language Model for Reasoning},

 # KULLM-R
+Introducing KULLM-R, a large language model specialized for high-level reasoning queries in Korean, with a particular focus on complex mathematical problems. The model is designed to provide both the correct reasoning paths and answers for such queries, offering enhanced reasoning efficiency and language transferability to Korean compared to general-purpose reasoning models. Reinforcement learning strategy is employed for efficient reasoning path exploration and Korean-specific generation.
 ## Model Details
 - **Reasoning Efficiency Aware Reinforcement Learning**: Introduces RL techniques considering both reasoning path efficiency and answer correctness, reducing unnecessary steps while maintaining answer quality.
 - **Reasoning Path Pruning**: Specialized for high-difficulty reasoning problems by pruning ineffective paths and emphasizing transparency and readability in generated answers.
 - **Support High Readability in Korean System**: Enhanced both logical reasoning and natural Korean expression ability in answer.
+- **Adaptive Length Penalty**: Adaptive penalties optimize the reasoning process according to the question’s complexity and difficulty, ensuring efficient solutions for various math problems.
 ## Data & Training Process
 ```
 > [!NOTE]
+> As mentioned in Qwen3, use `Temperature=0.6`, `TopP=0.95`, `TopK=20`, and `MinP=0` (the default setting in `generation_config.json`). **DO NOT use greedy decoding**, as it can lead to performance degradation and endless repetitions.
 ## Evaluation
 ## Citation
 ```
 @misc{KULLM-R2025,
   title   = {KULLM-R: Korea University Large Language Model for Reasoning},