Unofficial LLaDA2 Evaluation based on lm-eval

#4
by Lucasoppem - opened

Hi everyone, I'm also doing research on dLLM.

Here is an unofficial LLaDA2 Evaluation based on lm-eval, which has been tested on the A100. I hope it can be helpful.
Open source address: https://github.com/preordinary/LLaDA2.

Issues and discoveries:

  • Parameter changes: The steps parameter definition in version 2.0 is different from 1.0; it refers to the number of steps within a block. Please pay attention when reproducing this issue.
  • Length sensitivity: I tested lengths of 256/512/1024 and found that the accuracy dropped significantly at length 256 (HumanEval was only 5.5). I suspect this is because the thought chain in version 2.0 is longer, and a short window can easily lead to truncated answers.

The code is relatively simple. Welcome everyone to try it out, submit issues or pull requests, and feel free to share it! If you find it useful, please give it a star ⭐️! Thank you!

inclusionAI org

Thanks so much for this detailed evaluation and for sharing your findings!

Dear Official Author,

Could I request to have this GitHub link added to the readme file? I hope more people can access it.

If there's anything I can help you with (such as merging it to your repository), I will do my best to assist~ Thank you for your time!

inclusionAI org

Dear Official Author,

Could I request to have this GitHub link added to the readme file? I hope more people can access it.

If there's anything I can help you with (such as merging it to your repository), I will do my best to assist~ Thank you for your time!

Yes, any pull request is welcome~ thank you for contributing.

Dear Official Author,

Could I request to have this GitHub link added to the readme file? I hope more people can access it.

If there's anything I can help you with (such as merging it to your repository), I will do my best to assist~ Thank you for your time!

Yes, any pull request is welcome~ thank you for contributing.

Dear Official Author,

Can you add me as a member of inclusionAI? So that I can transfer this repository to the organization~

If there is any other way to make a pull request, please let me know. I am willing to PR this repository. Thank you for your patience!

Sign up or log in to comment