qanthony commited on
Commit
9384d6b
·
verified ·
1 Parent(s): 9a6b5e0

add blog links

Browse files
Files changed (1) hide show
  1. README.md +5 -1
README.md CHANGED
@@ -2,7 +2,11 @@
2
  license: apache-2.0
3
  ---
4
 
5
- [GPT-NeoX](https://github.com/EleutherAI/gpt-neox) now supports preference learning (SFT, DPO, KTO)!
 
 
 
 
6
 
7
  This is a direct preference optimization (DPO) model produced by:
8
  1. Taking the ultrachat SFT checkpoint from https://huggingface.co/HuggingFaceH4/mistral-7b-sft-beta
 
2
  license: apache-2.0
3
  ---
4
 
5
+ [GPT-NeoX](https://github.com/EleutherAI/gpt-neox) now supports preference learning (SFT, DPO, KTO)! For more information on this joint effort between EleutherAI and SynthLabs, view our associated blog posts:
6
+
7
+ SynthLabs: https://www.synthlabs.ai/blog/rlhf-and-rlaif-in-gpt-neox
8
+
9
+ EleutherAI: https://www.eleuther.ai/rlhf-and-rlaif-in-gpt-neox
10
 
11
  This is a direct preference optimization (DPO) model produced by:
12
  1. Taking the ultrachat SFT checkpoint from https://huggingface.co/HuggingFaceH4/mistral-7b-sft-beta