Several trained models to compare the differences between each method. Each model has a complete description of hyperparams with wandb reports.
G
G-reen
AI & ML interests
SFT, DPO, ORPO, LLMs, text-generation
Recent Activity
updated
a model
about 16 hours ago
G-reen/gemma-2-2b-alpaca-dpo-finetuned
published
a model
about 16 hours ago
G-reen/gemma-2-2b-alpaca-dpo-finetuned
updated
a dataset
1 day ago
G-reen/sumthink_fixed_cleaned
Organizations
None yet