Tatbooy
/

llm-course-hw2-reward-model

Text Classification

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Tatbooy commited on Mar 28

Commit

dc54a7b

·

verified ·

1 Parent(s): 0a33a93

Tatbooy/llm-course-hw2-reward-model

Files changed (3) hide show

README.md +2 -1
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -1,5 +1,6 @@
 ---
 base_model: HuggingFaceTB/SmolLM-135M-Instruct
 library_name: transformers
 model_name: trainer_output
 tags:
@@ -11,7 +12,7 @@ licence: license
 # Model Card for trainer_output
-This model is a fine-tuned version of [HuggingFaceTB/SmolLM-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct).
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start

 ---
 base_model: HuggingFaceTB/SmolLM-135M-Instruct
+datasets: HumanLLMs/Human-Like-DPO-Dataset
 library_name: transformers
 model_name: trainer_output
 tags:
 # Model Card for trainer_output
+This model is a fine-tuned version of [HuggingFaceTB/SmolLM-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct) on the [HumanLLMs/Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset) dataset.
 It has been trained using [TRL](https://github.com/huggingface/trl).
 ## Quick start

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:fab62be6f370f941d35c265fc261b8cb2d3a5efae857087f9ab9480edf21d847
 size 269061784

 version https://git-lfs.github.com/spec/v1
+oid sha256:ff9977982cead3f03e3ae5a7393c307b69623f93e282dc7b9b48215baa26b8b5
 size 269061784

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:088acdd8070837a24f1e3bf82b2c67f0789629bef4d38e1222f8a259f05ba631
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:6251355e49b3905f8376993707034959b89dc4c566d5ede9a82c3f5325ecd581
 size 5432