Tatbooy commited on
Commit
dc54a7b
·
verified ·
1 Parent(s): 0a33a93

Tatbooy/llm-course-hw2-reward-model

Browse files
Files changed (3) hide show
  1. README.md +2 -1
  2. model.safetensors +1 -1
  3. training_args.bin +1 -1
README.md CHANGED
@@ -1,5 +1,6 @@
1
  ---
2
  base_model: HuggingFaceTB/SmolLM-135M-Instruct
 
3
  library_name: transformers
4
  model_name: trainer_output
5
  tags:
@@ -11,7 +12,7 @@ licence: license
11
 
12
  # Model Card for trainer_output
13
 
14
- This model is a fine-tuned version of [HuggingFaceTB/SmolLM-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct).
15
  It has been trained using [TRL](https://github.com/huggingface/trl).
16
 
17
  ## Quick start
 
1
  ---
2
  base_model: HuggingFaceTB/SmolLM-135M-Instruct
3
+ datasets: HumanLLMs/Human-Like-DPO-Dataset
4
  library_name: transformers
5
  model_name: trainer_output
6
  tags:
 
12
 
13
  # Model Card for trainer_output
14
 
15
+ This model is a fine-tuned version of [HuggingFaceTB/SmolLM-135M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-135M-Instruct) on the [HumanLLMs/Human-Like-DPO-Dataset](https://huggingface.co/datasets/HumanLLMs/Human-Like-DPO-Dataset) dataset.
16
  It has been trained using [TRL](https://github.com/huggingface/trl).
17
 
18
  ## Quick start
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fab62be6f370f941d35c265fc261b8cb2d3a5efae857087f9ab9480edf21d847
3
  size 269061784
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ff9977982cead3f03e3ae5a7393c307b69623f93e282dc7b9b48215baa26b8b5
3
  size 269061784
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:088acdd8070837a24f1e3bf82b2c67f0789629bef4d38e1222f8a259f05ba631
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6251355e49b3905f8376993707034959b89dc4c566d5ede9a82c3f5325ecd581
3
  size 5432