Update README.md
Browse files
README.md
CHANGED
|
@@ -90,6 +90,12 @@ you built your model off of, and at least two other comparison models of similar
|
|
| 90 |
In a text paragraph, as you did in your second project check in, describe the benchmark evaluation tasks you chose and why you chose them. Next, briefly state why you
|
| 91 |
chose each comparison model. Last, include a summary sentence(s) describing the performance of your model relative to the comparison models you chose.
|
| 92 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
|
| 94 |
# Usage and Intended Use
|
| 95 |
|
|
|
|
| 90 |
In a text paragraph, as you did in your second project check in, describe the benchmark evaluation tasks you chose and why you chose them. Next, briefly state why you
|
| 91 |
chose each comparison model. Last, include a summary sentence(s) describing the performance of your model relative to the comparison models you chose.
|
| 92 |
|
| 93 |
+
| Model | HumanEval | SQuADv2 | E2E NLG Challenge | Testing Split of Training Dataset |
|
| 94 |
+
|-----------------------------------------------------------|-----------|---------|-------------------|--------------------------------------------------------------------------------------------|
|
| 95 |
+
| Base Model: Qwen/Qwen2.5-7B-Instruct | 0.652 | 9.81 | 6.68 | Bert Score Mean Precision: 0.829, Bert Score Mean Recall: 0.852, Bert Score Mean F1: 0.841 |
|
| 96 |
+
| My Model: Qwen/Qwen2.5-7B-Instruct Trained and Finetuned | 0.598 | 21.57 | 5.04 | Bert Score Mean Precision: 0.813, Bert Score Mean Recall: 0.848, Bert Score Mean F1: 0.830 |
|
| 97 |
+
| Similar Size Model: meta-llama/Meta-Llama-3-8B-Instruct | 0.280 | 20.33 | 2.26 | Bert Score Mean Precision: 0.814, Bert Score Mean Recall: 0.848, Bert Score Mean F1: 0.830 |
|
| 98 |
+
| Similar Size Model: deepseek-ai/DeepSeek-R1-0528-Qwen3-8B | 0.634 | 5.81 | 3.63 | Bert Score Mean Precision: 0.803, Bert Score Mean Recall: 0.831, Bert Score Mean F1: 0.817 |
|
| 99 |
|
| 100 |
# Usage and Intended Use
|
| 101 |
|