| --- |
| license: apache-2.0 |
| language: |
| - en |
| base_model: |
| - google-t5/t5-base |
| pipeline_tag: summarization |
| --- |
| |
| **Model Name:** LoRA Fine-Tuned Model for Dialogue Summarization |
| **Model Type:** Seq2Seq with Low-Rank Adaptation (LoRA) |
| **Base Model:** `google/t5-base` |
|
|
| ## Model Details |
| - **Architecture**: T5-base |
| - **Finetuning Technique**: LoRA (Low-Rank Adaptation) |
| - **PEFT Method**: Parameter Efficient Fine-Tuning |
| - **Data**: samsumdataset |
| - **Metrics**: Evaluated using ROUGE (ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum) |
|
|
| ## Intended Use |
| This model is designed for summarizing dialogues, such as conversations between individuals in a chat or messaging context. It’s suitable for applications in: |
| - **Customer Service**: Summarizing chat logs for quality monitoring or training. |
| - **Messaging Apps**: Generating conversation summaries for user convenience. |
| - **Content Creation**: Assisting writers by summarizing character dialogues. |
|
|
| ## Training Process |
|
|
| Optimizer: AdamW with learning rate 3e-5 |
|
|
| Batch Size: 4 (gradient accumulation steps of 2) |
|
|
| Training Epochs: 2 |
|
|
| Evaluation Metrics: ROUGE-1, ROUGE-2, ROUGE-L, ROUGE-Lsum |
|
|
| Hardware: Trained on a single GPU with mixed precision to optimize performance. |
|
|
| The model was trained using the Seq2SeqTrainer class from transformers, with LoRA parameters applied to selected attention layers to reduce computation without compromising accuracy. |