File size: 5,059 Bytes
57a5444 3422a3b 57a5444 3422a3b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
---
license: cc-by-4.0
language: ti
widget:
- text: "<text-to-classify>"
datasets:
- fgaim/tigrinya-abusive-language-detection
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: tiroberta-tiald-all-tasks
results:
- task:
name: Text Classification
type: text-classification
metrics:
- name: Abu Accuracy
type: accuracy
value: 0.8611111111111112
- name: F1
type: f1
value: 0.8611109396431353
- name: Precision
type: precision
value: 0.8611128943846637
- name: Recall
type: recall
value: 0.8611111111111112
---
# TiRoBERTa Fine-tuned for Multi-task Abusiveness, Sentiment, and Topic Classification
This model is a fine-tuned version of [TiRoBERTa](https://huggingface.co/fgaim/tiroberta-base) on the [TiALD](https://huggingface.co/datasets/fgaim/tigrinya-abusive-language-detection) dataset.
**Tigrinya Abusive Language Detection (TiALD) Dataset** is a large-scale, multi-task benchmark dataset for abusive language detection in the Tigrinya language. It consists of **13,717 YouTube comments** annotated for **abusiveness**, **sentiment**, and **topic** tasks. The dataset includes comments written in both the **Ge’ez script** and prevalent non-standard Latin **transliterations** to mirror real-world usage.
> ⚠️ The dataset contains explicit, obscene, and potentially hateful language. It should be used for research purposes only. ⚠️
This work accompanies the paper ["A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings"](https://arxiv.org/abs/2505.12116).
## Model Usage
```python
from transformers import pipeline
tiald_multitask = pipeline("text-classification", model="fgaim/tiroberta-tiald-all-tasks", top_k=11)
tiald_multitask("<text-to-classify>")
```
### Performance Metrics
This model achieves the following results on the TiALD test set:
```json
"abusiveness_metrics": {
"accuracy": 0.8611111111111112,
"macro_f1": 0.8611109396431353,
"macro_recall": 0.8611111111111112,
"macro_precision": 0.8611128943846637,
"weighted_f1": 0.8611109396431355,
"weighted_recall": 0.8611111111111112,
"weighted_precision": 0.8611128943846637
},
"topic_metrics": {
"accuracy": 0.6155555555555555,
"macro_f1": 0.5491185274678864,
"macro_recall": 0.5143416011263588,
"macro_precision": 0.7341640739780486,
"weighted_f1": 0.5944096153417657,
"weighted_recall": 0.6155555555555555,
"weighted_precision": 0.6870800624645906
},
"sentiment_metrics": {
"accuracy": 0.6533333333333333,
"macro_f1": 0.5340845253007789,
"macro_recall": 0.5410170159158625,
"macro_precision": 0.534652401599494,
"weighted_f1": 0.6620101614004723,
"weighted_recall": 0.6533333333333333,
"weighted_precision": 0.6750245466592532
}
```
## Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 8
- optimizer: Adam (betas=0.9, 0.999, epsilon=1e-08)
- lr_scheduler_type: linear
- num_epochs: 7.0
- seed: 42
## Intended Usage
The TiALD dataset and models designed to support:
- Research in abusive language detection in low-resource languages
- Context-aware abuse, sentiment, and topic modeling
- Multi-task and transfer learning with digraphic scripts
- Evaluation of multilingual and fine-tuned language models
Researchers and developers should avoid using this dataset for direct moderation or enforcement tasks without human oversight.
## Ethical Considerations
- **Sensitive content**: Contains toxic and offensive language. Use for research purposes only.
- **Cultural sensitivity**: Abuse is context-dependent; annotations were made by native speakers to account for cultural nuance.
- **Bias mitigation**: Data sampling and annotation were carefully designed to minimize reinforcement of stereotypes.
- **Privacy**: All the source content for the dataset is publicly available on YouTube.
- **Respect for expression**: The dataset should not be used for automated censorship without human review.
This research received IRB approval (Ref: KH2022-133) and followed ethical data collection and annotation practices, including informed consent of annotators.
## Citation
If you use this model or the `TiALD` dataset in your work, please cite:
```bibtex
@misc{gaim-etal-2025-tiald-benchmark,
title = {A Multi-Task Benchmark for Abusive Language Detection in Low-Resource Settings},
author = {Fitsum Gaim and Hoyun Song and Huije Lee and Changgeon Ko and Eui Jun Hwang and Jong C. Park},
year = {2025},
eprint = {2505.12116},
archiveprefix = {arXiv},
primaryclass = {cs.CL},
url = {https://arxiv.org/abs/2505.12116}
}
```
## License
This dataset is released under the [Creative Commons Attribution 4.0 International License (CC BY 4.0)](https://creativecommons.org/licenses/by/4.0/).
|