Spaces:
Runtime error
Runtime error
A newer version of the Gradio SDK is available: 6.15.2
metadata
title: COMPASS-Inspired Semantic Sampling for Sudanese Arabic Dialect Understanding
emoji: 🎯
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.36.0
app_file: app.py
pinned: false
license: mit
COMPASS-Inspired Semantic Sampling Demo
This Space demonstrates adaptive semantic sampling for low-resource dialect understanding, inspired by the COMPASS paper (2604.20720). It clusters Sudanese Arabic text by semantic meaning and prioritizes under-represented clusters for more efficient learning.
Hypothesis
Semantic clustering with distribution-aware sampling improves Sudanese Arabic dialect coverage compared to random sampling, requiring fewer examples to achieve equivalent semantic diversity.
Method
- Generate/embed sample Sudanese dialect sentences
- Cluster by semantic similarity
- Compare random vs COMPASS-style adaptive sampling
- Measure cluster coverage and semantic diversity
Paper Reference
COMPASS: COntinual Multilingual PEFT with Adaptive Semantic Sampling (arXiv:2604.20720)