--- title: "COMPASS-Inspired Semantic Sampling for Sudanese Arabic Dialect Understanding" emoji: "🎯" colorFrom: "blue" colorTo: "indigo" sdk: gradio sdk_version: 4.36.0 app_file: app.py pinned: false license: mit --- # COMPASS-Inspired Semantic Sampling Demo This Space demonstrates adaptive semantic sampling for low-resource dialect understanding, inspired by the COMPASS paper (2604.20720). It clusters Sudanese Arabic text by semantic meaning and prioritizes under-represented clusters for more efficient learning. ## Hypothesis Semantic clustering with distribution-aware sampling improves Sudanese Arabic dialect coverage compared to random sampling, requiring fewer examples to achieve equivalent semantic diversity. ## Method 1. Generate/embed sample Sudanese dialect sentences 2. Cluster by semantic similarity 3. Compare random vs COMPASS-style adaptive sampling 4. Measure cluster coverage and semantic diversity ## Paper Reference COMPASS: COntinual Multilingual PEFT with Adaptive Semantic Sampling (arXiv:2604.20720)