O96a (Aamer Mihaysi)
Fix color values to match HF requirements
1c5416b

A newer version of the Gradio SDK is available: 6.15.2

Upgrade
metadata
title: COMPASS-Inspired Semantic Sampling for Sudanese Arabic Dialect Understanding
emoji: 🎯
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 4.36.0
app_file: app.py
pinned: false
license: mit

COMPASS-Inspired Semantic Sampling Demo

This Space demonstrates adaptive semantic sampling for low-resource dialect understanding, inspired by the COMPASS paper (2604.20720). It clusters Sudanese Arabic text by semantic meaning and prioritizes under-represented clusters for more efficient learning.

Hypothesis

Semantic clustering with distribution-aware sampling improves Sudanese Arabic dialect coverage compared to random sampling, requiring fewer examples to achieve equivalent semantic diversity.

Method

  1. Generate/embed sample Sudanese dialect sentences
  2. Cluster by semantic similarity
  3. Compare random vs COMPASS-style adaptive sampling
  4. Measure cluster coverage and semantic diversity

Paper Reference

COMPASS: COntinual Multilingual PEFT with Adaptive Semantic Sampling (arXiv:2604.20720)