Spaces:

RobotsMali
/

RobotsMali_ASR_DEMO

Runtime error

AI Assistant commited on 21 days ago

Commit

a3e5374

1 Parent(s): 6ed534f

Fix: revert to robust Docker SDK, add agents.md, and finalize app.py logic

Files changed (3) hide show

Dockerfile ADDED Viewed

+FROM python:3.10
+ENV PYTHONDONTWRITEBYTECODE=1
+ENV PYTHONUNBUFFERED=1
+RUN useradd -m -u 1000 user
+WORKDIR /home/user/app
+# Install system dependencies
+RUN apt-get update && apt-get install -y \
+    ffmpeg \
+    libsndfile1 \
+    cmake \
+    g++ \
+    git \
+    build-essential \
+    && rm -rf /var/lib/apt/lists/*
+# Install build tools manually to ensure they are available for youtokentome
+RUN pip install --no-cache-dir --upgrade pip && \
+    pip install --no-cache-dir \
+        Cython \
+        packaging \
+        setuptools \
+        wheel
+# Install youtokentome without isolation for NeMo compatibility
+RUN pip install --no-cache-dir --no-build-isolation youtokentome
+# Copy requirements and install
+COPY --chown=user requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+# Copy everything else
+COPY --chown=user . .
+# Permissions
+RUN chmod 777 /home/user/app
+USER user
+EXPOSE 7860
+CMD ["python", "app.py"]

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ title: RobotsMali ASR Demo
 emoji: 🎙️
 colorFrom: blue
 colorTo: green
-sdk: gradio
 app_port: 7860
 pinned: false
 ---

 emoji: 🎙️
 colorFrom: blue
 colorTo: green
+sdk: docker
 app_port: 7860
 pinned: false
 ---

agents.md ADDED Viewed

+# RobotsMali ASR Agent
+This agent provides Automatic Speech Recognition (ASR) for Bambara using state-of-the-art models from RobotsMali.
+## Description
+This space implements several models optimized for Bambara language transcription, including:
+- **Soloni V3 (TDT-CTC)**: A hybrid architecture for fast and accurate transcription.
+- **Soloba V3 (CTC)**: A robust CTC-based model.
+The agent can take audio input (via upload or internal path) and return a full text transcription.
+## Tools
+### transcription
+Transcribes an audio file into text.
+- **audio**: The audio file to transcribe (Path or URL).
+- **model_name**: The name of the model to use (default: "Soloni V3 (TDT-CTC)").
+Returns: String containing the transcription.