AI Assistant commited on
Commit
a3e5374
·
1 Parent(s): 6ed534f

Fix: revert to robust Docker SDK, add agents.md, and finalize app.py logic

Browse files
Files changed (3) hide show
  1. Dockerfile +43 -0
  2. README.md +1 -1
  3. agents.md +20 -0
Dockerfile ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10
2
+
3
+ ENV PYTHONDONTWRITEBYTECODE=1
4
+ ENV PYTHONUNBUFFERED=1
5
+
6
+ RUN useradd -m -u 1000 user
7
+ WORKDIR /home/user/app
8
+
9
+ # Install system dependencies
10
+ RUN apt-get update && apt-get install -y \
11
+ ffmpeg \
12
+ libsndfile1 \
13
+ cmake \
14
+ g++ \
15
+ git \
16
+ build-essential \
17
+ && rm -rf /var/lib/apt/lists/*
18
+
19
+ # Install build tools manually to ensure they are available for youtokentome
20
+ RUN pip install --no-cache-dir --upgrade pip && \
21
+ pip install --no-cache-dir \
22
+ Cython \
23
+ packaging \
24
+ setuptools \
25
+ wheel
26
+
27
+ # Install youtokentome without isolation for NeMo compatibility
28
+ RUN pip install --no-cache-dir --no-build-isolation youtokentome
29
+
30
+ # Copy requirements and install
31
+ COPY --chown=user requirements.txt .
32
+ RUN pip install --no-cache-dir -r requirements.txt
33
+
34
+ # Copy everything else
35
+ COPY --chown=user . .
36
+
37
+ # Permissions
38
+ RUN chmod 777 /home/user/app
39
+
40
+ USER user
41
+ EXPOSE 7860
42
+
43
+ CMD ["python", "app.py"]
README.md CHANGED
@@ -3,7 +3,7 @@ title: RobotsMali ASR Demo
3
  emoji: 🎙️
4
  colorFrom: blue
5
  colorTo: green
6
- sdk: gradio
7
  app_port: 7860
8
  pinned: false
9
  ---
 
3
  emoji: 🎙️
4
  colorFrom: blue
5
  colorTo: green
6
+ sdk: docker
7
  app_port: 7860
8
  pinned: false
9
  ---
agents.md ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # RobotsMali ASR Agent
2
+
3
+ This agent provides Automatic Speech Recognition (ASR) for Bambara using state-of-the-art models from RobotsMali.
4
+
5
+ ## Description
6
+
7
+ This space implements several models optimized for Bambara language transcription, including:
8
+ - **Soloni V3 (TDT-CTC)**: A hybrid architecture for fast and accurate transcription.
9
+ - **Soloba V3 (CTC)**: A robust CTC-based model.
10
+
11
+ The agent can take audio input (via upload or internal path) and return a full text transcription.
12
+
13
+ ## Tools
14
+
15
+ ### transcription
16
+ Transcribes an audio file into text.
17
+ - **audio**: The audio file to transcribe (Path or URL).
18
+ - **model_name**: The name of the model to use (default: "Soloni V3 (TDT-CTC)").
19
+
20
+ Returns: String containing the transcription.