Spaces:
Sleeping
A newer version of the Streamlit SDK is available:
1.52.1
title: CantusSVS
emoji: 🕊️
colorFrom: gray
colorTo: blue
sdk: streamlit
sdk_version: 1.32.2
app_file: app.py
pinned: false
CantusSVS
Table of Contents
About CantusSVS
CantusSVS is a singing voice synthesis tool that automatically generates audio playback for the Latin chants in Cantus. For training and inferencing, we use DiffSinger, a diffusion-based singing voice synthesis model described in the paper below:
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Liu, Jinglin, Chengxi Li, Yi Ren, Feiyang Chen, and Zhou Zhao. 2022. "Diffsinger: Singing Voice Synthesis via Shallow Diffusion Mechanism." In Proceedings of the AAAI Conference on Artificial Intelligence 36 10: 11020–11028. https://arxiv.org/abs/2105.02446.
Training was done using Cedar, a cluster provided by the Digital Research Alliance of Canada. To set up training locally, follow this tutorial by tigermeat.
For general help training and creating a dataset, this tutorial by PixPrucer is an excellent guide. For help, join the DiffSinger Discord server.
The dataset used for this project was built using Adventus: Dominica prima adventus Domini, the first track from Psallentes' album Salzinnes Saints. Psallentes is a Belgian women's chorus that specializes in Late Medieval and Renaissance music. Salzinnes Saints is an album of music from the Salzinnes Antiphonal, a mid-sixteenth century choirbook with the music and text for the Liturgy of the Hours.
Preparing Your Input
- Most commercial music composition software can export
.meifiles. MuseScore 4 is free to use. - Input format must be
.mei(Music Encoding Initiative). - Only monophonic scores are supported (one staff, one voice).
- Lyrics must be embedded in the MEI file and aligned with the notes.
Validation tool:
python scripts/validate_mei.py your_song.mei
FAQ
Q: Can I synthesize polyphonic (multi-voice) chants?
A: No, only monophonic scores are supported currently. However, in the future, polyphonic chants could be synthesized by layering multiple monophonic voices.
Q: Can I change the voice timbre?
A: In the webapp, only the provided pre-trained model is available. However, DiffSinger will learn the timbre of the input dataset so if you train your own model, you can control the timbre that way.