frequent dropouts during voice generation.

by ziozzang - opened 22 days ago

22 days ago

This issue becomes more apparent when the sentences get longer. In a text of about 100-200 characters, the audio skips at least one or two characters. It also frequently jumps over or omits numbers, especially in the thousands or ten-thousands range.

Although the overall voice quality is clean and consistent, this skipping problem is quite severe. It seems highly probable—almost guaranteed—that a dropout will occur at least once per generation.

esp. use case 'Korean'

anlgboy-cream

Supertone org 19 days ago

Hello! Reading numbers can be challenging for the current model, as it does not use a dedicated text normalizer and the training data volume is not yet sufficient to handle these cases robustly. As a workaround, you can apply your own text normalization for numbers if needed. Also, the model can occasionally exhibit skip/repeat issues. We're aware of these limitations and are working on improvements. We hope to release an improved model as soon as possible.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment