ByteDance's speech and voice synthesis with strong Asian language support
Doubao (from ByteDance) offers TTS, voice cloning, and real-time speech recognition tuned for Mandarin and other Asian languages—popular when you need voice pipelines that go beyond English-first models.
Use cases
- Voice assistants
- Audiobook narration
- Localization pipelines
Key features
- Text-to-speech
- Voice cloning
- Speech recognition
- Real-time translation
Related
Related
3 Indexed items
ElevenLabs
ElevenLabs focuses on natural prosody, multilingual dubbing, and API-driven voice pipelines—common for podcasts, localized marketing, and realtime avatar backends.
Descript
Descript treats audio and video like a document: edit the transcript, clean the voice track, add captions, and turn long recordings into shareable clips without opening a heavy timeline editor.
Krisp
Krisp sits in the meeting audio path to remove background noise, echo, and cross-talk while also handling transcripts, recordings, and action items for teams that live on calls.