ChatTTS is a text-to-speech (TTS) generation model designed for conversational scenarios, with the following features:
- Multi-language Support: Supports Chinese and English for use in multilingual environments.
- Large-scale Data Training: Trained with approximately 100,000 hours of Chinese and English data to ensure high-quality and natural speech synthesis.
- Conversational Task Compatibility: Particularly suitable for conversational tasks of large language model (LLM) assistants, providing a natural and smooth interactive experience.
- Open Source Plan: The project team plans to open source a trained base model to promote academic research and community development.
- Control and Security: Enhances model controllability, adds watermarks, and ensures the security and reliability of the model.
- Usability: Users simply input text to generate the corresponding audio file.
ChatTTS can be widely used in conversational tasks of large language model assistants, conversational audio and video introductions, and speech synthesis for education and training content.