Project Introduction
FishSpeech is a TTS voice generation tool developed by the FishAudio team, which, along with ChatTTS, is one of the super popular open-source TTS projects from the same period (June-July 2024). Speaking of its team members, they are various SVC experts on GitHub, the pioneers of AI voice cloning.
Main Features
• Zero-shot & Few-shot TTS: Only 10-30 seconds of voice samples are needed to generate high-quality speech, perfectly supporting voice cloning needs.
• Strong generalization capability without phoneme dependency: The Fish Speech model is phoneme-independent and can easily handle any language represented in text, making TTS application scenarios more extensive.
• Ultra-high accuracy: On 5 minutes of English text, the character error rate (CER) and word error rate (WER) are only about 2%.
• User-friendly multi-interface support:
• WebUI: A web user interface based on Gradio, compatible with mainstream browsers (Chrome, Firefox, Edge).
• GUI Inference: Provides a PyQt6 graphical interface that works seamlessly with the API server.
• Easy deployment: Supports quick deployment both locally and in the cloud, minimizing speed loss and providing great convenience for developers.
Official Homepage: https://fish.audio
GitHub Project Address: https://github.com/fishaudio/fish-speech
HF Demo: https://huggingface.co/spaces/fishaudio/fish-speech-1