An open-source human animation generation model that supports generating realistic animation effects for half-body characters driven by audio. It features captivating motion expressiveness, a simplified generation process, and characteristics of half-body character animation, enabling coordinated speech, expressions, and body movements driven by audio.
The V1 model focused on the digital face project, while V2 supports generating captivating half-body character videos. It utilizes a novel audio-pose dynamic coordination strategy, including pose sampling and audio diffusion, to enhance half-body details, facial and gesture expressiveness, while reducing conditional redundancy. It can be used for digital human live streaming, virtual anchors, video editing, AI voiceover, and other projects.
GitHub repository: https://github.com/antgroup/echomimic_v2
Online Demo: https://huggingface.co/spaces/fffiloni/echomimic-v2