The IIAU team from Dalian University of Technology proposed CharacterFactory, which can train for 10 minutes with only 2.5GB of VRAM without the need for reference images. It can sample new consistent characters end-to-end in 3 seconds without restrictions, combining action, background, style, and other text prompts for consistent image generation. It can be inserted into videos/3D generation without fine-tuning.
Related Links#
Demo: https://huggingface.co/spaces/DecoderWQH666/CharacterFactory
Github: https://github.com/qinghew/CharacterFactory (Code is open source)
Project Page: https://qinghew.github.io/CharacterFactory/