If you are interested, here is a portal: https://audiobox.metademolab.com/ (no magic required)
Through Audiobox Maker, even novice users can design and generate voice files for different characters (such as Little Red Riding Hood, the Big Bad Wolf, and Grandma), while adding different sound effects. By dragging and arranging various files (like building with Lego), you can create and direct your own story.
In fact, you can think of Audiobox as a "model series" that brings together six AI tools, including voice cloning, text-to-speech, text-to-sound effects (such as applause, dog barking, car horns, thunder), and adding or removing sound effects in specific places.
The results are as follows:
Meta claims that compared to previous state-of-the-art products, Audiobox has reduced the FAD (Frechet Audio Distance, the smaller the FAD value, the better) by 50%, making it comparable to real audio in terms of quality and fidelity.