Synthesia's deepfake avatars will get full bodies

batasakas · Post by **batasakas** » Thu Dec 12, 2024 4:52 am

Synthesia is preparing to introduce a new version of avatars with moving bodies and arms. Deepfakes will be able to sing and wave a microphone while dancing, get up from a table and walk around the room, and express more complex emotions such as excitement, fear, or nervousness. The update will be available towards the end of the year.

“It’s quite impressive. No one else can do this,” says Jack Saunders, a scientist at the University of Bath who was not involved in the Synthesia work.

He says the full-length avatars he's seen are very good, despite some minor flaws, like the arms sometimes crossing over each other. But "you probably won't be looking closely enough to notice it," Saunders says.

In April, Synthesia released its first version of hyper-realistic avatars. They use large-scale language models to ensure that facial expressions and tone of voice match the content of the text they are speaking. The avatar’s appearance iceland phone number material is generated by diffusion models, which are used in artificial intelligence systems to create images and videos. However, this generation of avatars only display above the waist, which can affect their realism.

To create full-bodied avatars, Synthesia is developing an even larger AI model. Users will have to go to a studio to record their body movements. But before full-length avatars become available, another version will be released, with hands and the ability to record from different angles. Their predecessors were only available in portrait mode and were only visible from the front.

Other startups, like Hour One, have launched similar avatars with hands. Synthesia's version, due out in late July, features slightly more realistic hand movements and lip syncing.

With the update, creating your own avatar will be much easier: while previously the user had to go to a studio and spend a couple of hours recording their face and voice, the new version requires only 10 minutes of material, and the equipment includes a digital camera, a portable microphone, and a laptop. But in general, a laptop camera will be enough.

And while previously you had to record your facial movements and voice separately, this time the data is collected simultaneously. You also have to read a text expressing consent to recording and a randomly generated security password.

According to the company's CEO Victor Riparbelli, these changes will give the AI models that power the avatars more power with less data. They also speed up the process: where previously it took weeks, now the avatar is ready the next day.