Microsoft has introduced a new VASA 1 AI model, a framework designed to generate lifelike talking faces for virtual characters, boasting appealing Visual Affective Skills (VAS).
With just a single static image and a speech audio clip, the company says its VASA-1 can create life-like short videos. The model also gives several options to make changes to the video. Here is everything you need to know.
In a post on its Research announcement page, Microsoft revealed its new AI model that can synchronise lip movements with audio and also capture a wide spectrum of facial nuances and natural head motions.
It is claimed that the VASA 1 can deliver high video-quality content with realistic facial and head dynamics. The model supports the online generation of 512 x 512 videos at up to 40fps with negligible starting latency.
VASA-1 will essentially spin a static selfie and turn it into a talking clip of you. All you have to do is upload a photo along with a voice note and let the AI model do the talking for you.
V
ASA-1 will be able to render a minute-long clip at 512 x 512 pixels resolution, and at up to 40 frames per second, without compromising on the quality of your image.
Source: Qatar News Agency