On Thursday, Naver unveiled natural end-to-end speech synthesis (NES) through the website of Clova, an AI voice assistant service. "Everyone can make voice fonts easily and conveniently," Naver Clova Voice research head Kim Jae-min was quoted as saying. Naver plans to add more features such as the voices of popular figures and various emotions to NES.
Naver said that AI synthesized voices can be created by studying voice recordings of about 40 minutes (about 400 sentences). Similar technologies developed by tech companies so far needed to analyze and study at least 40 hours of actual voice recordings to create an artificial voice.
NES can control the emotions of the artificial voice to make it sound happy or sad. Synthesized voice technology can be useful in service, electronic book and other sectors. In November last year, Naver released an audiobook service using synthesized voices of Yoo In-na, a 36-year-old actress who served as a radio DJ, using Hybrid DNN Text-to-Speech (HDTS), a technology that converts texts into synthesized voices.
Copyright ⓒ Aju Press All rights reserved.