Shohei Iwase, Takuya Kato, Shugo Yamaguchi, Yukitaka Tsuchiya, Shigeo Morishima

Song2Face: Synthesizing Singing Facial Animation from Audio


We present Song2Face, a deep neural network capable of producing singing facial animation from an input of singing voice and singer label. The network architecture is built upon our insight that, although facial expression when singing varies between different individuals, singing voices store valuable information such as pitch, breathe, and vibrato that expressions may be attributed to 꾸러기 훈민정음 다운로드. Therefore, our network consists of an encoder that extracts relevant vocal features from audio, and a regression network conditioned on a singer label that predicts control parameters for facial animation Two popes downloaded. In contrast to prior audio-driven speech animation methods which initially map audio to text-level features, we show that vocal features can be directly learned from singing voice without any explicit constraints 슈퍼 혼두라. Our network is capable of producing movements for all parts of the face and also rotational movement of the head itself. Furthermore, stylistic differences in expression between different singers are captured via the singer label, and thus the resulting animations singing style can be manipulated at test time 어린이 영어 동화.