2020-12-10 | peer-reviewedconferencepapers

Song2Face: Synthesizing Singing Facial Animation from Audio

We present Song2Face, a deep neural network capable of producing singing facial animation from an input of singing voice and singer label. The network architecture is built upon our insight that, although facial expression when singing varies between different individuals, singing voices store valuable information such as pitch, breathe, and vibrato that expressions may be attributed to 꾸러기 훈민정음 다운로드. Therefore, our network consists of an encoder that extracts relevant vocal features from audio, and a regression network conditioned on a singer label that predicts control parameters for facial animation Two popes downloaded. In contrast to prior audio-driven speech animation methods which initially map audio to text-level features, we show […]

2020-11-30 | peer-reviewedconferencepapers

Do We Need Sound for Sound Source Localization?

During the performance of sound source localization which uses both visual and aural information, it presently remains unclear how much either image or sound modalities contribute to the result, i.e 수고했어 오늘도 mp3. do we need both image and sound for sound source localization? To address this question, we develop an unsupervised learning system that solves sound source localization by decomposing this task into two steps: (i) “potential sound source localization”, a step that localizes possible sound sources using only visual information (ii) “object selection”, a step that identifies which objects are actually sounding using aural information 지금 만나러 갑니다 2018. Our overall system achieves state-of-the-art performance in sound source […]

2020-10-20 | peer-reviewedconferencepapers

LinSSS: linear decomposition of heterogeneous subsurface scattering for real-time screen-space rendering

Screen-space subsurface scattering is currently the most common approach to represent translucent materials in real-time rendering. However, most of the current approaches approximate the diffuse reflectance profile of translucent materials as a symmetric function, whereas the profile has an asymmetric shape in nature 헬로카봇 시즌4. To address this problem, we propose LinSSS, a numerical representation of heterogeneous subsurface scattering for real-time screen-space rendering. Although our representation is built upon a previous method, it makes two contributions Download the iPhone photo. First, LinSSS formulates the diffuse reflectance profile as a linear combination of radially symmetric Gaussian functions. Nevertheless, it can also represent the spatial variation and the radial asymmetry of the […]

2020-10-20 | peer-reviewedconferencepapers

Audio-Visual Object Removal in 360-Degree Videos

We present a novel concept audio-visual object removal in 360-degree videos, in which a target object in a 360-degree video is removed in both the visual and auditory domains synchronously Download iTunes backup. Previous methods have solely focused on the visual aspect of object removal using video inpainting techniques, resulting in videos with unreasonable remaining sounds corresponding to the removed objects 스티키노트. We propose a solution which incorporates direction acquired during the video inpainting process into the audio removal process. More specifically, our method identifies the sound corresponding to the visually tracked target object and then synthesizes a three-dimensional sound field by subtracting the identified sound from the input 360-degree […]

2020-10-13 | peer-reviewedconferencepapers

Resolving Hand-Object Occlusion for Mixed Reality with Joint Deep Learning and Model Optimization

By overlaying virtual imagery onto the real world, mixed reality facilitates diverse applications and has drawn increasing attention. Enhancing physical in-hand objects with a virtual appearance is a key component for many applications that require users to interact with tools such as surgery simulations Download hangul drawing yard. However, due to complex hand articulations and severe hand-object occlusions, resolving occlusions in hand-object interactions is a challenging topic 제이쿼리 모바일. Traditional tracking-based approaches are limited by strong ambiguities from occlusions and changing shapes, while reconstruction-based methods show a poor capability of handling dynamic scenes Lumion 8. In this paper, we propose a novel real-time optimization system to resolve hand-object occlusions by […]

2020-07-17 | peer-reviewedconferencepapers

Multi-Instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams

This paper describes a clustering-based music transcription method that estimates the piano rolls of arbitrary musical instrument parts from multi-instrument polyphonic music signals 코요태 비상. If target musical pieces are always played by particular kinds of musical instruments, a straightforward way to obtain piano rolls is to compute the pitchgram (pitch saliency spectrogram) of each musical instrument by using a deep neural network (DNN) Download low-flying. However, this approach has a critical limitation that it has no way to deal with musical pieces including undefined musical instruments. To overcome this limitation, we estimate a condensed pitchgram with an instrument-independent neural multi-pitch estimator and then separate the pitchgram into a specified […]

2020-05-25 | peer-reviewedconferencepapers

Asynchronous Eulerian Liquid Simulation

We present a novel method for simulating liquid with asynchronous time steps on Eulerian grids. Previous approaches focus on Smoothed Particle Hydrodynamics (SPH), Material Point Method (MPM) or tetrahedral Finite Element Method (FEM) but the method for simulating liquid purely on Eulerian grids have not yet been investigated 팡파레 효과음. We address several challenges specifically arising from the Eulerian asynchronous time integrator such as regional pressure solve, asynchronous advection, interpolation, regional volume preservation, and dedicated segregation of the simulation domain according to the liquid velocity 김씨네 편의점 다운로드. We demonstrate our method on top of staggered grids combined with the level set method and the semi-Lagrangian scheme. We run several […]

2020-05-18 | peer-reviewedconferencepapers

Foreground-aware Dense Depth Estimation for 360 Images

With 360 imaging devices becoming widely accessible, omnidirectional content has gained popularity in multiple fields. The ability to estimate depth from a single omnidirectional image can benefit applications such as robotics navigation and virtual reality 베가스 11 다운로드. However, existing depth estimation approaches produce sub-optimal results on real-world omnidirectional images with dynamic foreground objects. On the one hand, capture-based methods cannot obtain the foreground due to the limitations of the scanning and stitching schemes 프렌즈 시즌1. On the other hand, it is challenging for synthesis-based methods to generate highly-realistic virtual foreground objects that are comparable to the real-world ones Download Legacy into the Future. In this paper, we propose to […]

2020-04-25 | peer-reviewedconferencepapers

Smartphone-Based Assistance for Blind People to Stand in Lines

We present a system to allow blind people to stand in line in public spaces by using an off-the-shelf smartphone only. The technologies to navigate blind pedestrians in public spaces are rapidly improving, but tasks which require to understand surrounding people’s behavior are still difficult to assist 피아노 음악. Standing in line at shops, stations, and other crowded places is one of such tasks. Therefore, we developed a system to detect and notify the distance to a person in front continuously by using a smartphone with an RGB camera and an infrared depth sensor Download the marmoset. The system alerts three levels of distance via vibration patterns to allow users […]

2020-04-25 | peer-reviewedconferencepapers

BlindPilot: A Robotic Local Navigation System that Leads Blind People to a Landmark Object

Blind people face various local navigation challenges in their daily lives such as identifying empty seats in crowded stations, navigating toward a seat, and stopping and sitting at the correct spot Armi6.5. Although voice navigation is a commonly used solution, it requires users to carefully follow frequent navigational sounds over short distances. Therefore, we presented an assistive robot, BlindPilot, which guides blind users to landmark objects using an intuitive handle windows 95 다운로드. BlindPilot employs an RGB-D camera to detect the positions of target objects and uses LiDAR to build a 2D map of the surrounding area. On the basis of the sensing results, BlindPilot then generates a path to […]

2020-02-27 | peer-reviewedconferencepapers, awards

Single Sketch Image based 3D Car Shape Reconstruction with DeepLearning and Lazy Learning

Efficient car shape design is a challenging problem in both the automotive industry and the computer anima-tion/games industry. In this paper, we present a system to reconstruct the 3D car shape from a single 2D sketchimage Download LEGO Mindstorms. To learn the correlation between 2D sketches and 3D cars, we propose a Variational Autoencoder deepneural network that takes a 2D sketch and generates a set of multi-view depth and mask images, which forma more effective representation comparing to 3D meshes, and can be effectively fused to generate a 3D carshape 강하나 스트레칭 다운로드. Since global models like deep learning have limited capacity to reconstruct fine-detail features, wepropose a local lazy […]

2020-02-27 | peer-reviewedconferencepapers

Audio-guided Video Interpolation via Human Pose Features

This paper describes a method that generates in-between frames of two videos of a musical instrument being played. While image generation achieves a successful outcome in recent years, there is ample scope for improvement in video generation Download The World of Daqu. The keys to improving the quality of video generation are the high resolution and temporal coherence of videos. We solved these requirements by using not only visual information but also aural information all the brides of Habaek. The critical point of our method is using two-dimensional pose features to generate high resolution in-between frames from the input audio. We constructed a deep neural network with a recurrent structure […]

2020-02-27 | peer-reviewedconferencepapers

Melody Slot Machine: Audio-guided Video Interpolation via Human Pose Features

We developed an interactive music system called the “Melody Slot Machine,” which provides an experience of manipulating and controlling a music performance Download goal. The melodies used in the system are divided into multiple segments, and each segment has multiple variations of melodies. Users can create a variety of music by operating the touch panel and slot lever 한글 전서체. By turning the dials displayed on the panel manually, a user can switch the variations of melodies freely. When a user pulls the slot lever, the melody of all segments rotates, and melody segments are randomly selected Big Bang Bang Bang mp3. There are multiple melodies, from intense ones with […]

Song2Face: Synthesizing Singing Facial Animation from Audio

Do We Need Sound for Sound Source Localization?

LinSSS: linear decomposition of heterogeneous subsurface scattering for real-time screen-space rendering

Audio-Visual Object Removal in 360-Degree Videos

Resolving Hand-Object Occlusion for Mixed Reality with Joint Deep Learning and Model Optimization

Multi-Instrument Music Transcription Based on Deep Spherical Clustering of Spectrograms and Pitchgrams

Asynchronous Eulerian Liquid Simulation

Foreground-aware Dense Depth Estimation for 360 Images

Smartphone-Based Assistance for Blind People to Stand in Lines

BlindPilot: A Robotic Local Navigation System that Leads Blind People to a Landmark Object

Single Sketch Image based 3D Car Shape Reconstruction with DeepLearning and Lazy Learning

Audio-guided Video Interpolation via Human Pose Features

Melody Slot Machine: Audio-guided Video Interpolation via Human Pose Features

Recent Posts

Meta

Publucations