Home / Portfolio

LineChaser: A Smartphone-Based Navigation System for Blind People to Stand in Lines

Standing in line is one of the most common social behaviors in public spaces but can be challenging for blind people. We propose an assistive system named LineChaser, which navigates a blind user to the end of a line and continuously reports the distance and direction to the last person in the line so that they can be followed Download hp Recovery Manager. LineChaser uses the RGB camera in a smartphone to detect nearby pedestrians, and the built-in infrared depth sensor to estimate their position. Via pedestrian position estimations, LineChaser determines whether nearby pedestrians are standing in line, and uses audio and vibration signals to notify the user when they […]

MirrorNet: A Deep Reflective Approach to 2D Pose Estimation for Single-Person Images

This paper proposes a statistical approach to 2D pose estimation from human images. The main problems with the standard supervised approach, which is based on a deep recognition (image-to-pose) model, are that it often yields anatomically implausible poses, and its performance is limited by the amount of paired data Download the gift agreement. To solve these problems, we propose a semi-supervised method that can make effective use of images with and without pose annotations. Specifically, we formulate a hierarchical generative model of poses and images by integrating a deep generative model of poses from pose features with that of images from poses and image features net framework 3.0 다운로드. We […]

Do We Need Sound for Sound Source Localization?

During the performance of sound source localization which uses both visual and aural information, it presently remains unclear how much either image or sound modalities contribute to the result, i.e 세계 테마 기행. do we need both image and sound for sound source localization? To address this question, we develop an unsupervised learning system that solves sound source localization by decomposing this task into two steps: (i) “potential sound source localization”, a step that localizes possible sound sources using only visual information (ii) “object selection”, a step that identifies which objects are actually sounding using aural information 일러스트 cs3. Our overall system achieves state-of-the-art performance in sound source localization, and […]

Adversarial Knowledge Distillation for a Compact Generator

In this paper, we propose memory-efficient Generative Adversarial Nets (GANs) in line with knowledge distillation. Most existing GANs have a shortcoming in terms of the number of model parameters and low processing speed download md5. Here, to tackle the problem, we propose Adversarial Knowledge Distillation for Generative models (AKDG) for highly efficient GANs, in terms of unconditional generation Download Youtuber's Life. Using AKDG, model size and processing speed are substantively reduced. Through an adversarial training exercise with a distillation discriminator, a student generator successfully mimics a teacher generator in fewer model layers and fewer parameters and at a higher processing speed 윈도우10 수동. Moreover, our AKDG is network architecture-agnostic. A […]

Song2Face: Synthesizing Singing Facial Animation from Audio

We present Song2Face, a deep neural network capable of producing singing facial animation from an input of singing voice and singer label. The network architecture is built upon our insight that, although facial expression when singing varies between different individuals, singing voices store valuable information such as pitch, breathe, and vibrato that expressions may be attributed to 꾸러기 훈민정음 다운로드. Therefore, our network consists of an encoder that extracts relevant vocal features from audio, and a regression network conditioned on a singer label that predicts control parameters for facial animation Two popes downloaded. In contrast to prior audio-driven speech animation methods which initially map audio to text-level features, we show […]

Song2Face: Synthesizing Singing Facial Animation from Audio

We present Song2Face, a deep neural network capable of producing singing facial animation from an input of singing voice and singer label. The network architecture is built upon our insight that, although facial expression when singing varies between different individuals, singing voices store valuable information such as pitch, breathe, and vibrato that expressions may be attributed to Download Kakao Group. Therefore, our network consists of an encoder that extracts relevant vocal features from audio, and a regression network conditioned on a singer label that predicts control parameters for facial animation HotspotShield Mobile. In contrast to prior audio-driven speech animation methods which initially map audio to text-level features, we show that […]

Do We Need Sound for Sound Source Localization?

During the performance of sound source localization which uses both visual and aural information, it presently remains unclear how much either image or sound modalities contribute to the result, i.e 수고했어 오늘도 mp3. do we need both image and sound for sound source localization? To address this question, we develop an unsupervised learning system that solves sound source localization by decomposing this task into two steps: (i) “potential sound source localization”, a step that localizes possible sound sources using only visual information (ii) “object selection”, a step that identifies which objects are actually sounding using aural information 지금 만나러 갑니다 2018. Our overall system achieves state-of-the-art performance in sound source […]

LinSSS: linear decomposition of heterogeneous subsurface scattering for real-time screen-space rendering

Screen-space subsurface scattering is currently the most common approach to represent translucent materials in real-time rendering. However, most of the current approaches approximate the diffuse reflectance profile of translucent materials as a symmetric function, whereas the profile has an asymmetric shape in nature 헬로카봇 시즌4. To address this problem, we propose LinSSS, a numerical representation of heterogeneous subsurface scattering for real-time screen-space rendering. Although our representation is built upon a previous method, it makes two contributions Download the iPhone photo. First, LinSSS formulates the diffuse reflectance profile as a linear combination of radially symmetric Gaussian functions. Nevertheless, it can also represent the spatial variation and the radial asymmetry of the […]

Audio-Visual Object Removal in 360-Degree Videos

We present a novel concept audio-visual object removal in 360-degree videos, in which a target object in a 360-degree video is removed in both the visual and auditory domains synchronously Download iTunes backup. Previous methods have solely focused on the visual aspect of object removal using video inpainting techniques, resulting in videos with unreasonable remaining sounds corresponding to the removed objects 스티키노트. We propose a solution which incorporates direction acquired during the video inpainting process into the audio removal process. More specifically, our method identifies the sound corresponding to the visually tracked target object and then synthesizes a three-dimensional sound field by subtracting the identified sound from the input 360-degree […]

Resolving Hand-Object Occlusion for Mixed Reality with Joint Deep Learning and Model Optimization

By overlaying virtual imagery onto the real world, mixed reality facilitates diverse applications and has drawn increasing attention. Enhancing physical in-hand objects with a virtual appearance is a key component for many applications that require users to interact with tools such as surgery simulations Download hangul drawing yard. However, due to complex hand articulations and severe hand-object occlusions, resolving occlusions in hand-object interactions is a challenging topic 제이쿼리 모바일. Traditional tracking-based approaches are limited by strong ambiguities from occlusions and changing shapes, while reconstruction-based methods show a poor capability of handling dynamic scenes Lumion 8. In this paper, we propose a novel real-time optimization system to resolve hand-object occlusions by […]

3D Car Shape Reconstruction from a Contour Sketch using GAN and Lazy Learning

3D car models are heavily used in computer games, visual effects, and even automotive designs. As a result, producing such models with minimal labour costs is increasingly more important 연희 몽상. To tackle the challenge, we propose a novel system to reconstruct a 3D car using a single sketch image. The system learns from a synthetic database of 3D car models and their corresponding 2D contour sketches and segmentation masks, allowing effective training with minimal data collection cost 흥망성쇠. The core of the system is a machine learning pipeline that combines the use of a Generative Adversarial Network (GAN) and lazy learning. GAN, being a deep learning method, is capable […]

Resolving Hand-Object Occlusion for Mixed Reality with Joint Deep Learning and Model Optimization

By overlaying virtual imagery onto the real world, mixed reality facilitates diverse applications and has drawn increasing attention. Enhancing physical in-hand objects with a virtual appearance is a key component for many applications that require users to interact with tools such as surgery simulations Download packet tracer. However, due to complex hand articulations and severe hand-object occlusions, resolving occlusions in hand-object interactions is a challenging topic 윈도우10 컨슈머 프리뷰. Traditional tracking-based approaches are limited by strong ambiguities from occlusions and changing shapes, while reconstruction-based methods show a poor capability of handling dynamic scenes 카카오뮤직 음악. In this paper, we propose a novel real-time optimization system to resolve hand-object occlusions by […]

Guiding Blind Pedestrians in Public Spaces by Understanding Walking Behavior of Nearby Pedestrians

We present a guiding system to help blind people walk in public spaces while making their walking seamless with nearby pedestrians. Blind users carry a rolling suitcase-shaped system that has two RGBD Cameras, an inertial measurement unit (IMU) sensor, and light detection and ranging (LiDAR) sensor Download Rio 2 dubbed. The system senses the behavior of surrounding pedestrians, predicts risks of collisions, and alerts users to help them avoid collisions. It has two modes: the “on-path” mode that helps users avoid collisions without changing their path by adapting their walking speed; and the “off-path” mode that navigates an alternative path to go around pedestrians standing in the way Download this […]

LinSSS: linear decomposition of heterogeneous subsurface scattering for real-time screen-space rendering

Screen-space subsurface scattering is currently the most common approach to represent translucent materials in real-time rendering. However, most of the current approaches approximate the diffuse reflectance profile of translucent materials as a symmetric function, whereas the profile has an asymmetric shape in nature 야마토 게임 다운로드. To address this problem, we propose LinSSS, a numerical representation of heterogeneous subsurface scattering for real-time screen-space rendering. Although our representation is built upon a previous method, it makes two contributions c# 첨부 파일 다운로드. First, LinSSS formulates the diffuse reflectance profile as a linear combination of radially symmetric Gaussian functions. Nevertheless, it can also represent the spatial variation and the radial asymmetry of […]

Audio-Visual Object Removal in 360-Degree Videos

We present a novel concept audio-visual object removal in 360-degree videos, in which a target object in a 360-degree video is removed in both the visual and auditory domains synchronously all the music in the world. Previous methods have solely focused on the visual aspect of object removal using video inpainting techniques, resulting in videos with unreasonable remaining sounds corresponding to the removed objects jeus 7 다운로드. We propose a solution which incorporates direction acquired during the video inpainting process into the audio removal process. More specifically, our method identifies the sound corresponding to the visually tracked target object and then synthesizes a three-dimensional sound field by subtracting the identified […]

1
2
3
›
»