Authors:

Kevin Yu*, Gleb Gorbachev*, Ulrich Eck, Frieder Pankratz, Nassir Navab, Daniel Roth (*shared authorship)

Abstract: 

A 3D Telepresence system allows users to interact with each other in a virtual, mixed, or augmented reality (VR, MR, AR) environment, creating a shared space for collaboration and communication. There are two main methods for representing users within these 3D environments. Users can be represented either as point cloud reconstruction-based avatars that resemble a physical user or as virtual character-based avatars controlled by tracking the users' body motion. This work compares both techniques to identify the differences between user representations and their fit in the reconstructed environments regarding the perceived presence, uncanny valley factors, and behavior impression. Our study uses an asymmetric VR/AR teleconsultation system that allows a remote user to join a local scene using VR. The local user observes the remote user with an AR head-mounted display, leading to facial occlusions in the 3D reconstruction. Participants perform a warm-up interaction task followed by a goal-directed collaborative puzzle task, pursuing a common goal. The local user was represented either as a point cloud reconstruction or as a virtual character-based avatar, in which case the point cloud reconstruction of the local user was masked. Our results show that the point cloud reconstruction-based avatar was superior to the virtual character avatar regarding perceived co-presence, social presence, behavioral impression, and humanness. Further, we found that the task type partly affected the perception. The point cloud reconstruction-based approach led to higher usability ratings, while objective performance measures showed no significant difference. We conclude that despite partly missing facial information, the point cloud-based reconstruction resulted in better conveyance of the user behavior and a more coherent fit into the simulation context.

Media:

Supplementary Video