Smart speakers with voice agents are becoming increasingly common. However, the agent’s voice always emanates from the device, even when that information is contextually and spatially relevant elsewhere. Digital Ventriloquism allows smart speakers to render sound onto everyday objects, such that it appears they are speaking and are interactive. This can be achieved without any modification of objects or the environment. For this, we used a highly directional pan-tilt ultrasonic array. By modulating a 40 kHz ultrasonic signal, we can emit sound that is inaudible “in flight” and demodulates to audible frequencies when impacting a surface through acoustic parametric interaction. This makes it appear as though the sound originates from an object and not the speaker. We ran a study in which we projected speech onto five objects in three environments, and found that participants were able to correctly identify the source object 92% of the time and correctly repeat the spoken message 100% of the time, demonstrating our digital ventriloquy is both directional and intelligible.
Previous Sozu Presented at UIST