Multimodal Error Correction for Speech To Text @ IUI’22

I am glad to be able to announce that the “last core study” of my PhD thesis on a mobile office conditionally automated vehicle, was accepted as a full paper publication at 2022’s Intelligent User Interfaces ACM Conference (IUI’22). I happily invite anyone to attend the remote talk (originally in Helsinki, Finland) on Thursday, 24th March 14:30 – 15:55 (UTC+2) – find more information in the conference program.

The study and paper investigate tackling the “repair problem” of speech-to-text text entry in a mobile context by adding already existing but underutilized in-vehicle interaction modalities, like touch-sensitive areas or mid-air hand gestures. For the study, we employed a low-fidelity WoZ user study approach using Zoom.us’ “click-through” option with an Adobe Axure click prototype.

Multimodal Error Correction for Speech-to-Text in a Mobile Office Automated Vehicle: Results From a Remote Study

Clemens Schartmüller and Andreas Riener. 2022. Multimodal Error Correction for Speech-to-Text in a Mobile Office Automated Vehicle: Results From a Remote Study. In 27th International Conference on Intelligent User Interfaces (IUI ’22), March 22–25, 2022, Helsinki, Finland. ACM, New York, NY, USA, 15 pages. https://doi.org/10.1145/3490099.3511131

ACM Reference Format

The paper will most likely be available in the ACM Digital Library and on my ResearchGate profile as soon as the conference starts, on March 21st 2022.

Future users of automated vehicles will demand the ability to perform diverse and extensive non-driving related tasks. However, prevailing restrictions in the car require new interaction concepts to enable productive office work. Intelligent voice-based interfaces may be a solution to facilitate productivity while at the same time keeping the driver in the loop” and thereby maintaining safety. In this work, we investigated the repair problem of productive speech-to-text input in a highly automated vehicle. We examined the user experience of selecting/navigating to an incorrectly recognized word using only speech, pointing and clicking on a touchpad, and using mid-air hand gestures. Results indicate that hand gestures (condition VaG) have high hedonic quality but are not considered viable for error correction in productive text input. On the other hand, the unimodal (Voice-only; baseline) and touchpad-based point-and-click (VaT) approaches to error correction were rated equally well in the hypothesized mobile office” automated vehicle. The utilized remote study execution methodology proved to be a useful intermediary tool between pure online surveys and on-site studies for qualitative research during a pandemic but suffered from a lack of fidelity and options for objective usability and safety evaluation.

Abstract