Voice devices are now everywhere, whether you like them or not. Amazon's Alexa, Google's Assistant, and Apple's Siri have proved that voice interactions are not from science fiction films but part of our new reality.
Just as touch screens, voice interaction with devices will completely revolutionize how we interact with our computers, smartphones, and watches (and even cars and houses) in the coming years. But you might ask yourself, why is it evolving at such a fast speed? Well, there are many reasons.
Artificial intelligence is fast advancing, allowing machines to understand a wider variety of human speech. Without AI, computers would be lost at understanding idioms, accents, and other language aspects that have made it so complicated for machines to understand. It wasn't long ago when people would just get frustrated with Siri because they wouldn't understand the right thing. Now, things have advanced to such a degree that people use voice interactions for their daily tasks.
Voice interaction presents a huge opportunity to build a better relationship between humans and machines, providing an enjoyable user experience that can feel more natural and human than typing everything on a screen. With voice, not only can people with disabilities better interact with their devices, it opens an opportunity of using idioms, tone, and ways of speaking to systems and brands so that they can have a persona that people can better identify with.
Voice is set to be such a game-changer that there will be thousands of jobs applications for voice design specialists, and at the moment, not many people have these kinds of skills. So whether you are a UX designer or are simply curious about this up-and-coming area, let's dive a little deeper into what it is and how it works.
What exactly is it?
User voice interface, voice services, voice assistance, voice-first devices, and many other names that already exist and people use interchangeably are all the same: user voice interfaces.
Just as within a standard interface (a graphical one), there needs to be a keyboard, mouse, touchscreen, or something to interact with the machine. In voice interfaces, the voice is the interactive aspect between user and device. The way the device will respond to the voice commands will be either by speaking back or by showing a visual response, creating two different types: voice-only or multimodal interactions.
Voice-Only Interactions
As its name indicates, Voice-Only interactions are those interactions that are commanded and responded to only using voice. For example, a voice search on Alexa with a voice response will be a voice-only interaction. Although these are great and people do use them, it is believed that multimodal interactions will be the future. Mainly due to the limitations that voice-only interactions entail. For example, cognitive overload or the speed of information delivered needs to be controlled.
Multimodal Interactions
A voice-controlled TV is the perfect example of a multimodal interface. The user does not have to reach for the remote control, yet they see the results of their voice commands playing on the screens. This multimodal interaction allows for more information to be conveyed to the user than a voice-only interface. You can show multiple options on the screen and let the user decide (with a voice command) which option to go for.
Multimodal and voice-only interfaces aren't only what the future looks (or sounds) like, making things faster and more comfortable. They also help protect the wellbeing of the users by avoiding repetitive straining injuries from handheld devices or even from desk arrangements.
Multimodal and voice-only interactions are the future, but experts still don't know which one will become the dominant way of interacting. What is known now is that over half the US teens use voice search daily, yet we are still far from a voice-only future.
How to Design Voice Interfaces
But how exactly will UX designers adopt human-machine interactions to voice? We do know that this is not just an advancement in technology but branding. Companies have realized that they can increase their personal touch, add personality to their brand and boost their brand loyalty through voice.
Since it is still UX design, just through another medium, many of the same principles will apply user research, constant and vigorous testing with target audience's users, continuous iteration, simplicity, and functionality, etc.
However, there are some differences that we will look at:
1. Typing and Talking Styles Differ
The way we type and the way we talk are very different. Although we might search online for things like 'Keanu Reeves films,' when we speak we might ask 'Which films have Keanu Reeves been in?'. When designing voice-commended applications, we need to understand these differences. Voice services will need to understand the different ways we will command, considering how direct the commend will be and the different accents, tones, idioms, and more, that the user will use.
2. System Status Will Change
Due to the nature of voice interactions will resemble more human conversation than the human-machine interactions that we know and experience. How will computers communicate in a voice interaction that they are working and alert? In human interactions, there is nodding, smiling, repeating things, and asking relevant questions. Just as these serve a purpose, new ways to show this system status will need to consider the voice interactive world we want to design.
3. System Personality
Personality is attractive—both in humans and in systems. People love to see their brand personas be funny and quirky. But above all, users want to encounter a unique personality, not something that feels utterly scripted and wants to be liked. So although right now the focus is on developing the most intelligent voice services, companies also have their sights on creating the most likable and genuine brand personas through voice.
4. Information Architecture and Website Navigation
We build information architecture, and website navigation will also be replanted in the voice interactive world. Because although on a website, you might go through multiple windows and buttons until you get to your destination, with voice, you won't say these steps out loud. You will ask for the request.
For example, instead of going through the homepage, my account section, then viewing shopping history, and then opening an item and reordering, when you are giving a voice command, you might say, 'Alexa, reorder X item. From Amazon.'
5. Security and Privacy
With voice interaction, there are multiple new opportunities at the reach of your voice, but security and privacy still need to be taken into account. It would not be wise to have the user say a password or confirm a credit card number aloud. Although we have to give credit to users and hope they understand basic security concepts, designers need to keep these things in mind and design accordingly.
Although this is an exciting new technology with many new opportunities, there is also room for improvement. Its advancements always need to be user-centered. Understanding which context a voice interaction will add to their experience and when best to stay with the visual approach will be paramount. The future will have a significant component of voice interactions, and as the technology evolves, so will our designs. The possibilities are endless, and the benefits are many.