The Internet of Talking Things: Why Voice Interactions Catalyze Consumer IoT Adoption

Eric Suliga, Creative Director

By Jessica Groopman Jessica Groopman is an independent industry analyst and IoT advisor specializing in consumer-side Internet of Things, as well as AI, and blockchain. Her research and analyst practice concentrates on the application of sensors, machine learning, automation, and consumer protections in B2C and B2B2C businesses. Based in the San Francisco Bay Area, she supports clients across Retail, Smart Home, Wearable, and Tech verticals. To learn more about Jessica, visit

Although internet-connected devices have been on the market for years now, the reality is consumer IoT has struggled to achieve mainstream adoption. While an estimated 15 billion IoT devices are forecasted to be in consumer hands by 2020, the market has been scratching its head as to what will inspire such a significant growth trajectory.

Meanwhile, another technology is emerging; the advent of reliable voice recognition is redefining how we interact with machines. Thanks to advancements in machine and deep learning, and natural language processing, chatbots and voice-enabled controls and conversational interfaces are proving more productive and less frustrating. Already, the consumer sector accounts for the largest market using voice technologies relative to other industries such as financial, healthcare, automotive. Consumer applications represents some 80% of the entire voice recognition market in 2017 according to Tractica.

We know the most powerful technological advancements occur when multiple existing technologies converge, but why IoT and voice? The answer lies in three essential ways voice removes friction in user experiences.

First, it’s easier.

Moreover, it’s simpler than writing not only because it evades the formalities of writing, but it’s hands free. Indeed some 87% of consumers considering buying a smart speaker said they wanted one in order to ask it questions without having to type them, according to a recent study by Edison Research. Just as talking to another person tends to be easier than writing or typing to them, so too does voice remove friction in our interactions with devices.

From a technological adoption standpoint, this doesn’t just improve the interface—speaking over typing or tapping— it unlocks entirely new markets. Consider, for instance, how this enables elderly folks to use home care apps, disabled folks to enjoy internet services, play games, live more independently, or even offers kids story time enhancement or an easy way to engage with parents or family members remotely.

Second, it’s everywhere.

Although consumers rarely use the term ‘Internet of Things’ to describe their devices, the internet and sensors now pervade virtually all consumer electronics, appliances, and the machines and infrastructure we use every day. While not all consumers have connected coffee-makers or connected cars, consider how modern consumers are empowered with devices and data-driven services in each realm of life—many of which are already benefiting from voice interactions. For example:
  • Our selves: Smartphones and wearables pioneered voice for millions of users. In the not-so-distant future, we will see hearables—voice-controlled in-ear computers— supplement and possibly replace smartphone features like navigation, social media, and calendar notifications.
  • Our home: Smart speakers, TVs, cameras, remote controllers, refrigerators, thermostats, even social robots aren’t just popping up in homes and apartments everywhere, they’re all using voice recognition to allow people to benefit from in-home technologies without constantly staring at screens or swiping through apps.
  • In-transport: From cars to motorcycles, and even in public transportation, speech recognition is already being used to control music and media, make and take calls, navigate, even securely authenticate identity.
  • In-store: From personalized digital signage to augmented reality to customer service robots, brands and retailers are using all manner of in-store touchpoints to improve and streamline the shopper experience without losing the ‘human touch.’
  • In life: Whether at work in voice-enabled conference rooms; checking-in at the doctor’s office, voice authentication for digital banking, at a baseball game, or just about anywhere else, the ability to simply ask for anything at any time drives efficiencies for both consumers and the institutions with which we interact.

Third, it’s human.

Since the dawn of civilization, speaking has been our most natural form of communication. We are innately wired to learn and produce language with relatively little effort. Moreover, voice conveys umpteen unseen elements of communication—emotion, tone, cultural nuance, etc.—which we naturally absorb and use to gauge interactions. This is the inherent personalization we employ for social interaction.

Another long held human trait is imbuing our objects with human characteristics; we trust and identify with things when we anthropomorphize them. When devices talk to us, however robotic they may sound, they are immediately engendered with personality. Recent advances in processing speed and algorithms enable devices to respond dynamically, incorporating context from diverse data sets, customer preferences, and real-time contextual signals. This unlocks a new era of personalization wherein our devices become trusted partners, concierges, media curators, maybe even… friends?

Given the inherently heads-up, hands-free applications where voice interactions thrive—cooking, adjusting in-home environments, driving, etc.— audio is an immediate opportunity. Some 70% of smart speaker owners report listening to more audio in the home since purchasing these voice-enabled devices. Numerous studies have found music and entertainment in particular are the most requested actions of people using voice controls. (Coldwell Banker/Vivint study, Edison Research/NPR study, VoiceLabs study) This is perhaps unsurprising given music is inherently personal, culturally shared, and of course personalize-able.

As consumer IoT expands, brand persona must be ubiquitous yet personalized

Advancements in interface are more transformational than any single technology. Conversational interfaces reduce the learning and adoption curve while simultaneously fostering a bond between product and owner. Now that we can talk to machines and they can reply with accuracy, personality, even personalized and contextually relevant responses and media, a new baseline expectation is set. Consumers are unlikely to ever care about IoT; instead they value being able to intuitively access just what they want, just when they want it in the simplest way possible. Such expectations demand companies of all types, from device manufacturers to advertisers and service providers, to ‘just be there’—to respond, to entertain, to be more human.

Interested in diving deeper? To learn more about how Voice is the New Touch or about what consumers are doing with their voice-enabled smart speakers, check out Pandora’s latest research here.

To learn more about Jessica and her research, check out her website at: You can also access her new research report, The User Experience of Things: Why Ubiquitous Sensing and Software Require a New Approach to Experience Design, at no cost.