How tech firms are racing to perfect conversational artificial intelligence

The Internet of Things is struggling to break into the mainstream because it’s hard to set up. Simply talking to your devices may be the solution

 
Amazon echo and other Internet of Things devices

(Photoillustration by Justin Poulsen)

It sounds like the sort of trivial debate you might have with a friend over drinks at a pub, but here’s an actual back and forth I had with my phone the other day while driving.

Me: “Who invented pizza?”

Phone: “Legend has it that [Raffaele] Esposito was called upon to make a pizza for Italian King Umberto I and Queen Margherita when they visited Naples in 1889.”

“Okay, well, who invented Hawaiian pizza?”

“It appears that Hawaiian pizza is not from Hawaii, but from Chatham, Ont., where retired pizza cook Sam Panopoulos says he invented the dish at Satellite Restaurant in 1962.”

“Why do people hate it?”

“Pizza is an Italian thing. Pineapple is a Hawaiian thing. Some people feel the two shouldn’t be combined and many don’t feel the sweet pineapple belongs or goes along with the flavours that a pizza should convey.”

My curiosity for pizza’s origins had been spurred innocuously enough. I’d just driven by an Italian restaurant and got to wondering about the food’s history. Hawaiian pizza and some people’s distaste for it had also been in the news recently.

Rather than pull over and do an Internet search, I looked at my phone sitting in its mount and activated Google Assistant by uttering the magic phrase, “Okay Google…”

In response, a friendly female voice quickly and accurately answered my queries, drawing from Wonderopolis, the National Post and Quora, respectively.

The conversation made me hungry, so I asked where I could get the best pizza—non-Hawaiian, of course. My voice assistant suggested a few of the best-reviewed places nearby, based on my location.

I gulped down lunch a few minutes later, feeling as though both my hunger and my thirst for useless knowledge had been satiated.

Pizza minutiae aside, Google Assistant and its kin are quickly becoming the next big thing in technology—and for good reason. Not only can these artificially intelligent voice assistants—including Amazon’s Alexa, Microsoft’s Cortana and Apple’s Siri among others—provide us with trivia, they can also perform many actually useful tasks.

Voice assistants are increasingly able to manage our calendars, send messages and book tables at restaurants. They can learn our habits and anticipate what we might want: suggesting when we should leave for work for instance, or automatically playing our favourite songs on the nearest speaker.

They’re also moving beyond smartphones. Soon, they’ll be found in everything from toasters to coffee makers, clothing and cars.

Alexa, for one, was omnipresent at this year’s Consumer Electronics Show in Las Vegas. Amazon’s voice AI, which works much like Google’s, was demonstrated in lamps, speakers, vacuums, cars and even refrigerators.

But the promise of intelligent voice assistants goes well beyond bringing inanimate objects to life; it will change how and when we use technology, and perhaps more importantly, who uses it. “There’s still a huge portion of the population that doesn’t know how to interact with technology,” says Alexander Wong, co-director of the Vision and Image Processing Research Group at the University of Waterloo. “When you get technology to the point where it can interact verbally, pretty much anybody can use it at any time.”

The underlying technology in voice assistants has been developing for decades, but it arguably made its first splash on Apple’s iPhone 4S in 2011. Siri wowed users, not only with “her” dictation abilities, but with her sense of humour. Siri, it turns out, was programmed with a large database of jokes.

The initial novelty wore off, though, and Siri became better known for her mistakes—confusing words like “goodbye” and “Dubai” for example—than her successes. Apple’s competitors, however, took notice. In 2014, Microsoft released Cortana for Windows, while Amazon’s Echo speaker—with Alexa built in—really propelled the market forward.

What set Alexa apart was that the technology—or rather the confluence of technologies that allow AI voice assistants to perform—had finally matured.

Voice AI requires four basic components: speech recognition (to understand what is being said), processing power (to crunch the information), a fast Internet connection (to relay data back and forth between the cloud and the device) and natural playback (so it doesn’t sound like a robot).

The algorithms that drive assistants also require reams of data. The more they have, the better they get. Listen to a song once, for example, and an AI won’t know if you like it. But listen to it dozens of times and the AI will be able to reasonably identify it as one of your favourites. It’s the same basic principle for pretty much everything the assistants do.

Consumers, it seems, are intrigued. Echo speakers have been a hit for Amazon. The online retailer doesn’t disclose sales numbers, but it has identified them as one of the most popular items on its website. External estimates suggest there are as many as 14 million Echo users in the United States alone, and that’s before Alexa makes its way into other appliances and cars.

Google, meanwhile, dabbled in different voice AI applications for years before finally settling on Google Assistant last fall. Launched in its Pixel phones and Google Home speakers, the assistant has received rave reviews for its accuracy and is seen as Alexa’s strongest competitor.

Refusing to sit on the sidelines, Samsung last fall acquired Viv, the San Jose, Calif.-based start-up launched by the creators of Siri. The electronics giant has plans to incorporate Viv into its many products, including phones, TVs and washing machines.

The global market for voice AI speakers is expected to grow at a compound annual rate of 43 per cent to reach US$2.1 billion by 2020, according to analysis firm Gartner. By then, 3.3 per cent of global households will have a speaker, and of those, a quarter will have more than one. A report from Global Market Insights expects AI voice assistants overall to be worth US$11 billion by 2024.

The next frontier, according to Gartner, will be cars. Ford, Volkswagen, Volvo, Hyundai, Fiat Chrysler, Nissan and BMW are just a few manufacturers who have announced plans to incorporate Alexa, Google Assistant and Cortana into their vehicles. Siri is also a core feature of Apple’s CarPlay, which is rolling out in many new vehicles.

Voice AI, it turns out, may be the ideal way to curb the proliferation of dashboard controls in cars. From GPS navigation to adjusting cabin temperature and music preferences, voice commands will likely be safer than manual buttons. “If you think about it, the automotive human-machine interface is kind of predestined for voice,” says Gartner research director Werner Goertz.

The same applies to many Internet-connected items that have hit the market in recent years, from appliances to wearable gadgets. Such products typically require a go-between such as a phone or tablet app for users to manage settings or glean results. Voice AI could, in the near future, obviate the need for those apps.

Scott Huffman, vice-president of engineering on Google Assistant, brings up his connected lights as an example. “If I want to change the colour to blue for dinnertime, I have to pull out my phone, open the app… It’s a big pain in the butt, so I’m not going to do it,” he says. “I wonder if voice and that ubiquitous ability to access it is going to cause [a] shift where I do more things because it’s easy.”

The technology’s promise won’t be realized without some obstacles, however. As with any data-hungry application, voice assistants are constantly creating new packets of information—highly personal in some cases—that could be accessed or exploited by authorities or hackers. And voice AI devices are always listening for their activation phrase, which makes them ripe for eavesdropping.

To underscore the point, the FBI recently demanded that Amazon provide it with Echo voice records in a murder case. The company turned the data over after the suspect in question gave his permission to release it.

Understandably, two-thirds of consumers polled in a recent study by Gartner said they worried that their home devices could be used to listen in on their private conversations.

The technology has a trust issue to overcome, which means manufacturers must be clear about how user data is gathered, stored and used. “We hope that when we’re talking to this disembodied assistant that we have the same amount of privacy as we would if we were hiring our own personal assistant,” says Christopher Parsons, research associate at the University of Toronto’s Citizen Lab. “In order for that to have any hope of holding, these companies need to be more transparent.”

Wong’s team at the University of Waterloo is working on a potentially novel solution. Their approach centres around “operational” AI, or intelligent voice assistants that reside mostly on the devices themselves. Such localization would limit the need to send large amounts of personal user data over the Internet, where it can be intercepted. “We’re trying to take these giant brains and cram them down,” he says. “That would help mitigate some of the risk.”

AI experts say the idea has merit but also point out that assistants will always work better when connected to the Internet, where they can draw on massive processing and data resources in real time.

Google’s Huffman meanwhile accepts that companies could be doing more to educate the public on how to protect themselves. Google Assistant, for instance, allows users to view their usage logs and delete them if necessary. But because the option is new, many users aren’t making use of it. “We follow all the same sort of logging principles and controls that we do for Google Search,” Huffman says.

With adequate safeguards in place, AI voice assistants have the potential to expand and equalize access to information and tools, especially for those who have less experience with technology, including the elderly and people with mental or physical disabilities. “It’s a very different thing from installing an app or learning how to use a site,” Huffman says. “That’s potentially very powerful.” Isn’t that, after all, the whole point?


MORE ABOUT THE INTERNET OF THINGS & ARTIFICIAL INTELLIGENCE:

Get our daily briefing on innovation, leadership, technology & the economy.
Weekdays at 6 AM ET. Learn More »

Comments are closed.