“Dirty Water Restaurant has a table for 3 on Friday night, is that okay?” said the virtual assistant to the human.
No, this isn’t the start of a weird joke. It’s a response from one of Nuance’s conversational agents, which was demoed to me at Nuance Communications’ Sunnyvale headquarters this past August, and it has a unique talent for a virtual agent—it can respond to shifts in topic and continue a logical conversation. These shifts are still largely domain-based, but they nonetheless allow a human to carry on an active conversation when trying to decide, for example, where to make a dinner reservation in San Francisco.
Change your mind in restaurant type? That’s exactly how this conversation got started. The demonstrator decided that he was interested in eating out on Friday evening instead of over the weekend, and wanted Chinese instead of an upscale American restaurant. Instead of getting flustered, the agent switched gears, and its response echoed a couple of important cues. “How many people is the reservation for?” He didn’t mention anything about reservations, but the agent correlated the likely necessity of making a reservation at an upscale restaurant on a busy Friday night, and knew that it also needed to find out how many people would be dining.
The system doesn’t have to start from scratch each time you ask it a question. It knows what information to keep and what to update, applying those changing details to the ongoing conversation between virtual agent and human.
We humans (mostly) take our conversational abilities for granted, and we also don’t think about our questions and responses in a logical “if-then” fashion. Instead, we use context clues and make assumptions. Language is powerful; if it wasn’t, we’d be inefficient and far less intelligent beings at best.
If our emerging chatbots and virtual agents want to keep up, they have to be smart enough to understand what details to include in context, which to ignore, and what other types of cues to use (such as slang expression, tone of voice, facial expression, etc.) in deciding the course of a conversation. “When people communicate, they’re trying to solve a problem together, and with a virtual system, it’s similar, we have to work together,” says Charlie Ortiz, director of the AI and Natural Language (NL) Processing Lab at Nuance.
A lot of progress has been made in AI and natural language processing (NLP) with “narrow” chatbots and agents. Nina was Nuance’s first implementation of a “smart virtual assistant”, one that Mark Hanson, Nuance’s Director of Design and Development, says “is able to do all of the things that an enterprise needs it to do...you can just ask your questions and tell the virtual assistant what you want.” Since then, Hanson says Nuance has seen some really cool implementations made with its technology. Domino’s Dom, for example, when launched in late 2014, was the first voice-ordering app of its kind in the food industry. Dom relays personalized orders, suggests other items to order or where to find coupons, and even injects some pizza-related humor while getting it done.
Jetstar’s Jess is another virtual use-case that’s been active since early 2014. The ‘Ask Jess’ service draws on Nuance’s NLP technology to get its assistant to understand ‘intent’ and not just words. “It’s really a seamless conversational experience the we’re trying to drive that simplifies the systems that we’re providing the user,” says Hanson. Jess was designed as a booking system and is versed in providing responses across more than 20 different topics. Assuming that Jess functions as planned, it seems easy to draw the correlation between its implementation and improved customer service. But that’s just one tiny slice of the whole potential pie of potential optimal possibilities for both consumers and businesses.
As Hanson explains the technology, as customers ask more questions, Jess starts to “learn” what it is that they’re asking. Customers can access Jess on the web or on mobile and ask very specific questions, and because they’re talking to the same virtual assistant that everyone’s talking to - think of it as a centralized brain - it’s the same intelligent system with which you’re having a conversation, not a separate agent (so to speak). “We can start to centralize the intelligence and learn all sorts of new things about our customers; we get this learning all the time, which is, “Hey, I didn’t know my customers actually wanted to know that piece of information,” says Hanson.
Take seat reservations, for example. One of the more interesting question that people asked Jess was if they could purchase a seat and not sit in it. Seems strange, maybe even a bit suspicious at first glance. But based on the data collected through Jess, it turns out that this could be attributed to many brides who were interested in purchasing an extra seat for their wedding gowns when traveling to a destination wedding.
This type of specific information that’s not easily found or answered on the web (known as sparse data) is extremely useful and potentially very valuable to companies in making operations and marketing decisions, but was more or less largely unknown or invisible before virtual chat agents like Jess came along. Teaching virtual assistants to respond to such unique questions, the ones that do not exist as labeled text on the web, is something that can now be taught explicitly; however, scaling such a system across enterprise is a challenge.
“You come back to this question, well then how do we teach a system to do new things? Today, the answer is we build it, we explicitly teach it; the answer tomorrow is, it’s going to learn, but the question is how is it going to learn?” Hanson poses. Nuance’s AI and NLP lab has been hard at work solving that problem, developing a way where a virtual chat agent can learn through the observation of human agents.
So, instead of having a human explicitly program ‘here’s an intent…and here’s how to resolve that intent’, Nuance is trying to have the agents emulate the intelligence of humans that work inside a company’s contact center or company. “We have tooling today…so that when a virtual agent is asked something it doesn’t know the answer to…we actually capture the information, and we apply NLP to that so that we can start to group those unknowns together,” explains Hanson.
Here’s a simplified overview of how that gradual learning process works:
It’s all about the confidence threshold, and its a learning process that sounds a whole lot like a modified version of educational scaffolding.
Finding patterns of unique question “clusters” and answers are also be delivered back to a company, who can then decide the best answer or action needed to resolve the particular pattern of intent. As with all technologies, it’s obviously important for humans to maintain control, and a built-in auditing process allows humans to go into the system and see which answers it’s learned and make any necessary edits.
Finding ways to efficiently scale this learning process is a current obstacle, as is creating a more “general” virtual assistant that can answer a wide variety of questions across domains and accomplish a range of different tasks, from ordering pizza to buying a planet ticket to sending a personalized card to a friend. “The main advance is going to be conversational systems, dialogue-based systems, that can engage with a user in a multi-utterance interaction, because that’s the way you and I talk, and we don’t want to try to teach people to talk in a different way,” says Ortiz. Related to that project focus is making these systems more “worldly”. In Ortiz’s words, “Imbuing systems with more and more of this world knowledge, common sense knowledge, that’s going to make systems slowly, but hopefully surely, more robust.”
It’s difficult to say when we’ll have a confident virtual agent that can follow our shifts in thought and talk about anything under the sun, but it’s no secret that Nuance and other similar companies in the field are hard at work on making this a reality.
The Morning Email helps you start your workday with everything you need to know: breaking news, entertainment and a dash of fun. Learn more