This startup raised $30 million for a robot secretary (guess how they programmed him/her to make small talk)

04/25/2016 12:37 pm ET Updated Dec 06, 2017

Robots are taking our jobs. Or, are they?

Dennis Mortensen, the CEO of x.ai, is making one of those artificial intelligence bots. He sat down with Lars Gaede to preview the topics he will cover at the Work Awesome conference in June.

Building artificial intelligence: Dennis Mortensen (CEO x.ai)

Lars Gaede: Dennis, your company x.ai offers an artificial intelligent personal assistant that makes, changes and cancels appointments via e-mail. Your customers tell the virtual assistant, Andrew (or Amy) in an e-mail what to do and they immediately contact the right people and converse with them until the appointment is set. The e-mails the assistant sends out sound absolutely natural. Like they were written by a real human. How did you decide to humanize your technology and make people believe they are dealing with a actual human being?
Dennis Mortensen: Whenever you are designing an artificial intelligent agent that really is the first decision to make: Do you want to humanize the agent as Apple did with Siri, as Microsoft did with Cortana and as we do with Amy and Andrew - or do you prefer to make it feel like software and call it Google Now for example. And then you have to go all the way: either or. Because if the interface you use is natural language making the assistant sound only a little bit human is very weird.

Lars: Because then it feels like a clumsy robot is writing to you?
Dennis: Something like that. And that serves no purpose. We decided to humanize the assistant, because we think there is a huge value in that: People build relationships with the assistants when they appear like a real persona. Our assistant has a full name - Amy or Andrew Ingram - it remembers who you are and it will be the same persona when you get back to her on the next day. And it works. Our customers almost always refer to Amy and Andrew as „she" or „he" not as „it". That I find very fascinating. It means that even if people know they are dealing with a machine they decide not to treat it as one. They speak with their assistant like they would speak with a human.

Lars: How did you teach the software to sound human?
Dennis: We spent over 2 years trying to bring Amy and Andrew to life. We looked at tens of thousands of mails that were sent to set up meetings and we tried to map out the dialog flow. Not so much the individual words and sentences themselves. What does one person say? What does the other person answer? How do they conclude? How does the whole negotiation work? And then we tried to replicate that.

Lars: How? By writing a ton of templates?
Dennis: Yes, but here is the challenge: templates can quickly turn into something very robotic like on some phone system. You click one, there's a human voice saying something. Then you click 2 and the human voice says something else. But at no point do you feel like you're in a conversation with another individual. So we've invested heavily into creating dynamic templates that do not stand on their own, but are part of a bigger whole. Meaning that if you speak to Amy on a Monday, and you speak to her again on a Wednesday or in three weeks from now, you should not see that as three distinct templates, but as one whole being. Achieving that is super difficult. And interestingly for this particular challenge we ended up hiring not engineers, but somebody who has written plays and whose educational background is in mythology. Somebody who knows how to study and create characters.

Lars: Interesting! Why was this important?
Dennis: Just like when you go see a play on Broadway, you expect to build some relationship with the character. Between the moment he walks on stage, and the moment he walks off two hours later, he should have moved something within you. It's not just that he said a lot of words and you took note of those words. There is more to that. We have tried to apply some of the same philosophy to how we've built Amy and Andrew.

Lars: How does all this work on the more technical side? How exactly do you make the assistant understand what's written in the emails? And then write answers that make sense?
Dennis: The part that makes it hard is that we need to have a near 100% accuracy. When you search on Google and they don't completely understand what you ask for, they'll still give you a set of web results. They don't need a 100% accuracy. For us, we need to make sure that we understood that you want a meeting with Dennis for example. It needs to be this week. It has to happen over Skype. Dennis is in a different zime zone and so on. We cannot not understand that. That applies a lot of stress on the system. What makes it easier, and I'm putting that in big quotes here, is that we work within a specific domain, that of scheduling meetings.

Lars: Why does that make it easier?
Dennis: It means that just because you start to talk about how awesome your favourite team played on saturday, Amy can choose to somewhat disregard that. She's not into the chit chat type conversation. She's only into chit chat about "So when are we going to meet up with Dennis?".
So the first thing you have to do, if you want to go create an intelligent agent is come up with a complete conceptual description of this universe which you access. You need to know all the boundaries of that universe. If you don´t, you can not predict all outcomes within it. So you have to define- what is a meeting? Then you have to come up with some kind of a scheme for how to assign data into that model. And then the next question is: on what data set do you model? There was no meeting scheduling data set that we could acquire and start to work on. Just like there's no self-driving car data set that you can go acquire and then start to build your own self-driving car. If you want to do a self-driving car, you've got to buy a Toyota and put a camera on top of it and drive around the parking lot a little bit to get some data.

Lars: So I guess you scheduled a lot of meetings.
Dennis: Exactly. Two years ago, we scheduled a first meeting to see what data came back, then we scheduled another one, then another one. Millions of meetings later, we now have a data set so well annotated and labeled that we can build models on top of it and start to fully emulate what you would otherwise have done as a human. We basically handcrafted models for each one of the skills we needed Amy to have. We wanted Amy to understand that someone is running late, so we built a model for that. We wanted Amy to understand that a Skype meeting does not require the people to show up in person. So we built a model for that. That is the only way we can reach the level of accuracy that we're hoping for. Then we start to put these models into the world. We figure out when we make mistakes. We re-annotate the data. We re-model and do it over, and over, and over again. That's called supervised learning. And that's how you build many intelligent systems.

Lars: What are the boundaries of that technology? What happens, when it gets too complicated for the assistant?
Dennis: There's multiple worst-case scenarios here. One is that we can misunderstand what you ask us to do and all of a sudden you end up heading to mid-town Manhattan and nobody is waiting for you. That's very unfortunate. It could also be that Amy sets something for where you have to tell her, "Oh, no, no, no, Lars is actually in Europe. Don't do this at 5:00 PM EST, because Lars is about to go to bed by then." There's certainly room for the assistant to make errors that need to be corrected. We just need to make sure that it is something that is done less than having a human assistant in place which by the way, sometimes also needs to be corrected because you forgot to tell her that your friend is in Singapore or Lars is in Germany. Plus in other things, Amy and Andrew are so much better than humans.

Lars: For example?
Dennis: They have total memory, once you have told them Lars is in Berlin, they will never forget that. They work 7 days a week. They do not ask for $50,000 a year, they don´t ask for a coffee break, they never complain. Instead they can do things that are difficult for humans like setting up a meeting between four different time zones in multiple languages and keeping control of that. And think of the network effects! When many people have the same assistant - the same Amy - making appointments will be done in a split of a second.

Lars: That may sound great to you. But perhaps not to the people whose jobs go away - the human personal assistants out there.
Dennis: Well, you and I might be in industries where we see personal assistants and if some of those people that used to have a human assistant get access to Amy and Andrew, good for them. But in the rest of the world today, personal assistants almost don't exist. The assistant segment, the people who have human assistants basically, is tiny. The majority of all meetings are being set up by individuals themselves, you and me, your mom, your dad. We want to democratize the idea of the personal assistant so that it's not for a few select people but for everybody. In addition to that: you should do a quick study and ask any personal assistant: "Don't you just love doing email ping-pong with these set of people during the week that don't respond to your questions and you have to call up?" No they don't. They hate it. And they would be happy to do other things.

Lars: What do you think is the more general impact of A.I. at the workplace? Amy and Andrew are not the only smart tools that have the potential to replace humans.
Dennis: You are referring to the idea that there could be a future, a decade or two from now, where there is mass unemployment. I am very optimistic that that will not be the case. I'm very much a believer in the fact that technology in its wildest interpretation is all for the better. The things that disappear are things that we probably shouldn't be doing to begin with or aren't even worthy of a human doing them. I can't play out the scenario in which I wake up one morning, 14 years from now, and we have 30 million unemployed people in the US. It just doesn't compute for me. What we're including in that idea is that humans at some point reach the end of their imagination.

Lars and Dennis will be with me at Work Awesome on June 23rd in New York. Will you be there?