On May 7th, 2013, Ron Kaplan spoke at the CMU Silicon Valley Campus and claimed that now is the time for the “Conversational User Interface (CUI).” He claimed the Graphical User Interface (GUI) has “topped out” and showed an old cluttered and complex version of Expedia’s GUI to prove his point.
Forty years ago, Ron and colleagues at Xerox PARC published a paper about an early intelligent assistant called GUS (Genial Understanding System) that helped a user plan a flight from Palo Alto to San Diego. However it is only recently that the Conversational User Interface has become possible. Forty years ago, the quality of speech recognition was poor. Difficulties with ambiguity in natural language and with the sheer complexity of language slowed progress. The telegraphic nature of conversation and the fact that so much of what is communicated is unspoken also slowed progress. Ron argued that advances in speech recognition and in computational linguistics will soon make Conversational User Interfaces a reality.
Siri has already brought us out of the AI Winter and into a NL Spring. Ron said this is the time to be in Natural Language.
The proliferation of smartphones, the ubiquity of computing and the emerging internet of things are introducing more complexity and confusion into our environment. As we are surrounded by ever more complex gadgets from remote controls to thermostats we need to be able to say simple things to them to get them to understand what we want and to do it.
For most applications, perfection is not required. Even people fail to understand each other frequently but these failures can be overcome.
Ron distinguished between two approaches, which are valuable in different situations, and said they should be combined. Learning by observation (the data driven approach) is good for classification and correlation including search and machine translation. On the other hand, interpretation problems or transformational problems will need learning by instruction. The trend has been toward throwing data at every problem. But for some things like subject verb agreement in English, it is better to do something more like knowledge engineering using computational linguistics formalisms rather than using zillions of examples to train the system.
Ron is working on developing Conversational User Interfaces at a new laboratory that he founded: Nuance’s Silicon Valley Laboratory for Natural Language Understanding. Nuance started with speech recognition but they recognize that speech is not the end, it is just the beginning and there is more value in going beyond simply transcribing speech to text.
Nuance’s NLU lab is selecting “best of breed” components and assembling them into an integrated system. The modules should be aware of ambiguity and should work together to resolve ambiguity. They should avoid premature resolution which eliminates the correct interpretation. They should also avoid proliferation of ambiguity.
The lab is a year old and has grown rapidly from 0 to 20 people. There are three teams: a core natural language team, an AI, reasoning, discourse and dialog team, and a third team that “puts pressure on the first two teams.” The first two teams will solve general problems while the third team will pick specific domains and show things work in specific applications. He described a restaurant reservation assistant and gave examples of some of the kinds of general problems that the lab is working on.
Ron ended by saying “the killer app for natural language and AI is the Conversational User Interface.”