If like 30% of Macworld readers who own a iPhone 4S, you think that Siri is rubbish, you may be surprised to hear that industry experts are predicting that in a few years time we will all be talking to lifts and our stereo systems, and feeling puzzled if they don’t respond to us.
Asymco analyst Horace Dediu thinks that the next new input method “will be voice" rather than the touch screen, and he suggest that while it’s still early days for Siri, the technology has the potential to disrupt the industry.
Not surprisingly, William Tunstall-Pedoe, the Cambridge-based entrepreneur and businessman who founded Siri-competitor Evi, agrees. “In the future absolutely everything will be controlled by voice,” he told Macworld UK. “You’ll be surprised in the future when you talk to your television set and it doesn’t do anything. It will be the way that everything is controlled.”
“Right now you see kids trying to swipe picture frames and wondering why it doesn't move. I think in the future people will be surprised when we stand in front of a piece of technology and ask it to do something and it doesn't respond. That will be odd,” he added.
“The future has always been about speech. Speech is the natural way to communicate. It’s a totally natural human way of asking for information, communicating information, and making things happen,” he continued.
Breakthrough speech interpretation technology
It was only a matter of time before we would start to communicate with computers using voice - the only thing stopping us doing so being the ability of speech recognition technology to understand our meaning, until now that is. Tunstall-Pedoe revealed that we are at an exciting time when the technologies that enable us to have a natural conversation with a computer, have turned the corner in terms of being “practical and useful”.
Tunstall-Pedoe explained: “The only reason why computers aren't controlled by speech is because the technology up until very recently hasn’t been good enough. That's the only reason. And what’s happened in the last few years is that speech recognition - the technology that turns sound into text - has improved, and also natural language understanding - the piece that takes that text and makes sense of it and answers questions (which is the technology that Siri and our company, Evi, uses) has also become available.”
Both Evi, and Siri, use Nuance’s voice recognition technology, but it's the technology that understands the meaning of what you say that is the breakthrough. Tunstall-Pedoe told us: “Our technology is about understanding the world, understanding what the user means, making sense of it, and responding directly. There are questions you can ask that can be phrased in millions of ways. You have to intuitively understand what a question means, no matter how it’s phrased. It’s very difficult to get a computer to do that. And similarly you and I have got lots and lots of knowledge in our brains, common sense knowledge about the world, and we can draw on that knowledge, but that’s an extremely challenging for a computer, computers don’t usually store knowledge in a way that they can process and understand.”
Evi’s technology “can store knowledge and respond directly to questions,” explained Tunstall-Pedoe. Evi is “improving constantly”, he added. “The server, what she knows, what’s in her brain, is improving every day. So she is constantly getting better and better. She constantly knows more and more and is able to do more and more,” he claimed.
“Evi’s core technology is very much about understanding what the user means to a very high level, and often answering directly,” explained Tunstall-Pedoe. “Evi knows 700,000,000 facts and she can use that knowledge to answer directly, to give answers directly back to the user, and have a conversation with the user. And she can combine facts to produce answers as well.”
Siri versus Evi
So how is that different to Apple’s Siri, we wondered. One big difference is the fact that the user of Evi can provide feedback via a thumb up, thumb down button. That enables users to give direct feedback to Evi, noted Tunstall-Pedoe.
That’s in addition to Evi learning as people use her, building up her database, just as is the case with Siri. User input is “very helpful in terms of her being able to learn and grow,” explained Tunstall-Pedoe.
The other big difference between Evi and Siri is the fact that “Evi is available on every iPhone, and every iPad, and all Android phones,” he noted, unlike Siri, which is currently only on the iPhone 4S, though coming to the New iPad with iOS 6. “Siri is an Apple product so she will never be available on anything other than Apple products, and top end Apple products only,” Tunstall-Pedoe speculated.
Will this limit the Siri database’s ability to learn, we asked? “I’m not sure about that,” said Tunstall-Pedoe, “but it’s a difference between the two products.”
Another difference, that’s far more important, according to Tunstall-Pedoe is “what drives both products”. He explained: “Siri is very much about finding an external service to call. There are a number of things that Siri can do, and each of those involves a partner for Apple to do them: the weather; stock prices; local search, each of those is a vertical that Siri can do.”
“We obviously call out to external APIs, and external partners as well. But our core technology is different and if you use both products you see that. You get more of a feeling of understanding from Evi because she understands what the questions are about. There’s a difference in the personality and the feel of the products that comes from the underlying tech,” claimed Tunstall-Pedoe.
What if Apple opened up its API so developers can use Siri, we asked? Would that improve Siri and be a limitation for Evi? “Obviously there is speculation about what Apple might do for developers with Siri, obviously that hasn’t been confirmed. But that isn’t going to change the core technology of Siri, allowing developers to get their apps into Siri isn’t going to change the way the core technology works,” was Tunstall-Pedoe’s answer.
We noted that most of the time Siri can’t answer your question and sends you to Google. “Exactly. It’s saying, I can’t do it, why don’t you try a search,” he replied.
We speculated that while some people suggest that Siri shipped too soon, the service needed to ship in beta form in order to build up its data banks. “I agree with that completely,” answered Tunstall-Pedoe.
However, there has been speculation that the version of Siri Apple shipped on the iPhone 4S wasn’t as good as the original Siri app that Apple purchased. In fact, Apple’s co-founder Steve Wozniak criticised Siri for not being as good since Apple bought it.
Woz claimed that before Apple bought Siri the service would return useful results and “that was pretty incredible”. “This was the future: speaking things in normal ways, feeling like you’re talking to a human and how Siri was the greatest program,” Wozniak said.
Tunstall-Pedoe explained the background: “The Siri app that the start up Siri had is a different product to the Siri that’s in the iPhone. In many ways it’s now more limited, but it’s embedded in the operating system. So the Siri that you currently see is part of iOS, while the start up that Siri Inc has was an app like any other app, which actually did a lot more. When Apple bought it they essentially started again. I can’t comment for Apple on why that was. But there is some truth to that. The technology is still the same,” he added.
Cloud security and voice recognition
Another issue that has emerged is whether there is a security risk associated with associated with this new world of voice interpretation that requires our voice requests to be sent to servers in the cloud for interpretation. F-Secure made a warning back in June that when your voice is sent out to servers for interpretation, it could allow for phishing scams and security breaches. IBM has already banned its staff from using Siri for that very reason.
Why is this a problem, we asked? Tunstall-Pedoe explained: “At the moment speech recognition needs to happen in the cloud, it needs big computers for processing. With Siri your voice gets sent off to the cloud for processing. So if you are very security conscious it’s not possible to confine that within the building, which is what I suspect is behind IBMs concerns.”
“But this is a general problem with cloud,” he added: “Just think of the amount of information that Google has. Search details, along with the IP address, emails, all stored on Google servers. This is a fundamental problem with the cloud. If you use any kind of cloud service, the data is transmitted to the cloud and logged there in order for it to work.”
“Google has phenomenal amounts of personal information,” added Tunstall-Pedoe. “They know more about you than you do. Facebook is the same. Just think how much information there is on the Google servers for Gmail. The comforting thing is that obviously their reputation depends on not breaching the trust, if there was a scandal about it that would be absolutely disastrous for them. So they have every incentive to maintain their users trust.”
However, Tunstall-Pedoe agreed: “The fact is that we are going to have to find ways of solving that, whether it involves keeping the data separate from the service providers, but the service providers need the data in order to provide the service. This isn’t a problem that’s specific to Siri, it’s a problem with cloud computing.”