With the launch of Bixby and reports that Samsung is building its own competitor to Amazon’s Echo, the consumer electronics giant has now made an acquisition that could help power its next generation of voice-powered services. Samsung has acquired Innoetics, a startup out of Greece that has developed text-to-speech and voice-to-speech technology that can, among other things, listen to a person speaking, train on what that person is saying, and then read out a piece of completely unrelated text in that same voice.
Innoetics had been working primarily on B2B services up to now, with telcos and other businesses using its tech by way of a set of APIs. Innoetics has now posted a note on on the homepage of its website announcing that these B2B services have now been discontinued.
It’s not clear yet what Samsung plans to do with the tech, but according to one person, “it is perfectly suited for consumer services.” In other words, we could see it working with Bixby, or a new piece of hardware, or something for Samsung’s extensive mobile handset business, or all of the above. Or something else entirely different, given Samsung’s reach into so many other areas of consumer electronics. In any case, Samsung plans to keep Innoetics and its 8-10 employees (the higher number includes contractors) based in Athens as a subsidiary of its wider business.
Terms of the deal — which officially closed last Friday — have not been disclosed, but we understand that it’s one of the bigger exits for a tech startup in Greece. Sources tell us that Innoetics went for less than the amount Daimler paid for Taxibeat, an Uber rival that it acquired earlier this year for around €40 million ($43 million).
Samsung acquiring Innoetics follows other acquisitions it has made in the area of voice-based technology — namely, in October last year, Samsung bought the personal assistant startup Viv, which it used to help build Bixby.
(Samsung has incubated and acquired other kinds of tech, too, such as its recent move to pick up VRB, a VR startup that it funded and incubated in Samsung Next.)
Innoetics started as a spinout from the Athena Research and Innovation Center, a research institute in Athens that includes a department focused on speech and language processing. The Athena RIC announced the acquisition itself. We have also contacted Samsung for a comment.
Notably, Innoetics was completely bootstrapped since being founded in 2006 by Aimilios Chalamandaris, Pirros Tsiakoulis, Sotiris Karabetsos, and Spyros Raptis. The company had actually been in the process of getting rebooted and was seeking VC funding when it first started talking to Samsung. Initially, conversations started around a potential partnership before the two entered into acquisition talks.
We have seen a huge boom in voice-powered technology, from personal assistants — not just Bixby, but Apple’s Siri, Microsoft’s Cortana and many more — to hardware like Google Home and Amazon’s Echo range, all of which are using innovations in machine learning and other AI tools, as well as advances in natural language processing, to become more and lifelike and useful.
Innoetics’ technology is an interesting complement to all of this. Many services today are built around single languages before getting adapted, slowly, to more; and they are all basically built around one familiar voice (recall the hot pursuit that finally found the “voice of Siri“). Innoetics currently supports not one but 19 different languages, including English and (naturally) Greek, German and several dialects of Hindi.
“The team has amazing foundational technology in text-to-speech,” says Mallios Kostas, an ex-Microsoftie from Seattle who had started working with the company six months ago as an advisor and ended up helping lead the sale to Samsung. He said Innoetics has “huge capabilities” to increase the languages covered by its tech, and was on track to double or even triple the base. “Their synthesized voices are so accurate you almost can’t tell the difference between it and the real voice.”
Longer term, this may also raise security questions, of course: the smarter AI gets, the more likely it is that malicious hackers and others might use it for nefarious ends, and one of those ends could be in areas like identity theft. Tracking and mimicking people’s voices could be an obvious component of that.
“As synthesized voice has become more human sounding, security is something that will need to be dealt with,” Mallois said. “We’re not quite there yet but I can guarantee that large companies are thinking about how to address that, too.”
Featured Image: Bryce Durbin/Bryce Durbin