Bing Upgrades Text-To-Speech, Expands Intelligent Answers, Improves Visual Search

Bing has announced large upgrades to their text-to-speech capabilities in voice search, visual search capabilities and improvements to their intelligent answers.

According to the company, Bing’s intelligent answers are smarter by leveraging their own deep learning models.  Bing says that they are able to answer “harder questions than ever before”by being able to use advancements in GPU technology.

“Instead of the relatively simple answer to ‘what is the capital of Bangladesh’, Bing can now provide answers to more complex questions, such as ‘what are different types of lighting for a living room’, quicker than before,” the company said.

When responding to a user query, the voice that Bing uses sounds more human-like than ever before.  Bing said they “can speak answers to your queries back to you with a voice that’s nearly indistinguishable from a human’s.”  This is done through Bing’s AI human-like intonation and clear articulation of words.

You can hear this in action on the Bing blog.

Regarding improvements to Bing’s visual search capabilities, they have made “huge strides in efficiency and coverage” around this feature.

The example given by Bing is “if you see an image of an accent light you like, Bing can show visually-similar decor and even show purchase options at different price points if the item is available online. To save you time, visual search also automatically detects and places clickable hotspots over important objects you may want to search for next.”

According to Bing, all of these improvements were only possible because of the Azure N-series virtual machines running NVIDIA GPUs.

