yandex_logo_en-svgIt’s been announced by Yandex, through their Russian blog, that a new algorithm has been launched, which is aimed at improving how they handel long-tail queries.  The new algorithm has been called  Palekh, named after a world-famous Russian city that has a firebird on its coat of arms.

It’s no coinciendence that Yandex decided to name their algorithm Palekh, is because the firebird it’s named for has a long tail, and Yandex, the largest Russian search engine, used that as code name for long-tail queries.  Long-tail queries are a number of words that are entered into the search box, and is seen more more often these days in vocie queries.  About 100 million queries per day are falling under “long-tail” within the Yandex search engine.

The Palekh algorithm began using neural networks as one of the 1,500 factors of ranking.  A spokesperson of Yandex said that they “managed to teach our neural networks to see the connections between a query and a document even if they don’t contain common words.”  This was done by “converting the words from billions of search queries into numbers (with groups of 300 each) and putting them in 300-dimensional space — now every document has its own vector in that space.” They added, ” If the numbers of a query and numbers of a document are near each other in that space, then the result is relevant.”

When asked if they use machine learning, Yandex said that they do indeed use machine learning and explained that they teach their “neural network based on these queries will lead to some advancements in answering conversational based queries in the future.” Adding that they “also have many targets (long click prediction, CTR, “click or not click” models and so on) that are teaching our neural network — our research has showed that using more targets is more effective.”

Source –