A new research paper published by Google describes a dramatically new way to improve how web pages are ranked. This algorithm claims significant improvements to deep neural network algorithms that calculate relevance.
While without confirmation we can’t know for certain if it is in use, because significant improvements are claimed by the researchers, in my opinion it is not far fetched to consider that this algorithm may be in use by Google.
The new algorithm introduces a method of ranking web pages called, Groupwise Scoring Functions.
Does Google Use Published Algorithms?
Google has stated in the past that “Google research papers in general shouldn’t be assumed to be something that’s actually happening in search.” Google rarely discusses which algorithm patents or papers are in use.
Is this Algorithm Part of the March 2019 Core Update?
This research paper shows how Google is focusing on understanding search queries and understanding what web pages are about. This is typical of recent Google research.
Google has recently introduced a broad core update that is reported to be among the biggest in years. Is this algorithm a part of that change? We don’t know and we will likely never know. Google rarely discusses specific algorithms.
In my opinion, it’s possible that something like this could be one part of a multi-part update of Google’s search ranking algorithm. I don’t believe it’s the only one. I believe the March 2019 Core Ranking Algorithm consists of a series of improvements.
Why this Algorithm is Important
The research paper begins by noting that machine learning algorithms label and give values to web pages individually, each web page in isolation from other web pages. Then the algorithms score the web pages in competition with the other web pages to find out which web page is most relevant.
Here’s how the research paper describes how current algorithms work:
“While in a classification or a regression setting a label or a value is assigned to each individual document, in a ranking setting we determine the relevance ordering of the entire input document list.”
The research paper then proposes that considering the age of all of the relevant web pages can give a clue as to what users want. So instead of ranking all the web pages one against the other, by reviewing the age of the web pages first, the ranking algorithm can better understand what a user wants and choose a better web page.
This is how the research paper describes the new algorithm:
“The majority of the existing learning-to-rank algorithms model such relativity at the loss level using pairwise or listwise loss functions. However, they are restricted to pointwise scoring functions, i.e., the relevance score of a document is computed based on the document itself, regardless of the other documents in the list.
…the relevance score of a document to a query is computed independently of the other documents in the list. This setting could be less optimal for ranking problems for multiple reasons.”
The research paper then shows how the current method of ranking web pages is missing an opportunity to improve the relevance of search results.
This is the example the research paper uses to illustrate the problem and the solution:
“Consider a search scenario where a user is searching for a name of a musical artist. If all the results returned by the query (e.g., calvin harris) are recent, the user may be interested in the latest news or tour information.
If, on the other hand, most of the query results are older (e.g., frank sinatra), it is more likely that the user wants to learn about artist discography or biography. Thus, the relevance of each document depends on the distribution of the whole list.”
In this example, the age of the web pages that are relevant to the search query can help to refine which answer is the best answer.
Modeling Human Behavior for Better Accuracy
The research paper then notes that search engine users tend to compare search results relative to other web pages. They then suggest that a ranking model that does the same thing is more accurate.
“…user interaction with search results shows strong comparison patterns. Prior research suggests that preference judgments by comparing a pair of documents are faster to obtain, and are more consistent than the absolute ratings.”
Also, better predictive capability is achieved when user actions are modeled in a relative fashion… These indicate that users compare the clicked document to its surrounding documents prior to a click, and a ranking model that uses the direct comparison mechanism can be more effective as it mimics the user behavior more faithfully.”
The New Algorithm Works
When considering algorithm research, it’s important to note whether the researchers stated that it improved and advanced the state of the art.
Some research papers note that the improvements are minimal and the cost of achieving these gains are significant (time and hardware). I consider less successful research to not be a good candidate for inclusion in Google’s search algorithms.
When a research paper reports significant improvements coupled to minimal cost, then in my opinion these kinds of algorithms have a higher likelihood of being included into Google’s algorithms.
The researchers concluded that this new method improves Deep Neural Network and tree-based models. In other words, this is useful. Google never says if an algorithm is used or how it is used. But knowing that an algorithm provides significant improvements and can scale improves the likelihood that the algorithm may be used by Google, if not currently then at some point in the future.
This is the value of knowing about information retrieval research. You can know what is possible. Understanding that something has not been studied is a strong clue that a theory about what Google is doing is not likely.
For example, correlation studies caused the SEO community to believe that Facebook likes were a ranking factor. But if those SEOs had bothered to read research papers they would have known that such a thing was highly unlikely.
In this case, the researchers state that this method is highly successful. In the following quote, please note that DNN means Deep Neural Networks. GSF means Groupwise Scoring Function.
Here is the conclusion:
“Experimental results show that GSFs significantly benefit several state-of-the-art DNN and tree-based models…”
How this Can Help Your SEO
Ranking in Google is increasingly less about traditional ranking factors. Twenty year old ranking factors like anchor text, heading tags, and links are decreasing in importance.
This research paper shows how considering commonalities between relevant pages may provide clues to what users want. Even if Google isn’t using this algorithm to rank web pages, the concept is still useful to you.
Knowing what users want can help you better understand the user’s information needs and to create web pages that better meets those needs.
And that may increase your ability to rank. Chase the carrot, not the stick.
Read the research paper here:Learning Groupwise Scoring Functions Using Deep Neural Networks (PDF)
Images by Shutterstock, Modified by Author