You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Project Backrub is our implementation of a PageRank-like algorithm that accounts for the number of back-links (that is, pages that link back to a given page), which is a measure of popularity of a document.
The graph concept could also be extended and utilized by centillion, for example to enhance the ranking system (with a graph where nodes are documents and edges are interlinked documents, highly linked-to documents receive higher weighting in centillion)...
This helps centillion to transition to a high-level view of documents in the Data Commons - and will greatly improve its ability to retrieve the most relevant results based on the number of back-links to a document. (The PageRank algorithm was originally called Backrub.)
However, linking this idea of the graph structure to centillion... would be centric to the documents indexed by centillion (i.e., it would be restricted to a particular folder hierarchy).
The idea is to assemble a graph of documents in the search index (node = document indexed by centillion, directed edge = link from document A to document B), compute the in-degree of each node in the graph (number of documents that link to a given document), and store this in the search index, for use in the scoring mechanism.
The text was updated successfully, but these errors were encountered:
Project Backrub is our implementation of a PageRank-like algorithm that accounts for the number of back-links (that is, pages that link back to a given page), which is a measure of popularity of a document.
See this comment in dib-lab/copper#305:
The idea is to assemble a graph of documents in the search index (node = document indexed by centillion, directed edge = link from document A to document B), compute the in-degree of each node in the graph (number of documents that link to a given document), and store this in the search index, for use in the scoring mechanism.
The text was updated successfully, but these errors were encountered: