VisualRank

From HandWiki
Short description: Image ranking system

VisualRank is a system for finding and ranking images by analysing and comparing their content, rather than searching image names, Web links or other text. Google scientists made their VisualRank work public in a paper describing applying PageRank to Google image search at the International World Wide Web Conference in Beijing in 2008. [1]

Methods

Both computer vision techniques and locality-sensitive hashing (LSH) are used in the VisualRank algorithm. Consider an image search initiated by a text query. An existing search technique based on image metadata and surrounding text is used to retrieve the initial result candidates (PageRank), which along with other images in the index are clustered in a graph according to their similarity (which is precomputed). Centrality is then measured on the clustering, which will return the most canonical image(s) with respect to the query. The idea here is that agreement between users of the web about the image and its related concepts will result in those images being deemed more similar. VisualRank is defined iteratively by [math]\displaystyle{ VR = S^* \times VR }[/math], where [math]\displaystyle{ S^* }[/math] is the image similarity matrix. As matrices are used, eigenvector centrality will be the measure applied, with repeated multiplication of [math]\displaystyle{ VR }[/math] and [math]\displaystyle{ S^* }[/math] producing the eigenvector we're looking for. Clearly, the image similarity measure is crucial to the performance of VisualRank since it determines the underlying graph structure.

The main VisualRank system begins with local feature vectors being extracted from images using scale-invariant feature transform (SIFT). Local feature descriptors are used instead of color histograms as they allow similarity to be considered between images with potential rotation, scale, and perspective transformations. Locality-sensitive hashing is then applied to these feature vectors using the p-stable distribution scheme. In addition to this, LSH amplification using AND/OR constructions are applied. As part of the applied scheme, a Gaussian distribution is used under the [math]\displaystyle{ \ell 2 }[/math] norm.

References

  1. Yushi Jing and Baluja, S. (2008). "VisualRank: Applying PageRank to Large-Scale Image Search". IEEE Transactions on Pattern Analysis and Machine Intelligence 30 (11): 1877–1890. doi:10.1109/TPAMI.2008.121. ISSN 0162-8828. PMID 18787237. .

External links