Link-Based Similarity Measures Using Reachability VectorsReport as inadecuate

Link-Based Similarity Measures Using Reachability Vectors - Download this document for free, or read online. Document in PDF available to download.

The Scientific World Journal - Volume 2014 2014, Article ID 741608, 13 pages -

Research Article

Department of Electronics and Computer Engineering, Hanyang University, Seoul 133-791, Republic of Korea

Department of Computer and Software, Hanyang University, Seoul 133-791, Republic of Korea

Department of Computer Science, KAIST, Daejeon 305-701, Republic of Korea

Received 31 August 2013; Accepted 28 November 2013; Published 18 February 2014

Academic Editors: S. Amat, L. Martínez, and J. Zhang

Copyright © 2014 Seok-Ho Yoon et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


We present a novel approach for computing link-based similarities among objects accurately by utilizing the link information pertaining to the objects involved. We discuss the problems with previous link-based similarity measures and propose a novel approach for computing link based similarities that does not suffer from these problems. In the proposed approach each target object is represented by a vector. Each element of the vector corresponds to all the objects in the given data, and the value of each element denotes the weight for the corresponding object. As for this weight value, we propose to utilize the probability of reaching from the target object to the specific object, computed using the “Random Walk with Restart” strategy. Then, we define the similarity between two objects as the cosine similarity of the two vectors. In this paper, we provide examples to show that our approach does not suffer from the aforementioned problems. We also evaluate the performance of the proposed methods in comparison with existing link-based measures, qualitatively and quantitatively, with respect to two kinds of data sets, scientific papers and Web documents. Our experimental results indicate that the proposed methods significantly outperform the existing measures.

Author: Seok-Ho Yoon, Ji-Soo Kim, Jiwoon Ha, Sang-Wook Kim, Minsoo Ryu, and Ho-Jin Choi



Related documents