1.Harbin University of Science and Technology;2.Northeastern University
National Natural Science Foundation of China
针对已有社区搜索算法采用高维稀疏向量表示节点, 时间复杂度高的问题, 提出一种基于节点嵌入表示学习的社区搜索算法 CSNERL. 节点嵌入技术能够直接从网络结构中学习节点的低维实值向量表示, 为社区搜索提供了新思路. 首先, 针对已有节点嵌入算法存在着较高的概率在最亲近邻居间来回游走的问题, 提出基于最亲近邻居但不立即回访随机游走的节点嵌入模型 NECRWNR, 采用 NECRWNR 模型学习节点的特征向量表示.然后, 采用社区内所有节点的向量均值作为社区的向量表示, 通过选择与当前社区距离最近的节点加入社区的方法, 实现了一种新的社区搜索算法. 在真实网络和模拟网络数据集上分别与相关的社区搜索算法进行了实验对比, 实验结果表明所提社区搜索算法 CSNERL 具有更高的准确性.
Considering that the existing community search algorithms represent nodes as high-dimensional sparse vectors and have high time complexity, a novel community search algorithm based on node embedding representation learning (CSNERL) is proposed. Node embedding techniques can learn low-dimensional vectorial representation of nodes from network structure directly, and provide a new solution to community search problem. Firstly, in view of the problem that the existing node embedding algorithm has a high probability to walk back and forth between the closest neighbors, a node embedding model based on closest-neighbor biased random walk with non-immediately revisiting (NECRWNR) is proposed. Based on this model, vectorial representation of nodes is learned and used as feature vectors of nodes in the downsteam data mining task. Then, vectorial representation of a community is defined as the average of the vectors for nodes in the community, and a new community search algorithm is designed by choosing those nodes which are nearest with the current community. The proposed algorithm is tested on both real-world and synthetic network datasets with the related community search algorithms. The experimental results show that CSNERL is more effective at community search than baselines.