Recent Patents on Computer Science

Author(s): Bingxian Chen, Lianggui Liu*, Huiling Jia and Yu Zhang

DOI: 10.2174/2213275911666180403110851

Cite As
Reducing Repetition Rate: Unbiased Delay Sampling in Online Social Networks

Page: [308 - 314] Pages: 7

  • * (Excluding Mailing and Handling)

Abstract

Background: Due to the large network scale, nowadays, it is hard to get extensive data from online social networks (OSN). Moreover, a large number of social nodes and links have made network data analysis a time-consuming task. Therefore, to sample the large-scale online social networks and restore the topological properties of original network become a problem. The purpose of this paper is to study an unbiased sampling method that can extract a representative sample from the social graph.

Methods: We propose an improved algorithm based on MHRW, called Unbiased Delay sampling (UD algorithm). Then we compare it with some recent patents on sampling method to evaluate our method.

Results: Different sample methods extract subnet with different topological properties. We find that UD can adapt to all kinds of different network connectivity. On the one hand, UD has a better degree distribution when the sample does not consider repeated nodes; on the other hand, UD algorithm can reduce the probability of reiterated nodes selected to sample and improve the ability of network discovery.

Conclusion: We get the first, to the best of our knowledge, unbiased sampling method which has a good degree of distribution when the sample set does not have duplicate nodes. More specifically, we add parameter α to sampling process, and the value of α can control the repetition rate of the sample set.

Keywords: Social network, MHRW, twitter, degree distribution, independent sample, unbiased sampling.

Graphical Abstract