Peer Sharing Behaviour in the eDonkey Network, and Implications for the Design of Server-less File Sharing SystemsReport as inadecuate

Peer Sharing Behaviour in the eDonkey Network, and Implications for the Design of Server-less File Sharing Systems - Download this document for free, or read online. Document in PDF available to download.

1 LPD - Distributed Programming Laboratory 2 PARIS - Programming distributed parallel systems for large scale numerical simulation IRISA - Institut de Recherche en Informatique et Systèmes Aléatoires, ENS Cachan - École normale supérieure - Cachan, Inria Rennes – Bretagne Atlantique 3 LIX - Laboratoire d-informatique de l-École polytechnique Palaiseau 4 INRIA Futurs 5 Microsoft - Microsoft Research Cambridge 6 UNIBO - Università di Bologna Bologna

Abstract : Peer-to-peer file sharing systems have grown to the extent that they now generate most of the Internet traffic, way ahead of Web traffic. Understanding workload properties of peer-to-peer systems is necessary to optimize their performance. In this paper we present an empirical study of a workload gathered by crawling the eDonkey network a dominant file sharing system for over 50 days. Besides confirming the presence of some well-known features, such as the prevalence of free-riding and the Zipf-like distribution of file popularity, we also analyze several previously ignored aspects of such workloads. More specifically, we measure the geographical clustering of peers offering a given file. We find that most files are offered mostly by peers of a single country, although popular files don-t have such a clear home country. We also analyze the overlap between contents offered by different peers. We find that peer contents tend to be clustered, which may be taken as evidence that peers possess specific interests. We leverage this and allow peers to search for content without any server support, by maintaining a list of semantic neighbours, i.e. peers with similar interests. Simulation results confirm the clustering property of the trace and show that a high hit ratio is achieved by querying the most recently discovered peers even after removing the top 15% most generous peers. Results also indicate that the clustering is much higher for rare files.


Author: S. Handurukande - Anne-Marie Kermarrec - Fabrice Le Fessant - Laurent Massoulié - S. Patarin -



Related documents