ABSTRACT
Almost ten years have passed since the expression "peer-to-peer" reached the large internet
population. It has had considerable success, but there still seems to be a long way ahead.
In these ten years, a wealth of peer-to-peer systems has been proposed by the research
institutions, but only few have made it commercially. While the research community in-
troduces ever more advanced structured peer-to-peer networks, the public seems to prefer
the simplicity and elasticity of unstructured peer-to-peer networks, despite their numerous
and obvious flaws. Even developers or system designers seem to prefer either unstructured
networks, or, at best, hybrid systems. Why?
One of the important factors, business considerations aside, is that designing, imple-
menting and testing one such system before deployment is virtually impossible for some-
one without a PhD in the area. Even comparing existing systems is difficult, because there
is no unifying benchmarks or models.
This thesis addresses this problem by introducing theoretical models for both struc-
tured and unstructured networks, by implementing simulators to help system designers get
a better idea of the performance before going to a real implementation and, finally, by
showcasing an image retrieval application on structured peer-to-peer networks.
We organize this work in 4 chapters, in addition to the introductory and concluding
remarks. First, we make a thorough overview of the state of the art in peer-to-peer networks,
both structured and unstructured, as well as existing attempts at providing a generic model
for such networks. In the same chapter, we also review techniques in information retrieval
on peer-to-peer networks, as well as image retrieval techniques that we will subsequently
adapt for our application environment. Afterwards, we dedicate one chapter for each of the
three important contributions of the thesis: the structured network model, the unstructured
network model and the image retrieval application.
This thesis demonstrates the possibility of defining new peer-to-peer networks or mod-
eling existing designs easily, as well as the applicability of image retrieval techniques in apeer-to-peer environment with scarce resources.
CONTENTS
1 Introduction 15
11 Contributions 19
12 Organization of this report
19
2 State of the art in peer-to-peer networks, information retrieval on peer-to-peer,
and image retrieval 21
21 Peer-to-Peer Networks 21
211 Query Flooding 21
212 Selective Forwarding 22
213 Centralized Indexes 23
214 Distributed Indexes 23
22 Generalized Architectures for Peer-to-Peer Overlays 25
221 The Early Works 26
222 The Recent Works 27
23 Information Retrieval on P2P Networks 28
24 Content Based Image Retrieval 30
241 Histograms of Color 31
242 Adding Position into the Equation 32
243 Vector Space Transformations 33
244 Summary 40
3 Generalized Architecture for Structured Peer-to-Peer Overlays 43
31 Introduction 44
311 The Case for Peer-to-Peer Analysis in the Context of PDMS 47
312 Specific Contributions of This Chapter 49
32 Terminology and Notation 51
33 Related Work 53
34 Structured Overlays as Cayley Graphs 55
341 Cayley Graph Definitions 55
35 Initial Assessment of Existing Cayley Graph Overlays 59
351 Node Degree 59
352 Vertex Transitivity 59
353 Edge Transitivitiy 60
354 Optimal Fault Tolerance 60
36 Alternative Fault Tolerance Measures 61
361 Diameter Resilience 62
362 Bisection Width 64
363 Hierarchy 70
364 Hamiltonicity 73
37 Experimental Results 75
371 Simulation Framework 75
372 Routing 77
373 Non-Cayley Overlays 79
38 Summary 80
4 A Model for Unstructured P2P Networks 83
41 Introduction 84
411 The Case for Guiding Random Walks 85
412 Specific Contributions of This Chapter 86
42 Related Work 87
421 Formal Approaches to Random Walks 87
422 Semantically-enhanced Random Walks 89
43 Preliminaries 90
431 Markov Chains And Their Stationary Distributions 91
44 Peer and Network Model 92
441 Peers as States in a Continuous State Space 95
442 Stationary Distribution 99
443 Transition Function 101
45 Retrieval Speed 103
451 Mixing Time 103
452 Proposal Density 104
46 Experiments 106
461 P2P Network Implementation 107
462 Consistency of the Model’s Behavior 108
463 Additional Observations 112
47 Summary 114
5 Application in Image Retrieval on P2P Mobile Ad-Hoc Networks 117
51 The Case for Image Retrieval on Mobile Phones 119
52 Specific Contributions of This Chapter 120
53 Hyper-M 121
531 Cluster Representation 123
532 Peer Relevance Score 125
54 Query Processing 126
541 Range queries 126
542 K-nn Queries 128
55 Experiments on Data Dissemination 131
551 Synthetic Data 132
552 Speed of Insertion 133
553 Data Distribution in the CAN Network 136
56 Effectiveness of Retrieval 137
561 Retrieval Effectiveness on Real Data 138
57 Considerations on an unstructured network overlay 141
571 Range and KNN queries 142
58 Summary 143
6 Conclusions 145
61 Summary of Contributions 145
611 Quality of Structured Peer-to-Peer Networks 146
612 Efficiency of Unstructured Peer-to-Peer Networks 147
613 Proof of Concept 148
62 Future Research Directions 149
621 Poly-morphic Structured Peer-to-Peer Networks 149
622 Logarithmic-length Paths in Unstructured Peer-to-Peer Networks 149
623 Sharing More Meaningful Data 150
63 Closing 150
A P3N: Profiling the Potential of a Peer-based Data Management System 151
A1 Introduction 151
A2 P
3N 152
A21 User Interface 153
A3 Sample Network Implementation 154
A4 Summary 158
B Mobile implementation 161
B1 Peer-to-Peer Overlay 161
B2 Data Processing 162
B3 Summary 163