Design and Implementation of a Peer-to-Peer based System to enable the Share Functionality in a Platform-independent Cloud Storage Overlay

By

Author

Presented To

Department of Surveying and Geoinformatics

ABSTRACT
Cloud storage services have aroused great interest in modern information technology,
providing a simple way of storing and sharing files. However, they are mostly built on
centralized architectures and hence subject to drawbacks. Peer-to-Peer (P2P) systems are
a good alternative to deal with the limitations of centralized systems in the scope of file
sharing, due to shared usage of distributed resources and higher fault tolerance. Therefore,
this thesis designs, implements, and evaluates a P2P-based file sharing system to be
integrated into PiCsMu, a novel platform-independent cloud storage application. The
proposed system features a modular design which relies on well-defined interfaces, high
security, as well as a P2P storage concept with good performance. Additional features that
try to distinguish this approach from other P2P file sharing systems include an integrated
file search and the possibility to privately share files. The results of a qualitative and
quantitative evaluation show that the proposed system performs well and as designed.
Considering that a download of 750 Megabytes files takes in average 154.65 seconds, the
complexity added by the proposed system (4.21 seconds) is minimal. Yet the system still
leaves aspects to be optimized.
Contents
Abstract i
Zusammenfassung iii
Acknowledgments v
1 Introduction 1
11 Description of Work and Thesis Goals 3
12 Thesis Outline 3
2 Terminology and Related Work 5
21 Terminology 5
22 Cloud Storage Services 6
23 A Brief History of Peer-to-Peer File Sharing 7
231 Napster 8
232 Gnutella 8
233 BitTorrent 9
234 Anonymous P2P and Freenet 11
235 Comparison 12
3 Design 17
31 File Upload and Download Protocol 17
32 Index 18
33 System Architecture 20
331 Application 20
332 Central Server 22
333 Peer-to-Peer Network 23
334 Cloud Services 25
34 Distributed Hash Table 25
341 Storage Concept using a Distributed Hash Table 26
342 Enabling a Content-Based Search 27
35 Privacy and Secure Data Exchange 29
351 Cryptography 30
36 Share Functionality 32
361 Public Sharing 32
362 Share Notification 33
363 Private Sharing 34
4 Implementation 37
41 An Overview of the P2P Code Architecture 37
42 Data Exchange using Beans 38
43 File-Sharing Protocols 38
44 Asynchronous and Non-Blocking Communication 40
441 Interacting with the DHT using Futures 41
442 Asynchronous Callbacks and Event Handling 42
45 Implementing Security Measures 44
46 File Search Functionality 46
461 Keyword Generation 47
462 Fast Similarity Search 48
5 Evaluation 49
51 Testbed and Evaluation Setup 49
511 Test Cases 50
52 Qualitative Evaluation 50
521 Code Coverage using JUnit Tests 50
522 Functional Testing 53
53 Quantitative Evaluation 54
54 Additional Issues and Problems 58
6 Summary and Conclusions 61
Bibliography 69
Abbreviations 71
Glossary 73
List of Figures 75
List of Tables 77
List of Listings 79
A Installation Guidelines 83
B Contents of the CD 85


Chapter 1
Introduction
Modern Information Technology (IT) strives to provide computing as a utility Utility
computing is a concept based on delivering services and resources (eg, processing, stor-
age, software) to end-users while hiding their internal mechanisms The acquisition of
these offerings should be possible at minimal costs, having the advantage of being pur-
chasable from private users to large IT companies As an example, developers can easily
launch new Internet services without the need to primarily invest a large amount of money
in infrastructure Thus, organizations are able to first deploy services and later check if the
demands meet predictions, eg, number of users simultaneously accessing the service [21]
Multiple distributed technologies such as cluster, grid, and cloud computing have emerged
from this paradigm They allow access to large amounts of computing power in a fully
virtualized manner, by aggregating resources and offering a single system view [25]
Cloud computing (or simply cloud) has been one of the most used buzzwords in IT over
recent years The name emerged as a metaphor for the Internet, which has been typically
represented as a cloud symbol in network diagrams [59] The symbol is an abstraction
for the complex underlying mechanisms A variety of cloud providers (ie, companies
offering cloud computing) have been incorporated over the last couple of years, including
big names such as Amazon, Google, and Microsoft They offer cloud services that can
be classified into three types: Infrastructure-as-a-Service (IaaS), Platform-as-a-Service
(PaaS), and Software-as-a-Service (SaaS) [59] Well-known examples of cloud services are,
among others, Amazon Web Services [1], Google Picasa [9], or Microsoft Office 365 [11]
Section 22 introduces cloud services relevant for this thesis In the scope of SaaS, cloud
providers offer applications with a specific purpose and responsibility Consequently, data
that is uploaded to the services is restricted by the corresponding cloud providers’ data
validation process For example, Google Picasa only allows hosting images of a certain
file type, resolution, and size Machado et al [42] investigated on how to bypass the
data validation process of cloud providers to store arbitrary data without restrictions
By applying different encoders (ie, Steganography-related, FileFormatHeader-related,
Appending-related) data is transformed to a form accepted by the cloud provider A
weak data validation process does not recognize this change in data Therefore, it is
possible to upload any kind of data to a cloud service without violating the service-specific
restrictions being noticed by the cloud provider, causing impacts in security, accounting,
and charging
On the basis of the idea presented by Machado et al, its authors created a Platform-
independent Cross Storage System for Multi-Usage (PiCsMu) [15] PiCsMu is a proto-
typical cloud-based overlay (ie, network on top of another network) offering file type
independent cloud storage using file type dependent cloud services By applying encod-
ing, the application can store arbitrary files on any cloud service The way PiCsMu stores
files is divided into three major steps: (1) fragmentation, (2) encoding, and (3) upload
Step (1) splits the file to be uploaded into multiple file parts In the scope of (2), each
file part is encoded specific to the service the file part will be uploaded Finally, step (3)
uploads all encoded file parts to their corresponding cloud service Advantages of this
method are discussed in Section 22; and the individual steps are explained in more detail
in Section 31 In order to retrieve the file, the reversed process is applied PiCsMu itself
does not store data, but index data (ie, index data, this is data about data), which con-
tains information about the upload process, such as (1) number and order of fragmented
file parts, (2) applied encoders, and (3) used cloud services Using the index data, PiC-
sMu is able to retrieve and reconstruct a previously stored file Section 32 describes how
the index data is composed The denotation of PiCsMu as an overlay is based upon the
characteristics that it runs on top of multiple cloud services, using indexes to manage the
files distributed across them
Since the file parts are stored in public available services, any authorized person can access
and download them Hence, by sharing index data of a file with another PiCsMu user,
he or she can obtain all necessary file parts from the corresponding cloud services and
reconstruct the file using index data information As a client-server (C/S) application,
PiCsMu offers a centralized way of sharing files However, having one central authority
bears drawbacks in applying file sharing High network traffic from many concurrent
users may rapidly lead to congestion at the server, causing a temporal denial of service
or complete breakdown Further, the identity of the users (ie, Internet Protocol (IP)
address) is known to the server and can be required by governmental authorities for
prosecution in case of sharing copyrighted or sensitive files Having said this, it is also
easy for an authority to shut down the entire system by disabling the central server
Peer-to-Peer (P2P) systems have become popular on the Internet, especially in the field of
file sharing and media streaming Steinmetz et al [54] provide the following definition: "A
Peer-to-Peer system is a self-organizing system of equal, autonomous entities (peers) which
aims for the shared usage of distributed resources in a networked environment avoiding
central services" The increasing availability of high-speed broadband connections made
the concept of globally applicable P2P systems feasible Average users have the computa-
tional capacity and resources to act as as server and client simultaneously (ie, definition
of a peer), providing and consuming data P2P systems are designed to overcome the
limitations and drawbacks of classical C/S systems They are (1) extensible (easy to add
new resources), (2) more fault tolerant, (3) scalable (system can grow without loss in
performance), and (4) resistant to lawsuits Hence, this thesis investigates, implements,
evaluates, and discusses the design of a P2P-based system to enable a distributed share
functionality in PiCsMu to overcome its current architectural limitations
11 Description of Work and Thesis Goals
The objective of this thesis is to design, implement, and evaluate a P2P-based system
to be integrated into PiCsMu to extend and enhance current functionality This primar-
ily includes the introduction of a distributed and decentralized file sharing and storing
mechanism Since PiCsMu already exists as a prototypical implementation, the system
architecture and code design is analyzed in order to integrate new functions in the best
possible way, minimizing changes to the original PiCsMu implementation Furthermore,
an investigation and comparison of related work in the field of cloud storage services is
done to see how PiCsMu differs in features and how the work of this thesis can make a
contribution The proposed P2P solution is compared side-by-side with well-known P2P
file-sharing systems in order to analyze advantages and drawbacks The observation of
the current implementation of PiCsMu, together with the analysis of related work, result
in a design solution for the new PiCsMu with a P2P extension The design is intended
to be modular, in order to make future adaptions (eg, new features, different P2P ar-
chitecture) possible with minimum effort In addition, design decisions are based on how
to make best use of the functionality that PiCsMu already offers The implementation
realizes all aspects of the design and is done in the Java programming language In the
scope of evaluation, the proposed solution is evaluated to survey the design’s validity and
feasibility The evaluation process comprises of defining different scenarios in which the
system’s functionality and performance is measured, and the results obtained are dis-
cussed Finally, the end-to-end work is analyzed and open issues as well as future work
are presented
12 Thesis Outline
The remainder of this thesis is structured as follows: Chapter 2 presents the terminology
and the related work in the area of cloud services and P2P file-sharing systems Chapter 3
presents the design of the proposed system architecture and explains the system compo-
nents thoroughly Chapter 4 focuses on the technical part of the thesis and explains how
the design decisions are implemented into PiCsMu Chapter 5 presents and discusses
results obtained from the evaluation Finally, Chapter 6 summarizes and concludes what
was achieved and presents future work


About E-Project Material Centre


E-Project Material Centre is a web service aimed at successfully assisting final year students with quality, well researched, reliable and ready made project work. Our materials are recent, complete (chapter 1 to Minimum of Chapter 5, with references) and well written.INSTANT ACCESS! INSTANT DOWNLOAD. Simply select your department, choose from our list of topics available and explore your data

Why Students Love to Use E-Project Material ?


Guaranteed Delivery Getting your project delivered on time is essential. You cannot afford to turn in your project past the deadline. That is why you must get your project online from a company that guarantees to meet your deadline. e-Project Topics Material Centre is happy to offer instant delivery of projects listed on our website. We can handle just about any deadline you send our way. Satisfaction Guaranteed We always do whatever is necessary to ensure every customer's satisfaction

Disclaimer


E-Project Topics Material Centre will only provide projects as a reference for your research. The projects ordered and produced should be used as a guide or framework for your own project. The contents of the projects should be able to help you in generating new ideas and thoughts for your own project. It is the aim of e-Project Topics Centre to only provide guidance by which the projects should be pursued. We are neither encouraging any form of plagiarism nor are we advocating the use of the projects produced herein for cheating.

Terms and Condition


Using our service is LEGAL and IS NOT prohibited by any university/college policies You are allowed to use the original model papers you will receive in the following ways:
  • As a source for additional understanding of the subject
  • As a source for ideas for you own research (if properly referenced)
  • For PROPER paraphrasing ( see your university definition of plagiarism and acceptable paraphrase) Direct citing ( if referenced properly)
Thank you so much for your respect to the authors copyright
X

Need Help Finding or Downloading Your Project Material?

If you don't see the topic you're looking for or You need urgent/express attention, click the WhatsApp Icon/link below to contact ADMIN and get the material you need instantly. We are always available online to attend to your needs. Thanks