View this notebook on a map
![]() |
|
CS652, Spring 2007 John H. Hartman Department of Computer Science University of Arizona
This is the tentative reading list for CS652, Spring 2007. I haven't read many of these papers so the list is likely to change as the seminar progresses. The non-uniform formatting is because I used Google Notebook to assemble this list, mostly as an experiment. I'm hoping it's easier to assemble and maintain than a regular web page.
Date: 1/16/07 Presenter: Jon
Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications
Abstract
A fundamental problem that confronts peer-to-peer applications is to
efficiently locate the node that stores a particular data item. This
paper presents Chord, a distributed lookup protocol that
addresses this problem. Chord provides support for just one operation:
given a key, it maps the key onto a node. Data location can be easily
implemented on top of Chord by associating a key with each data item,
and storing the key/data item pair at the node
to which the key maps.
Chord adapts efficiently as nodes join
and leave the system,
and can answer queries even if the system is
continuously changing.
Results from theoretical analysis, simulations, and experiments show
that Chord is scalable, with communication cost and the state
maintained by each node scaling logarithmically with the number of
Chord nodes.
In the Proceedings of ACM SIGCOMM 2001, San Deigo, CA, August 2001. (BibTeX entry)
Paper text: PDF,
PS,
gzipped PS |
Date: 1/23/07 Presenter: Mohamed 1
|
|
| A scalable content-addressable network
Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Schenker
|
|
August 2001
|
|
ACM SIGCOMM Computer Communication Review , Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications SIGCOMM '01,
Volume 31
Issue 4
|
|
Publisher: ACM Press
|
|
|
|
Hash tables - which map "keys" onto "values" - are an essential building block in modern software systems. We believe a similar functionality would be equally valuable to large distributed systems. In this paper, we introduce the concept of a Content-Addressable Network (CAN) as a distributed infrastructure that provides hash table-like functionality on Internet-like scales. The CAN is scalable, fault-tolerant and completely self-organizing, and we demonstrate its scalability, robustness and low
...
|
|
Date: 1/23/07 Presenter: Swami | OpenDHT: a public DHT service and its uses |
| Full text |
Pdf
(536 KB)
|
| Source |
Applications, Technologies, Architectures, and Protocols for Computer Communication
archive
Proceedings of the 2005 conference on Applications, technologies, architectures, and protocols for computer communications
table of contents
Philadelphia, Pennsylvania, USA
Pages: 73 - 84
Year of Publication: 2005
ISBN:1-59593-009-4
Also published in ...
|
|
Authors
|
|
Date: 1/25/07 Presenter: Ravi
|
Towards a Common API for Structured Peer-to-Peer Overlays
Frank Dabek and Ben Zhao and Peter Druschel and John Kubiatowicz and Ion Stoica
Abstract
In this paper, we describe an ongoing effort to define common APIs for
structured peer-to-peer overlays and the key abstractions that can be built
on them. In doing so, we hope to facilitate independent innovation
in overlay protocols, services, and applications, to allow direct
experimental comparisons, and to encourage application development by
third parties. We provide a snapshot of our efforts and discuss open
problems in an effort to solicit feedback from the research community.
In the Proceedings of the
2nd International
Workshop on Peer-to-Peer Systems (IPTPS '03), Berkeley, CA, 2003.
(BibTeX entry)
Paper text: PDF, PS,
gzipped PS |
Date: 1/25/07 Presenter: Manish
Ivy: A Read/Write Peer-to-peer File System
Abstract
Ivy is a multi-user read/write peer-to-peer file system. Ivy has no
centralized or dedicated components, and it provides useful integrity
properties without requiring users to fully trust either the
underlying peer-to-peer storage system or the other users of the file
system.
An Ivy file system consists solely of a set of logs, one log per
participant. Ivy stores its logs in the DHash distributed hash table.
Each participant finds data by consulting all logs, but performs
modifications by appending only to its own log. This arrangement
allows Ivy to maintain meta-data consistency without locking. Ivy
users can choose which other logs to trust, an appropriate arrangement
in a semi-open peer-to-peer system.
Ivy presents applications with a conventional file system interface.
When the underlying network is fully connected, Ivy provides NFS-like
semantics, such as close-to-open consistency. Ivy detects conflicting
modifications made during a partition, and provides relevant version
information to application-specific conflict resolvers. Performance
measurements on a wide-area network show that Ivy is two to three
times slower than NFS.
To appear in the Proceedings of the
5th Symposium on Operating Systems Design and Implementation,
Boston, MA, December 2002. (BibTeX entry)
Paper text: PDF, PS, gzipped PS
Ivy project homepage |
Date: 1/30/07 Presenter: Srini
Shark: Scaling File Servers via Cooperative Caching
Siddhartha Annapureddy, Michael J. Freedman, and David Mazières
In Proc. 2nd USENIX/ACM Symposium on Networked Systems Design and
Implementation
(NSDI '05) Boston, MA, May 2005.
[ ps ]
[ ps.gz ]
[ pdf ]
Slides:
[ pdf ]
Date: 2/6/07 Presenter: Chinmaya
| OceanStore: an architecture for global-scale persistent storage |
| Full text |
Pdf
(1.47 MB)
|
| Source |
ACM SIGPLAN Notices
archive
Volume 35 , Issue 11 (November 2000)
table of contents
Pages: 190 - 201
Year of Publication: 2000
ISSN:0362-1340
|
|
Authors
|
|
John Kubiatowicz
|
University of California, Berkeley
|
|
David Bindel
|
University of California, Berkeley
|
|
Yan Chen
|
University of California, Berkeley
|
|
Steven Czerwinski
|
University of California, Berkeley
|
|
Patrick Eaton
|
University of California, Berkeley
|
|
Dennis Geels
|
University of California, Berkeley
|
|
Ramakrishan Gummadi
|
University of California, Berkeley
|
|
Sean Rhea
|
University of California, Berkeley
|
|
Hakim Weatherspoon
|
University of California, Berkeley
|
|
Westley Weimer
|
University of California, Berkeley
|
|
Chris Wells
|
University of California, Berkeley
|
|
Ben Zhao
|
University of California, Berkeley | |
Date: 2/13/07 Presenter: Anshul SplitStream: High-bandwidth content distribution in a cooperative environment - group of 38 » M Castro, P Druschel, AM Kermarrec, A Nandi, A … - Proceedings of SOSP, 2003 - Springer ... Miguel Castro 1 , Peter Druschel 2 , Anne-Marie Kermarrec 1 , Animesh Nandi 2 , ... relies
on a structured peer-to-peer overlay network called Pastry [21], and on ...
Cited by 136 - Related Articles - Web Search - Resources @ UofA Library - BL Direct
Date: 2/8/07 Presenter: Ameya Scale and Performance in the CoBlitz Large-File Distribution Service
KyoungSoo Park and Vivek S. Pai, Princeton University
Abstract
Scalable distribution of large files has been the area of much
research and commercial interest in the past few years. In this paper,
we describe the CoBlitz system, which efficiently distributes large
files using a content distribution network (CDN) designed for HTTP. As
a result, CoBlitz is able to serve large files without requiring any
modifications to standard Web servers and clients, making it an
interesting option both for end users as well as infrastructure
services. Over the 18 months that CoBlitz and its partner service,
CoDeploy, have been running on PlanetLab, we have had the opportunity
to observe its algorithms in practice, and to evolve its design. These
changes stem not only from observations on its use, but also from a
better understanding of their behavior in real-world conditions. This
utilitarian approach has led us to better understand the effects of
scale, peering policies, replication behavior, and congestion, giving
us new insights into how to better improve their performance. With
these changes, CoBlitz is able to deliver in excess of 1 Gbps on
PlanetLab, and to outperform a range of systems, including research
systems as well as the widely-used BitTorrent.
Date: 2/8/07 Presenter: Robert Scribe: a large-scale and decentralized application-level multicast infrastructure
Castro, M.
Druschel, P.
Kermarrec, A.-M.
Rowstron, A.I.T.
Microsoft Res. Ltd., Cambridge;
This paper appears in: Selected Areas in Communications, IEEE Journal on
Publication Date: Oct 2002
Volume: 20,
Issue: 8
On page(s): 1489- 1499
ISSN: 0733-8716
INSPEC Accession Number: 7414454
Digital Object Identifier: 10.1109/JSAC.2002.803069
Posted online: 2002-12-10 17:12:25.0
|
 |
 |
 |
Abstract
This paper presents Scribe, a scalable application-level multicast infrastructure. Scribe supports large numbers of groups, with a potentially large number of members per group. Scribe is built on top of Pastry, a generic peer-to-peer object location and routing substrate overlayed on the Internet, and leverages Pastry's reliability, self-organization, and locality properties. Pastry is used to create and manage groups and to build efficient multicast trees for the dissemination of messages to each group. Scribe provides best-effort reliability guarantees, and we outline how an application can extend Scribe to provide stronger reliability. Simulation results, based on a realistic network topology model, show that Scribe scales across a wide range of groups and group sizes. Also, it balances the load on the nodes while achieving acceptable delay and link stress when compared with Internet protocol multicast. |
Date: 2/22/07 Presenter: Swami
| Bullet: high bandwidth data dissemination using an overlay mesh |
| Full text |
Pdf
(416 KB)
|
| Source |
ACM Symposium on Operating Systems Principles
archive
Proceedings of the nineteenth ACM symposium on Operating systems principles
table of contents
Bolton Landing, NY, USA
Pages: 282 - 297
Year of Publication: 2003
ISBN:1-58113-757-5
Also published in ...
|
|
Authors
|
|
Date: 2/27/07 Presenter: Ravi
BAR Gossip," H. Li, A. Clement, E. Wong, J. Napper, I. Roy, L. Alvisi, M. Dahlin, Proceedings of the 2006 USENIX Operating Systems Design and Implementation (OSDI), Nov 2006 . pdf bibtex
Date: 2/27/07 Presenter: Ameya
| A digital fountain approach to reliable distribution of bulk data |
| Full text |
Pdf
(1.65 MB)
|
| Source |
ACM SIGCOMM Computer Communication Review
archive
Volume 28 , Issue 4 (October 1998)
table of contents
Pages: 56 - 67
Year of Publication: 1998
ISBN:1-58113-003-1
Also published in ...
|
|
Authors
|
|
John W. Byers
|
UC Berkeley and International Computer Science Institute, Berkeley, California
|
|
Michael Luby
|
International Computer Science Institute, Berkeley, California
|
|
Michael Mitzenmacher
|
Digital Systems Research Center, Palo Alto, California
|
|
Ashutosh Rege
|
International Computer Science Institute, Berkeley, California | |
Date: 3/1/07 Presenter: Waj
Serving DNS Using Chord
Abstract
The current domain name system (DNS) couples ownership
of domains with the responsibility of serving data for them.
The DNS security extensions (DNSSEC) allow verificaton of
records obtained by alternate means, opening exploration
of alternative storage systems for DNS records.
We explore one such alternative using DHash, a peer-to-peer
distributed hash table built on top of Chord.
Our system inherits Chord's fault-tolerance and load balance
properties, at the same time eliminating many administrative
problems with the current DNS.
Still, our system has significantly higher latencies and
other disadvantages in comparison with conventional DNS.
We use this comparison to draw conclusions about
general issues that still need to be addressed in peer-to-peer
systems and distributed hash tables in particular.
In the Proceedings of the
1st International
Workshop on Peer-to-Peer Systems (IPTPS '02), Cambridge, Massachusetts,
March 2002.
(BibTeX entry)
Paper text: PDF, PS,
gzipped PS |
Date: 3/1/07 Presenter: Manish
Corona: A High Performance Publish-Subscribe System for the World
Wide Web
Venugopalan Ramasubramanian, Ryan Peterson, and Emin Gün Sirer,
Cornell University
Abstract
Despite the abundance of frequently changing information, the Web
lacks a publish-subscribe interface for delivering updates to
clients. The use of naïve polling for detecting updates leads to
poor performance and limited scalability as clients do not detect
updates quickly and servers face high loads imposed by active
polling. This paper describes a novel publish-subscribe system
for the Web called Corona, which provides high performance and
scalability through optimal resource allocation. Users register
interest in Web pages through existing instant messaging services.
Corona monitors the subscribed Web pages, detects updates
efficiently by allocating polling load among cooperating peers, and
disseminates updates quickly to users. Allocation of resources
for polling is driven by a distributed optimization engine that
achieves the best update performance without exceeding load limits
on content servers. Large-scale simulations and measurements from
PlanetLab deployment demonstrate that Corona achieves orders of
magnitude improvement in update performance at a modest cost.
Date: 3/6/07 Presenter: Chinmaya
Pastwatch: A Distributed Version Control System
Alexander Yip, Benjie Chen, and Robert Morris, MIT Computer Science and AI Laboratory
Abstract
Pastwatch is a version control system that acts like a
traditional client-server system when users are connected to the
network;
users can see each other's changes immediately after the changes are
committed.
When a user is not connected, Pastwatch
also allows users to read revisions
from the repository, commit new revisions and share
modifications directly between users, all without access to
the central repository. In
contrast, most existing version control systems require connectivity to
a centralized server in order to read or update the repository.
Each Pastwatch user's host keeps its own writable replica
of the repository, including historical revisions. Users can
synchronize their local replicas with each other or with one or more
servers. Synchronization must handle inconsistency between
replicas because users may commit concurrent and conflicting changes
to their local replicas. Pastwatch represents its repository as a
"revtree" data structure which tracks the relationships among these
conflicting changes, including any reconciliation.
The revtree also ensures that the replicas eventually converge to
identical images after sufficient synchronization.
We have implemented Pastwatch and evaluate it in a setting distributed
over North America. We have been using it actively for more than a
year. We show that the system is scalable beyond 190 users per
project and that commit and update operations only take 2-4 seconds.
Currently, five users and six different projects regularly use the
system; they find that the system is easy to use and that the system's
replication has masked several network and storage failures.
Date: 3/20/07 Presenter: Sulu
A performance vs. cost framework for evaluating DHT design tradeoffs under churn
Jinyang Li, Jeremy Stribling, Robert Morris, M. Frans Kaashoek and Thomer M. Gil
Abstract
Protocols for distributed hash tables (DHTs) incorporate features to
achieve low latency for lookup requests in the face of churn,
continuous changes in membership. These protocol features can include a
directed identifier space, parallel lookups,
pro-active flooding of membership changes,
and stabilization protocols for maintaining accurate routing.
In addition, DHT protocols have parameters that can be
tuned to achieve different tradeoffs between lookup latency and
communication cost due to maintenance traffic. The relative importance of the
features and parameters is not well understood, because most previous
work evaluates protocols on static networks.
This paper presents a performance versus cost framework (PVC) that
allows designers to compare the effects of different protocol features
and parameter values. PVC views a protocol as consuming a certain
amount of network bandwidth in order to achieve a certain lookup
latency, and helps reveal the efficiency with which protocols
use additional network resources to improve latency. To demonstrate
the value of PVC, this paper simulates Chord, Kademlia, Kelips,
OneHop, and Tapestry under different workloads and uses PVC to
understand which features are more important under churn. PVC
analysis shows that the key to efficiently using additional bandwidth
is for a protocol to adjust its routing table size. It also
shows that routing table stabilization is wasteful and can be
replaced with opportunistic learning through normal lookup traffic.
These insights combined demonstrate that PVC
is a valuable tool for DHT designers.
In the Proceedings of the
24th Infocom, Miami, FL, 2005.
(BibTeX entry)
Paper text: PDF, PS,
gzipped PS |
Date: 3/8/07 Presenter: Jon
Non-Transitive Connectivity and DHTs
Michael J. Freedman, Karthik Lakshminarayanan, Sean Rhea, and Ion Stoica
In Proc. 2nd Workshop on Real, Large, Distributed Systems
(WORLDS '05) San Francisco, CA, December 2005.
[ ps ]
[ ps.gz ]
[ pdf ]
Date: 3/20/07 Presenter: Prakash
|
Bandwidth-efficient management of DHT routing tables
Jinyang Li, Jeremy Stribling, Robert Morris and M. Frans Kaashoek
Abstract
Today an application developer using a distributed hash table (DHT)
with n nodes must choose a DHT protocol from the spectrum between
O(1) lookup protocols and O(log n)
protocols. O(1) protocols achieve low latency lookups on small
or low-churn networks because lookups take only a few hops, but incur
high maintenance traffic on large or high-churn networks. O(log n)
protocols incur less maintenance traffic on large or high-churn
networks but require more lookup hops in small networks. Accordion is a
new routing protocol that does not force the developer to make this
choice: Accordion adjusts itself to provide the best performance across
a range of network sizes and churn rates while staying within a
bounded bandwidth budget.
The key challenges in the design of Accordion are the algorithms that
choose the routing table's size and content. Each Accordion node learns
of new neighbors opportunistically, in a way that causes the density
of its neighbors to be inversely proportional to their distance in ID
space from the node. This distribution allows Accordion to vary the
table size along a continuum while still guaranteeing at most O(log
n) lookup hops. The user-specified bandwidth budget controls the
rate at which a node learns about new neighbors. Each node limits its
routing table size by evicting neighbors that it judges likely to have
failed. High churn (i.e., short node lifetimes) leads to a high
eviction rate. The equilibrium between the learning and eviction
processes determines the table size.
Simulations show that Accordion maintains an efficient lookup latency versus
bandwidth tradeoff over a wider range of operating conditions
than existing DHTs.
In the Proceedings of the
2nd USENIX Symposium on Networked Systems Design and Implementation (NSDI '05), Boston, MA, 2005.
(BibTeX entry)
Paper text: PDF, PS,
gzipped PS |
Date: 3/27/07 Presenter: John
Robust Incentive Techniques for Peer-to-Peer networks. (PDF)
by Michal Feldman, Kevin Lai, Ion Stoica, and John Chuang.
In the ACM Conference on Electronic Commerce (EC'04), new york, new york, 2004. (BibTeX entry)
Date: 4/3/07 Presenter: Sulu
Experience with an Object Reputation System for Peer-to-Peer Filesharing
Kevin Walsh and Emin Gün Sirer, Cornell University
Abstract
In this paper, we describe Credence, a decentralized
object reputation and ranking system for large-scale peer-to-peer filesharing networks. Credence
counteracts pollution in these networks by allowing honest peers to assess the
authenticity of online content through secure tabulation and
management of endorsements from other peers. Our system enables peers
to learn relationships even in the absence of direct observations or
interactions through a novel, flow-based trust computation to discover
trustworthy peers. We have deployed Credence as an
overlay on top of the Gnutella filesharing network, with more than
10,000 downloads of our client software to date. We describe the system design, our
experience with its deployment, and results from a long-term study of
the trust network built by users. Data from the
live deployment shows that Credence's flow-based trust computation
enables users to avoid
undesirable content. Honest Credence clients can identify
three quarters of the decoys encountered when querying the Gnutella network.
Date: 4/3/07 Presenter: Mohammad Security Considerations for Peer-to-Peer Distributed Hash Tables
Abstract
Recent peer-to-peer research has focused on providing efficient hash
lookup systems that can be used to build more complex systems. These
systems have good properties when their algorithms are executed
correctly but have not generally considered how to handle misbehaving
nodes. This paper looks at what sorts of security problems are
inherent in large peer-to-peer systems based on distributed hash
lookup systems. We examine the types of problems that such systems
might face, drawing examples from existing systems, and propose some
design principles for detecting and preventing these problems.
In the Proceedings of the
1st International
Workshop on Peer-to-Peer Systems (IPTPS '02), Cambridge, Massachusetts,
March 2002.
(BibTeX entry)
Paper text: PDF, PS,
gzipped PS |
Date: 4/12/07 Presenter: Srini
Denial-of-Service Resilience in Peer-to-Peer File Sharing Systems. (PDF)
by A. Kuzmanovic, D. Dumitriu, E. Knightly, I. Stoica, and W. Zwaenepoel.
In the ACM SIGMETRICS'05, 2005. (BibTeX entry)
|