BRISA: Combining Efficiency and Reliability in Epidemic Data Dissemination
2012-5-21, Matos, Miguel, Schiavoni, Valerio, Felber, Pascal, Oliveira, Rui, Rivière, Etienne
There is an increasing demand for efficient and robust systems able to cope with today's global needs for intensive data dissemination, e.g., media content or news feeds. Unfortunately, traditional approaches tend to focus on one end of the efficiency/robustness design spectrum, by either leveraging rigid structures such as trees to achieve efficient distribution, or using loosely-coupled epidemic protocols to obtain robustness. In this paper we present BRISA, a hybrid approach combining the robustness of epidemic-based dissemination with the efficiency of tree-based structured approaches. This is achieved by having dissemination structures such as trees implicitly emerge from an underlying epidemic substrate by a judicious selection of links. These links are chosen with local knowledge only and in such a way that the completeness of data dissemination is not compromised, i.e., the resulting structure covers all nodes. Failures are treated as an integral part of the system as the dissemination structures can be promptly compensated and repaired thanks to the underlying epidemic substrate. Besides presenting the protocol design, we conduct an extensive evaluation in a real environment, analyzing the effectiveness of the structure creation mechanism and its robustness under faults and churn. Results confirm BRISA as an efficient and robust approach to data dissemination in the large scale.
Lightweight, Efficient, Robust Epidemic Dissemination
2013-1-13, Matos, Miguel, Schiavoni, Valerio, Felber, Pascal, Oliveira, Rui, Rivière, Etienne
Gossip-based protocols provide a simple, scalable, and robust way to disseminate messages in large-scale systems. In such protocols, messages are spread in an epidemic manner. Gossiping may take place between nodes using push, pull, or a combination. Push-based systems achieve reasonable latency and high resilience to failures but may impose an unnecessarily large redundancy and overhead on the system. At the other extreme, pull-based protocols impose a lower overhead on the network at the price of increased latencies. A few hybrid approaches have been proposed—typically pushing control messages and pulling data—to avoid the redundancy of high-volume content and single-source streams. Yet, to the best of our knowledge, no other system intermingles push and pull in a multiple-senders scenario, in such a way that data messages of one help in carrying control messages of the other and in adaptively adjusting its rate of operation, further reducing overall cost and improving both on delays and robustness. In this paper, we propose an efficient generic push-pull dissemination protocol, Pulp, which combines the best of both worlds. Pulp exploits the efficiency of push approaches, while limiting redundant messages and therefore imposing a low overhead, as pull protocols do. Pulp leverages the dissemination of multiple messages from diverse sources: by exploiting the push phase of messages to transmit information about other disseminations, Pulp enables an efficient pulling of other messages, which themselves help in turn with the dissemination of pending messages. We deployed Pulp on a cluster and on PlanetLab. Our results demonstrate that Pulp achieves an appealing trade-off between coverage, message redundancy, and propagation delay.
TOPiCo: Detecting Most Frequent Items from Multiple High-Rate Event Streams
2015-6-29, Schiavoni, Valerio, Rivière, Etienne, Sutra, Pierre, Felber, Pascal, Matos, Miguel, Oliveira, Rui
Systems such as social networks, search engines or trading platforms operate geographically distant sites that continu- ously generate streams of events at high-rate. Such events can be access logs to web servers, feeds of messages from participants of a social network, or financial data, among others. The ability to timely detect trends and popularity variations is of paramount importance in such systems. In particular, determining what are the most popular events across all sites allows to capture the most relevant informa- tion in near real-time and quickly adapt the system to the load. This paper presents TOPiCo, a protocol that com- putes the most popular events across geo-distributed sites in a low cost, bandwidth-efficient and timely manner. TOPiCo starts by building the set of most popular events locally at each site. Then, it disseminates only events that have a chance to be among the most popular ones across all sites, significantly reducing the required bandwidth. We give a correctness proof of our algorithm and evaluate TOPiCo using a real-world trace of more than 240 million events spread across 32 sites. Our empirical results shows that (i) TOPiCo is timely and cost-efficient for detecting popular events in a large-scale setting, (ii) it adapts dynamically to the distribution of the events, and (iii) our protocol is particularly efficient for skewed distributions.
Have a Seat on the ErasureBench: Easy Evaluation of Erasure Coding Libraries for Distributed Storage Systems
2016-9-26, Vaucher, Sébastien, Mercier, Hugues, Schiavoni, Valerio
We present ErasureBench, an open-source framework to test and benchmark erasure coding implementations for distributed storage systems under realistic conditions. ErasureBench automatically instantiates and scales a cluster of storage nodes, and can seamlessly leverage existing failure traces. As a first example, we use ErasureBench to compare three coding implementations: a (10,4) Reed-Solomon (RS) code, a (10,6,5) locally repairable code (LRC), and a partition of the data source in ten pieces without error-correction. Our experiments show that LRC and RS codes require the same repair throughput when used with small storage nodes, since cluster and network management traffic dominate at this regime. With large storage nodes, read and write traffic increases and our experiments confirm the theoretical and practical tradeoffs between the storage overhead and repair bandwidth of RS and LRC codes.
Evaluating the Cost and Robustness of Self-organizing Distributed Hash Tables
2016-6-5, Krasikova, Sveta, Carvajal Gómez, Raziel, Borba Ribeiro, Heverson, Rivière, Etienne, Schiavoni, Valerio
Self-organizing construction principles are a natural fit for large-scale distributed system in unpredictable deployment environments. These principles allow a system to systematically converge to a global state by means of simple, uncoordinated actions by individual peers. Indexing services based on the distributed hash table (DHT) abstraction have been established as a solid foundation for large-scale distributed applications. For most DHTs, the creation and maintenance of the overlay structure relies on the exploration and update of an already stabilized structure. We evaluate in this paper the practical interest of self-organizing principles, and in particular gossip-based overlay construction protocols, to bootstrap and maintain various DHT implementations. Based on the seminal work on T-Chord, a self-organizing version of Chord using the T-Man overlay construction service, we contribute three additional self-organizing DHTs: T-Pastry, T-Kademlia and T-Kelips. We conduct an experimental evaluation of the cost and performance of each of these designs using a prototype implementation. Our conclusion is that, while providing equivalent performance in a stabilized system, self-organizing DHTs are able to sustain and recover from higher level of churn than their explicitly-created counterparts, and should therefore be considered as a method of choice for deploying robust indexing layers in adverse environments.