Voici les éléments 1 - 2 sur 2
  • Publication
    Métadonnées seulement
    TOPiCo: Detecting Most Frequent Items from Multiple High-Rate Event Streams
    (: ACM, 2015-6-29) ; ; ; ;
    Matos, Miguel
    Oliveira, Rui
    Systems such as social networks, search engines or trading platforms operate geographically distant sites that continu- ously generate streams of events at high-rate. Such events can be access logs to web servers, feeds of messages from participants of a social network, or financial data, among others. The ability to timely detect trends and popularity variations is of paramount importance in such systems. In particular, determining what are the most popular events across all sites allows to capture the most relevant informa- tion in near real-time and quickly adapt the system to the load. This paper presents TOPiCo, a protocol that com- putes the most popular events across geo-distributed sites in a low cost, bandwidth-efficient and timely manner. TOPiCo starts by building the set of most popular events locally at each site. Then, it disseminates only events that have a chance to be among the most popular ones across all sites, significantly reducing the required bandwidth. We give a correctness proof of our algorithm and evaluate TOPiCo using a real-world trace of more than 240 million events spread across 32 sites. Our empirical results shows that (i) TOPiCo is timely and cost-efficient for detecting popular events in a large-scale setting, (ii) it adapts dynamically to the distribution of the events, and (iii) our protocol is particularly efficient for skewed distributions.
  • Publication
    Accès libre
    Topology-aware protocols, tools and applications for large-scale distributed systems
    Large-scale distributed systems offer scalable solutions to the ever increasing demand of efficient, online services. Examples of such services include data dissemination, group and membership management, distributed indexing and storage, data streaming, etc. The internal mechanisms of these large-scale systems rely on cooperation among thousands of host machines, deployed at geographically distant sites. The cooperation is typically implemented by message-passing (MP). Pragmatically speaking, MP consists is the exchange of sequences of Bytes through physical and logical routing layers. The physical and logical interconnections between the hosts, i.e., their topology, define the routes of the messages. These topologies consistently affect the routing behaviors of the application-level messages. They expose physical properties (i.e., delays, available bandwidth, loss rate, etc.) as well as dynamic characteristics (number of hops, connectivity, contention on the specific link, failure of the end nodes, etc.). The proper design of distributed systems requires taking into account the underlying topologies.
    This thesis presents protocols, tools and applications that consider adapting to the routing topology substrate as a key design aspect for large-scale distributed systems.
    First, we address the problem of creating anonymous and confidential communication channels on large scale networks. These networks make the design of such confidential communication systems challenging under many perspectives: their scale, the unpredictable crashes of nodes, the inability to establish direct node-to-node communication channels, etc. We present Whisper, a protocol and its possible applications to establish anonymous and confidential communication channels targeting such challenging network topology conditions.
    Then, we observe the need to easily evaluate distributed systems under varying network topology conditions. As a matter of fact, despite the vast literature on the topic, we still lack an integrated tool for topology emulation that is easy-to-use, scalable, featuring multi-user support, concurrent deployments, non-dedicated access, and platform portability. This thesis contributes SplayNet, an integrated tool to support rapid development and evaluation of distributed systems under different network topology conditions.
    Finally, this thesis presents Brisa and LayStream, respectively a data-dissemination protocol and a video-streaming application. These two protocols share the common goal of providing reliable dissemination protocols for large-scale networks. Brisa efficiently organizes the nodes to quickly react to failures in the underlying routing topology or nodes. LayStream presents the lesson learnt in supporting a demanding distributed system, such as video streaming, on top of principled composition of gossip protocols.