Voici les éléments 1 - 6 sur 6
- PublicationMétadonnées seulementElastic Scaling of a High-Throughput Content-Based Publish/Subscribe EnginePublish/subscribe (pub/sub) infrastructures running as a service on cloud environments offer simplicity and flexibility for composing distributed applications. Provisioning them appropriately is however challenging. The amount of stored subscriptions and incoming publications varies over time, and the computational cost depends on the nature of the applications and in particular on the filtering operation they require (e.g., content-based vs. topic-based, encrypted vs. non-encrypted filtering). The ability to elastically adapt the amount of resources required to sustain given throughput and delay requirements is key to achieving cost-effectiveness for a pub/sub service running in a cloud environment. In this paper, we present the design and evaluation of an elastic content-based pub/sub system: E-STREAMHUB. Specific contributions of this paper include: (1) a mechanism for dynamic scaling, both out and in, of stateful and stateless pub/sub operators, (2) a local and global elasticity policy enforcer maintaining high system utilization and stable end-to-end latencies, and (3) an evaluation using real-world tick workload from the Frankfurt Stock Exchange and encrypted content-based filtering.
- PublicationMétadonnées seulementExploiting Concurrency in Domain-Specific Data Structures: A Concurrent Order Book and Workload Generator for Online TradingConcurrent programming is essential to exploit parallel processing capabilities of modern multi-core CPUs. While there exist many languages and tools to simplify the development of concurrent programs, they are not always readily applicable to domain-specific problems that rely on complex shared data structures associated with various semantics (e.g., priorities or consistency). In this paper, we explore such a domain-specific application from the financial field, where a data structure—an order book —is used to store and match orders from buyers and sellers arriving at a high rate. This application has interesting characteristics as it exhibits some clear potential for parallelism, but at the same time it is relatively complex and must meet some strict guarantees, notably w.r.t. the ordering of operations. We first present an accurate yet slightly simplified description of the order book problem and describe the challenges in paral- lelizing it. We then introduce several approaches for introducing concurrency in the shared data structure, in increasing order of sophistication starting from lock-based techniques to partially lock-free designs. We propose a comprehensive workload generator for constructing histories of orders according to realistic models from the financial domain. We finally perform an evaluation and comparison of the different concurrent designs.
- PublicationMétadonnées seulementInfrastructure Provisioning for Scalable Content-based Routing: Framework and AnalysisContent-based publish/subscribe is an attractive paradigm for designing large-scale systems, as it decouples producers of information from consumers. This provides extensive flexibility for applications, which can use a modular architecture. Using this architecture, each participant expresses its interest in events by means of filters on the content of those events instead of using pre-established communication channels. However, matching events against filters has a non-negligible processing cost. Scaling the infrastructure with the number of users or events requires appropriate provisioning of resources for each of the operations involved: routing and filtering. In this paper, we propose and describe a generic, modular, and scalable infrastructure for supporting high-performance content-based publish/subscribe. We analyze its properties and show how it dynamically scales in a realistic setting. Our results provide valuable insights into the design and deployment of scalable content-based routing infrastructures.
- PublicationMétadonnées seulementStreamHub: A Massively Parallel Architecture for High-Performance Content-Based Publish/SubscribeBy routing messages based on their content, publish/subscribe (pub/sub) systems remove the need to establish and maintain fixed communication channels. Pub/sub is a natural candidate for designing large-scale systems, composed of applications running in different domains and communicating via middleware solutions deployed on a public cloud. Such pub/sub systems must provide high throughput, filtering thousands of publications per second matched against hundreds of thousands of registered subscriptions with low and predictable delays, and must scale horizontally and vertically. As large-scale application composition may require complex publications and subscriptions representations, pub/sub system designs should not rely on the specific characteristics of a particular filtering scheme for implementing scalability. In this paper, we depart from the use of broker overlays, where each server must support the whole range of operations of a pub/sub service, as well as overlay management and routing functionality. We propose instead a novel and pragmatic tiered approach to obtain high-throughput and scalable pub/sub for clusters and cloud deployments. We separate the three operations involved in pub/sub and leverage their natural potential for parallelization. Our design, named StreamHub, is oblivious to the semantics of subscriptions and publications. It can support any type and number of filtering operations implemented by independent libraries. Experiments on a cluster with up to 384 cores indicate that StreamHub is able to register 150 K subscriptions per second and filter next to 2 K publications against 100 K stored subscriptions, resulting in nearly 400 K notifications sent per second. Comparisons against a broker overlay solution shows an improvement of two orders of magnitude in throughput when using the same number of cores.
- PublicationMétadonnées seulementEfficient and Confidentiality-Preserving Content-Based Publish/Subscribe with PrefilteringContent-based publish/subscribe provides a loosely-coupled and expressive form of communication for large-scale distributed systems. Confidentiality is a major challenge for publish/subscribe middleware deployed over multiple administrative domains. Encrypted matching allows confidentiality-preserving content-based filtering but has high performance overheads. It may also prevent the use of classical optimizations based on subscriptions containment. We propose a support mechanism that reduces the cost of encrypted matching, in the form of a prefiltering operator using Bloom filters and simple randomization techniques. This operator greatly reduces the amount of encrypted subscriptions that must be matched against incoming encrypted publications. It leverages subscription containment information when available, but also ensures that containment confidentiality is preserved otherwise. We propose containment obfuscation techniques and provide a rigorous security analysis of the information leaked by Bloom filters in this case. We conduct a thorough experimental evaluation of prefiltering under a large variety of workloads. Our results indicate that prefiltering is successful at reducing the space of subscriptions to be tested in all cases. We show that while there is a tradeoff between prefiltering efficiency and information leakage when using containment obfuscation, it is practically possible to obtain good prefiltering performance while securing the technique against potential leakages.
- PublicationMétadonnées seulementThrifty Privacy: Efficient Support for Privacy-Preserving Publish/SubscribeContent-based publish/subscribe is an appealing paradigm for building large-scale distributed applications. Such applications are often deployed over multiple administrative domains, some of which may not be trusted. Recent attacks in public clouds indicate that a major concern in untrusted domains is the enforcement of privacy. By routing data based on subscriptions evaluated on the content of publications, publish/subscribe systems can expose critical information to unauthorized parties. Information leakage can be avoided by the means of privacy-preserving filtering, which is supported by several mechanisms for encrypted matching. Unfortunately, all existing approaches have in common a high performance overhead and the difficulty to use classical optimization for content-based filtering such as per-attribute containment. In this paper, we propose a novel mechanism that greatly reduces the cost of supporting privacy-preserving filtering based on encrypted matching operators. It is based on a pre-filtering stage that can be combined with containment graphs, if available. Our experiments indicate that pre-filtering is able to significantly reduce the number of encrypted matching for a variety of workloads, and therefore the costs associated with the cryptographic mechanisms. Furthermore, our analysis shows that the additional data structures used for pre-filtering have very limited impact on the effectiveness of privacy preservation.