Options
SafeCloud: Secure and Resilient Cloud Architecture
SafeCloud will re-architect cloud infrastructures to ensure that data transmission, storage, and processing can be (1) partitioned in multiple administrative domains that are unlikely to collude, so that sensitive data can be protected by design; (2) entangled with inter-dependencies that make it impossible for any of the domains to tamper with its integrity. These two principles (partitioning and entanglement) are thus applied holistically across the entire data management stack, from communication to storage and processing.
Users will control the choice of non-colluding domains for partitioning and the tradeoffs between entanglement and performance, and thus will have full control over what happens to their data. This will make users less reluctant to manage their personal data online due to privacy concerns and will generate important benefits for privacy-sensitive online applications such as distributed cloud infrastructures and medical record storage platforms.
Block placement strategies for fault-resilient distributed tuple spaces: an experimental study
2017-6-19, Barbi, Roberta, Buravlev, Vitaly, Antares Mezzina, Claudio, Schiavoni, Valerio
The tuple space abstraction provides an easy-to-use programming paradigm for distributed applications. Intuitively, it behaves like a distributed shared memory, where applications write and read entries (tuples). When deployed over a wide area network, the tuple space needs to efficiently cope with faults of links and nodes. Erasure coding techniques are increasingly popular to deal with such catastrophic events, in particular due to their storage efficiency with respect to replication. When a client writes a tuple into the system, this is first striped into k blocks and encoded into 𝑛>𝑘 blocks, in a fault-redundant manner. Then, any k out of the n blocks are sufficient to reconstruct and read the tuple. This paper presents several strategies to place those blocks across the set of nodes of a wide area network, that all together form the tuple space. We present the performance trade-offs of different placement strategies by means of simulations and a Python implementation of a distributed tuple space. Our results reveal important differences in the efficiency of the different strategies, for example in terms of block fetching latency, and that having some knowledge of the underlying network graph topology is highly beneficial.
Have a Seat on the ErasureBench: Easy Evaluation of Erasure Coding Libraries for Distributed Storage Systems
2016-9-26, Vaucher, Sébastien, Mercier, Hugues, Schiavoni, Valerio
We present ErasureBench, an open-source framework to test and benchmark erasure coding implementations for distributed storage systems under realistic conditions. ErasureBench automatically instantiates and scales a cluster of storage nodes, and can seamlessly leverage existing failure traces. As a first example, we use ErasureBench to compare three coding implementations: a (10,4) Reed-Solomon (RS) code, a (10,6,5) locally repairable code (LRC), and a partition of the data source in ten pieces without error-correction. Our experiments show that LRC and RS codes require the same repair throughput when used with small storage nodes, since cluster and network management traffic dominate at this regime. With large storage nodes, read and write traffic increases and our experiments confirm the theoretical and practical tradeoffs between the storage overhead and repair bandwidth of RS and LRC codes.