Thanks to the substantial involvement of a large number of PhD students, Master students and undergraduate engineering students, the DataCloud@work Associate Team could make significant progress with all 3 tasks defined in our initial proposal.
Goals. The goal of this research direction is to enable autonomic storage for cloud services. As a first milestone, we introduce self-management and self-adaptation facilities in BlobSeer. We target several features: an automatic management of the replication degree used by the storage (data) providers, automatic load balancing through data migration from overloaded to underloaded data providers, removal of providers with poor communication links or poor performance, along with automatic replacement of failed data providers.
Results. We enhanced BlobSeer with self-adaptive features by dynamically changing and maintaining the replication factors of the data. When a specific BLOB is under a heavy load (in terms of read operations), the system automatically increases its replication factor and handles all the necessary data transfers. In contrast, when some data is less (or never) used, its replication factor is transparently reduced. Moreover, we developed a component able to dynamically contract and expand the pool of storage providers based on the system's load, so as to adapt the resource usage to the needs of the clients accessing the data.
Two Bachelor theses at PUB focused on this task:
Goals. The following situations can be detected through the analysis of the stored user activity logs: users breaking existing policies, abnormal client activity or incorrect client requests. The restrictions of the provider must be enforced so all attempts to break them must be detected. These restrictions can take various shapes, for example by using only certain resources for each client or restricting the bandwidth in certain time periods. Through strict monitoring of the client activity the cases when the actions of the clients are outside these restrictions can be detected and can restrict the actions of that user or temporarily suspend his access rights.
Results. We have developed a generic Security Management Framework that allows providers of Cloud data management systems to define and enforce complex security policies. This security framework is designed to detect and stop a large array of attacks through an expressive policy description language. We integrated our security framework with BlobSeer and we showed that we can provide a secure environment for data management systems without a significant overhead, while being able to define and detect complex attack scenarios. Moreover, we developed a specific security mechanism which continually monitors and analyzes the client activity and the state of the system to detect security threats, malicious activity or other kinds of intrusions. Through monitoring, the security system defines (and continuously refines) a trust level for each client.
Additionally, we addressed the problem of securely running web services on top of BlobSeer. We provided a secure environment for such an environment by implementing mechanisms for the authentication and authorization of the users, as well as secure data transfers for web services that use BlobSeer as a storage back end.
Several Master research internships and Bachelor theses at PUB focused on this task:
Goals. This task aims at enabling BlobSeer as a storage service for sharing data of applications running in a Nimbus-enabled IaaS. There are two main goals that need to be reached. First, design and implement an IaaS client access interface that supports the deployment and management of a BlobSeer instance. Second, we need to design and implement an interface for accessing the BlobSeer data-sharing service for the application running inside the VM. This access interface must access the same BlobSeer instance from within any VM regardless of the physical machine where the VM is deployed on.
Results. We integrated BlobSeer distributed storage system with the Nimbus Cloud, and made it available as a storage service on the Cloud. We set up a Nimbus environment, installed BlobSeer entities inside virtual machines deployed into the cloud and implemented a new feature that allows BlobSeer to restart from a consistent state, using a model of incremental checkpoints. We added mechanisms for bringing BlobSeer to a consistent state before stopping it and then for starting/stopping/restarting BlobSeer inside the Nimbus Cloud, while preserving the data it stored during previous runs.
In the context of providing efficient Virtual Machine management for IaaS clouds, we used BlobSeer as a storage system for checkpointing images of the virtual machines. To this end, we stored the virtual-machine instances as binary large objects (BLOBs) using a globally shared namespace built using BlobSeer. Furthermore, we studied live migration of virtual machines between clouds as a way to adapt to resource dynamicity. We used the Nimbus cloud toolkit to manage several virtual clusters and we implemented inter-cloud live virtual machine migration on top of them.
Several PhD and Master students were involved in Sub-tasks of this Task: