Storage systems

Overview: 


LogoStorage for large installations (i.e. large clusters or the Grid) is one of the key issues in current computing, as far as performance and usability are concerned. On the one hand, the size of data grows constantly and large systems have to be ready to store this huge amount of data. On the other hand, the access to this data has to be efficient. In addition, storage systems should offer easy mechanisms to store/retrieve/manage this data. None of these issues has found a satisfactory solution in today systems and thus BSC-CNS works towards better solutions to these key issues.

Objectives: 
  • Study how applications use storage. Most of the research done on file systems are based on assumptions about the use of storage studied at least a decade ago. We work on updating this information and either show how the behavior has changed or prove that these older studies are still valid for today applications.
  • Find mechanisms to make storage more efficient for applications. In order to make storage more efficient, we study what scalability and usability problems appear in large installations and propose solutions mainly based on the autonomic storage, parallel data access, and the proper use of heterogeneity concepts.
Projects/Areas: 

The team structures its research activities around three areas:

  • I/O for widearea systems:In this area we work on mechanism that will simplify the access to data in grid system, as well as making these accesses much more efficient.
  • Storage-system scalability: In this area we study the problems found in cluster file systems when running on large cluster installation with thousands of clients.
  • File-systems:In this area, we investigate how to improve file systems, not from the scalability point of view, but by adapting them to new needs raised by new accessing paradigms.

In addition to these three areas, BSC-CNS has the mission of cooperating with other institutions. For this reason, the team does research in other areas in collaboration with other institutions:

  • Disk caching and scheduling. In this project we propose to investigate the relationship between the disk scheduling, the segments in the disk cache, and host memory caches to improve the performance of disk accesses allowing a tight cooperation between these three levels. Furthermore, we investigate how simple disk modeling can also help in this task. This work is done in cooperation with the Universidad de Murcia.
  • Deduplication at supercomputing centers. Deduplication is widely used in cloud storage datacenters and in archival systems, but no much work has been done to see whether this technique could be useful in supercomputing centers. BSC, in cooperation with the Paderborn Parallel Computing Center (PC2) is performing such study.

PEOPLE

PUBLICATIONS AND COMMUNICATIONS

Pages