Computer Architecture Operating System Interface

Overview: 

Multi-core and/or multi-threaded architectures are monopolizing the market, from embedded systems to supercomputers. However, achieving high performance with these modern systems has become a complex task: as the number of cores per chip and/or the number of hardware threads per core continue to increase, new challenges arise in terms of scheduling, power, temperature, scalability, analyzability, design complexity, efficiency, throughput, heterogeneity, etc. Performance is not the only important metric anymore, and new metrics (such as security, power, total throughput, Quality of Service) are becoming more and more important. It seems clear that neither the hardware nor the software alone can achieve the desired performance and, at the same time, be compliant with these constraints. The answer to these new challenges comes from hardware-software co-design. Computer Architectures (CA) and Operating Systems (OS) should interact through a well-defined interface, exchanging run-time information, monitoring application progress and needs, and enforcing resource management.

Objectives: 

The Computer Architecture/Operating Systems group researches mainly on real time and high performance computing. Our objectives are:

  • Proposing complexity-effective, low-power processor architectures with special emphasis� on multicore/multithreaded architectures, and in particular in on-chip resources.
  • Developing tools that allow evaluating different alternatives at hardware and software. The use of powerful and trustable simulation tools allows us to make design space explorations and, hence, fair comparisons between different hardware/software designs�
  • Developing methodologies to fairly evaluate different processor designs. Understanding the bottlenecks of current processors is a key factor to make proposals that are of the interest of the industry.
  • Designing and implementing power- and temperature- aware OS solutions for real time and high performance systems (scheduling, load balancing).
  • Improving the interaction between hardware (processor) and software (operating system and run time environment).
  • Deploying time-analyzable and low-power processor designs for the real-time arena.
Projects/Areas: 
  • Architectures for hard-real time systems: The increasing demand for functionality in current and future real-time embedded systems is driving an increase in performance of processors. More powerful processors, offer the opportunity to schedule a larger number of applications, potentially co-hosting several safety and non-safety applications on a common powerful platform, providing a better performance/Watt ratio than a single core solution with similar performance. However at the same time, in developing safety-related real- time embedded systems, there is a need to prove that the timing requirements are met. We investigate new processor architectures that allow executing hard, soft, and non real-time tasks simultaneously, providing timing predictability for hard real-time tasks and high-performance for non real-time tasks, under the stringent low-power constraints of embedded systems.
  • Probabilistic real-time systems: Aggressive hardware acceleration features like caches, deep memory hierarchies, and multicore processors need to be employed to respond to the increasing demand for performance, computation power, and the number and cost reduction of processing units of Critical Real-Time Embedded systems. Despite the fact that most CRTE systems are deployed on comparatively simple and old processor technologies whose temporal behaviour is relatively easy to understand, static analysis and extensive testing efforts (which account for a large proportion of total production time and cost) yield far from perfect results. There have been significant advances in this domain, both in static analysis methods as well as hybrid measurement-based methods and testing. However, they cannot keep pace with current hardware trends. As long as current analysis techniques and testing processes are unable to scale up to the challenge, increased hardware complexity will lead to a significant degradation of the quality of the resulting products. Our strategy is to introduce Architectural Design Principles that, by construction, result in temporal behaviour for which the hypothesis of statistical independence can be made (or a clear notion of independence) and therefore enables probabilistic analysis. This is done by moving away from deterministic behaviours to more random behaviors
  • Operating System and architectures for High-Performance Computing: Operating systems have historically been implemented as independent layers between hardware and applications. User programs communicate to the OS through a set of well-defined system calls while the OS, on the other hand, communicates with the underlying architecture using control registers. Except for these interfaces, the three layers are practically independent and oblivious of each other. While this approach worked well in the past, the arrival of multicore/multithread architectures poses new challenges in terms of performance, power consumption and system utilization. In this new scenario, the classic approach may not deliver optimal performance. High Performance Systems are especially sensitive to these problems: in order to obtain the optimal performance the hardware, the operating system, and the applications can no longer remain isolated, and instead should communicate and cooperate to achieve high performance with the minimal power consumption. The CAOS group addresses some of the problems of modern HPC systems, such as power- and temperature-aware scheduling, dynamic load balancing, resource utilization, Quality of Service.
  • Low-Power and Complexity Effective architectures: Embedded processor evolution leads to an increasing number of features at lower power integrated into a single chip. This challenge requires a set of solutions to reduce power consumption of those systems and integration of heterogeneous systems into the same chip to reduce area, power and delay. Based on the fact that current techniques to save power in multicore processors for real-time applications miss significant details, and different tasks with diverse requirements must be run in the same chip, the CAOS group addresses these issues from new perspectives. The aim of the group consists of proposing new approaches to save power in multicore processors by smartly controlling the resources of the chip as well as devising new hybrid processor designs capable of running some tasks at high-performance in an energy-efficient manner and some others reliably at ultra-low-power with the same hardware.

Previous projects



  • SOW on POWER5: In this project IBM and BSC intend to pursue a Research Collaboration to enable BSC to analyze, understand and evaluate the behavior of SMT/CMP processor architectures, including but not limited to IBM's POWER5 processor. In particular, we analyzed the interaction between the operating system and the IBM POWER5 processor; (2) we understand the effect of the IBM POWER5 hardware prioritization on performance; (3) we understand the SMT/CMP behavior characteristics of workloads frequently executed in the BSC; and (4) we explore the design space of current and future SMT/CMP architectures.
  • Real-time CMT architectures: In this project BSC and Sun microsystems Inc. collaborate in the area of Chip Multithreading (CMT) systems. As CMT systems we use boards based on the UltraSPARC T1 and T2 processors. In particular the project focuses on (1) Task scheduling of low-layer network-type of applications, such as IP Forwarding and (2) Analyzing the virtualization capabilities on the UltraSPARC T1 and T2 processors.

PEOPLE

PUBLICATIONS AND COMMUNICATIONS

2013

Sazeides Y, Ozer E, Kershaw D, Nikolaou P, Kleanthous M, Abella J. Implicit-Storing and Redundant-Encoding-of-Attribute Information in Error-Correction-Codes. 46th IEEE/ACM International Symposium on Microarchitecture (MICRO). 2013 :160–171.
Sazeides Y, Ozer E, Kershaw D, Nikolaou P, Kleanthous M, Abella J. Implicit-Storing and Redundant-Encoding-of-Attribute Information in Error-Correction-Codes. 46th IEEE/ACM International Symposium on Microarchitecture (MICRO). 2013 :160–171.
Camarero C, Vallejo E, Martinez C, Moreto M, Beivide R. Task Mapping in Rectangular Twisted Tori. Proceedings of the High Performance Computing Symposium [Internet]. 2013 :15:1–15:11. Available from: http://dl.acm.org/citation.cfm?id=2499968.2499983
Luque C, Moreto M, Cazorla FJ, Valero M. Fair CPU Time Accounting in CMP+SMT Processors. ACM Trans. Archit. Code Optim. [Internet]. 2013 ;9:50:1–50:25. Available from: http://doi.acm.org/10.1145/2400682.2400709
Maric B, Abella J, Valero M. Analyzing the Efficiency of L1 Caches for Reliable Hybrid-Voltage Operation Using EDC Codes. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on. 2013 ;PP:1-1.
Schulz M, Bhatele A, Belak J, Bremer P-T, Casas M, Bronevetsky G, Gamblin T, Isaacs K, Laguna I, Levine J, et al. Performance Analysis Techniques for the Exascale Co-Design Process. Proceedings of International Conference on Parallel Computing (PARCO). 2013 .

Pages