Computer Architecture Operating System Interface

Overview: 

Multi-core and/or multi-threaded architectures are monopolizing the market, from embedded systems to supercomputers. However, achieving high performance with these modern systems has become a complex task: as the number of cores per chip and/or the number of hardware threads per core continue to increase, new challenges arise in terms of scheduling, power, temperature, scalability, analysability, design complexity, efficiency, throughput, heterogeneity, etc. Performance is not the only important metric anymore, and new metrics (such as security, power, total throughput, Quality of Service) are becoming more and more important. It seems clear that neither the hardware nor the software alone can achieve the desired performance and, at the same time, be compliant with these constraints. The answer to these new challenges comes from hardware-software co-design. Computer Architectures and System Software should interact through a well-defined interface, exchanging run-time information, monitoring application progress and needs, and enforcing resource management.

 

Objectives: 

The Computer Architecture/Operating Systems group researches mainly on real time and high performance computing. Our objectives are:

  • Deploying time-analysable and low-power processor designs for the real-time arena.
  • Proposing complexity-effective, low-power processor architectures with special emphasis on multicore/multithreaded architectures, and in particular in on-chip resources for high-performance and real-time domains
  • Developing tools that allow evaluating different alternatives at hardware and software. The use of powerful and trustable simulation tools allows us to make design space explorations and, hence, fair comparisons between different hardware/software designs
  • Developing methodologies to fairly evaluate different processor designs. Understanding the bottlenecks of current processors is a key factor to make proposals that are of the interest of the industry.
  • Designing and implementing power- and temperature- aware OS solutions for real time and high performance systems (scheduling, load balancing).
  • Improving the interaction between hardware (processor) and software (operating system and run time environment).
  • Facilitating the certification of critical real-time embedded systems against safety standards, both in terms of timing and functional correctness.
  • Developing new parallel programming models and run-time systems applicable to both computing domains, i.e. high performance computing (HPC) and real-time embedded computing (EC). By doing so, programs can exploit the performance opportunities of the newest many-core processor architectures and provide timing guarantees, and so achieving a truly convergence of HPC and EC.
Projects/Areas: 
  • Architectures for hard-real time systems: The increasing demand for functionality in current and future real-time embedded systems is driving an increase in performance of processors. More powerful processors, offer the opportunity to schedule a larger number of applications, potentially co-hosting several safety and non-safety applications on a common powerful platform, providing a better performance/Watt ratio than a single core solution with similar performance. However at the same time, in developing safety-related and mission-critical real- time embedded systems, there is a need to prove that the timing requirements are met. We investigate new processor architectures that allow mixed-criticality tasks simultaneously, providing timing predictability and high-performance, under the stringent low-power constraints of embedded systems.
  • Probabilistic real-time systems: Aggressive hardware acceleration features like caches, deep memory hierarchies, and multicore processors need to be employed to respond to the increasing demand for performance, computation power, and the number and cost reduction of processing units of Critical Real-Time Embedded (CRTE) systems. Despite the fact that most CRTE systems are deployed on comparatively simple and old processor technologies whose temporal behaviour is relatively easy to understand, static analysis and extensive testing efforts (which account for a large portion of total production time and cost) yield far from perfect results. There have been significant advances in this domain, both in static analysis methods as well as hybrid measurement-based methods and testing. However, they cannot keep pace with current hardware trends. As long as current analysis techniques and testing processes are unable to scale up to the challenge, increased hardware complexity will lead to a significant degradation of the quality of the resulting products. Our strategy is to introduce Architectural Design Principles that, by construction, result in temporal behaviour for which the hypothesis of statistical independence can be made (or a clear notion of independence) and therefore enables probabilistic analysis. This is done by moving away from deterministic behaviours to more random behaviours
  • Timing Analysis of COTS multicore processors:  many of the current processors used in real-time domains do not have a domain-specific design but rather are commercial off-the-shelf (COTS). In this area we are investigating news ways to provide bounds on tasks execution time as well as providing evidence about the trustworthiness of those bounds on different COTS multicore processors.
  • High-Performance time predictable parallel applications on multicore processors: The advent of next-generation many-core embedded plat- forms has the chance of intercepting a converging need for predictable high-performance coming from both the HPC and EC domains. On one side, new kinds of HPC applications are being required by markets needing huge amounts of information to be processed within a bounded amount of time. On the other side, EC systems are increasingly concerned with providing higher performance in real-time, challenging the performance capabilities of current architectures. This converging demand, however, raises the problem about how to guarantee timing requirements in presence of parallel execution. In this regard we investigate novel parallel programming models and run-time systems capable to intercept the requirements of both computing domains, i.e. HPC and EC, in a unique system.
  • Functional Verification and Certification:  Functional certification of CRTE systems against safety standards is a complex, expensive and time consuming process needed to deploy systems that may have catastrophic consequences on a failure either in terms of human lives (e.g. a plane crash) or in economic terms (e.g. loss of control of a satellite). In this regard we investigate new methods to verify the robustness of hardware designs in the safety-critical domain in early stages of the design phase, when verification is cheap and fast. Aswell, we also devise hardware/software solutions for an early detetion of failures so that higher chances to recover in time are had.
  • Low-Power and Complexity Effective architectures: Embedded processor evolution leads to an increasing number of features at lower power integrated into a single chip. This challenge requires a set of solutions to reduce power consumption of those systems and integration of heterogeneous systems into the same chip to reduce area, power and delay. Based on the fact that current techniques to save power in multicore processors for real-time applications miss significant details, and different tasks with diverse requirements must be run in the same chip, the CAOS group addresses these issues from new perspectives. The aim of the group consists of proposing new approaches to save power in multicore processors by smartly controlling the resources of the chip as well as devising new hybrid processor designs capable of running some tasks at high-performance in an energy-efficient manner and some others reliably at ultra-low-power with the same hardware.
Additional Information: 
Past Projects
  • parMERASA (www.parmerasa.eu): The motivation for the parMERASA project was the industry’s demands for new functionality and higher levels of performance of embedded hard real-time systems. The parMERASA project developed a many-core processor architecture that provides a predictable timing behaviour, a suitable system-level software, software design guidelines for parallelising hard real-time applications, and tools for estimating and verifying the timing behaviour of such parallel applications.
  • Multicore Benchmarks (http://microelectronics.esa.int/ngmp/ ), In this activity in collaboration with the European Space Agency the CAOS group analysed the real-time properties of the Aeroflex Gaisler NGMP multicore processor.
  • MERASA (www.merasa.org): MERASA was one of the first European projects attacking the challenge of enabling multicore processors in real-time domains.  As an outcome MERASA provide a set of processor design together with the corresponding techniques and tools for static timing analysis of multicore processors.
  • PROARTIS (www.proartis-project.eu): PROARTIS addressed the challenge of analysing the timing behaviour of processors by means of probabilistic techniques. PROARTIS provided as outcome a set of processor, compilation and operating designs and timing analysis techniques that enable probabilistic timing analysis.
  • SOW on POWER5: In this project IBM and BSC intend to pursue a Research Collaboration to enable BSC to analyse, understand and evaluate the behaviour of SMT/CMP processor architectures, including but not limited to IBM's POWER5 processor. In particular, we analysed the interaction between the operating system and the IBM POWER5 processor; (2) we understand the effect of the IBM POWER5 hardware prioritization on performance; (3) we understand the SMT/CMP behaviour characteristics of workloads frequently executed in the BSC; and (4) we explore the design space of current and future SMT/CMP architectures.
  • Real-time CMT architectures: In this project BSC and Sun microsystems Inc. collaborate in the area of Chip Multithreading (CMT) systems. As CMT systems we use boards based on the UltraSPARC T1 and T2 processors. In particular the project focuses on (1) Task scheduling of low-layer network-type of applications, such as IP Forwarding and (2) Analyzing the virtualization capabilities on the UltraSPARC T1 and T2 processors.

PEOPLE

PUBLICATIONS AND COMMUNICATIONS

2014

Jalle J, Kosmidis L, Abella J, Quiñones E, Cazorla FJ. Bus Designs for Time-probabilistic Multicore Processors. Proceedings of the Conference on Design, Automation & Test in Europe [Internet]. 2014 :50:1–50:6. Available from: http://dl.acm.org/citation.cfm?id=2616606.2616668
Hernandez C, Abella J. LiVe: Timely Error Detection in Light-Lockstep Safety Critical Systems. Proceedings of the The 51st Annual Design Automation Conference on Design Automation Conference [Internet]. 2014 :25:1–25:6. Available from: http://doi.acm.org/10.1145/2593069.2593155
Jalle J, Abella J, Quiñones E, Fossati L, Zulianello M, Cazorla FJ. AHRB: A High-Performance Time-Composable AMBA AHB Bus. 20th IEEE Real-Time and Embedded Technology and Applications Symposium. 2014 .
Maric B, Abella J, Valero M. Analyzing the Efficiency of L1 Caches for Reliable Hybrid-Voltage Operation Using EDC Codes. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on. 2014 ;22(10):2211 - 2215.
Panic M, Kehr S, Quiñones E, Boddecker B, Abella J, Cazorla FJ. RunPar: An Allocation Algorithm for Automotive Applications Exploiting Runnable Parallelism in Multicores. Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis [Internet]. 2014 :29:1–29:10. Available from: http://doi.acm.org/10.1145/2656075.2656096

2013

Panic M, Rodríguez G, Quiñones E, Abella J, Cazorla FJ. On-Chip Ring Network Designs for Hard-Real Time Systems. 21st International Conference on Real-Time Networks and Systems. 2013 :23–32.

Pages