![]() |
|
|
|
| Computer Sciences |
| Cell Superscalar |
| Home > Computer Sciences > Programming Models > Cell Superscalar |
|
|
Cell SuperscalarOVERVIEW Cell Superscalar (CellSs) addresses the automatic exploitation of the functional parallelism of a sequential program through the different processing elements of the Cell BE architecture. The focus in on the simplicity and flexibility of the programming model. Based on a simple annotation of the source code, a source to source compiler generates the necessary code and a runtime library exploits the existing parallelism by building at runtime a task dependency graph. The runtime takes care of the task scheduling and data handling between the different processors of this heterogeneous architecture. Besides, a locality-aware task scheduling has been implemented to reduce the overhead of data transfers. OBJECTIVES In this project we focus on the Cell BE processor, composed of a 64-bit multithreaded PowerPC processor element (PPE) and eight synergistic processor elements (SPEs). Besides the complexity of dealing with several processors, the programmability of this architecture is additionally more complex due to the non coherence of the PPE main memory and the local memories of the SPEs. Data transfers form the main memory to the small (only 256 KB) of the SPEs must be explicitly programmed in the user application. With this goal, we propose the Cell Superscalar framework (CellSs), which is based in a source to source compiler and a runtime library. The supported programming model allows the programmers to write sequential applications and the framework is able to exploit the existing concurrency and to use the different components of the Cell BE (PPE and SPEs) by means of a automatic parallelization at execution time. The only requirement we place on the programmer is that annotations (somehow similar to the OpenMP ones) are written before the declaration of some of the functions used in the application. Similarly to OpenMP, an annotation (or directive) before a piece of code indicates that this part of code will be executed in the SPEs. This part of code is separated from the main code and a manager program run in the SPEs is able to call it. An annotation before a function does not indicate that this is a parallel region. It just indicates that it is a function that can be run in the SPE. To be able to exploit the parallelism, the CellSs runtime builds a data dependency graph where each node represents an instance of an annotated function and edges between nodes denote data dependencies. From this graph, the runtime is able to schedule for execution independent nodes to different SPEs at the same time. Techniques imported from the computer architecture area like the data dependency analysis, data renaming and data locality exploitation are applied to increase the performance of the application. Summarizing, we focus on offering tools that enable a flexible and high-level programming model for the Cell BE, while we will rely on the Octopiler or other that may appear for the code SIMDization and other lower level code optimizations OTHER INFORMATION Talks
Contact rosa.m.badia.at.bsc.es |
| Barcelona Supercomputing Center, 2010 - Legal Notice |