AYGUADE PARRA, EDUARD
Analyzing performance improvements and energy savings in Infiniband Architecture using network compression. SBAC-PAD 2014 (2014).
Runtime-Aware Architectures: A First Approach. International Journal on Supercomputing Frontiers and Innovations 1, 29-44 (2014).
Software-Managed Power Reduction in Infiniband Links. ICPP 2014 (2014).
Aeneas: A Tool to Enable Applications to Effectively Use Non-Relational Databases. 2013 International Conference on Computational Science 2561–2564 (2013).
Counter-Based Power Modeling Methods: Top-Down vs. Bottom-Up. Computer Journal 56, 198–213 (2013).
Deadline-based MapReduce Workload Management. IEEE Transactions on Network and Service Management 10, 231–244 (2013).
Enabling Distributed Key-Value Stores with Low Latency-Impact Snapshot Support. 12th IEEE International Symposium on Network Computing and Applications 65–72 (2013).
Implementing OmpSs Support for Regions of Data in Architectures with Multiple Address Spaces. 27th International Conference on Supercomputing (ICS) 359–368 (2013).
Loop Level Speculation in a Task Based Programming Model. IEEE Conference on High Performance Computing (HiPC 2013) 1–10 (2013).
Programmability and portability for exascale: Top down programming methodology and tools with StarSs. Journal of Computational Science 4, 450–456 (2013).
Self-Adaptive OmpSs Tasks in Heterogeneous Environments. The 27th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2013) 138–149 (2013). doi:10.1109/IPDPS.2013.53
Assessing the impact of network compression on Molecular Dynamics and Finite Element Methods. 14th International Conference on High-Performance Computing and Communications (HPCC-2012) (2012).
Autonomic Placement of Mixed Batch and Transactional Workloads. IEEE Transactions on Parallel and Distributed Systems 23, 219-231 (2012).
BSArc: Blacksmith Streaming Architecture for HPC Accelerators. ACM International Conference on Computing Frontiers (2012).
Counter-Based Power Modeling Methods: Top-Down vs. Bottom-Up. The Computer Journal (2012). doi:10.1093/comjnl/bxs116
DMA++: On the Fly Data Realignment for On-Chip Memories. Computers, IEEE Transactions on 61, 237 -250 (2012).
DMA-circular: an enhanced high level programmable DMA controller for optimized management of on-chip local memories. Proceedings of the 9th conference on Computing Frontiers 113–122 (2012). doi:10.1145/2212908.2212925
Energy accounting for shared virtualized environments under DVFS using PMC-based power models. Future Generation Computer Systems 28, 457 - 468 (2012).
Hardware-software coherence protocol for the coexistence of caches and local memories. Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 89:1–89:11 (2012). at <http://dl.acm.org/citation.cfm?id=2388996.2389117>
Applications, Tools and Techniques on the Road to Exascale Computing (IOS Press, 2012).
Programming Multi-Core and Many-Core Computing Systems (Wiley Series on Parallel and Distributed Computing) (John Wiley & Sons, Inc., 2012). at <http://www.par.univie.ac.at/~pllana/manycore_book/>
On the Instrumentation of OpenMP and OmpSs Tasking Constructs. Euro-Par 2012: Parallel Processing Workshops. Lecture Notes in Computer Science 7640, 414-428 (2012).
Integrating Dataflow Abstractions into the Shared Memory Model. Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on 243–251 (2012).
Introducing Speculative Optimizations in Task Dataflow with Language Extensions and Runtime Support. 2nd Workshop on Data-Flow Execution Models for Extreme Scale Computing (DFM 2012), In conjunction with 21st International Conference on Parallel Architectures and Compilation Techniques (PACT-2012) (2012).
A methodology for the evaluation of high response time on E-commerce users and sales. Information Systems Frontiers (2012). doi:10.1007/s10796-012-9387-4
OmpSs-OpenCL Programming Model for Heterogeneous Systems. 25th International Workshop on Languages and Compilers for Parallel Computing (LCPC2012) (2012).
PPMC : A Programmable Pattern based Memory Controller. Applied Reconfigurable Computing (ARC) 1–12 (2012).
PPMC : Hardware Scheduling and Memory Management Support for Multi Accelerators. 22nd International Conference on Field Programmable Logic and Applications (FPL-2012 (2012).
Productive Programming of GPU Clusters with OmpSs. 26th IEEE International Parallel and Distributed Processing Symposium (IPDPS 2012) 557-568 (2012). doi:http://doi.ieeecomputersociety.org/10.1109/IPDPS.2012.58
Supporting Stateful Tasks in a Dataflow Graph. Proceedings of the 21st international conference on Parallel architectures and compilation techniques 435–436 (2012).
A Systematic Methodology to Generate Decomposable and Responsive Power Models for CMPs. Computers, IEEE Transactions on PP, 1 (2012).
Transactional Access to Shared Memory in StarSs, a Task Based Programming Model. 18th International Conference, Euro-Par 2012 514–525 (2012).
The Abstract Streaming Machine: Compile-Time Performance Modelling of Stream Programs on Heterogeneous Multiprocessors. Transactions on HiPEAC 5, (2011).
Assessing Accelerator-based HPC Reverse Time Migration. Transactions on Parallel and Distributed Systems, Special Issue on Accelerators 22(1), 147-162 (2011).
A CellBE-based HPC Application for the Analysis of Vulnerabilities in Cryptographic Hash Functions. (2011). at <http://dx.doi.org/10.1109/HPCC.2010.113>
Design space exploration for aggressive core replication schemes in CMPs. Proceedings of the 20th international symposium on High performance distributed computing 269–270 (2011). doi:http://doi.acm.org/10.1145/1996130.1996169
Energy accounting for shared virtualized environments under DVFS using PMC-based power models. Future Generation Computer Systems 28, 457 - 468 (2011).
Implementation of a Reverse Time Migration Kernel using the HCE High Level Synthesis Tool. Field Programmable Technology 1–8 (2011).
Integrating Dataflow Abstractions into Transactional Memory. First Workshop on Systems for Future Multi-Core Architectures (SFMA'11) 1–6 (2011).
Integrating dataflow abstractions into transactional memory. Systems for Future Multi-Core Architectures (SFMA'11) (2011).
Mercurium: Design Decisions for a S2S Compiler. Cetus Users and Compiler Infastructure Workshop in conjunction with PACT 2011 (2011).
Non-intrusive Estimation of QoS Degradation Impact on E-commerce User Satisfaction. Network Computing and Applications (NCA), 2011 10th IEEE International Symposium on 179–186 (2011).
OmpSs: A PROPOSAL FOR PROGRAMMING HETEROGENEOUS MULTI-CORE ARCHITECTURES. Parallel Processing Letters 21, 173-193 (2011).
Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL. Lecture Notes in Computer Science 6548/2011, 215-229 (2011).
Poster: programming clusters of GPUs with OMPSs. Proceedings of the international conference on Supercomputing 378–378 (2011). doi:http://doi.acm.org/10.1145/1995896.1995961
Productive Cluster Programming with OmpSs. Euro-Par 2011 Parallel Processing 6852, 555-566 (2011).
Reconfigurable Memory Controller with Programmable Pattern Support. 5th HiPEAC Workshop on Reconfigurable Computing (WRC 2011) 55–64 (2011).
Resource-aware Adaptive Scheduling for MapReduce Clusters. ACM/IFIP/USENIX 12th International Middleware Conference (Middleware 2011) (2011).
TARCAD: A template architecture for reconfigurable accelerator designs. IEEE Symposium on Application Specific Processors (SASP) 8-15 (2011).
A Template System for the Efficient Compilation of Domain Abstractions onto Reconfigurable Computers. 5th HiPEAC Workshop on Reconfigurable Computing (WRC 2011) 65–74 (2011).