Export 23 results:Sort by: Author Title Type [ Year]
Filters: Author is Nacho Navarro [Clear All Filters]
Comparison based sorting for systems with multiple GPUs. Proceedings of the 6th Workshop on General Purpose Processor Using Graphics Processing Units 1–11 (2013).doi:10.1145/2458523.2458524
Assessing the impact of network compression on Molecular Dynamics and Finite Element Methods. 14th International Conference on High-Performance Computing and Communications (HPCC-2012) (2012).
BSArc: Blacksmith Streaming Architecture for HPC Accelerators. ACM International Conference on Computing Frontiers (2012).
Counter-Based Power Modeling Methods: Top-Down vs. Bottom-Up. The Computer Journal (2012).doi:10.1093/comjnl/bxs116
Energy accounting for shared virtualized environments under DVFS using PMC-based power models. Future Generation Computer Systems 28, 457 - 468 (2012).
Hardware-software coherence protocol for the coexistence of caches and local memories. Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis 89:1–89:11 (2012).at <http://dl.acm.org/citation.cfm?id=2388996.2389117>
PPMC : Hardware Scheduling and Memory Management Support for Multi Accelerators. 22nd International Conference on Field Programmable Logic and Applications (FPL-2012 (2012).
Assessing Accelerator-based HPC Reverse Time Migration. Transactions on Parallel and Distributed Systems, Special Issue on Accelerators 22(1), 147-162 (2011).
Design space exploration for aggressive core replication schemes in CMPs. Proceedings of the 20th international symposium on High performance distributed computing 269–270 (2011).doi:http://doi.acm.org/10.1145/1996130.1996169
DiDi: Mitigating The Performance Impact of TLB Shootdowns Using A Shared TLB Directory. Parallel Architectures and Compilation Techniques (PACT) (2011).
Energy accounting for shared virtualized environments under DVFS using PMC-based power models. Future Generation Computer Systems 28, 457 - 468 (2011).
FELI: HW/SW support for On-Chip Distributed Shared Memory in Multicores. Euro-Par (2011).
TARCAD: A template architecture for reconfigurable accelerator designs. IEEE Symposium on Application Specific Processors (SASP) 8-15 (2011).
An Asymmetric Distributed Shared Memory Model for Heterogeneous Parallel Systems. ACM SIGARCH Computer Architecture News - ASPLOS '10 (2010).at <http://doi.acm.org/10.1145/1735970.1736059>
Decomposable and Responsive Power Models for Multicore Processors using Performance Counters. (2010).at <http://doi.acm.org/10.1145/1810085.1810108>
High-Performance Reverse Time Migration on GPU. XXVIII International Conference of the Chilean Computer Society - XIII Workshop on Parallel and Distributed Systems (WSDP) (2009).
CUBA: an Architecture for Efficient CPU/co-processor Data Communication. 22nd International Conference on Supercomputing (2008).
Memory Management on Chip-MultiProcessors with on-chip Memories. Workshop on the Interaction between Operating Systems and Computer Architecture (WIOSCA'08) 1-7 (2008).
On-Chip memories, the OS perspective. 5th HiPEAC Industrial Workshop (2008).
Strategies for Efficient Exploitation of Loop-Level Parallelism in Java Concurrency and Computation. Concurrency and Computation: Practice and Experience Vol. 13 (8-9), 663-680 (2001).