Publications
Export 342 results:
Sort by: Author Title [ Type
] Year Filters: Author is Mateo Valero [Clear All Filters]
Assessing Accelerator-based HPC Reverse Time Migration. Transactions on Parallel and Distributed Systems, Special Issue on Accelerators 22(1), 147-162 (2011).
A Case for Energy-Aware Accounting and Billing in Large-Scale Computing Facilities Cost Metrics and Design Implications. IEEE Micro (2011).
Characterizing Power and Temperature Behavior of POWER6-Based System. (invited paper). IEEE Journal of Emerging and Selected Topics in Circuits and Systems (2011).
CPU accounting in CMP Processors. (2009).
DIA: A Complexity-Effective Decoding Architecture. IEEE Transactions on Computers 58, 448-462 (2009).
Dynamic Memory Instruction Bypassing. (2004).
Dynamic Memory Interval Test vs. Interprocedural Pointer Analiysis in Multimedia Applications. (2005).
Feasibility of QoS for SMT by Resource Allocation. Lecture Notes in Computer Science (LNCS) 3149/2004, (2004).
FlexDCP: a QoS framework for CMP architectures. ACM SIGOPS Operating System Review, Special Issue on the Interaction among the OS, Compilers, and Multicore Processors 43, 0163-5980 (2009).
FlexDCP: a QoS framework for CMP architectures. ACM Operating Systems Review, Special Issue on the Interaction among the OS, Compilers, and Multicore Processors 43, 86-96 (2010).
A Highly Scalable Parallel Implementation of H.264. Transactions on High-Performance Embedded Architectures and Compilers 4, (2009).
Instruction Fetch Architectures and Code Layout Optimizations. Proceedings of the IEEE 89, 1588-1609 (2001).
A latency conscious SMT branch predictor architecture. International Journal of High Performance Computing and Networking (IJHPCN) 2, 11-21 (2004).
A Low Complexity Fetch Architecture for High Performance Superscalar Processors. ACM Transactions on Architecture and Compiler Optimizations (TACO) 1, 220-245 (2004).
A Low-Complexity Fetch Architecture for High-Performance Superscalar Processors. ACM Transactions on Architecture and Code Optimization 1, 220-245 (2004).
Optimizing Long-Latency-Load-Aware Fetch Policies for SMT Processors. International Journal of High Performance Computing and Networking (IJHPCN) 2, (2004).
Performance Evaluation of Macroblock-level Parallelization of H.264 Decoding on a cc-NUMA Multiprocessor Architecture. Avances en Sistemas e Informática 6, 219-228 (2009).
Predictable Performance in SMT processors: Synergy Between the OS and SMTs. IEEE Transactions on Computers 55, 785-799 (2006).
On the Problem of Evaluating the Performance of Multiprogrammed Workloads. . IEEE Transactions on Computers 59, (2010).
Scalable multicore architectures for long DNA sequence comparison. Concurrency and Computation Practice and Experience 23, (2011).
On the Simulation of Large-scale Architectures Using Multiple Application Abstraction Levels. ACM Transactions on Architecture and Code Optimization 8, 36 (2012).
Software Trace Cache for Commercial Applications. International Journal of Parallel Programming 30, 373-395 (2002).
A Stream Processor Front-end. IEEE Technical Committee on Computer Architecture Newsletter 10-13 (2000).


