Publications

Export 356 results:
Author Title [ Type(Desc)] Year
Filters: Author is Mateo Valero  [Clear All Filters]
International Conferences
Ramirez, A., Larriba-Pey, J. L. & Valero, M. Trace Cache Redundancy: Red & Blue Traces. Sixth International Symposium on High-Performance Computer Architecture (HPCA'2000) 325-333 (2000).
Rico, A., Ramirez, A. & Valero, M. Trace Filtering of Multithreaded Applications for CMP Memory Simulation. IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS-2013) 134–135 (2013).
Rico, A. et al. Trace-driven simulation of multithreaded applications. 2011 IEEE International Symposium on Performance Analysis of Systems and Software 87--96 (2011).
Jiménez, V. J. et al. Trends and techniques for energy efficient architectures. (2010).
Pericas, M. et al. A two-level Load/Store Queue based on Execution Locality. (2008).
Armejach, A. et al. Using a Reconfigurable L1 Data Cache for Efficient Version Management in Hardware Transactional Memory. Parallel Architectures and Compilation Techniques (PACT) 360–370 (2011).
Hayes, T., Palomar, O., Unsal, O., Cristal, A. & Valero, M. Vector Extensions for Decision Support DBMS Acceleration. The 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO45) 166-176 (2012). doi:10.1109/MICRO.2012.24
Pericas, M., Chaves, R., Gaydadjiev, G. N., Valero, M. & Vassiliadis, S. Vectorized AES Core for high-throughput secure environments. (2008).
Salami, E. & Valero, M. A Vector-uSIMD-VLIW Architecture for Multimedia Applications. (2005).
Journal
Paolieri, M., Quiñones, E., Cazorla, F. & Valero, M. An Analyzable Memory Controller for Hard Real-Time CMPs. IEEE Embedded Systems Letters 1, (2009).
Maric, B., Abella, J. & Valero, M. Analyzing the Efficiency of L1 Caches for Reliable Hybrid-Voltage Operation Using EDC Codes. Very Large Scale Integration (VLSI) Systems, IEEE Transactions on 22, 2211 - 2215 (2014).
Araya-Polo, M. et al. Assessing Accelerator-based HPC Reverse Time Migration. Transactions on Parallel and Distributed Systems, Special Issue on Accelerators 22(1), 147-162 (2011).
Rico, A., Ramirez, A. & Valero, M. Available task-level parallelism on the Cell BE. Scientific Programming 17, 59-76 (2009).
Falcón, A., Stark, J., Ramirez, A., Lai, K. & Valero, M. Better branch prediction through prophet/critic hybrids. IEEE Micro 25, 80-89 (2005).
Jimenez, V. et al. A Case for Energy-Aware Accounting and Billing in Large-Scale Computing Facilities Cost Metrics and Design Implications. IEEE Micro (2011).
Jimenez, V. et al. Characterizing Power and Temperature Behavior of POWER6-Based System. (invited paper). IEEE Journal of Emerging and Selected Topics in Circuits and Systems (2011).
Luque, C. et al. CPU Accounting for Multicore Processors. IEEE Transactions on Computers 61, 251–264 (2012).
Luque, C. et al. CPU accounting in CMP Processors. (2009).
Santana, O. J., Falcón, A., Ramirez, A. & Valero, M. DIA: A Complexity-Effective Decoding Architecture. IEEE Transactions on Computers 58, 448-462 (2009).
Vidal, J., March, M., Cerdá, L., Corbal, J. & Valero, M. A DRAM/SRAM Memory Scheme for Fast Packet Buffers. (2006).
Moreto, M., Cazorla, F., Ramirez, A. & Valero, M. Dynamic Cache Partitioning Based on the MLP on Cache Misses. Transactions on HiPEAC 3, 1-21 (2008).
Ortega, D., Valero, M. & Ayguadé, E. Dynamic Memory Instruction Bypassing. (2004).
Salami, E. & Valero, M. Dynamic Memory Interval Test vs. Interprocedural Pointer Analiysis in Multimedia Applications. (2005).
Santana, O. J., Ramirez, A. & Valero, M. Enlarging Instruction Streams. IEEE Transactions on Computers 56, 1342-1357 (2007).
Moreto, M., Cazorla, F., Ramirez, A. & Valero, M. Explaining Dynamic Cache Partitioning Speed Ups. IEEE Computer Architecture Letters 6, 1-12 (2007).
Luque, C., Moreto, M., Cazorla, F. J. & Valero, M. Fair CPU Time Accounting in CMP+SMT Processors. ACM Trans. Archit. Code Optim. 9, 50:1–50:25 (2013).
Cazorla, F. et al. Feasibility of QoS for SMT by Resource Allocation. Lecture Notes in Computer Science (LNCS) 3149/2004, (2004).
Moreto, M., Cazorla, F., Ramirez, A., Sakellariou, R. & Valero, M. FlexDCP: a QoS framework for CMP architectures. ACM SIGOPS Operating System Review, Special Issue on the Interaction among the OS, Compilers, and Multicore Processors 43, 0163-5980 (2009).
Moreto, M., Cazorla, F., Ramirez, A., Sakellariou, R. & Valero, M. FlexDCP: a QoS framework for CMP architectures. ACM Operating Systems Review, Special Issue on the Interaction among the OS, Compilers, and Multicore Processors 43, 86-96 (2010).
Álvarez, C., Corbal, J. & Valero, M. Fuzzy Memoization for Floating Point Multimedia Applications. (2005).
Liu, Q. et al. Hardware Support for Accurate Per-task Energy Metering in Multicore Systems. ACM Trans. Archit. Code Optim. 10, 34:1–34:27 (2013).
Monreal, T., Viñals, V., González, A. & Valero, M. Hardware Support for Early Register Release. (2005).
Pericas, M., Ayguadé, E., Zalamea, J., Llosa, J. & Valero, M. High Performance and Low Power VLIW for Numerical Applications. (2004).
Azevedo, A. et al. A Highly Scalable Parallel Implementation of H.264. Transactions on High-Performance Embedded Architectures and Compilers 4, (2009).
Salami, E. & Valero, M. Initial Evaluation of Multimedia Extensions on VLIW Architectures. (2004).
Ramirez, A., Larriba-Pey, J. L. & Valero, M. Instruction Fetch Architectures and Code Layout Optimizations. Proceedings of the IEEE 89, 1588-1609 (2001).
Cristal, A. et al. Kilo-instruction Processors: Overcoming the Memory Wall. IEEE Micro 25, 48–57 (2005).
Monreal, T., Viñals, V., González, J., González, A. & Valero, M. Late Allocation and Early Release of Physical Registers. (2004).
Falcón, A., Santana, O. J., Ramirez, A. & Valero, M. A latency conscious SMT branch predictor architecture. International Journal of High Performance Computing and Networking (IJHPCN) 2, 11-21 (2004).
Valero, M., Santana, O. J., Ramirez, A. & Larriba-Pey, J. L. A Low Complexity Fetch Architecture for High Performance Superscalar Processors. ACM Transactions on Architecture and Compiler Optimizations (TACO) 1, 220-245 (2004).
Santana, O. J., Ramirez, A., Larriba-Pey, J. L. & Valero, M. A Low-Complexity Fetch Architecture for High-Performance Superscalar Processors. ACM Transactions on Architecture and Code Optimization 1, 220-245 (2004).
Nesbit, K. J. et al. Multicore Resource Management. IEEE Micro 28, 6-16 (2008).
Cazorla, F., Fernández, E., Ramirez, A. & Valero, M. Optimizing Long-Latency-Load-Aware Fetch Policies for SMT Processors. International Journal of High Performance Computing and Networking (IJHPCN) 2, (2004).
Pericas, M., Ayguadé, E., Zalamea, J., Llosa, J. & Valero, M. Performance and Power Evaluation of Clustered VLIW Processors with Functional Units. (2004).
Álvarez, M. et al. Performance Evaluation of Macroblock-level Parallelization of H.264 Decoding on a cc-NUMA Multiprocessor Architecture. Avances en Sistemas e Informática 6, 219-228 (2009).
Morad, T., Weiser, U., Kolodny, A., Valero, M. & Ayguadé, E. Performance, Power Efficiency and Scalability of Asymmetric Cluster Chip Multiprocessors. (2006).
Pericas, M., Ayguadé, E., Zalamea, J., Llosa, J. & Valero, M. Power and Performace Evaluation of Widened and Clustered VLIW Cores. (2005).
Cazorla, F. et al. Predictable Performance in SMT processors: Synergy Between the OS and SMTs. IEEE Transactions on Computers 55, 785-799 (2006).
Cazorla, F., Pajuelo, A., Santana, O. J., Fernandez, E. & Valero, M. On the Problem of Evaluating the Performance of Multiprogrammed Workloads. . IEEE Transactions on Computers 59, (2010).
Subotic, V. et al. Programmability and portability for exascale: Top down programming methodology and tools with StarSs. Journal of Computational Science 4, 450–456 (2013).

Pages