Antonio Pena

Primary tabs

Biography

Job vacancies:

 

[Please see related websites and full CV in separate sections at the end of this webpage.]

I am currently a Senior Researcher at Barcelona Supercomputing Center (BSC), Computer Sciences Department. At BSC, I work within the Programming Models group where I am Activity Leader for the "Accelerators and Communications for HPC" team. I'm the Manager of the BSC/UPC NVIDIA GPU Center of Excellence and member of the Outreach Working Group. I'm also Teaching and Research Staff at Universitat Politècnica de Catalunya. I'm a Juan de la Cierva Fellow and prospective Marie Curie Fellow. I'm a recipient of the 2017 IEEE TCHPC Award for Excellence for Early Career Researchers in High Performance Computing. My research interests in the area of runtime systems and programming models for high performance computing include resource heterogeneity and communications.

I was previously at Argonne National Laboratory, Mathematics and Computer Science Division, as a Postdoctoral Appointee (2012-2015). I was driving the heterogeneous memory and accelerator computing areas of research within the Pogramming Models and Runtime Systems group led by Dr. Pavan Balaji, where I was the technical lead of the DMEM and VOCL projects. I was also part of the core MPICH R&D team.

I hold a BS + MS degree in Computer Engineering (2006), and MS and PhD degrees in Advanced Computer Systems (2010, 2013), from Jaume I University of Castellón, Spain. I pursued my doctorate in Advanced Computer Systems, in a joint collaboration between the Universitat Jaume I of Castellón (Spain) and the Universitat Politècnica de València (Spain). My PhD dissertation, titled "GPU Virtualization for High Performance Clusters", was awarded with the Cum Laude distinction and more recently (Sep. 2015) with the Extraordinary Doctoral Award from the Jaume I University. This work started the rCUDA project, from which I am the original developer and architect. Later, I acted as the Development Supervisor of the project.

Research

International Journals

  1. A. Castelló, A. J. Peña, K. Sala, P. Balaji, R. Mayo, and V. Beltran, "On the adequacy of lightweight thread approaches for high-level parallel programming models", Future Generation Computer Systems, Elsevier. Accepted.
  2. S. Chandrasekaran and A. J. Peña, "Special issue on topics on heterogeneous computing", Parallel Computing, Elsevier, vol. 68, pp. 1-2, Oct. 2017. Editorial.
  3. A. Castelló, A. J. Peña, R. Mayo, J. Planas, E. S. Quintana-Ortí, and P. Balaji, "Exploring the interoperability of remote GPGPU virtualization using rCUDA and directive-based programming models", Journal of Supercomputing, Springer, June 2016. [online] 10.1007/s11227-016-1791-y.
  4. A. M. Aji, A. J. Peña, P. Balaji, and W. Feng, "MultiCL: Enabling automatic scheduling for task-parallel workloads in OpenCL", Parallel Computing, Elsevier, vol. 58, pp. 37-55, Oct. 2016.
  5. A. J. Peña and P. Balaji, "A data-oriented profiler to assist in data partitioning and distribution for heterogeneous memory in HPC", Parallel Computing, Elsevier, vol. 51, pp. 46-55, Jan. 2016. http://dx.doi.org/10.1016/j.parco.2015.10.006.
  6. C. Reaño, F. Silla, A. Castelló, A. J. Peña, R. Mayo, E. S. Quintana-Ortí, and J. Duato, "Improving the user experience of the rCUDA remote GPU virtualization framework", Concurrency and Computation: Practice and Experience, Wiley, vol. 27, no. 14, pp. 3749-3770, Sep. 2015. DOI: 10.1002/cpe.3409.
  7. A. J. Peña, C. Reaño, F. Silla, R. Mayo, E. S. Quintana-Ortí, and J. Duato, "A complete and efficient CUDA-sharing solution for HPC clusters", Parallel Computing, Elsevier, vol. 40, no. 10, pp. 574-588, Dec. 2014. DOI: 10.1016/j.parco.2014.09.011.

International Conferences

  1. P. Valero-Lara, I. Martínez-Pérez, A. J. Peña, X. Martorell, R. Sirvent, and J. Labarta. "cuHinesBatch: Solving multiple Hines systems on GPUs", in 2nd HBP Student Conference (HBPSC), Ljubljana, Slovenia, Feb. 2018.
  2. P. Valero-Lara, I. Martinez-Perez, R. Sirvent, X. Martorell, and A. J. Peña. "NVIDIA GPUs scalability to solve multiple (batch) tridiagonal systems. Implementation of cuThomasBatch", in 12th International Conference on Parallel Processing and Applied Mathematics (PPAM), Lublin, Poland, Sep. 2017.
  3. H. Servat, A. J. Peña, G. Llort, E. Mercadal, H. C. Hoppe, and J. Labarta. "Automating the application data placement in hybrid memory systems", in IEEE Cluster, Hawaii, USA, Sep. 2017.
  4. A. Castelló, S. Seo, R. Mayo, P. Balaji, E. S. Quintana-Ortí, and A. J. Peña. "GLT: A unified API for lightweight thread libraries", in 23rd International European Conference on Parallel and Distributed Computing (Euro-Par), Santiago de Compostela, Spain, Aug. 2017.
  5. V. Garcia-Flores, E. Ayguade, and A. J. Peña. "Efficient data sharing on heterogeneous systems", in The 46th International Conference on Parallel Processing (ICPP), Bristol, UK, Aug. 2017.
  6. A. Castelló, S. Seo, R. Mayo, P. Balaji, E. S. Quintana-Orti, and A. J. Peña. "GLTO: On the adequacy of lightweight thread approaches for OpenMP implementations", in The 46th International Conference on Parallel Processing (ICPP), Bristol, UK, Aug. 2017.
  7. A. J. Peña, V. Beltran, C. Clauss, and T. Moschny. "Supporting automatic recovery in offloaded distributed programming models through MPI-3 techniques", in International Conference on Supercomputing (ICS), Chicago, USA, June 2017.
  8. P. Valero-Lara, I. Martínez-Pérez, A. J. Peña, X. Martorell, R. Sirvent, and J. Labarta. "cuHinesBatch: Solving multiple Hines systems on GPUs. Human Brain Project", in International Conference on Computational Science (ICCS), Zúrich, Switzerland, June 2017.
  9. J. Gómez-Luna, I. El Hajj, L. Chang, V. Garcia-Flores, S. Garcia de Gonzalo, T. B. Jablin, A. J. Peña, and W. Hwu. "Chai: Collaborative heterogeneous applications for integrated-architectures", in IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), San Francisco, USA, Apr. 2017.
  10. V. Garcia, J. Gomez-Luna, T. Grass, A. Rico, E. Ayguade, and A. J. Peña. "Evaluating the effect of last-level cache sharing on integrated GPU-CPU systems with heterogeneous applications", in IEEE International Symposium on Workload Characterization (IISWC), Rhode Island, USA, Sep. 2016.
  11. A. Castelló, A. J. Peña, S. Seo, R. Mayo, P. Balaji, and E. S. Quintana-Orti. "A review of lightweight thread approaches for high performance computing", in IEEE Cluster, Taipei, Taiwan, Sep. 2016.
  12. S. Ghosh, J. Hammond, A. J. Peña, P. Balaji, A. Gebremedhin, and B. Chapman. "One-sided interface for matrix operations using MPI-3 RMA: A case study with Elemental", in International Conference on Parallel Processing (ICPP), Philadelphia, PA, USA, Aug. 2016.
  13. A. J. Peña, W. Bland, and P. Balaji. "VOCL-FT: Introducing techniques for efficient soft error coprocessor recovery", in The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15), Austin, TX, USA, Nov. 2015.
  14. A. Aji, A. J. Peña, P. Balaji, and W. Feng. "Automatic command queue scheduling for task-parallel workloads in OpenCL", in IEEE Cluster, Chicago, IL, USA, Sep. 2015.
  15. A. Castelló, A. J. Peña, R. Mayo, P. Balaji, and E. S. Quintana-Ortí. "Exploring the suitability of remote GPGPU virtualization for the OpenACC programming model using rCUDA", in IEEE Cluster, Chicago, IL, USA, Sep. 2015.
  16. M. Si, A. J. Peña, J. Hammond, P. Balaji, M. Takagi, and Y. Ishikawa. "Casper: An asynchronous progress model for MPI RMA on many-core architectures", in 29th IEEE International Parallel & Distributed Processing Symposium (IPDPS), Hyderabad, India, May 2015.
  17. M. Si, A. J. Peña, J. Hammond, P. Balaji, and Y. Ishikawa. "Scaling NWChem with efficient and portable asynchronous communication in MPI RMA", in The 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Shenzhen, Guangdong, China, May 2015. Scale Challenge Finalist.
  18. A. J. Peña and P. Balaji. "Toward the efficient use of multiple explicitly managed memory subsystems", in IEEE Cluster, Madrid, Spain, Sep. 2014.
  19. M. Si, A. J. Peña, P. Balaji, M. Takagi, and Y. Ishikawa. "MT-MPI: Multithreaded MPI for many-core environments", in ACM International Conference on Supercomputing (ICS), Munich, Germany, June 2014.
  20. A. Castelló, J. Duato, R. Mayo, A. J. Peña, E. S. Quintana-Ortí, V. Roca, and F. Silla. "On the use of remote GPUs and low-power processors for the acceleration of scientific applications", in The Fourth International Conference on Smart Grids, Green Communications and IT Energy-aware Technologies (ENERGY), Chamonix, France, Apr. 2014. Best Paper.
  21. A. J. Peña, R. G. Correa Carvalho, J. S. Dinan, P. Balaji, R. Thakur, and W. D. Gropp. “Analysis of topology-dependent MPI performance on Gemini networks”, in The Euro MPI Users’ Group Conference (EuroMPI), Madrid, Spain, Sep. 2013.
  22. C. Reaño, F. Silla, R. Mayo, E. S. Quintana-Ortí, J. Duato, and A. J. Peña. "Influence of InfiniBand FDR on the performance of remote GPU virtualization", in IEEE Cluster, Indianapolis, IN, USA, Sep. 2013. Best Technical Paper.
  23. A. J. Peña and S. Alam. “Evaluation of inter- and intra-node data transfer efficiencies between GPU devices and their impact on scalable applications”, in The 13th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), pp. 144-151, Delft, The Netherlands, May 2013.
  24. C. Reaño, A. J. Peña, F. Silla, J. Duato, R. Mayo, and E. S. Quintana-Ortí. “CU2rCU: towards the complete rCUDA remote GPU virtualization and sharing solution”, in Proceedings of the International Conference on High Performance Computing (HiPC), Pune, India, Dec. 2012.
  25. S. Alam, J. Poznanovic, U. Varetto, N. Bianchi, A. J. Peña, N. Suvanphim. “Early experiences with the Cray XK6 hybrid CPU and GPU MPP platform”, in Cray User Group Conference (CUG), Stuttgart, Germany, Apr. 2012.
  26. J. Duato, J. C. Fernández, R. Mayo, A. J. Peña, E. S. Quintana, and F. Silla. “Enabling CUDA acceleration within virtual machines using rCUDA”, in High Performance Computing Conference (HiPC), Bangalore, India, Dec. 2011.
  27. J. Duato, R. Mayo, A. J. Peña, E. S. Quintana-Ortí, and F. Silla. “Performance of CUDA virtualized remote GPUs in high performance clusters”, in International Conference on Parallel Processing (ICPP), pp. 365-374, Taipei, Taiwan, Sep. 2011.

International Workshops

  1. M. Jordà, P. Valero-Lara, and A. J. Peña. "Convolutional Deep Learning (cuDNN) on NVIDIA GPUs", in Workshop on Optimization and Learning: Challenges and Applications (OLA), Alicante, Spain, Feb. 2018.
  2. S. Iserte, R. Mayo, E. S. Quintana-Orti, V. Beltran, and A. J. Peña, "Efficient scalable computing through flexible applications and adaptive workloads", in Tenth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), Bristol, UK, Aug. 2017.
  3. H. Servat, J. Labarta, H. Hoppe, J. Gimenez, and A. J. Peña, "Integrating memory perspective into the BSC performance tools", in Tenth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), Bristol, UK, Aug. 2017.
  4. S. Iserte, A. J. Peña, R. Mayo, E. S. Quintana-Ortí, and V. Beltrán, "Dynamic management of resource allocation for OmpSs jobs", in PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD), Timisoara, Romania, Feb. 2016.
  5. A. J. Peña and P. Balaji. "A framework for tracking memory accesses in scientific applications", in 43nd International Conference on Parallel Processing Workshops (ICPP-W), Minneapolis, MN, USA, Sep. 2014.
  6. J. Duato, A. J. Peña, F. Silla, R. Mayo, and E. S. Quintana-Ortí, “rCUDA: reducing the number of GPU-based accelerators in high performance clusters”, in Proceedings of the International Conference on High Performance Computing and Simulation (HPCS), Caen, France, June 2010.
  7. J. Duato, F. D. Igual, R. Mayo, A. J. Peña, E. S. Quintana-Ortí, and F. Silla, “An efficient implementation of GPU virtualization in high performance clusters”, in Euro-Par 2009, Parallel Processing – Workshops, 6043, pp. 385-394, Lecture Notes in Computer Science, Springer-Verlag, 2010.
  8. J. Duato, A. J. Peña, F. Silla, R. Mayo, and E. S. Quintana-Ortí, “Modeling the CUDA remoting virtualization behaviour in high performance networks”, in Workshop on Language, Compiler, and Architecture Support for GPGPU (LCA-GPGPU-I), Bangalore, India, Jan. 2010.
  9. M. F. Dolz, J. C. Fernández, E. S. Quintana-Ortí, R. Mayo, and A. J. Peña. “Research line on power-aware computing by the High Performance Computing and Architectures Group”, in COST Action IC0804 on Energy Efficiency in Large Scale Distributed Systems, pp. 32-36, Tolouse, France, Nov. 2009.

International Posters

  1. P. Valero-Lara, I. Martinez-Perez, A. J. Peña, X. Martorell, R. Sirvent, and J. Labarta, "Simulating the behavior of the human brain on NVIDIA GPUs (Human Brain Project)", in GPU Technology Conference (GTC), Silicon Valley, USA, May 2017.
  2. V. García, J. Gómez-Luna, T. Grass, A. Rico, A. J. Peña, and E. Ayguadé, "Analyzing the effect of last level cache sharing on integrated platforms with fine-grain CPU-GPU collaboration", in GPU Technology Conference Euope (GTC Europe), Amsterdam, Netherlands, Sep. 2016.
  3. A. Castelló, A. J. Peña, S. Seo, R. Mayo, P. Balaji, and E. S. Quintana-Ortí, “On the use of lightweight threads”, in Advanced Computer Architectures and Compilation for Embedded Systems (ACACES), pp. 83-86, HiPEAC Network of Excellence, Fuggi, Italy, July 2016.
  4. A. J. Peña and P. Balaji, "Understanding data access patterns using object-differentiated memory profiling", in The 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Shenzhen, Guangdong, China, May 2015.
  5. K. Raffenetti, A. J. Peña, and P. Balaji, "Toward implementing robust support for Portals 4 networks in MPICH", in The 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), Shenzhen, Guangdong, China, May 2015.
  6. C. Reaño, F. Silla, A. J. Peña, G. Shainer, S. Schultz, A. Castelló, E. S. Quintana-Ortí, and J. Duato, "Boosting the performance of remote GPU virtualization using InfiniBand Connect-IB and PCIe 3.0", in IEEE Cluster, Madrid, Spain, Sep. 2014.
  7. J. Duato, A. J. Peña, F. Silla, R. Mayo, and E. S. Quintana-Ortí, “rCUDA InfiniBand performance”, in International Supercomputing Conference (ISC), Hamburg, Germany, June 2011.
  8. J. Duato, R. Mayo, A. J. Peña, E. S. Quintana-Ortí, and F. Silla. “Network influence on rCUDA”, in Advanced Computer Architectures and Compilation for Embedded Systems (ACACES), pp. 9-12, HiPEAC Network of Excellence, Terrassa (Barcelona), Spain, July 2010.
  9. J. Duato, F. D. Igual, R. Mayo, A. J. Peña, E. S. Quintana-Ortí, and F. Silla, “Virtualized remote GPUs”, in Advanced Computer Architectures and Compilation for Embedded Systems (ACACES), pp. 221-224, HiPEAC Network of Excellence, Terrassa (Barcelona), Spain, July 2009. 
  10. A. J. Peña, and J. Fabregat, “A robust bolid and fireball detection algorithm for all-sky sequential images”, in Meteoroids, Barcelona, Spain, June 2007.

International Oral Communications

  1. A. J. Peña, H. Servat, et. al, "Use of the Folding profiler to assist on data distribution for heterogeneous memory systems", in 7th Joint Laboratory for Extreme-Scale Computing Workshop (JLESC), Urbana, USA, June 2017.
  2. A. J. Peña, H. Servat, G. Llort, J. Jiménez, J. Labarta, “Use of the Folding profiler to assist on data distribution for heterogeneous memory systems”, in 6th Joint Laboratory for Extreme-Scale Computing Workshop (JLESC), Kobe, Japan, Dec. 2016.
  3. A. J. Peña and L. Oden, "Data distribution approaches for heterogeneous memory systems", in 5th Joint Laboratory for Extreme-Scale Computing Workshop (JLESC), Lyon, France, June 2016.
  4. A. J. Peña and H. Servat, "Data placement on heterogeneous memory systems in HPC", in 4th Joint Laboratory for Extreme-Scale Computing Workshop (JLESC), Bonn, Germany, Nov. 2015.
  5. H. Servat and A. J. Peña, "Study the use of the Folding hardware-based profiler to assist on data distribution for heterogeneous memory systems in HPC", in 3rd Joint Laboratory for Extreme-Scale Computing Workshop (JLESC), Barcelona, Spain, June 2015.
  6. A. J. Peña and P. Balaji, "The upcoming era of memory heterogeneity in compute nodes", in 2nd Joint Laboratory for Extreme-Scale Computing Workshop (JLESC), Chicago, IL, Nov. 2014.
  7. A. J. Peña, “Virtualization of accelerators in high performance clusters”, in The International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Dissertation Research Showcase, Salt Lake City, UT, Nov. 2012.

Invited Talks and Keynotes

  1. A. J. Peña, "Programming models and heterogeneity in HPC", Foundation for Research and Technology - Hellas (FORTH), Heraklion, Greece, Nov. 2017.
  2. A. J. Peña, "The nightmare and power of heterogeneity in HPC", in Tenth International Workshop on Parallel Programming Models and Systems Software for High-End Computing (P2S2), Bristol, UK, Aug. 2017. Keynote.
  3. A. J. Peña, "Best GPU code practices combining OpenACC, CUDA, and OmpSs", in Workshop on Open Source Supercomputing (OpenSuCo), Frankfurt, Germany, June 2017.
  4. A. J. Peña, “Toward heterogeneous memory systems for HPC", in Enhancing Software Development for Emerging Platforms Using Algorithms and Performance Tools Minisymposium, SIAM CSE, Salt Lake City, UT, USA, Mar. 2015.
  5. A. J. Peña, “Virtualization of accelerators in high performance clusters”. Argonne National Laboratory, Argonne, IL, USA, Oct. 2012.
  6. A. J. Peña and R. Mayo, “rCUDA 4: GPGPU as a service in HPC clusters”, in HPC Advisory Council Spain Conference, Málaga, Spain, Sep. 2012.
  7. F. Silla and A. J. Peña, “rCUDA, an approach to provide remote access to GPU computational power”, in HPC Advisory Council Switzerland Conference, Lugano, Switzerland, Mar. 2012.
  8. A. J. Peña, “Astroadapt: free software for persons suffering mobile disability”. Astronomical Observatory, University of Valencia, Valencia, Spain, Feb. 2009.
  9. A. J. Peña, “Web-based remote telescope control”. Astronomical Observatory, University of Valencia, Valencia, Spain, Mar. 2009.

Spanish Conferences

  1. S. Iserte, R. Mayo, E. S. Quintana-Orti, V. Beltran, and A. J. Peña. "El camino desde la maleabilidad MPI hasta las cargas de trabajo adaptativas", in XXVIII Jornadas de Paralelismo. Malaga, Spain, Sep. 2017.
  2. A. Castelló, J. Duato, R. Mayo, A. J. Peña, E. S. Quintana-Ortí, V. Roca, and F. Silla. "Acelerando aplicaciones científicas con GPUs remotas y procesadores de bajo consumo", in XXV Jornadas de Paralelismo. Valladolid, Spain, Sep. 2014.
  3. S. Iserte, A. Castelló, A. J. Peña, C. Reaño, J. Prades, F. Silla, R. Mayo, E. S. Quintana-Ortí, and J. Duato. "Extendiendo SLURM con soporte para el uso de GPUs remotas", in XXV Jornadas de Paralelismo. Valladolid, Spain, Sep. 2014.
  4. C. Reaño, A. Castelló, S. Iserte, A. J. Peña, F. Silla, R. Mayo, E. S. Quintana-Ortí, and J. Duato. “Virtualización remota de GPUs: evaluación de soluciones disponibles para CUDA”, in XXIV Jornadas de Paralelismo. Madrid, Spain, Sep. 2013.
  5. S. Iserte, A. Castelló, C. Reaño, A. J. Peña, F. Silla, R. Mayo, E. S. Quintana-Ortí, and J. Duato. “Un planificador de GPUs remotas en clusters HPC”, in XXIV Jornadas de Paralelismo. Madrid, Spain, Sep. 2013.
  6. C. Reaño, A. J. Peña, F. Silla, J. Duato, R. Mayo, and E. S. Quintana-Ortí. “CU2rCU: a CUDA-to-rCUDA converter”, in XXIII Jornadas de Paralelismo, pp. 44-49. Elche, Spain, Sep. 2012.
  7. J. Duato, A. J. Peña, F. Silla, J. C. Fernández, R. Mayo, and E. S. Quintana-Ortí. “A new approach to rCUDA”, XXII Jornadas de Paralelismo, pp. 305-310, La Laguna, Spain, Sep. 2011.
  8. C. Reaño, A. J. Peña, F. Silla, R. Mayo, E. S. Quintana-Ortí, and J. Duato. “rCUDA: Uso concurrente de dispositivos compatibles con CUDA de forma remota. Adaptación a CUDA 4”, in XXII Jornadas de Paralelismo, pp. 311-316, La Laguna, Spain, Sep. 2011.
  9. J. Duato, A. J. Peña, F. Silla, R. Mayo, and E. S. Quintana-Ortí. “rCUDA: a framework to perform remote CUDA calls”, in XXI Jornadas de Paralelismo, pp. 519-526, Valencia, Spain, Sep. 2010.
  10. J. Duato, F. D. Igual, R. Mayo, A. J. Peña, E. S. Quintana-Ortí, and F. Silla, “CUDA remoto para clusters de altas prestaciones”, in II Workshop en Aplicaciones de Nuevas Arquitecturas de Consumo y Altas Prestaciones (ANACAP), Móstoles (Madrid), Spain, Nov. 2009.
  11. J. Duato, A. J. Peña, F. Silla, F. D. Igual, R. Mayo, and E. S. Quintana-Ortí, “Accelerating computing through virtualized remote GPUs”, in XX Jornadas de Paralelismo, pp. 635-639, A Coruña, Spain, Sep. 2009.
  12. A. J. Peña, J. M. Claver, A. Sanjuan, and V. Arnau, “Análisis paralelo de secuencias de ADN mediante el uso de GPU y CUDA”, in Workshop de Aplicaciones de Nuevas Arquitecturas de Consumo y Altas Prestaciones (ANACAP), Móstoles (Madrid), Spain, Nov. 2008.
  13. R. Rodríguez, J. M. Claver, G. Fernández, A. J. Peña, and J. L. Sánchez, “Aceleración de la estimación de movimiento en la codificación H.264/AVC mediante GPUs”, in Workshop de Aplicaciones de Nuevas Arquitecturas de Consumo y Altas Prestaciones (ANACAP), Móstoles (Madrid), Spain, Nov. 2008.

Teaching

  • Computer Structure (Labs). B.S. Computer Science. Technical University of Catalonia, Spain. Spring 2018.
  • Parallel programming with MPI and OpenMP. Universidad Autónoma de Madrid. Nov. 2016 and Nov. 2017.
  • Best GPU code practices combining OpenACC, CUDA, and OmpSs. GPU Technology Conference Europe (GTC Europe), Munich, Germany, Oct. 2017.
  • Multi-GPU Programming and CUDA Interoperability (MPI, OpenACC). PUMPS Summer School, Barcelona, Spain, June 2017.
  • Best GPU Code Practices Combining OpenACC, CUDA, and OmpSs. GPU Technology Conference (GTC), Silicon Valley, USA, May 2017.
  • Introduction to OpenACC (Course). PRACE Advanced Training Centre, Barcelona Supercomputing Center, Spain. Apr. 2017.
  • GPU Programming Models and their Combinations (Course). Universidad de Córdoba, Spain. Apr. 2017.
  • Computer Structure (Labs). B.S. Computer Science. Technical University of Catalonia, Spain. Spring 2017.
  • Parallel programming with MPI and OpenMP (Course). Universidad Autónoma de Madrid, Spain. Nov. 2016.
  • Parallel programming workshop. PRACE Advanced Training Centre, Barcelona Supercomputing Center, Spain. Oct. 2016.
  • Introduction to CUDA programming (Course). PRACE Advanced Training Centre, Barcelona Supercomputing Center, Spain. July 2016.
  • Advanced parallel programming with MPI-3 (Tutorial). Argonne National Laboratory, USA. June 2015.
  • Introduction to MPI (Tutorial). Argonne National Laboratory, USA. June 2015.
  • Web-based remote telescope control (Seminar). Astronomic Observatory, University of Valencia, Spain. Mar. 2009.
  • Astroadapt: Free software for persons suffering mobile disability (Seminar). Astronomic Observatory, University of Valencia, Spain. Feb. 2009.