Architectural support for efficient message passing on shared memory multi-cores

Journal 2016