The present invention relates to hardware acceleration of task dependency management in parallel computing. In particular, solutions are proposed for hardware-based dependency management to support nested tasks, resolve system deadlocks as a result of memory full conditions in the dedicated hardware memory and synergetic operation of software runtime and hardware acceleration to solve otherwise unsolvable deadlocks when nested tasks are processed. It also introduces buffered asynchronous communication of larger data exchange, requiring less support from multi-core processor elements as opposed to standard access through the multi-core processor elements. The invention can be implemented as a hardware acceleration processor in the same silicon die as the multi-core processor for achieving gains in performance, fabrication cost reduction and energy consumption saving during operation.
