computations (CPU and GPU) with communications (GPU transfers and internode
communications). However, we have previously shown that for some parallel
iterative algorithms, it is sometimes even more efficient to use an asynchronous
computations (CPU and GPU) with communications (GPU transfers and internode
communications). However, we have previously shown that for some parallel
iterative algorithms, it is sometimes even more efficient to use an asynchronous