increasing the matrix size up to $100^3$ elements, it was necessary to increase the
CPU power by \np[\%]{50} to \np[GFlops]{1.5} to get the algorithm convergence and the same order of asynchronous mode efficiency. Maintaining a relative gain of $2.5$ and such processor power but increasing network throughput inter cluster up to \np[Mbit/s]{50}, is obtained with
high external precision of \np{E-11} for a matrix size from $110^3$ to $150^3$ side
increasing the matrix size up to $100^3$ elements, it was necessary to increase the
CPU power by \np[\%]{50} to \np[GFlops]{1.5} to get the algorithm convergence and the same order of asynchronous mode efficiency. Maintaining a relative gain of $2.5$ and such processor power but increasing network throughput inter cluster up to \np[Mbit/s]{50}, is obtained with
high external precision of \np{E-11} for a matrix size from $110^3$ to $150^3$ side