-\paragraph{The execution time}
-Let $T_{i}(n)$ be the time to compute one new root value at step 3, $T_{i}$ depends on the polynomial's degree $n$. When $n$ increase $T_{i}(n)$ increases too. We need $n.T_{i}(n)$ to compute all the new values in one iteration at step 3.
-
-Let $T_{j}$ be the time needed to check the convergence of one root value at the step 4, so we need $n.T_{j}$ to compute global convergence condition in each iteration at step 4.
-
-Thus, the execution time for both steps 3 and 4 is:
-\begin{equation}
-T_{iter}=n(T_{i}(n)+T_{j})+O(n).
-\end{equation}
-Let $K$ be the number of iterations necessary to compute all the roots, so the total execution time $T$ can be given as:
-
-\begin{equation}
-\label{eq:T-global}
-T=\left[n\left(T_{i}(n)+T_{j}\right)+O(n)\right].K
-\end{equation}
-The execution time increases with the increasing of the polynomial degree, which justifies to parallelize these steps in order to reduce the global execution time. In the following, we explain how we did parallelize these steps on a GPU architecture using the CUDA platform.