+\begin{proof}
+ Let $X=(S,E)$, $\varepsilon>0$, and $k_0 = \lfloor log_{10}(\varepsilon)+1 \rfloor$.
+Any point $X'=(S',E')$ such that $E'=E$ and $\forall k \leqslant k_0, S'^k=S^k$,
+are in the open ball $\mathcal{B}\left(X,\varepsilon\right)$. Let us define
+$\check{X} = \left(\check{S},\check{E}\right)$, where $\check{X}= G^{k_0}(X)$.
+We denote by $s\subset \llbracket 1; \mathsf{N} \rrbracket$ the set of coordinates
+that are different between $\check{E}$ and the state of $Y$. Thus each point $X'$ of
+the form $(S',E')$ where $E'=E$ and $S'$ starts with
+$(S^0, S^1, \hdots, S^{k_0},s,\hdots)$, verifies the following properties:
+\begin{itemize}
+ \item $X'$ is in $\mathcal{B}\left(X,\varepsilon\right)$,
+ \item the state of $G_f^{k_0+1}(X')$ is the state of $Y$.
+\end{itemize}
+Finally the point $\left(\left(S^0, S^1, \hdots, S^{k_0},s,s^0, s^1, \hdots\right); E\right)$,
+where $(s^0,s^1, \hdots)$ is the strategy of $Y$, satisfies the properties
+claimed in the lemma.
+\end{proof}
+
+We can now prove the Theorem~\ref{t:chaos des general}.
+
+\begin{proof}[Theorem~\ref{t:chaos des general}]
+Firstly, strong transitivity implies transitivity.
+
+Let $(S,E) \in\mathcal{X}$ and $\varepsilon >0$. To
+prove that $G_f$ is regular, it is sufficient to prove that
+there exists a strategy $\tilde S$ such that the distance between
+$(\tilde S,E)$ and $(S,E)$ is less than $\varepsilon$, and such that
+$(\tilde S,E)$ is a periodic point.
+
+Let $t_1=\lfloor-\log_{10}(\varepsilon)\rfloor$, and let $E'$ be the
+configuration that we obtain from $(S,E)$ after $t_1$ iterations of
+$G_f$. As $G_f$ is strongly transitive, there exists a strategy $S'$
+and $t_2\in\mathds{N}$ such
+that $E$ is reached from $(S',E')$ after $t_2$ iterations of $G_f$.
+
+Consider the strategy $\tilde S$ that alternates the first $t_1$ terms
+of $S$ and the first $t_2$ terms of $S'$:
+%%RAPH : j'ai coupé la ligne en 2
+$$\tilde
+S=(S_0,\dots,S_{t_1-1},S'_0,\dots,S'_{t_2-1},S_0,$$$$\dots,S_{t_1-1},S'_0,\dots,S'_{t_2-1},S_0,\dots).$$ It
+is clear that $(\tilde S,E)$ is obtained from $(\tilde S,E)$ after
+$t_1+t_2$ iterations of $G_f$. So $(\tilde S,E)$ is a periodic
+point. Since $\tilde S_t=S_t$ for $t<t_1$, by the choice of $t_1$, we
+have $d((S,E),(\tilde S,E))<\epsilon$.
+\end{proof}
+
+
+\begin{color}{red}
+\section{Statistical Improvements Using Chaotic Iterations}
+
+\label{The generation of pseudo-random sequence}
+
+
+Let us now explain why we are reasonable grounds to believe that chaos
+can improve statistical properties.
+We will show in this section that, when mixing defective PRNGs with
+chaotic iterations, the result presents better statistical properties
+(this section summarizes the work of~\cite{bfg12a:ip}).
+
+\subsection{Details of some Existing Generators}
+
+The list of defective PRNGs we will use
+as inputs for the statistical tests to come is introduced here.
+
+Firstly, the simple linear congruency generator (LCGs) will be used.
+It is defined by the following recurrence:
+\begin{equation}
+x^n = (ax^{n-1} + c)~mod~m
+\label{LCG}
+\end{equation}
+where $a$, $c$, and $x^0$ must be, among other things, non-negative and less than
+$m$~\cite{LEcuyerS07}. In what follows, 2LCGs and 3LCGs refer as two (resp. three)
+combinations of such LCGs. For further details, see~\cite{bfg12a:ip,combined_lcg}.
+
+Secondly, the multiple recursive generators (MRGs) will be used too, which
+are based on a linear recurrence of order
+$k$, modulo $m$~\cite{LEcuyerS07}:
+\begin{equation}
+x^n = (a^1x^{n-1}+~...~+a^kx^{n-k})~mod~m
+\label{MRG}
+\end{equation}
+Combination of two MRGs (referred as 2MRGs) is also used in these experimentations.
+
+Generators based on linear recurrences with carry will be regarded too.
+This family of generators includes the add-with-carry (AWC) generator, based on the recurrence:
+\begin{equation}
+\label{AWC}
+\begin{array}{l}
+x^n = (x^{n-r} + x^{n-s} + c^{n-1})~mod~m, \\
+c^n= (x^{n-r} + x^{n-s} + c^{n-1}) / m, \end{array}\end{equation}
+the SWB generator, having the recurrence:
+\begin{equation}
+\label{SWB}
+\begin{array}{l}
+x^n = (x^{n-r} - x^{n-s} - c^{n-1})~mod~m, \\
+c^n=\left\{
+\begin{array}{l}
+1 ~~~~~\text{if}~ (x^{i-r} - x^{i-s} - c^{i-1})<0\\
+0 ~~~~~\text{else},\end{array} \right. \end{array}\end{equation}
+and the SWC generator designed by R. Couture, which is based on the following recurrence:
+\begin{equation}
+\label{SWC}
+\begin{array}{l}
+x^n = (a^1x^{n-1} \oplus ~...~ \oplus a^rx^{n-r} \oplus c^{n-1}) ~ mod ~ 2^w, \\
+c^n = (a^1x^{n-1} \oplus ~...~ \oplus a^rx^{n-r} \oplus c^{n-1}) ~ / ~ 2^w. \end{array}\end{equation}
+
+Then the generalized feedback shift register (GFSR) generator has been implemented, that is:
+\begin{equation}
+x^n = x^{n-r} \oplus x^{n-k}
+\label{GFSR}
+\end{equation}
+
+
+Finally, the nonlinear inversive generator~\cite{LEcuyerS07} has been regarded too, which is:
+
+\begin{equation}
+\label{INV}
+\begin{array}{l}
+x^n=\left\{
+\begin{array}{ll}
+(a^1 + a^2 / z^{n-1})~mod~m & \text{if}~ z^{n-1} \neq 0 \\
+a^1 & \text{if}~ z^{n-1} = 0 .\end{array} \right. \end{array}\end{equation}
+
+
+
+
+
+\subsection{Statistical tests}
+\label{Security analysis}
+
+Three batteries of tests are reputed and usually used
+to evaluate the statistical properties of newly designed pseudorandom
+number generators. These batteries are named DieHard~\cite{Marsaglia1996},
+the NIST suite~\cite{ANDREW2008}, and the most stringent one called
+TestU01~\cite{LEcuyerS07}, which encompasses the two other batteries.
+
+
+
+\label{Results and discussion}
+\begin{table*}
+\renewcommand{\arraystretch}{1.3}
+\caption{NIST and DieHARD tests suite passing rates for PRNGs without CI}
+\label{NIST and DieHARD tests suite passing rate the for PRNGs without CI}
+\centering
+ \begin{tabular}{|l||c|c|c|c|c|c|c|c|c|c|}
+ \hline\hline
+Types of PRNGs & \multicolumn{2}{c|}{Linear PRNGs} & \multicolumn{4}{c|}{Lagged PRNGs} & \multicolumn{1}{c|}{ICG PRNGs} & \multicolumn{3}{c|}{Mixed PRNGs}\\ \hline
+\backslashbox{\textbf{$Tests$}} {\textbf{$PRNG$}} & LCG& MRG& AWC & SWB & SWC & GFSR & INV & LCG2& LCG3& MRG2 \\ \hline
+NIST & 11/15 & 14/15 &\textbf{15/15} & \textbf{15/15} & 14/15 & 14/15 & 14/15 & 14/15& 14/15& 14/15 \\ \hline
+DieHARD & 16/18 & 16/18 & 15/18 & 16/18 & \textbf{18/18} & 16/18 & 16/18 & 16/18& 16/18& 16/18\\ \hline
+\end{tabular}
+\end{table*}
+
+Table~\ref{NIST and DieHARD tests suite passing rate the for PRNGs without CI} shows the
+results on the two firsts batteries recalled above, indicating that all the PRNGs presented
+in the previous section
+cannot pass all these tests. In other words, the statistical quality of these PRNGs cannot
+fulfill the up-to-date standards presented previously. We have shown in~\cite{bfg12a:ip} that the use of chaotic
+iterations can solve this issue.
+More precisely, to
+illustrate the effects of chaotic iterations on these defective PRNGs, experiments have been divided in three parts~\cite{bfg12a:ip}:
+\begin{enumerate}
+ \item \textbf{Single CIPRNG}: The PRNGs involved in CI computing are of the same category.
+ \item \textbf{Mixed CIPRNG}: Two different types of PRNGs are mixed during the chaotic iterations process.
+ \item \textbf{Multiple CIPRNG}: The generator is obtained by repeating the composition of the iteration function as follows: $x^0\in \mathds{B}^{\mathsf{N}}$, and $\forall n\in \mathds{N}^{\ast },\forall i\in \llbracket1;\mathsf{N}\rrbracket, x_i^n=$
+\begin{equation}
+\begin{array}{l}
+\left\{
+\begin{array}{l}
+x_i^{n-1}~~~~~\text{if}~S^n\neq i \\
+\forall j\in \llbracket1;\mathsf{m}\rrbracket,f^m(x^{n-1})_{S^{nm+j}}~\text{if}~S^{nm+j}=i.\end{array} \right. \end{array}
+\end{equation}
+$m$ is called the \emph{functional power}.
+\end{enumerate}
+
+
+We have performed statistical analysis of each of the aforementioned CIPRNGs.
+The results are reproduced in Tables~\ref{NIST and DieHARD tests suite passing rate the for PRNGs without CI} and \ref{NIST and DieHARD tests suite passing rate the for single CIPRNGs}.
+The scores written in boldface indicate that all the tests have been passed successfully, whereas an asterisk ``*'' means that the considered passing rate has been improved.
+
+\subsubsection{Tests based on the Single CIPRNG}
+
+\begin{table*}
+\renewcommand{\arraystretch}{1.3}
+\caption{NIST and DieHARD tests suite passing rates for PRNGs with CI}
+\label{NIST and DieHARD tests suite passing rate the for single CIPRNGs}
+\centering
+ \begin{tabular}{|l||c|c|c|c|c|c|c|c|c|c|c|c|}
+ \hline
+Types of PRNGs & \multicolumn{2}{c|}{Linear PRNGs} & \multicolumn{4}{c|}{Lagged PRNGs} & \multicolumn{1}{c|}{ICG PRNGs} & \multicolumn{3}{c|}{Mixed PRNGs}\\ \hline
+\backslashbox{\textbf{$Tests$}} {\textbf{$Single~CIPRNG$}} & LCG & MRG & AWC & SWB & SWC & GFSR & INV& LCG2 & LCG3& MRG2 \\ \hline\hline
+Old CIPRNG\\ \hline \hline
+NIST & \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} & \textbf{15/15} & \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} *& \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} \\ \hline
+DieHARD & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} & \textbf{18/18} * & \textbf{18/18} *& \textbf{18/18} * & \textbf{18/18} *& \textbf{18/18} * \\ \hline
+New CIPRNG\\ \hline \hline
+NIST & \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} & \textbf{15/15} & \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} *& \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} \\ \hline
+DieHARD & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} *& \textbf{18/18} *\\ \hline
+Xor CIPRNG\\ \hline\hline
+NIST & 14/15*& \textbf{15/15} * & \textbf{15/15} & \textbf{15/15} & 14/15 & \textbf{15/15} * & 14/15& \textbf{15/15} * & \textbf{15/15} *& \textbf{15/15} \\ \hline
+DieHARD & 16/18 & 16/18 & 17/18* & \textbf{18/18} * & \textbf{18/18} & \textbf{18/18} * & 16/18 & 16/18 & 16/18& 16/18\\ \hline
+\end{tabular}
+\end{table*}
+
+The statistical tests results of the PRNGs using the single CIPRNG method are given in Table~\ref{NIST and DieHARD tests suite passing rate the for single CIPRNGs}.
+We can observe that, except for the Xor CIPRNG, all of the CIPRNGs have passed the 15 tests of the NIST battery and the 18 tests of the DieHARD one.
+Moreover, considering these scores, we can deduce that both the single Old CIPRNG and the single New CIPRNG are relatively steadier than the single Xor CIPRNG approach, when applying them to different PRNGs.
+However, the Xor CIPRNG is obviously the fastest approach to generate a CI random sequence, and it still improves the statistical properties relative to each generator taken alone, although the test values are not as good as desired.
+
+Therefore, all of these three ways are interesting, for different reasons, in the production of pseudorandom numbers and,
+on the whole, the single CIPRNG method can be considered to adapt to or improve all kinds of PRNGs.
+
+To have a realization of the Xor CIPRNG that can pass all the tests embedded into the NIST battery, the Xor CIPRNG with multiple functional powers are investigated in Section~\ref{Tests based on Multiple CIPRNG}.
+
+
+\subsubsection{Tests based on the Mixed CIPRNG}
+
+To compare the previous approach with the CIPRNG design that uses a Mixed CIPRNG, we have taken into account the same inputted generators than in the previous section.
+These inputted couples $(PRNG_1,PRNG_2)$ of PRNGs are used in the Mixed approach as follows:
+\begin{equation}
+\left\{
+\begin{array}{l}
+x^0 \in \llbracket 0, 2^\mathsf{N}-1 \rrbracket, S \in \llbracket 0, 2^\mathsf{N}-1 \rrbracket^\mathds{N} \\
+\forall n \in \mathds{N}^*, x^n = x^{n-1} \oplus PRNG_1\oplus PRNG_2,
+\end{array}
+\right.
+\label{equation Oplus}
+\end{equation}
+
+With this Mixed CIPRNG approach, both the Old CIPRNG and New CIPRNG continue to pass all the NIST and DieHARD suites.
+In addition, we can see that the PRNGs using a Xor CIPRNG approach can pass more tests than previously.
+The main reason of this success is that the Mixed Xor CIPRNG has a longer period.
+Indeed, let $n_{P}$ be the period of a PRNG $P$, then the period deduced from the single Xor CIPRNG approach is obviously equal to:
+\begin{equation}
+n_{SXORCI}=
+\left\{
+\begin{array}{ll}
+n_{P}&\text{if~}x^0=x^{n_{P}}\\
+2n_{P}&\text{if~}x^0\neq x^{n_{P}}.\\
+\end{array}
+\right.
+\label{equation Oplus}
+\end{equation}
+
+Let us now denote by $n_{P1}$ and $n_{P2}$ the periods of respectively the $PRNG_1$ and $PRNG_2$ generators, then the period of the Mixed Xor CIPRNG will be:
+\begin{equation}
+n_{XXORCI}=
+\left\{
+\begin{array}{ll}
+LCM(n_{P1},n_{P2})&\text{if~}x^0=x^{LCM(n_{P1},n_{P2})}\\
+2LCM(n_{P1},n_{P2})&\text{if~}x^0\neq x^{LCM(n_{P1},n_{P2})}.\\
+\end{array}
+\right.
+\label{equation Oplus}
+\end{equation}
+
+In Table~\ref{DieHARD fail mixex CIPRNG}, we only show the results for the Mixed CIPRNGs that cannot pass all DieHARD suites (the NIST tests are all passed). It demonstrates that Mixed Xor CIPRNG involving LCG, MRG, LCG2, LCG3, MRG2, or INV cannot pass the two following tests, namely the ``Matrix Rank 32x32'' and the ``COUNT-THE-1's'' tests contained into the DieHARD battery. Let us recall their definitions:
+
+\begin{itemize}
+ \item \textbf{Matrix Rank 32x32.} A random 32x32 binary matrix is formed, each row having a 32-bit random vector. Its rank is an integer that ranges from 0 to 32. Ranks less than 29 must be rare, and their occurences must be pooled with those of rank 29. To achieve the test, ranks of 40,000 such random matrices are obtained, and a chisquare test is performed on counts for ranks 32,31,30 and for ranks $\leq29$.
+
+ \item \textbf{COUNT-THE-1's TEST} Consider the file under test as a stream of bytes (four per 2 bit integer). Each byte can contain from 0 to 8 1's, with probabilities 1,8,28,56,70,56,28,8,1 over 256. Now let the stream of bytes provide a string of overlapping 5-letter words, each ``letter'' taking values A,B,C,D,E. The letters are determined by the number of 1's in a byte: 0,1, or 2 yield A, 3 yields B, 4 yields C, 5 yields D and 6,7, or 8 yield E. Thus we have a monkey at a typewriter hitting five keys with various probabilities (37,56,70,56,37 over 256). There are $5^5$ possible 5-letter words, and from a string of 256,000 (over-lapping) 5-letter words, counts are made on the frequencies for each word. The quadratic form in the weak inverse of the covariance matrix of the cell counts provides a chisquare test: Q5-Q4, the difference of the naive Pearson sums of $(OBS-EXP)^2/EXP$ on counts for 5- and 4-letter cell counts.
+\end{itemize}
+
+The reason of these fails is that the output of LCG, LCG2, LCG3, MRG, and MRG2 under the experiments are in 31-bit. Compare with the Single CIPRNG, using different PRNGs to build CIPRNG seems more efficient in improving random number quality (mixed Xor CI can 100\% pass NIST, but single cannot).
+
+\begin{table*}
+\renewcommand{\arraystretch}{1.3}
+\caption{Scores of mixed Xor CIPRNGs when considering the DieHARD battery}
+\label{DieHARD fail mixex CIPRNG}
+\centering
+ \begin{tabular}{|l||c|c|c|c|c|c|}
+ \hline
+\backslashbox{\textbf{$PRNG_1$}} {\textbf{$PRNG_0$}} & LCG & MRG & INV & LCG2 & LCG3 & MRG2 \\ \hline\hline
+LCG &\backslashbox{} {} &16/18&16/18 &16/18 &16/18 &16/18\\ \hline
+MRG &16/18 &\backslashbox{} {} &16/18&16/18 &16/18 &16/18\\ \hline
+INV &16/18 &16/18&\backslashbox{} {} &16/18 &16/18&16/18 \\ \hline
+LCG2 &16/18 &16/18 &16/18 &\backslashbox{} {} &16/18&16/18\\ \hline
+LCG3 &16/18 &16/18 &16/18&16/18&\backslashbox{} {} &16/18\\ \hline
+MRG2 &16/18 &16/18 &16/18&16/18 &16/18 &\backslashbox{} {} \\ \hline
+\end{tabular}
+\end{table*}
+
+\subsubsection{Tests based on the Multiple CIPRNG}
+\label{Tests based on Multiple CIPRNG}
+
+Until now, the combination of at most two input PRNGs has been investigated.
+We now regard the possibility to use a larger number of generators to improve the statistics of the generated pseudorandom numbers, leading to the multiple functional power approach.
+For the CIPRNGs which have already pass both the NIST and DieHARD suites with 2 inputted PRNGs (all the Old and New CIPRNGs, and some of the Xor CIPRNGs), it is not meaningful to consider their adaption of this multiple CIPRNG method, hence only the Multiple Xor CIPRNGs, having the following form, will be investigated.
+\begin{equation}
+\left\{
+\begin{array}{l}
+x^0 \in \llbracket 0, 2^\mathsf{N}-1 \rrbracket, S \in \llbracket 0, 2^\mathsf{N}-1 \rrbracket^\mathds{N} \\
+\forall n \in \mathds{N}^*, x^n = x^{n-1} \oplus S^{nm}\oplus S^{nm+1}\ldots \oplus S^{nm+m-1} ,
+\end{array}
+\right.
+\label{equation Oplus}
+\end{equation}
+
+The question is now to determine the value of the threshold $m$ (the functional power) making the multiple CIPRNG being able to pass the whole NIST battery.
+Such a question is answered in Table~\ref{threshold}.
+
+
+\begin{table*}
+\renewcommand{\arraystretch}{1.3}
+\caption{Functional power $m$ making it possible to pass the whole NIST battery}
+\label{threshold}
+\centering
+ \begin{tabular}{|l||c|c|c|c|c|c|c|c|}
+ \hline
+Inputted $PRNG$ & LCG & MRG & SWC & GFSR & INV& LCG2 & LCG3 & MRG2 \\ \hline\hline
+Threshold value $m$& 19 & 7 & 2& 1 & 11& 9& 3& 4\\ \hline\hline
+\end{tabular}
+\end{table*}
+
+\subsubsection{Results Summary}
+
+We can summarize the obtained results as follows.
+\begin{enumerate}
+\item The CIPRNG method is able to improve the statistical properties of a large variety of PRNGs.
+\item Using different PRNGs in the CIPRNG approach is better than considering several instances of one unique PRNG.
+\item The statistical quality of the outputs increases with the functional power $m$.
+\end{enumerate}
+
+\end{color}
+
+\section{Efficient PRNG based on Chaotic Iterations}
+\label{sec:efficient PRNG}
+
+Based on the proof presented in the previous section, it is now possible to
+improve the speed of the generator formerly presented in~\cite{bgw09:ip,guyeux10}.
+The first idea is to consider
+that the provided strategy is a pseudorandom Boolean vector obtained by a
+given PRNG.
+An iteration of the system is simply the bitwise exclusive or between
+the last computed state and the current strategy.
+Topological properties of disorder exhibited by chaotic
+iterations can be inherited by the inputted generator, we hope by doing so to
+obtain some statistical improvements while preserving speed.
+
+%%RAPH : j'ai viré tout ca
+%% Let us give an example using 16-bits numbers, to clearly understand how the bitwise xor operations
+%% are
+%% done.
+%% Suppose that $x$ and the strategy $S^i$ are given as
+%% binary vectors.
+%% Table~\ref{TableExemple} shows the result of $x \oplus S^i$.
+
+%% \begin{table}
+%% \begin{scriptsize}
+%% $$
+%% \begin{array}{|cc|cccccccccccccccc|}
+%% \hline
+%% x &=&1&0&1&1&1&0&1&0&1&0&0&1&0&0&1&0\\
+%% \hline
+%% S^i &=&0&1&1&0&0&1&1&0&1&1&1&0&0&1&1&1\\
+%% \hline
+%% x \oplus S^i&=&1&1&0&1&1&1&0&0&0&1&1&1&0&1&0&1\\
+%% \hline
+
+%% \hline
+%% \end{array}
+%% $$
+%% \end{scriptsize}
+%% \caption{Example of an arbitrary round of the proposed generator}
+%% \label{TableExemple}
+%% \end{table}
+
+
+
+
+\lstset{language=C,caption={C code of the sequential PRNG based on chaotic iterations},label=algo:seqCIPRNG}
+\begin{small}
+\begin{lstlisting}
+
+unsigned int CIPRNG() {
+ static unsigned int x = 123123123;
+ unsigned long t1 = xorshift();
+ unsigned long t2 = xor128();
+ unsigned long t3 = xorwow();
+ x = x^(unsigned int)t1;
+ x = x^(unsigned int)(t2>>32);
+ x = x^(unsigned int)(t3>>32);
+ x = x^(unsigned int)t2;
+ x = x^(unsigned int)(t1>>32);
+ x = x^(unsigned int)t3;
+ return x;
+}
+\end{lstlisting}
+\end{small}
+
+
+
+In Listing~\ref{algo:seqCIPRNG} a sequential version of the proposed PRNG based
+on chaotic iterations is presented. The xor operator is represented by
+\textasciicircum. This function uses three classical 64-bits PRNGs, namely the
+\texttt{xorshift}, the \texttt{xor128}, and the
+\texttt{xorwow}~\cite{Marsaglia2003}. In the following, we call them ``xor-like
+PRNGs''. As each xor-like PRNG uses 64-bits whereas our proposed generator
+works with 32-bits, we use the command \texttt{(unsigned int)}, that selects the
+32 least significant bits of a given integer, and the code \texttt{(unsigned
+ int)(t$>>$32)} in order to obtain the 32 most significant bits of \texttt{t}.
+
+Thus producing a pseudorandom number needs 6 xor operations with 6 32-bits numbers
+that are provided by 3 64-bits PRNGs. This version successfully passes the
+stringent BigCrush battery of tests~\cite{LEcuyerS07}.
+
+\section{Efficient PRNGs based on Chaotic Iterations on GPU}
+\label{sec:efficient PRNG gpu}
+
+In order to take benefits from the computing power of GPU, a program
+needs to have independent blocks of threads that can be computed
+simultaneously. In general, the larger the number of threads is, the
+more local memory is used, and the less branching instructions are
+used (if, while, ...), the better the performances on GPU is.
+Obviously, having these requirements in mind, it is possible to build
+a program similar to the one presented in Listing
+\ref{algo:seqCIPRNG}, which computes pseudorandom numbers on GPU. To
+do so, we must firstly recall that in the CUDA~\cite{Nvid10}
+environment, threads have a local identifier called
+\texttt{ThreadIdx}, which is relative to the block containing
+them. Furthermore, in CUDA, parts of the code that are executed by the GPU, are
+called {\it kernels}.
+
+
+\subsection{Naive Version for GPU}
+
+
+It is possible to deduce from the CPU version a quite similar version adapted to GPU.
+The simple principle consists in making each thread of the GPU computing the CPU version of our PRNG.
+Of course, the three xor-like
+PRNGs used in these computations must have different parameters.
+In a given thread, these parameters are
+randomly picked from another PRNGs.
+The initialization stage is performed by the CPU.
+To do it, the ISAAC PRNG~\cite{Jenkins96} is used to set all the
+parameters embedded into each thread.
+
+The implementation of the three
+xor-like PRNGs is straightforward when their parameters have been
+allocated in the GPU memory. Each xor-like works with an internal
+number $x$ that saves the last generated pseudorandom number. Additionally, the
+implementation of the xor128, the xorshift, and the xorwow respectively require
+4, 5, and 6 unsigned long as internal variables.
+
+
+\begin{algorithm}
+\begin{small}
+\KwIn{InternalVarXorLikeArray: array with internal variables of the 3 xor-like
+PRNGs in global memory\;
+NumThreads: number of threads\;}
+\KwOut{NewNb: array containing random numbers in global memory}
+\If{threadIdx is concerned by the computation} {
+ retrieve data from InternalVarXorLikeArray[threadIdx] in local variables\;
+ \For{i=1 to n} {
+ compute a new PRNG as in Listing\ref{algo:seqCIPRNG}\;
+ store the new PRNG in NewNb[NumThreads*threadIdx+i]\;
+ }
+ store internal variables in InternalVarXorLikeArray[threadIdx]\;
+}
+\end{small}
+\caption{Main kernel of the GPU ``naive'' version of the PRNG based on chaotic iterations}
+\label{algo:gpu_kernel}
+\end{algorithm}
+
+
+
+Algorithm~\ref{algo:gpu_kernel} presents a naive implementation of the proposed PRNG on
+GPU. Due to the available memory in the GPU and the number of threads
+used simultaneously, the number of random numbers that a thread can generate
+inside a kernel is limited (\emph{i.e.}, the variable \texttt{n} in
+algorithm~\ref{algo:gpu_kernel}). For instance, if $100,000$ threads are used and
+if $n=100$\footnote{in fact, we need to add the initial seed (a 32-bits number)},
+then the memory required to store all of the internals variables of both the xor-like
+PRNGs\footnote{we multiply this number by $2$ in order to count 32-bits numbers}
+and the pseudorandom numbers generated by our PRNG, is equal to $100,000\times ((4+5+6)\times
+2+(1+100))=1,310,000$ 32-bits numbers, that is, approximately $52$Mb.
+
+This generator is able to pass the whole BigCrush battery of tests, for all
+the versions that have been tested depending on their number of threads
+(called \texttt{NumThreads} in our algorithm, tested up to $5$ million).
+
+\begin{remark}
+The proposed algorithm has the advantage of manipulating independent
+PRNGs, so this version is easily adaptable on a cluster of computers too. The only thing
+to ensure is to use a single ISAAC PRNG. To achieve this requirement, a simple solution consists in
+using a master node for the initialization. This master node computes the initial parameters
+for all the different nodes involved in the computation.
+\end{remark}
+
+\subsection{Improved Version for GPU}
+
+As GPU cards using CUDA have shared memory between threads of the same block, it
+is possible to use this feature in order to simplify the previous algorithm,
+i.e., to use less than 3 xor-like PRNGs. The solution consists in computing only
+one xor-like PRNG by thread, saving it into the shared memory, and then to use the results
+of some other threads in the same block of threads. In order to define which
+thread uses the result of which other one, we can use a combination array that
+contains the indexes of all threads and for which a combination has been
+performed.
+
+In Algorithm~\ref{algo:gpu_kernel2}, two combination arrays are used. The
+variable \texttt{offset} is computed using the value of
+\texttt{combination\_size}. Then we can compute \texttt{o1} and \texttt{o2}
+representing the indexes of the other threads whose results are used by the
+current one. In this algorithm, we consider that a 32-bits xor-like PRNG has
+been chosen. In practice, we use the xor128 proposed in~\cite{Marsaglia2003} in
+which unsigned longs (64 bits) have been replaced by unsigned integers (32
+bits).
+
+This version can also pass the whole {\it BigCrush} battery of tests.
+
+\begin{algorithm}
+\begin{small}
+\KwIn{InternalVarXorLikeArray: array with internal variables of 1 xor-like PRNGs
+in global memory\;
+NumThreads: Number of threads\;
+array\_comb1, array\_comb2: Arrays containing combinations of size combination\_size\;}
+
+\KwOut{NewNb: array containing random numbers in global memory}
+\If{threadId is concerned} {
+ retrieve data from InternalVarXorLikeArray[threadId] in local variables including shared memory and x\;
+ offset = threadIdx\%combination\_size\;
+ o1 = threadIdx-offset+array\_comb1[offset]\;
+ o2 = threadIdx-offset+array\_comb2[offset]\;
+ \For{i=1 to n} {
+ t=xor-like()\;
+ t=t\textasciicircum shmem[o1]\textasciicircum shmem[o2]\;
+ shared\_mem[threadId]=t\;
+ x = x\textasciicircum t\;
+
+ store the new PRNG in NewNb[NumThreads*threadId+i]\;
+ }
+ store internal variables in InternalVarXorLikeArray[threadId]\;
+}
+\end{small}
+\caption{Main kernel for the chaotic iterations based PRNG GPU efficient
+version\label{IR}}
+\label{algo:gpu_kernel2}
+\end{algorithm}
+
+\subsection{Theoretical Evaluation of the Improved Version}
+
+A run of Algorithm~\ref{algo:gpu_kernel2} consists in an operation ($x=x\oplus t$) having
+the form of Equation~\ref{equation Oplus}, which is equivalent to the iterative
+system of Eq.~\ref{eq:generalIC}. That is, an iteration of the general chaotic
+iterations is realized between the last stored value $x$ of the thread and a strategy $t$
+(obtained by a bitwise exclusive or between a value provided by a xor-like() call
+and two values previously obtained by two other threads).
+To be certain that we are in the framework of Theorem~\ref{t:chaos des general},
+we must guarantee that this dynamical system iterates on the space
+$\mathcal{X} = \mathcal{P}\left(\llbracket 1, \mathsf{N} \rrbracket\right)^\mathds{N}\times\mathds{B}^\mathsf{N}$.
+The left term $x$ obviously belongs to $\mathds{B}^ \mathsf{N}$.
+To prevent from any flaws of chaotic properties, we must check that the right
+term (the last $t$), corresponding to the strategies, can possibly be equal to any
+integer of $\llbracket 1, \mathsf{N} \rrbracket$.
+
+Such a result is obvious, as for the xor-like(), all the
+integers belonging into its interval of definition can occur at each iteration, and thus the
+last $t$ respects the requirement. Furthermore, it is possible to
+prove by an immediate mathematical induction that, as the initial $x$
+is uniformly distributed (it is provided by a cryptographically secure PRNG),
+the two other stored values shmem[o1] and shmem[o2] are uniformly distributed too,
+(this is the induction hypothesis), and thus the next $x$ is finally uniformly distributed.
+
+Thus Algorithm~\ref{algo:gpu_kernel2} is a concrete realization of the general
+chaotic iterations presented previously, and for this reason, it satisfies the
+Devaney's formulation of a chaotic behavior.
+
+\section{Experiments}
+\label{sec:experiments}
+
+Different experiments have been performed in order to measure the generation
+speed. We have used a first computer equipped with a Tesla C1060 NVidia GPU card
+and an
+Intel Xeon E5530 cadenced at 2.40 GHz, and
+a second computer equipped with a smaller CPU and a GeForce GTX 280.
+All the
+cards have 240 cores.
+
+In Figure~\ref{fig:time_xorlike_gpu} we compare the quantity of pseudorandom numbers
+generated per second with various xor-like based PRNGs. In this figure, the optimized
+versions use the {\it xor64} described in~\cite{Marsaglia2003}, whereas the naive versions
+embed the three xor-like PRNGs described in Listing~\ref{algo:seqCIPRNG}. In
+order to obtain the optimal performances, the storage of pseudorandom numbers
+into the GPU memory has been removed. This step is time consuming and slows down the numbers
+generation. Moreover this storage is completely
+useless, in case of applications that consume the pseudorandom
+numbers directly after generation. We can see that when the number of threads is greater
+than approximately 30,000 and lower than 5 million, the number of pseudorandom numbers generated
+per second is almost constant. With the naive version, this value ranges from 2.5 to
+3GSamples/s. With the optimized version, it is approximately equal to
+20GSamples/s. Finally we can remark that both GPU cards are quite similar, but in
+practice, the Tesla C1060 has more memory than the GTX 280, and this memory
+should be of better quality.
+As a comparison, Listing~\ref{algo:seqCIPRNG} leads to the generation of about
+138MSample/s when using one core of the Xeon E5530.
+
+\begin{figure}[htbp]