modif reponse aux reviewers

[prng_gpu.git] / prng_gpu.tex
diff --git a/prng_gpu.tex b/prng_gpu.tex

index 6ebbcd996fccbd09777da10cbae0f798d2385e5a..27702e86deb5c8a719f4556331f233075bfe5a68 100644 (file)
--- a/prng_gpu.tex
+++ b/prng_gpu.tex
@@ -1,4 +1,6 @@
  \documentclass{article}
+%\documentclass[10pt,journal,letterpaper,compsoc]{IEEEtran}
+%\documentclass[preprint,12pt]{elsarticle}
  \usepackage[utf8]{inputenc}
  \usepackage[T1]{fontenc}
  \usepackage{fullpage}
@@ -7,7 +9,16 @@
  \usepackage{amscd}
  \usepackage{moreverb}
  \usepackage{commath}
+\usepackage[ruled,vlined]{algorithm2e}
+\usepackage{listings}
  \usepackage[standard]{ntheorem}
+\usepackage{algorithmic}
+\usepackage{slashbox}
+\usepackage{ctable}
+\usepackage{tabularx}
+\usepackage{multirow}
+
+\usepackage{color}
  
  % Pour mathds : les ensembles IR, IN, etc.
  \usepackage{dsfont}
@@ -19,7 +30,6 @@
  % Pour faire des sous-figures dans les figures
  \usepackage{subfigure}
  
-\usepackage{color}
  
  \newtheorem{notation}{Notation}
  
@@ -32,441 +42,2099 @@
  
  \newcommand{\alert}[1]{\begin{color}{blue}\textit{#1}\end{color}}
  
-\title{Efficient generation of pseudo random numbers based on chaotic iterations on GPU}
  \begin{document}
-\maketitle
-
-\begin{abstract}
-This is the abstract
-\end{abstract}
-
-\section{Introduction}
-
-Interet des itérations chaotiques pour générer des nombre alea\\
-Interet de générer des nombres alea sur GPU
-...
-
-\section{Chaotic iterations}
  
-Présentation des itérations chaotiques
+\title{Efficient and Cryptographically Secure Generation of Chaotic Pseudorandom Numbers on GPU}
  
  
+%% \author{Jacques M. Bahi}
+%% \ead{jacques.bahi@univ-fcomte.fr}
+%% \author{ Rapha\"{e}l Couturier \corref{cor1}}
+%% \ead{raphael.couturier@univ-fcomte.fr}
+%% \cortext[cor1]{Corresponding author}
+%% \author{  Christophe Guyeux}
+%% \ead{christophe.guyeux@univ-fcomte.fr}
+%% \author{ Pierre-Cyrille Héam }
+%% \ead{pierre-cyrille.heam@univ-fcomte.fr}
  
+\author{Christophe Guyeux \and  Rapha\"{e}l Couturier \and    Pierre-Cyrille Héam \and Jacques M. Bahi\\
+FEMTO-ST Institute, UMR  6174 CNRS,\\ University of Franche Comte, Belfort, France}
  
-\section{The relativity of disorder}
-\label{sec:de la relativité du désordre}
+\maketitle
  
-\subsection{Impact of the topology's finenesse}
  
-Let us firstly introduce the following notations.
+%\begin{frontmatter}
+%\IEEEcompsoctitleabstractindextext{
+\begin{abstract}
+In this paper we present a new pseudorandom number generator (PRNG) on
+graphics processing units  (GPU). This PRNG is based  on the so-called chaotic iterations.  It
+is firstly proven  to be chaotic according to the Devaney's  formulation. We thus propose  an efficient
+implementation  for  GPU that successfully passes the   {\it BigCrush} tests, deemed to be the  hardest
+battery of tests in TestU01.  Experiments show that this PRNG can generate
+about 20 billion of random numbers  per second on Tesla C1060 and NVidia GTX280
+cards.
+It is then established that, under reasonable assumptions, the proposed PRNG can be cryptographically 
+secure.
+A chaotic version of the Blum-Goldwasser asymmetric key encryption scheme is finally proposed.
  
-\begin{notation}
-$\mathcal{X}_\tau$ will denote the topological space $\left(\mathcal{X},\tau\right)$, whereas $\mathcal{V}_\tau (x)$ will be the set of all the neighborhoods of $x$ when considering the topology $\tau$ (or simply $\mathcal{V} (x)$, if there is no ambiguity).
-\end{notation}
  
+\end{abstract}
+%}
+%\begin{keyword}
+%   pseudo random number\sep parallelization\sep GPU\sep cryptography\sep chaos
+%\end{keyword}
+%\end{frontmatter}
  
  
-\begin{theorem}
-\label{Th:chaos et finesse}
-Let $\mathcal{X}$ a set and $\tau, \tau'$ two topologies on $\mathcal{X}$ s.t. $\tau'$ is finer than $\tau$. Let $f:\mathcal{X} \to \mathcal{X}$, continuous both for $\tau$ and $\tau'$.
+%\IEEEdisplaynotcompsoctitleabstractindextext
+%\IEEEpeerreviewmaketitle
  
-If $(\mathcal{X}_{\tau'},f)$ is chaotic according to Devaney, then $(\mathcal{X}_\tau,f)$ is chaotic too.
-\end{theorem}
  
-\begin{proof}
-Let us firstly establish the transitivity of $(\mathcal{X}_\tau,f)$.
-
-Let $\omega_1, \omega_2$ two open sets of $\tau$. Then $\omega_1, \omega_2 \in \tau'$, becaus $\tau'$ is finer than $\tau$. As $f$ is $\tau'-$transitive, we can deduce that $\exists n \in \mathds{N}, \omega_1 \cap f^{(n)}(\omega_2) = \varnothing$. Consequently, $f$ is $\tau-$transitive.
+\section{Introduction}
  
-Let us now consider the regularity of $(\mathcal{X}_\tau,f)$, \emph{i.e.}, for all $x \in \mathcal{X}$, and for all $\tau-$neighborhood $V$ of $x$, there is a periodic point for $f$ into $V$.
+Randomness is of importance in many fields such as scientific simulations or cryptography. 
+``Random numbers'' can mainly be generated either by a deterministic and reproducible algorithm
+called a pseudorandom number generator (PRNG), or by a physical non-deterministic 
+process having all the characteristics of a random noise, called a truly random number
+generator (TRNG). 
+In this paper, we focus on reproducible generators, useful for instance in
+Monte-Carlo based simulators or in several cryptographic schemes.
+These domains need PRNGs that are statistically irreproachable. 
+In some fields such as in numerical simulations, speed is a strong requirement
+that is usually attained by using parallel architectures. In that case,
+a recurrent problem is that a deflation of the statistical qualities is often
+reported, when the parallelization of a good PRNG is realized.
+This is why ad-hoc PRNGs for each possible architecture must be found to
+achieve both speed and randomness.
+On the other side, speed is not the main requirement in cryptography: the great
+need is to define \emph{secure} generators able to withstand malicious
+attacks. Roughly speaking, an attacker should not be able in practice to make 
+the distinction between numbers obtained with the secure generator and a true random
+sequence.  Or, in an equivalent formulation, he or she should not be
+able (in practice) to predict the next bit of the generator, having the knowledge of all the 
+binary digits that have been already released. ``Being able in practice'' refers here
+to the possibility to achieve this attack in polynomial time, and to the exponential growth
+of the difficulty of this challenge when the size of the parameters of the PRNG increases.
+
+
+Finally, a small part of the community working in this domain focuses on a
+third requirement, that is to define chaotic generators.
+The main idea is to take benefits from a chaotic dynamical system to obtain a
+generator that is unpredictable, disordered, sensible to its seed, or in other word chaotic.
+Their desire is to map a given chaotic dynamics into a sequence that seems random 
+and unassailable due to chaos.
+However, the chaotic maps used as a pattern are defined in the real line 
+whereas computers deal with finite precision numbers.
+This distortion leads to a deflation of both chaotic properties and speed.
+Furthermore, authors of such chaotic generators often claim their PRNG
+as secure due to their chaos properties, but there is no obvious relation
+between chaos and security as it is understood in cryptography.
+This is why the use of chaos for PRNG still remains marginal and disputable.
+
+The authors' opinion is that topological properties of disorder, as they are
+properly defined in the mathematical theory of chaos, can reinforce the quality
+of a PRNG. But they are not substitutable for security or statistical perfection.
+Indeed, to the authors' mind, such properties can be useful in the two following situations. On the
+one hand, a post-treatment based on a chaotic dynamical system can be applied
+to a PRNG statistically deflective, in order to improve its statistical 
+properties. Such an improvement can be found, for instance, in~\cite{bgw09:ip,bcgr11:ip}.
+On the other hand, chaos can be added to a fast, statistically perfect PRNG and/or a
+cryptographically secure one, in case where chaos can be of interest,
+\emph{only if these last properties are not lost during
+the proposed post-treatment}. Such an assumption is behind this research work.
+It leads to the attempts to define a 
+family of PRNGs that are chaotic while being fast and statistically perfect,
+or cryptographically secure.
+Let us finish this paragraph by noticing that, in this paper, 
+statistical perfection refers to the ability to pass the whole 
+{\it BigCrush} battery of tests, which is widely considered as the most
+stringent statistical evaluation of a sequence claimed as random.
+This battery can be found in the well-known TestU01 package~\cite{LEcuyerS07}.
+More precisely, each time we performed a test on a PRNG, we ran it
+twice in order to observe if all $p-$values are inside [0.01, 0.99]. In
+fact, we observed that few $p-$values (less than ten) are sometimes
+outside this interval but inside [0.001, 0.999], so that is why a
+second run allows us to confirm that the values outside are not for
+the same test. With this approach all our PRNGs pass the {\it
+  BigCrush} successfully and all $p-$values are at least once inside
+[0.01, 0.99].
+Chaos, for its part, refers to the well-established definition of a
+chaotic dynamical system defined by Devaney~\cite{Devaney}.
+
+In a previous work~\cite{bgw09:ip,guyeux10} we have proposed a post-treatment on PRNGs making them behave
+as a chaotic dynamical system. Such a post-treatment leads to a new category of
+PRNGs. We have shown that proofs of Devaney's chaos can be established for this
+family, and that the sequence obtained after this post-treatment can pass the
+NIST~\cite{Nist10}, DieHARD~\cite{Marsaglia1996}, and TestU01~\cite{LEcuyerS07} batteries of tests, even if the inputted generators
+cannot.
+The proposition of this paper is to improve widely the speed of the formerly
+proposed generator, without any lack of chaos or statistical properties.
+In particular, a version of this PRNG on graphics processing units (GPU)
+is proposed.
+Although GPU was initially designed  to accelerate
+the manipulation of  images, they are nowadays commonly  used in many scientific
+applications. Therefore,  it is important  to be able to  generate pseudorandom
+numbers inside a GPU when a scientific application runs in it. This remark
+motivates our proposal of a chaotic and statistically perfect PRNG for GPU.  
+Such device
+allows us to generate almost 20 billion of pseudorandom numbers per second.
+Furthermore, we show that the proposed post-treatment preserves the
+cryptographical security of the inputted PRNG, when this last has such a 
+property.
+Last, but not least, we propose a rewriting of the Blum-Goldwasser asymmetric
+key encryption protocol by using the proposed method.
+
+
+{\bf Main contributions.} In this paper a new PRNG using chaotic iteration
+is defined. From a theoretical point of view, it is proven that it has fine
+topological chaotic properties and that it is cryptographically secured (when
+the initial PRNG is also cryptographically secured). From a practical point of
+view, experiments point out a very good statistical behavior. An optimized
+original implementation of this PRNG is also proposed and experimented.
+Pseudorandom numbers are generated at a rate of 20GSamples/s, which is faster
+than in~\cite{conf/fpga/ThomasHL09,Marsaglia2003} (and with a better
+statistical behavior). Experiments are also provided using 
+\begin{color}{red} the well-known Blum-Blum-Shub
+(BBS) 
+\end{color}
+as the initial
+random generator. The generation speed is significantly weaker.
+%Note also that an original qualitative comparison between topological chaotic
+%properties and statistical tests is also proposed.
+
+
+
+
+The remainder of this paper  is organized as follows. In Section~\ref{section:related
+  works} we  review some GPU implementations  of PRNGs.  Section~\ref{section:BASIC
+  RECALLS} gives some basic recalls  on the well-known Devaney's formulation of chaos, 
+  and on an iteration process called ``chaotic
+iterations'' on which the post-treatment is based. 
+The proposed PRNG and its proof of chaos are given in  Section~\ref{sec:pseudorandom}.
+%Section~\ref{The generation of pseudorandom sequence} illustrates the statistical
+%improvement related to the chaotic iteration based post-treatment, for
+%our previously released PRNGs and a new efficient 
+%implementation on CPU.
+ Section~\ref{sec:efficient PRNG} %{sec:efficient PRNG
+%  gpu}  
+ describes and evaluates theoretically new effective versions of
+our pseudorandom generators,  in particular with a  GPU   implementation. 
+Such generators are experimented in 
+Section~\ref{sec:experiments}.
+We show in Section~\ref{sec:security analysis} that, if the inputted
+generator is cryptographically secure, then it is the case too for the
+generator provided by the post-treatment.
+A practical
+security evaluation is also outlined in Section~\ref{sec:Practicak evaluation}.
+Such a proof leads to the proposition of a cryptographically secure and
+chaotic generator on GPU based on the famous Blum Blum Shub
+in Section~\ref{sec:CSGPU} and to an improvement of the
+Blum-Goldwasser protocol in Sect.~\ref{Blum-Goldwasser}.
+This research work ends by a conclusion section, in which the contribution is
+summarized and intended future work is presented.
+
+
+
+
+\section{Related work on GPU based PRNGs}
+\label{section:related works}
+
+Numerous research works on defining GPU based PRNGs have already been proposed  in the
+literature, so that exhaustivity is impossible.
+This is why authors of this document only give reference to the most significant attempts 
+in this domain, from their subjective point of view. 
+The  quantity of pseudorandom numbers generated per second is mentioned here 
+only when the information is given in the related work. 
+A million numbers  per second will be simply written as
+1MSample/s whereas a billion numbers per second is 1GSample/s.
+
+In \cite{Pang:2008:cec}  a PRNG based on  cellular automata is defined
+with no  requirement to an high  precision  integer   arithmetic  or to any bitwise
+operations. Authors can   generate  about
+3.2MSamples/s on a GeForce 7800 GTX GPU, which is quite an old card now.
+However, there is neither a mention of statistical tests nor any proof of
+chaos or cryptography in this document.
+
+In \cite{ZRKB10}, the authors propose  different versions of efficient GPU PRNGs
+based on  Lagged Fibonacci or Hybrid  Taus.  They have  used these
+PRNGs   for  Langevin   simulations   of  biomolecules   fully  implemented   on
+GPU. Performances of  the GPU versions are far better than  those obtained with a
+CPU, and these PRNGs succeed to pass the {\it BigCrush} battery of TestU01. 
+However the evaluations of the proposed PRNGs are only statistical ones.
+
+
+Authors of~\cite{conf/fpga/ThomasHL09}  have studied the  implementation of some
+PRNGs on  different computing architectures: CPU,  field-programmable gate array
+(FPGA), massively parallel  processors, and GPU. This study is of interest, because
+the  performance  of the  same  PRNGs on  different architectures are compared. 
+FPGA appears as  the  fastest  and the most
+efficient architecture, providing the fastest number of generated pseudorandom numbers
+per joule. 
+However, we notice that authors can ``only'' generate between 11 and 16GSamples/s
+with a GTX 280  GPU, which should be compared with
+the results presented in this document.
+We can remark too that the PRNGs proposed in~\cite{conf/fpga/ThomasHL09} are only
+able to pass the {\it Crush} battery, which is far easier than the {\it Big Crush} one.
+
+Lastly, Cuda  has developed  a  library for  the  generation of  pseudorandom numbers  called
+Curand~\cite{curand11}.        Several       PRNGs        are       implemented, among
+other things 
+Xorwow~\cite{Marsaglia2003} and  some variants of Sobol. The  tests reported show that
+their  fastest version provides  15GSamples/s on  the new  Fermi C2050  card. 
+But their PRNGs cannot pass the whole TestU01 battery (only one test is failed).
+\newline
+\newline
+We can finally remark that, to the best of our knowledge, no GPU implementation has been proven to be chaotic, and the cryptographically secure property has surprisingly never been considered.
+
+\section{Basic Recalls}
+\label{section:BASIC RECALLS}
+
+This section is devoted to basic definitions and terminologies in the fields of
+topological chaos and chaotic iterations. We assume the reader is familiar
+with basic notions on topology (see for instance~\cite{Devaney}).
+
+
+\subsection{Devaney's Chaotic Dynamical Systems}
+\label{subsec:Devaney}
+In the sequel $S^{n}$ denotes the $n^{th}$ term of a sequence $S$ and $V_{i}$
+denotes the $i^{th}$ component of a vector $V$. $f^{k}=f\circ ...\circ f$
+is for the $k^{th}$ composition of a function $f$. Finally, the following
+notation is used: $\llbracket1;N\rrbracket=\{1,2,\hdots,N\}$.
+
+
+Consider a topological space $(\mathcal{X},\tau)$ and a continuous function $f :
+\mathcal{X} \rightarrow \mathcal{X}$.
  
-Let $x \in \mathcal{X}$ and $V \in \mathcal{V}_\tau (x)$ a $\tau-$neighborhood of $x$. By definition, $\exists \omega \in \tau, x \in \omega \subset V$.
+\begin{definition}
+The function $f$ is said to be \emph{topologically transitive} if, for any pair of open sets
+$U,V \subset \mathcal{X}$, there exists $k>0$ such that $f^k(U) \cap V \neq
+\varnothing$.
+\end{definition}
  
-But $\tau \subset \tau'$, so $\omega \in \tau'$, and then $V \in \mathcal{V}_{\tau'} (x)$. As $(\mathcal{X}_{\tau'},f)$ is regular, there is a periodic point for $f$ into $V$, and the regularity of $(\mathcal{X}_\tau,f)$ is proven.
-\end{proof}
+\begin{definition}
+An element $x$ is a \emph{periodic point} for $f$ of period $n\in \mathds{N}^*$
+if $f^{n}(x)=x$.% The set of periodic points of $f$ is denoted $Per(f).$
+\end{definition}
  
-\subsection{A given system can always be claimed as chaotic}
+\begin{definition}
+$f$ is said to be \emph{regular} on $(\mathcal{X}, \tau)$ if the set of periodic
+points for $f$ is dense in $\mathcal{X}$: for any point $x$ in $\mathcal{X}$,
+any neighborhood of $x$ contains at least one periodic point (without
+necessarily the same period).
+\end{definition}
  
-Let $f$ an iteration function on $\mathcal{X}$ having at least a fixed point. Then this function is chaotic (in a certain way):
  
-\begin{theorem}
-Let $\mathcal{X}$ a nonempty set and $f: \mathcal{X} \to \X$ a function having at least a fixed point.
-Then $f$ is $\tau_0-$chaotic, where $\tau_0$ is the trivial (indiscrete) topology on $\X$.
-\end{theorem}
+\begin{definition}[Devaney's formulation of chaos~\cite{Devaney}]
+The function $f$ is said to be \emph{chaotic} on $(\mathcal{X},\tau)$ if $f$ is regular and
+topologically transitive.
+\end{definition}
  
+The chaos property is strongly linked to the notion of ``sensitivity'', defined
+on a metric space $(\mathcal{X},d)$ by:
  
-\begin{proof}
-$f$ is transitive when $\forall \omega, \omega' \in \tau_0 \setminus \{\varnothing\}, \exists n \in \mathds{N}, f^{(n)}(\omega) \cap \omega' \neq \varnothing$.
-As $\tau_0 = \left\{ \varnothing, \X \right\}$, this is equivalent to look for an integer $n$ s.t. $f^{(n)}\left( \X \right) \cap \X \neq \varnothing$. For instance, $n=0$ is appropriate.
+\begin{definition}
+\label{sensitivity} The function $f$ has \emph{sensitive dependence on initial conditions}
+if there exists $\delta >0$ such that, for any $x\in \mathcal{X}$ and any
+neighborhood $V$ of $x$, there exist $y\in V$ and $n > 0$ such that
+$d\left(f^{n}(x), f^{n}(y)\right) >\delta $.
  
-Let us now consider $x \in \X$ and $V \in \mathcal{V}_{\tau_0} (x)$. Then $V = \mathcal{X}$, so $V$ has at least a fixed point for $f$. Consequently $f$ is regular, and the result is established.
-\end{proof}
+The constant $\delta$ is called the \emph{constant of sensitivity} of $f$.
+\end{definition}
  
+Indeed, Banks \emph{et al.} have proven in~\cite{Banks92} that when $f$ is
+chaotic and $(\mathcal{X}, d)$ is a metric space, then $f$ has the property of
+sensitive dependence on initial conditions (this property was formerly an
+element of the definition of chaos). To sum up, quoting Devaney
+in~\cite{Devaney}, a chaotic dynamical system ``is unpredictable because of the
+sensitive dependence on initial conditions. It cannot be broken down or
+simplified into two subsystems which do not interact because of topological
+transitivity. And in the midst of this random behavior, we nevertheless have an
+element of regularity''. Fundamentally different behaviors are consequently
+possible and occur in an unpredictable way.
  
  
  
+\subsection{Chaotic Iterations}
+\label{sec:chaotic iterations}
  
  
+Let us consider  a \emph{system} with a finite  number $\mathsf{N} \in
+\mathds{N}^*$ of elements  (or \emph{cells}), so that each  cell has a
+Boolean  \emph{state}. Having $\mathsf{N}$ Boolean values for these
+ cells  leads to the definition of a particular \emph{state  of the
+system}. A sequence which  elements belong to $\llbracket 1;\mathsf{N}
+\rrbracket $ is called a \emph{strategy}. The set of all strategies is
+denoted by $\llbracket 1, \mathsf{N} \rrbracket^\mathds{N}.$
  
-\section{Chaos on the order topology}
+\begin{definition}
+\label{Def:chaotic iterations}
+The      set       $\mathds{B}$      denoting      $\{0,1\}$,      let
+$f:\mathds{B}^{\mathsf{N}}\longrightarrow  \mathds{B}^{\mathsf{N}}$ be
+a  function  and  $S\in  \llbracket 1, \mathsf{N} \rrbracket^\mathds{N}$  be  a  ``strategy''.  The  so-called
+\emph{chaotic      iterations}     are     defined      by     $x^0\in
+\mathds{B}^{\mathsf{N}}$ and
+\begin{equation}
+\forall    n\in     \mathds{N}^{\ast     },    \forall     i\in
+\llbracket1;\mathsf{N}\rrbracket ,x_i^n=\left\{
+\begin{array}{ll}
+  x_i^{n-1} &  \text{ if  }S^n\neq i \\
+  \left(f(x^{n-1})\right)_{S^n} & \text{ if }S^n=i.
+\end{array}\right.
+\end{equation}
+\end{definition}
  
-\subsection{The phase space is an interval of the real line}
+In other words, at the $n^{th}$ iteration, only the $S^{n}-$th cell is
+\textquotedblleft  iterated\textquotedblright .  Note  that in  a more
+general  formulation,  $S^n$  can   be  a  subset  of  components  and
+$\left(f(x^{n-1})\right)_{S^{n}}$      can     be      replaced     by
+$\left(f(x^{k})\right)_{S^{n}}$, where  $k<n$, describing for example,
+delays  transmission~\cite{Robert1986,guyeux10}.  Finally,  let us  remark that
+the term  ``chaotic'', in  the name of  these iterations,  has \emph{a
+priori} no link with the mathematical theory of chaos, presented above.
  
-\subsubsection{Toward a topological semiconjugacy}
  
-In what follows, our intention is to establish, by using a topological semiconjugacy, that chaotic iterations over $\mathcal{X}$ can be described as iterations on a real interval. To do so, we must firstly introduce some notations and terminologies. 
+Let us now recall how to define a suitable metric space where chaotic iterations
+are continuous. For further explanations, see, e.g., \cite{guyeux10}.
  
-Let $\mathcal{S}_\mathsf{N}$ be the set of sequences belonging into $\llbracket 1; \mathsf{N}\rrbracket$ and $\mathcal{X}_{\mathsf{N}} = \mathcal{S}_\mathsf{N} \times \B^\mathsf{N}$.
+Let $\delta $ be the \emph{discrete Boolean metric}, $\delta
+(x,y)=0\Leftrightarrow x=y.$ Given a function $f$, define the function
+$F_{f}:  \llbracket1;\mathsf{N}\rrbracket\times \mathds{B}^{\mathsf{N}} 
+\longrightarrow  \mathds{B}^{\mathsf{N}}$
+\begin{equation*}
+\begin{array}{lrll}
+& (k,E) & \longmapsto & \left( E_{j}.\delta (k,j)+ f(E)_{k}.\overline{\delta
+(k,j)}\right) _{j\in \llbracket1;\mathsf{N}\rrbracket}%
+\end{array}%
+\end{equation*}%
+\noindent where + and . are the Boolean addition and product operations.
+Consider the phase space:
+\begin{equation}
+\mathcal{X} = \llbracket 1 ; \mathsf{N} \rrbracket^\mathds{N} \times
+\mathds{B}^\mathsf{N},
+\end{equation}
+\noindent and the map defined on $\mathcal{X}$:
+\begin{equation}
+G_f\left(S,E\right) = \left(\sigma(S), F_f(i(S),E)\right), \label{Gf}
+\end{equation}
+\noindent where $\sigma$ is the \emph{shift} function defined by $\sigma
+(S^{n})_{n\in \mathds{N}}\in \llbracket 1, \mathsf{N} \rrbracket^\mathds{N}\longrightarrow (S^{n+1})_{n\in
+\mathds{N}}\in \llbracket 1, \mathsf{N} \rrbracket^\mathds{N}$ and $i$ is the \emph{initial function} 
+$i:(S^{n})_{n\in \mathds{N}} \in \llbracket 1, \mathsf{N} \rrbracket^\mathds{N}\longrightarrow S^{0}\in \llbracket
+1;\mathsf{N}\rrbracket$. Then the chaotic iterations proposed in
+Definition \ref{Def:chaotic iterations} can be described by the following iterations:
+\begin{equation}
+\left\{
+\begin{array}{l}
+X^0 \in \mathcal{X} \\
+X^{k+1}=G_{f}(X^k).%
+\end{array}%
+\right.
+\end{equation}%
+
+With this formulation, a shift function appears as a component of chaotic
+iterations. The shift function is a famous example of a chaotic
+map~\cite{Devaney} but its presence is not sufficient enough to claim $G_f$ as
+chaotic. 
+To study this claim, a new distance between two points $X = (S,E), Y =
+(\check{S},\check{E})\in
+\mathcal{X}$ has been introduced in \cite{guyeux10} as follows:
+\begin{equation}
+d(X,Y)=d_{e}(E,\check{E})+d_{s}(S,\check{S}),
+\end{equation}
+\noindent where
+\begin{equation}
+\left\{
+\begin{array}{lll}
+\displaystyle{d_{e}(E,\check{E})} & = & \displaystyle{\sum_{k=1}^{\mathsf{N}%
+}\delta (E_{k},\check{E}_{k})}, \\
+\displaystyle{d_{s}(S,\check{S})} & = & \displaystyle{\dfrac{9}{\mathsf{N}}%
+\sum_{k=1}^{\infty }\dfrac{|S^k-\check{S}^k|}{10^{k}}}.%
+\end{array}%
+\right.
+\end{equation}
  
  
-\begin{definition}
-The function $\varphi: \mathcal{S}_{10} \times\mathds{B}^{10} \rightarrow \big[ 0, 2^{10} \big[$ is defined by:
-$$
-\begin{array}{cccl}
-\varphi: & \mathcal{X}_{10} = \mathcal{S}_{10} \times\mathds{B}^{10}& \longrightarrow & \big[ 0, 2^{10} \big[ \\
- & (S,E) = \left((S^0, S^1, \hdots ); (E_0, \hdots, E_9)\right) & \longmapsto & \varphi \left((S,E)\right)
-\end{array}
-$$
-\noindent where $\varphi\left((S,E)\right)$ is the real number:
+This new distance has been introduced to satisfy the following requirements.
  \begin{itemize}
-\item whose integral part $e$ is $\displaystyle{\sum_{k=0}^9 2^{9-k} E_k}$, that is, the binary digits of $e$ are $E_0 ~ E_1 ~ \hdots ~ E_9$.
-\item whose decimal part $s$ is equal to $s = 0,S^0~ S^1~ S^2~ \hdots = \sum_{k=1}^{+\infty} 10^{-k} S^{k-1}.$ 
+\item When the number of different cells between two systems is increasing, then
+their distance should increase too.
+\item In addition, if two systems present the same cells and their respective
+strategies start with the same terms, then the distance between these two points
+must be small because the evolution of the two systems will be the same for a
+while. Indeed, both dynamical systems start with the same initial condition,
+use the same update function, and as strategies are the same for a while, furthermore
+updated components are the same as well.
  \end{itemize}
-\end{definition}
+The distance presented above follows these recommendations. Indeed, if the floor
+value $\lfloor d(X,Y)\rfloor $ is equal to $n$, then the systems $E, \check{E}$
+differ in $n$ cells ($d_e$ is indeed the Hamming distance). In addition, $d(X,Y) - \lfloor d(X,Y) \rfloor $ is a
+measure of the differences between strategies $S$ and $\check{S}$. More
+precisely, this floating part is less than $10^{-k}$ if and only if the first
+$k$ terms of the two strategies are equal. Moreover, if the $k^{th}$ digit is
+nonzero, then the $k^{th}$ terms of the two strategies are different.
+The impact of this choice for a distance will be investigated at the end of the document.
  
+Finally, it has been established in \cite{guyeux10} that,
  
+\begin{proposition}
+Let $f$ be a map from $\mathds{B}^\mathsf{N}$ to itself. Then $G_{f}$ is continuous in
+the metric space $(\mathcal{X},d)$.
+\end{proposition}
  
-$\varphi$ realizes the association between a point of $\mathcal{X}_{10}$ and a real number into $\big[ 0, 2^{10} \big[$. We must now translate the chaotic iterations $\Go$ on this real interval. To do so, two intermediate functions over $\big[ 0, 2^{10} \big[$ must be introduced:
+The chaotic property of $G_f$ has been firstly established for the vectorial
+Boolean negation $f_0(x_1,\hdots, x_\mathsf{N}) =  (\overline{x_1},\hdots, \overline{x_\mathsf{N}})$ \cite{guyeux10}. To obtain a characterization, we have secondly
+introduced the notion of asynchronous iteration graph recalled bellow.
  
+Let $f$ be a map from $\mathds{B}^\mathsf{N}$ to itself. The
+{\emph{asynchronous iteration graph}} associated with $f$ is the
+directed graph $\Gamma(f)$ defined by: the set of vertices is
+$\mathds{B}^\mathsf{N}$; for all $x\in\mathds{B}^\mathsf{N}$ and 
+$i\in \llbracket1;\mathsf{N}\rrbracket$,
+the graph $\Gamma(f)$ contains an arc from $x$ to $F_f(i,x)$. 
+The relation between $\Gamma(f)$ and $G_f$ is clear: there exists a
+path from $x$ to $x'$ in $\Gamma(f)$ if and only if there exists a
+strategy $s$ such that the parallel iteration of $G_f$ from the
+initial point $(s,x)$ reaches the point $x'$.
+We have then proven in \cite{bcgr11:ip} that,
  
-\begin{definition}
-\label{def:e et s}
-Let $x \in \big[ 0, 2^{10} \big[$ and:
-\begin{itemize}
-\item $e_0, \hdots, e_9$ the binary digits of the integral part of $x$: $\displaystyle{\lfloor x \rfloor = \sum_{k=0}^{9} 2^{9-k} e_k}$.
-\item $(s^k)_{k\in \mathds{N}}$ the digits of $x$, where the chosen decimal decomposition of $x$ is the one that does not have an infinite number of 9: 
-$\displaystyle{x = \lfloor x \rfloor + \sum_{k=0}^{+\infty} s^k 10^{-k-1}}$.
-\end{itemize}
-$e$ and $s$ are thus defined as follows:
-$$
-\begin{array}{cccl}
-e: & \big[ 0, 2^{10} \big[ & \longrightarrow & \mathds{B}^{10} \\
- & x & \longmapsto & (e_0, \hdots, e_9)
-\end{array}
-$$
-\noindent and
-$$
-\begin{array}{cccl}
-s: & \big[ 0, 2^{10} \big[ & \longrightarrow & \llbracket 0, 9 \rrbracket^{\mathds{N}} \\
- & x & \longmapsto & (s^k)_{k \in \mathds{N}}
-\end{array}
-$$
-\end{definition}
  
-We are now able to define the function $g$, whose goal is to translate the chaotic iterations $\Go$ on an interval of $\mathds{R}$.
+\begin{theorem}
+\label{Th:Caractérisation   des   IC   chaotiques}  
+Let $f:\mathds{B}^\mathsf{N}\to\mathds{B}^\mathsf{N}$. $G_f$ is chaotic  (according to  Devaney) 
+if and only if $\Gamma(f)$ is strongly connected.
+\end{theorem}
  
-\begin{definition}
-$g:\big[ 0, 2^{10} \big[ \longrightarrow \big[ 0, 2^{10} \big[$ is defined by:
-$$
-\begin{array}{cccl}
-g: & \big[ 0, 2^{10} \big[ & \longrightarrow & \big[ 0, 2^{10} \big[ \\
-& \\
- & x & \longmapsto & g(x)
+Finally, we have established in \cite{bcgr11:ip} that,
+\begin{theorem}
+  Let $f: \mathds{B}^{n} \rightarrow \mathds{B}^{n}$, $\Gamma(f)$ its
+  iteration graph, $\check{M}$ its adjacency
+  matrix and $M$
+  a $n\times n$ matrix defined by 
+  $
+  M_{ij} = \frac{1}{n}\check{M}_{ij}$ %\textrm{ 
+  if $i \neq j$ and  
+  $M_{ii} = 1 - \frac{1}{n} \sum\limits_{j=1, j\neq i}^n \check{M}_{ij}$ otherwise.
+  
+  If $\Gamma(f)$ is strongly connected, then 
+  the output of the PRNG detailed in Algorithm~\ref{CI Algorithm} follows 
+  a law that tends to the uniform distribution 
+  if and only if $M$ is a double stochastic matrix.
+\end{theorem} 
+
+
+These results of chaos and uniform distribution have led us to study the possibility of building a
+pseudorandom number generator (PRNG) based on the chaotic iterations. 
+As $G_f$, defined on the domain   $\llbracket 1 ;  \mathsf{N} \rrbracket^{\mathds{N}} 
+\times \mathds{B}^\mathsf{N}$, is built from Boolean networks $f : \mathds{B}^\mathsf{N}
+\rightarrow \mathds{B}^\mathsf{N}$, we can preserve the theoretical properties on $G_f$
+during implementations (due to the discrete nature of $f$). Indeed, it is as if
+$\mathds{B}^\mathsf{N}$ represents the memory of the computer whereas $\llbracket 1 ;  \mathsf{N}
+\rrbracket^{\mathds{N}}$ is its input stream (the seeds, for instance, in PRNG, or a physical noise in TRNG).
+Let us finally remark that the vectorial negation satisfies the hypotheses of both theorems above.
+
+\section{Application to Pseudorandomness}
+\label{sec:pseudorandom}
+
+\subsection{A First Pseudorandom Number Generator}
+
+We have proposed in~\cite{bgw09:ip} a new family of generators that receives 
+two PRNGs as inputs. These two generators are mixed with chaotic iterations, 
+leading thus to a new PRNG that 
+should improve the statistical properties of each
+generator taken alone. 
+Furthermore, the generator obtained in this way possesses various chaos properties that none of the generators used as  input present.
+
+
+
+\begin{algorithm}[h!]
+\begin{small}
+\KwIn{a function $f$, an iteration number $b$, an initial configuration $x^0$
+($n$ bits)}
+\KwOut{a configuration $x$ ($n$ bits)}
+$x\leftarrow x^0$\;
+$k\leftarrow b + PRNG_1(b)$\;
+\For{$i=0,\dots,k$}
+{
+$s\leftarrow{PRNG_2(n)}$\;
+$x\leftarrow{F_f(s,x)}$\;
+}
+return $x$\;
+\end{small}
+\caption{An arbitrary round of $Old~ CI~ PRNG_f(PRNG_1,PRNG_2)$}
+\label{CI Algorithm}
+\end{algorithm}
+
+
+
+
+This generator is synthesized in Algorithm~\ref{CI Algorithm}.
+It takes as input: a Boolean function $f$ satisfying Theorem~\ref{Th:Caractérisation   des   IC   chaotiques};
+an integer $b$, ensuring that the number of executed iterations
+between two outputs is at least $b$
+and at most $2b+1$; and an initial configuration $x^0$.
+It returns the new generated configuration $x$.  Internally, it embeds two
+inputted generators $PRNG_i(k), i=1,2$,
+ which must return integers
+uniformly distributed
+into $\llbracket 1 ; k \rrbracket$.
+For instance, these PRNGs can be the \textit{XORshift}~\cite{Marsaglia2003},
+being a category of very fast PRNGs designed by George Marsaglia
+that repeatedly uses the transform of exclusive or (XOR, $\oplus$) on a number
+with a bit shifted version of it. Such a PRNG, which has a period of
+$2^{32}-1=4.29\times10^9$, is summed up in Algorithm~\ref{XORshift}. 
+This XORshift, or any other reasonable PRNG, is used
+in our own generator to compute both the number of iterations between two
+outputs (provided by $PRNG_1$) and the strategy elements ($PRNG_2$).
+
+%This former generator has successively passed various batteries of statistical tests, as the NIST~\cite{bcgr11:ip}, DieHARD~\cite{Marsaglia1996}, and TestU01~\cite{LEcuyerS07} ones.
+
+
+\begin{algorithm}[h!]
+\begin{small}
+\KwIn{the internal configuration $z$ (a 32-bit word)}
+\KwOut{$y$ (a 32-bit word)}
+$z\leftarrow{z\oplus{(z\ll13)}}$\;
+$z\leftarrow{z\oplus{(z\gg17)}}$\;
+$z\leftarrow{z\oplus{(z\ll5)}}$\;
+$y\leftarrow{z}$\;
+return $y$\;
+\end{small}
+\caption{An arbitrary round of \textit{XORshift} algorithm}
+\label{XORshift}
+\end{algorithm}
+
+
+\subsection{A ``New CI PRNG''}
+
+In order to make the Old CI PRNG usable in practice, we have proposed 
+an adapted version of the chaotic iteration based generator in~\cite{bg10:ip}.
+In this ``New CI PRNG'', we prevent a given bit from changing twice between two outputs.
+This new generator is designed by the following process. 
+
+First of all, some chaotic iterations have to be done to generate a sequence 
+$\left(x^n\right)_{n\in\mathds{N}} \in \left(\mathds{B}^{32}\right)^\mathds{N}$ 
+of Boolean vectors, which are the successive states of the iterated system. 
+Some of these vectors will be randomly extracted and our pseudorandom bit 
+flow will be constituted by their components. Such chaotic iterations are 
+realized as follows. Initial state $x^0 \in \mathds{B}^{32}$ is a Boolean 
+vector taken as a seed and chaotic strategy $\left(S^n\right)_{n\in\mathds{N}}\in 
+\llbracket 1, 32 \rrbracket^\mathds{N}$ is
+an \emph{irregular decimation} of $PRNG_2$ sequence, as described in 
+Algorithm~\ref{Chaotic iteration1}.
+
+Then, at each iteration, only the $S^n$-th component of state $x^n$ is 
+updated, as follows: $x_i^n = x_i^{n-1}$ if $i \neq S^n$, else $x_i^n = \overline{x_i^{n-1}}$.
+Such a procedure is equivalent to achieving chaotic iterations with
+the Boolean vectorial negation $f_0$ and some well-chosen strategies.
+Finally, some $x^n$ are selected
+by a sequence $m^n$ as the pseudorandom bit sequence of our generator.
+$(m^n)_{n \in \mathds{N}} \in \mathcal{M}^\mathds{N}$ is computed from $PRNG_1$, where $\mathcal{M}\subset \mathds{N}^*$ is a finite nonempty set of integers.
+
+The basic design procedure of the New CI generator is summarized in Algorithm~\ref{Chaotic iteration1}.
+The internal state is $x$, the output state is $r$. $a$ and $b$ are those computed by the two input
+PRNGs. Lastly, the value $g(a)$ is an integer defined as in Eq.~\ref{Formula}.
+This function must be chosen such that the outputs of the resulted PRNG are uniform in $\llbracket 0, 2^\mathsf{N}-1 \rrbracket$. Function of \eqref{Formula} achieves this
+goal (other candidates and more information can be found in ~\cite{bg10:ip}).
+
+\begin{equation}
+\label{Formula}
+m^n = g(y^n)=
+\left\{
+\begin{array}{l}
+0 \text{ if }0 \leqslant{y^n}<{C^0_{32}},\\
+1 \text{ if }{C^0_{32}} \leqslant{y^n}<\sum_{i=0}^1{C^i_{32}},\\
+2 \text{ if }\sum_{i=0}^1{C^i_{32}} \leqslant{y^n}<\sum_{i=0}^2{C^i_{32}},\\
+\vdots~~~~~ ~~\vdots~~~ ~~~~\\
+N \text{ if }\sum_{i=0}^{N-1}{C^i_{32}}\leqslant{y^n}<1.\\
  \end{array}
-$$
-\noindent where g(x) is the real number of $\big[ 0, 2^{10} \big[$ defined bellow:
-\begin{itemize}
-\item its integral part has a binary decomposition equal to $e_0', \hdots, e_9'$, with:
-$$
-e_i' = \left\{
-\begin{array}{ll}
-e(x)_i & \textrm{ if } i \neq s^0\\
-e(x)_i + 1 \textrm{ (mod 2)} & \textrm{ if } i = s^0\\
+\right.
+\end{equation}
+
+\begin{algorithm}
+\textbf{Input:} the internal state $x$ (32 bits)\\
+\textbf{Output:} a state $r$ of 32 bits
+\begin{algorithmic}[1]
+\FOR{$i=0,\dots,N$}
+{
+\STATE$d_i\leftarrow{0}$\;
+}
+\ENDFOR
+\STATE$a\leftarrow{PRNG_1()}$\;
+\STATE$k\leftarrow{g(a)}$\;
+\WHILE{$i=0,\dots,k$}
+
+\STATE$b\leftarrow{PRNG_2()~mod~\mathsf{N}}$\;
+\STATE$S\leftarrow{b}$\;
+    \IF{$d_S=0$}
+    {
+\STATE      $x_S\leftarrow{ \overline{x_S}}$\;
+\STATE      $d_S\leftarrow{1}$\;
+
+    }
+    \ELSIF{$d_S=1$}
+    {
+\STATE      $k\leftarrow{ k+1}$\;
+    }\ENDIF
+\ENDWHILE\\
+\STATE $r\leftarrow{x}$\;
+\STATE return $r$\;
+\medskip
+\caption{An arbitrary round of the new CI generator}
+\label{Chaotic iteration1}
+\end{algorithmic}
+\end{algorithm}
+
+
+We have shown in~\cite{bfg12a:ip} that the use of chaotic iterations 
+implies an improvement of the statistical properties for all the
+inputted defective generators we have investigated.
+For instance, when considering the TestU01 battery with its 588 tests, we obtained 261 
+failures for a PRNG based on the logistic map alone, and 
+this number of failures falls below 138 in the Old CI(Logistic,Logistic) generator.
+In the XORshift case (146 failures when considering it alone), the results are more amazing,
+as the chaotic iterations post-treatment makes it fails only 8 tests. 
+Further investigations have been systematically realized in \cite{bfg12a:ip}
+using a large set of inputted defective PRNGs, the three most used batteries of
+tests (DieHARD, NIST, and TestU01), and for all the versions of generators we have proposed.
+In all situations, an obvious improvement of the statistical behavior has
+been obtained, reinforcing the impression that chaos leads to statistical 
+enhancement~\cite{bfg12a:ip}.
+
+\subsection{Improving the Speed of the Former Generator}
+
+Instead of updating only one cell at each iteration, we now propose to choose a
+subset of components and to update them together, for speed improvement. Such a proposition leads 
+to a kind of merger of the two sequences used in Algorithms 
+\ref{CI Algorithm} and \ref{Chaotic iteration1}. When the updating function is the vectorial negation,
+this algorithm can be rewritten as follows:
+
+\begin{equation}
+\left\{
+\begin{array}{l}
+x^0 \in \llbracket 0, 2^\mathsf{N}-1 \rrbracket, S \in \llbracket 0, 2^\mathsf{N}-1 \rrbracket^\mathds{N} \\
+\forall n \in \mathds{N}^*, x^n = x^{n-1} \oplus S^n,
  \end{array}
  \right.
-$$
-\item whose decimal part is $s(x)^1, s(x)^2, \hdots$
-\end{itemize}
-\end{definition}
+\label{equation Oplus}
+\end{equation}
+where $\oplus$ is for the bitwise exclusive or between two integers. 
+This rewriting can be understood as follows. The $n-$th term $S^n$ of the
+sequence $S$, which is an integer of $\mathsf{N}$ binary digits, presents
+the list of cells to update in the state $x^n$ of the system (represented
+as an integer having $\mathsf{N}$ bits too). More precisely, the $k-$th 
+component of this state (a binary digit) changes if and only if the $k-$th 
+digit in the binary decomposition of $S^n$ is 1.
+\begin{color}{red} 
+Obviously, when $S$ is periodic of period $p$, then $x$ is periodic too of
+period either $p$ or $2p$, depending of the fact that, after $p$ iterations,
+the state of the system may or not be the same than before these iterations.
+\end{color}
+
+The single basic component presented in Eq.~\ref{equation Oplus} is of 
+ordinary use as a good elementary brick in various PRNGs. It corresponds
+to the following discrete dynamical system in chaotic iterations:
+
+\begin{equation}
+\forall    n\in     \mathds{N}^{\ast     },    \forall     i\in
+\llbracket1;\mathsf{N}\rrbracket ,x_i^n=\left\{
+\begin{array}{ll}
+  x_i^{n-1} &  \text{ if  } i \notin \mathcal{S}^n \\
+  \left(f(x^{n-1})\right)_{S^n} & \text{ if }i \in \mathcal{S}^n.
+\end{array}\right.
+\label{eq:generalIC}
+\end{equation}
+where $f$ is the vectorial negation and $\forall n \in \mathds{N}$, 
+$\mathcal{S}^n \subset \llbracket 1, \mathsf{N} \rrbracket$ is such that
+$k \in \mathcal{S}^n$ if and only if the $k-$th digit in the binary
+decomposition of $S^n$ is 1. Such chaotic iterations are more general
+than the ones presented in Definition \ref{Def:chaotic iterations} because, instead of updating only one term at each iteration,
+we select a subset of components to change.
+
+
+Obviously, replacing the previous CI PRNG Algorithms by 
+Equation~\ref{equation Oplus}, which is possible when the iteration function is
+the vectorial negation, leads to a speed improvement 
+(the resulting generator will be referred as ``Xor CI PRNG''
+in what follows).
+However, proofs
+of chaos obtained in~\cite{bg10:ij} have been established
+only for chaotic iterations of the form presented in Definition 
+\ref{Def:chaotic iterations}. The question is now to determine whether the
+use of more general chaotic iterations to generate pseudorandom numbers 
+faster, does not deflate their topological chaos properties.
+
+\subsection{Proofs of Chaos of the General Formulation of the Chaotic Iterations}
+\label{deuxième def}
+Let us consider the discrete dynamical systems in chaotic iterations having 
+the general form: $\forall    n\in     \mathds{N}^{\ast     }$, $  \forall     i\in
+\llbracket1;\mathsf{N}\rrbracket $,
+
+\begin{equation}
+  x_i^n=\left\{
+\begin{array}{ll}
+  x_i^{n-1} &  \text{ if  } i \notin \mathcal{S}^n \\
+  \left(f(x^{n-1})\right)_{S^n} & \text{ if }i \in \mathcal{S}^n.
+\end{array}\right.
+\label{general CIs}
+\end{equation}
+
+In other words, at the $n^{th}$ iteration, only the cells whose id is
+contained into the set $S^{n}$ are iterated.
+
+Let us now rewrite these general chaotic iterations as usual discrete dynamical
+system of the form $X^{n+1}=f(X^n)$ on an ad hoc metric space. Such a formulation
+is required in order to study the topological behavior of the system.
+
+Let us introduce the following function:
+\begin{equation}
+\begin{array}{cccc}
+ \chi: & \llbracket 1; \mathsf{N} \rrbracket \times \mathcal{P}\left(\llbracket 1; \mathsf{N} \rrbracket\right) & \longrightarrow & \mathds{B}\\
+         & (i,X) & \longmapsto  & \left\{ \begin{array}{ll} 0 & \textrm{if }i \notin X, \\ 1 & \textrm{if }i \in X,  \end{array}\right.
+\end{array} 
+\end{equation}
+where $\mathcal{P}\left(X\right)$ is for the powerset of the set $X$, that is, $Y \in \mathcal{P}\left(X\right) \Longleftrightarrow Y \subset X$.
+
+Given a function $f:\mathds{B}^\mathsf{N} \longrightarrow \mathds{B}^\mathsf{N} $, define the function:
+$F_{f}:  \mathcal{P}\left(\llbracket1;\mathsf{N}\rrbracket \right) \times \mathds{B}^{\mathsf{N}} 
+\longrightarrow \mathds{B}^{\mathsf{N}}$
+\begin{equation*}
+\begin{array}{rll}
+ (P,E) & \longmapsto & \left( E_{j}.\chi (j,P)+f(E)_{j}.\overline{\chi(j,P)}\right) _{j\in \llbracket1;\mathsf{N}\rrbracket}%
+\end{array}%
+\end{equation*}%
+where + and . are the Boolean addition and product operations, and $\overline{x}$ 
+is the negation of the Boolean $x$.
+Consider the phase space:
+\begin{equation}
+\mathcal{X} = \mathcal{P}\left(\llbracket 1 ; \mathsf{N} \rrbracket\right)^\mathds{N} \times
+\mathds{B}^\mathsf{N},
+\end{equation}
+\noindent and the map defined on $\mathcal{X}$:
+\begin{equation}
+G_f\left(S,E\right) = \left(\sigma(S), F_f(i(S),E)\right), %\label{Gf} %%RAPH, j'ai viré ce label qui existe déjà avant...
+\end{equation}
+\noindent where $\sigma$ is the \emph{shift} function defined by $\sigma
+(S^{n})_{n\in \mathds{N}}\in \mathcal{P}\left(\llbracket 1 ; \mathsf{N} \rrbracket\right)^\mathds{N}\longrightarrow (S^{n+1})_{n\in
+\mathds{N}}\in \mathcal{P}\left(\llbracket 1 ; \mathsf{N} \rrbracket\right)^\mathds{N}$ and $i$ is the \emph{initial function} 
+$i:(S^{n})_{n\in \mathds{N}} \in \mathcal{P}\left(\llbracket 1 ; \mathsf{N} \rrbracket\right)^\mathds{N}\longrightarrow S^{0}\in \mathcal{P}\left(\llbracket 1 ; \mathsf{N} \rrbracket\right)$. 
+Then the general chaotic iterations defined in Equation \ref{general CIs} can 
+be described by the following discrete dynamical system:
+\begin{equation}
+\left\{
+\begin{array}{l}
+X^0 \in \mathcal{X} \\
+X^{k+1}=G_{f}(X^k).%
+\end{array}%
+\right.
+\end{equation}%
+
+Once more, a shift function appears as a component of these general chaotic 
+iterations. 
+
+To study the Devaney's chaos property, a distance between two points 
+$X = (S,E), Y = (\check{S},\check{E})$ of $\mathcal{X}$ must be defined.
+Let us introduce:
+\begin{equation}
+d(X,Y)=d_{e}(E,\check{E})+d_{s}(S,\check{S}),
+\label{nouveau d}
+\end{equation}
+\noindent where $ \displaystyle{d_{e}(E,\check{E})} = \displaystyle{\sum_{k=1}^{\mathsf{N}%
+ }\delta (E_{k},\check{E}_{k})}$  is once more the Hamming distance, and
+$  \displaystyle{d_{s}(S,\check{S})}  =  \displaystyle{\dfrac{9}{\mathsf{N}}%
+ \sum_{k=1}^{\infty }\dfrac{|S^k\Delta {S}^k|}{10^{k}}}$,
+%%RAPH : ici, j'ai supprimé tous les sauts à la ligne
+%% \begin{equation}
+%% \left\{
+%% \begin{array}{lll}
+%% \displaystyle{d_{e}(E,\check{E})} & = & \displaystyle{\sum_{k=1}^{\mathsf{N}%
+%% }\delta (E_{k},\check{E}_{k})} \textrm{ is once more the Hamming distance}, \\
+%% \displaystyle{d_{s}(S,\check{S})} & = & \displaystyle{\dfrac{9}{\mathsf{N}}%
+%% \sum_{k=1}^{\infty }\dfrac{|S^k\Delta {S}^k|}{10^{k}}}.%
+%% \end{array}%
+%% \right.
+%% \end{equation}
+where $|X|$ is the cardinality of a set $X$ and $A\Delta B$ is for the symmetric difference, defined for sets A, B as
+$A\,\Delta\,B = (A \setminus B) \cup (B \setminus A)$.
  
-\bigskip
  
+\begin{proposition}
+The function $d$ defined in Eq.~\ref{nouveau d} is a metric on $\mathcal{X}$.
+\end{proposition}
  
-In other words, if $x = \displaystyle{\sum_{k=0}^{9} 2^{9-k} e_k +  \sum_{k=0}^{+\infty} s^{k} ~10^{-k-1}}$, then: $$g(x) = \displaystyle{\sum_{k=0}^{9} 2^{9-k} (e_k + \delta(k,s^0) \textrm{ (mod 2)}) +  \sum_{k=0}^{+\infty} s^{k+1} 10^{-k-1}}.$$
+\begin{proof}
+ $d_e$ is the Hamming distance. We will prove that $d_s$ is a distance
+too, thus $d$, as being the sum of two distances, will also be a distance.
+ \begin{itemize}
+\item Obviously, $d_s(S,\check{S})\geqslant 0$, and if $S=\check{S}$, then 
+$d_s(S,\check{S})=0$. Conversely, if $d_s(S,\check{S})=0$, then 
+$\forall k \in \mathds{N}, |S^k\Delta {S}^k|=0$, and so $\forall k, S^k=\check{S}^k$.
+ \item $d_s$ is symmetric 
+($d_s(S,\check{S})=d_s(\check{S},S)$) due to the commutative property
+of the symmetric difference. 
+\item Finally, $|S \Delta S''| = |(S \Delta \varnothing) \Delta S''|= |S \Delta (S'\Delta S') \Delta S''|= |(S \Delta S') \Delta (S' \Delta S'')|\leqslant |S \Delta S'| + |S' \Delta S''|$, 
+and so for all subsets $S,S',$ and $S''$ of $\llbracket 1, \mathsf{N} \rrbracket$, 
+we have $d_s(S,S'') \leqslant d_e(S,S')+d_s(S',S'')$, and the triangle
+inequality is obtained.
+ \end{itemize}
+\end{proof}
  
-\subsubsection{Defining a metric on $\big[ 0, 2^{10} \big[$}
  
-Numerous metrics can be defined on the set $\big[ 0, 2^{10} \big[$, the most usual one being the Euclidian distance recalled bellow:
+Before being able to study the topological behavior of the general 
+chaotic iterations, we must first establish that:
+
+\begin{proposition}
+ For all $f:\mathds{B}^\mathsf{N} \longrightarrow \mathds{B}^\mathsf{N} $, the function $G_f$ is continuous on 
+$\left( \mathcal{X},d\right)$.
+\end{proposition}
  
-\begin{notation}
-\index{distance!euclidienne}
-$\Delta$ is the Euclidian distance on $\big[ 0, 2^{10} \big[$, that is, $\Delta(x,y) = |y-x|^2$.
-\end{notation}
  
+\begin{proof}
+We use the sequential continuity.
+Let $(S^n,E^n)_{n\in \mathds{N}}$ be a sequence of the phase space $%
+\mathcal{X}$, which converges to $(S,E)$. We will prove that $\left(
+G_{f}(S^n,E^n)\right) _{n\in \mathds{N}}$ converges to $\left(
+G_{f}(S,E)\right) $. Let us remark that for all $n$, $S^n$ is a strategy,
+thus, we consider a sequence of strategies (\emph{i.e.}, a sequence of
+sequences).\newline
+As $d((S^n,E^n);(S,E))$ converges to 0, each distance $d_{e}(E^n,E)$ and $d_{s}(S^n,S)$ converges
+to 0. But $d_{e}(E^n,E)$ is an integer, so $\exists n_{0}\in \mathds{N},$ $%
+d_{e}(E^n,E)=0$ for any $n\geqslant n_{0}$.\newline
+In other words, there exists a threshold $n_{0}\in \mathds{N}$ after which no
+cell will change its state:
+$\exists n_{0}\in \mathds{N},n\geqslant n_{0}\Rightarrow E^n = E.$
+
+In addition, $d_{s}(S^n,S)\longrightarrow 0,$ so $\exists n_{1}\in %
+\mathds{N},d_{s}(S^n,S)<10^{-1}$ for all indexes greater than or equal to $%
+n_{1}$. This means that for $n\geqslant n_{1}$, all the $S^n$ have the same
+first term, which is $S^0$: $\forall n\geqslant n_{1},S_0^n=S_0.$
+
+Thus, after the $max(n_{0},n_{1})^{th}$ term, states of $E^n$ and $E$ are
+identical and strategies $S^n$ and $S$ start with the same first term.\newline
+Consequently, states of $G_{f}(S^n,E^n)$ and $G_{f}(S,E)$ are equal,
+so, after the $max(n_0, n_1)^{th}$ term, the distance $d$ between these two points is strictly less than 1.\newline
+\noindent We now prove that the distance between $\left(
+G_{f}(S^n,E^n)\right) $ and $\left( G_{f}(S,E)\right) $ is convergent to
+0. Let $\varepsilon >0$. \medskip
+\begin{itemize}
+\item If $\varepsilon \geqslant 1$, we see that the distance
+between $\left( G_{f}(S^n,E^n)\right) $ and $\left( G_{f}(S,E)\right) $ is
+strictly less than 1 after the $max(n_{0},n_{1})^{th}$ term (same state).
  \medskip
+\item If $\varepsilon <1$, then $\exists k\in \mathds{N},10^{-k}\geqslant
+\varepsilon > 10^{-(k+1)}$. But $d_{s}(S^n,S)$ converges to 0, so
+\begin{equation*}
+\exists n_{2}\in \mathds{N},\forall n\geqslant
+n_{2},d_{s}(S^n,S)<10^{-(k+2)},
+\end{equation*}%
+thus after $n_{2}$, the $k+2$ first terms of $S^n$ and $S$ are equal.
+\end{itemize}
+\noindent As a consequence, the $k+1$ first entries of the strategies of $%
+G_{f}(S^n,E^n)$ and $G_{f}(S,E)$ are the same ($G_{f}$ is a shift of strategies) and due to the definition of $d_{s}$, the floating part of
+the distance between $(S^n,E^n)$ and $(S,E)$ is strictly less than $%
+10^{-(k+1)}\leqslant \varepsilon $.
+
+In conclusion,
+%%RAPH : ici j'ai rajouté une ligne
+%%TOF : ici j'ai rajouté un commentaire
+%%TOF : ici aussi
+$
+\forall \varepsilon >0,$ $\exists N_{0}=max(n_{0},n_{1},n_{2})\in \mathds{N}
+,$ $\forall n\geqslant N_{0},$
+$ d\left( G_{f}(S^n,E^n);G_{f}(S,E)\right)
+\leqslant \varepsilon .
+$
+$G_{f}$ is consequently continuous.
+\end{proof}
  
-This Euclidian distance does not reproduce exactly the notion of proximity induced by our first distance $d$ on $\X$. Indeed $d$ is finer than $\Delta$. This is the reason why we have to introduce the following metric:
  
+It is now possible to study the topological behavior of the general chaotic
+iterations. We will prove that,
  
+\begin{theorem}
+\label{t:chaos des general}
+ The general chaotic iterations defined on Equation~\ref{general CIs} satisfy
+the Devaney's property of chaos.
+\end{theorem}
  
-\begin{definition}
-Let $x,y \in \big[ 0, 2^{10} \big[$.
-$D$ denotes the function from $\big[ 0, 2^{10} \big[^2$ to $\mathds{R}^+$ defined by: $D(x,y) = D_e\left(e(x),e(y)\right) + D_s\left(s(x),s(y)\right)$, where:
-\begin{center}
-$\displaystyle{D_e(E,\check{E}) = \sum_{k=0}^\mathsf{9} \delta (E_k, \check{E}_k)}$, ~~and~ $\displaystyle{D_s(S,\check{S}) = \sum_{k = 1}^\infty \dfrac{|S^k-\check{S}^k|}{10^k}}$.
-\end{center}
-\end{definition}
+Let us firstly prove the following lemma.
  
-\begin{proposition}
-$D$ is a distance on $\big[ 0, 2^{10} \big[$.
-\end{proposition}
+\begin{lemma}[Strong transitivity]
+\label{strongTrans}
+ For all couples $X,Y \in \mathcal{X}$ and any neighborhood $V$ of $X$, we can 
+find $n \in \mathds{N}^*$ and $X' \in V$ such that $G^n(X')=Y$.
+\end{lemma}
  
  \begin{proof}
-The three axioms defining a distance must be checked.
+ Let $X=(S,E)$, $\varepsilon>0$, and $k_0 = \lfloor log_{10}(\varepsilon)+1 \rfloor$. 
+Any point $X'=(S',E')$ such that $E'=E$ and $\forall k \leqslant k_0, S'^k=S^k$, 
+are in the open ball $\mathcal{B}\left(X,\varepsilon\right)$. Let us define 
+$\check{X} = \left(\check{S},\check{E}\right)$, where $\check{X}= G^{k_0}(X)$.
+We denote by $s\subset \llbracket 1; \mathsf{N} \rrbracket$ the set of coordinates
+that are different between $\check{E}$ and the state of $Y$. Thus each point $X'$ of
+the form $(S',E')$ where $E'=E$ and $S'$ starts with 
+$(S^0, S^1, \hdots, S^{k_0},s,\hdots)$, verifies the following properties:
  \begin{itemize}
-\item $D \geqslant 0$, because everything is positive in its definition. If $D(x,y)=0$, then $D_e(x,y)=0$, so the integral parts of $x$ and $y$ are equal (they have the same binary decomposition). Additionally, $D_s(x,y) = 0$, then $\forall k \in \mathds{N}^*, s(x)^k = s(y)^k$. In other words, $x$ and $y$ have the same $k-$th decimal digit, $\forall k \in \mathds{N}^*$. And so $x=y$.
-\item $D(x,y)=D(y,x)$.
-\item Finally, the triangular inequality is obtained due to the fact that both $\delta$ and $\Delta(x,y)=|x-y|$ satisfy it.
+ \item $X'$ is in $\mathcal{B}\left(X,\varepsilon\right)$,
+ \item the state of $G_f^{k_0+1}(X')$ is the state of $Y$.
  \end{itemize}
+Finally the point $\left(\left(S^0, S^1, \hdots, S^{k_0},s,s^0, s^1, \hdots\right); E\right)$, 
+where $(s^0,s^1, \hdots)$ is the strategy of $Y$, satisfies the properties
+claimed in the lemma.
  \end{proof}
  
+We can now prove the Theorem~\ref{t:chaos des general}.
+
+\begin{proof}[Theorem~\ref{t:chaos des general}]
+Firstly, strong transitivity implies transitivity.
+
+Let $(S,E) \in\mathcal{X}$ and $\varepsilon >0$. To
+prove that $G_f$ is regular, it is sufficient to prove that
+there exists a strategy $\tilde S$ such that the distance between
+$(\tilde S,E)$ and $(S,E)$ is less than $\varepsilon$, and such that
+$(\tilde S,E)$ is a periodic point.
+
+Let $t_1=\lfloor-\log_{10}(\varepsilon)\rfloor$, and let $E'$ be the
+configuration that we obtain from $(S,E)$ after $t_1$ iterations of
+$G_f$. As $G_f$ is strongly transitive, there exists a strategy $S'$ 
+and $t_2\in\mathds{N}$ such
+that $E$ is reached from $(S',E')$ after $t_2$ iterations of $G_f$.
+
+Consider the strategy $\tilde S$ that alternates the first $t_1$ terms
+of $S$ and the first $t_2$ terms of $S'$: 
+%%RAPH : j'ai coupé la ligne en 2
+$$\tilde
+S=(S_0,\dots,S_{t_1-1},S'_0,\dots,S'_{t_2-1},S_0,$$$$\dots,S_{t_1-1},S'_0,\dots,S'_{t_2-1},S_0,\dots).$$ It
+is clear that $(\tilde S,E)$ is obtained from $(\tilde S,E)$ after
+$t_1+t_2$ iterations of $G_f$. So $(\tilde S,E)$ is a periodic
+point. Since $\tilde S_t=S_t$ for $t<t_1$, by the choice of $t_1$, we
+have $d((S,E),(\tilde S,E))<\epsilon$.
+\end{proof}
  
-The convergence of sequences according to $D$ is not the same than the usual convergence related to the Euclidian metric. For instance, if $x^n \to x$ according to $D$, then necessarily the integral part of each $x^n$ is equal to the integral part of $x$ (at least after a given threshold), and the decimal part of $x^n$ corresponds to the one of $x$ ``as far as required''.
-To illustrate this fact, a comparison between $D$ and the Euclidian distance is given Figure \ref{fig:comparaison de distances}. These illustrations show that $D$ is richer and more refined than the Euclidian distance, and thus is more precise.
-
-
-\begin{figure}[t]
-\begin{center}
-  \subfigure[Function $x \to dist(x;1,234) $ on the interval $(0;5)$.]{\includegraphics[scale=.35]{DvsEuclidien.pdf}}\quad
-  \subfigure[Function $x \to dist(x;3) $ on the interval $(0;5)$.]{\includegraphics[scale=.35]{DvsEuclidien2.pdf}}
-\end{center}
-\caption{Comparison between $D$ (in blue) and the Euclidian distane (in green).}
-\label{fig:comparaison de distances}
-\end{figure}
  
+%\section{Statistical Improvements Using Chaotic Iterations}
+
+%\label{The generation of pseudorandom sequence}
+
+
+%Let us now explain why we have reasonable ground to believe that chaos 
+%can improve statistical properties.
+%We will show in this section that chaotic properties as defined in the
+%mathematical theory of chaos are related to some statistical tests that can be found
+%in the NIST battery. Furthermore, we will check that, when mixing defective PRNGs with
+%chaotic iterations, the new generator presents better statistical properties
+%(this section summarizes and extends the work of~\cite{bfg12a:ip}).
+
+
+
+%\subsection{Qualitative relations between topological properties and statistical tests}
+
+
+%There are various relations between topological properties that describe an unpredictable behavior for a discrete 
+%dynamical system on the one
+%hand, and statistical tests to check the randomness of a numerical sequence
+%on the other hand. These two mathematical disciplines follow a similar 
+%objective in case of a recurrent sequence (to characterize an intrinsically complicated behavior for a
+%recurrent sequence), with two different but complementary approaches.
+%It is true that the following illustrative links give only qualitative arguments, 
+%and proofs should be provided later to make such arguments irrefutable. However 
+%they give a first understanding of the reason why we think that chaotic properties should tend
+%to improve the statistical quality of PRNGs.
+%%
+%Let us now list some of these relations between topological properties defined in the mathematical
+%theory of chaos and tests embedded into the NIST battery. %Such relations need to be further 
+%%investigated, but they presently give a first illustration of a trend to search similar properties in the 
+%%two following fields: mathematical chaos and statistics.
+
+
+%\begin{itemize}
+%    \item \textbf{Regularity}. As stated in Section~\ref{subsec:Devaney}, a chaotic dynamical system must 
+%have an element of regularity. Depending on the chosen definition of chaos, this element can be the existence of
+%a dense orbit, the density of periodic points, etc. The key idea is that a dynamical system with no periodicity
+%is not as chaotic as a system having periodic orbits: in the first situation, we can predict something and gain a
+%knowledge about the behavior of the system, that is, it never enters into a loop. A similar importance for periodicity is emphasized in
+%the two following NIST tests~\cite{Nist10}:
+%    \begin{itemize}
+%        \item \textbf{Non-overlapping Template Matching Test}. Detect generators that produce too many occurrences of a given non-periodic (aperiodic) pattern.
+%        \item \textbf{Discrete Fourier Transform (Spectral) Test}. Detect periodic features (i.e., repetitive patterns that are close one to another) in the tested sequence that would indicate a deviation from the assumption of randomness.
+%    \end{itemize}
+
+%\item \textbf{Transitivity}. This topological property previously introduced  states that the dynamical system is intrinsically complicated: it cannot be simplified into 
+%two subsystems that do not interact, as we can find in any neighborhood of any point another point whose orbit visits the whole phase space. 
+%This focus on the places visited by the orbits of the dynamical system takes various nonequivalent formulations in the mathematical theory
+%of chaos, namely: transitivity, strong transitivity, total transitivity, topological mixing, and so on~\cite{bg10:ij}. A similar attention 
+%is brought on the states visited during a random walk in the two tests below~\cite{Nist10}:
+%    \begin{itemize}
+%        \item \textbf{Random Excursions Variant Test}. Detect deviations from the expected number of visits to various states in the random walk.
+%        \item \textbf{Random Excursions Test}. Determine if the number of visits to a particular state within a cycle deviates from what one would expect for a random sequence.
+%    \end{itemize}
+
+%\item \textbf{Chaos according to Li and Yorke}. Two points of the phase space $(x,y)$ define a couple of Li-Yorke when $\limsup_{n \rightarrow +\infty} d(f^{(n)}(x), f^{(n)}(y))>0$ et $\liminf_{n \rightarrow +\infty} d(f^{(n)}(x), f^{(n)}(y))=0$, meaning that their orbits always oscillate as the iterations pass. When a system is compact and contains an uncountable set of such points, it is claimed as chaotic according
+%to Li-Yorke~\cite{Li75,Ruette2001}. A similar property is regarded in the following NIST test~\cite{Nist10}.
+%    \begin{itemize}
+%        \item \textbf{Runs Test}. To determine whether the number of runs of ones and zeros of various lengths is as expected for a random sequence. In particular, this test determines whether the oscillation between such zeros and ones is too fast or too slow.
+%    \end{itemize}
+%    \item \textbf{Topological entropy}. The desire to formulate an equivalency of the thermodynamics entropy
+%has emerged both in the topological and statistical fields. Once again, a similar objective has led to two different
+%rewritting of an entropy based disorder: the famous Shannon definition of entropy is approximated in the statistical approach, 
+%whereas topological entropy is defined as follows:
+%$x,y \in \mathcal{X}$ are $\varepsilon-$\emph{separated in time $n$} if there exists $k \leqslant n$ such that $d\left(f^{(k)}(x),f^{(k)}(y)\right)>\varepsilon$. Then $(n,\varepsilon)-$separated sets are sets of points that are all $\varepsilon-$separated in time $n$, which
+%leads to the definition of $s_n(\varepsilon,Y)$, being the maximal cardinality of all $(n,\varepsilon)-$separated sets. Using these notations, 
+%the topological entropy is defined as follows: $$h_{top}(\mathcal{X},f)  = \displaystyle{\lim_{\varepsilon \rightarrow 0} \Big[ \limsup_{n \rightarrow +\infty} \dfrac{1}{n} \log s_n(\varepsilon,\mathcal{X})\Big]}.$$
+%This value measures the average exponential growth of the number of distinguishable orbit segments. 
+%In this sense, it measures the complexity of the topological dynamical system, whereas 
+%the Shannon approach comes to mind when defining the following test~\cite{Nist10}:
+%    \begin{itemize}
+%\item \textbf{Approximate Entropy Test}. Compare the frequency of the overlapping blocks of two consecutive/adjacent lengths ($m$ and $m+1$) against the expected result for a random sequence.
+%    \end{itemize}
+
+%    \item \textbf{Non-linearity, complexity}. Finally, let us remark that non-linearity and complexity are 
+%not only sought in general to obtain chaos, but they are also required for randomness, as illustrated by the two tests below~\cite{Nist10}.
+%    \begin{itemize}
+%\item \textbf{Binary Matrix Rank Test}. Check for linear dependence among fixed length substrings of the original sequence.
+%\item \textbf{Linear Complexity Test}. Determine whether or not the sequence is complex enough to be considered random.
+%      \end{itemize}
+%\end{itemize}
+
+
+%We have proven in our previous works~\cite{guyeux12:bc} that chaotic iterations satisfying Theorem~\ref{Th:Caractérisation   des   IC   chaotiques} are, among other
+%things, strongly transitive, topologically mixing, chaotic as defined by Li and Yorke,
+%and that they have a topological entropy and an exponent of Lyapunov both equal to $ln(\mathsf{N})$,
+%where $\mathsf{N}$ is the size of the iterated vector.
+%These topological properties make that we are ground to believe that a generator based on chaotic
+%iterations will probably be able to pass all the existing statistical batteries for pseudorandomness like
+%the NIST one. The following subsections, in which we prove that defective generators have their
+%statistical properties improved by chaotic iterations, show that such an assumption is true.
+
+%\subsection{Details of some Existing Generators}
+
+%The list of defective PRNGs we will use 
+%as inputs for the statistical tests to come is introduced here.
+
+%Firstly, the simple linear congruency generators (LCGs) will be used. 
+%They are defined by the following recurrence:
+%\begin{equation}
+%x^n = (ax^{n-1} + c)~mod~m,
+%\label{LCG}
+%\end{equation}
+%where $a$, $c$, and $x^0$ must be, among other things, non-negative and inferior to 
+%$m$~\cite{LEcuyerS07}. In what follows, 2LCGs and 3LCGs refer to two (resp. three) 
+%combinations of such LCGs. For further details, see~\cite{bfg12a:ip,combined_lcg}.
+
+%Secondly, the multiple recursive generators (MRGs) which will be used,
+%are based on a linear recurrence of order 
+%$k$, modulo $m$~\cite{LEcuyerS07}:
+%\begin{equation}
+%x^n = (a^1x^{n-1}+~...~+a^kx^{n-k})~mod~m .
+%\label{MRG}
+%\end{equation}
+%The combination of two MRGs (referred as 2MRGs) is also used in these experiments.
+
+%Generators based on linear recurrences with carry will be regarded too.
+%This family of generators includes the add-with-carry (AWC) generator, based on the recurrence:
+%\begin{equation}
+%\label{AWC}
+%\begin{array}{l}
+%x^n = (x^{n-r} + x^{n-s} + c^{n-1})~mod~m, \\
+%c^n= (x^{n-r} + x^{n-s} + c^{n-1}) / m, \end{array}\end{equation}
+%the SWB generator, having the recurrence:
+%\begin{equation}
+%\label{SWB}
+%\begin{array}{l}
+%x^n = (x^{n-r} - x^{n-s} - c^{n-1})~mod~m, \\
+%c^n=\left\{
+%\begin{array}{l}
+%1 ~~~~~\text{if}~ (x^{i-r} - x^{i-s} - c^{i-1})<0\\
+%0 ~~~~~\text{else},\end{array} \right. \end{array}\end{equation}
+%and the SWC generator, which is based on the following recurrence:
+%\begin{equation}
+%\label{SWC}
+%\begin{array}{l}
+%x^n = (a^1x^{n-1} \oplus ~...~ \oplus a^rx^{n-r} \oplus c^{n-1}) ~ mod ~ 2^w, \\
+%c^n = (a^1x^{n-1} \oplus ~...~ \oplus a^rx^{n-r} \oplus c^{n-1}) ~ / ~ 2^w. \end{array}\end{equation}
+
+%Then the generalized feedback shift register (GFSR) generator has been implemented, that is:
+%\begin{equation}
+%x^n = x^{n-r} \oplus x^{n-k} .
+%\label{GFSR}
+%\end{equation}
+
+
+%Finally, the nonlinear inversive (INV) generator~\cite{LEcuyerS07} has been studied, which is:
+
+%\begin{equation}
+%\label{INV}
+%\begin{array}{l}
+%x^n=\left\{
+%\begin{array}{ll}
+%(a^1 + a^2 / z^{n-1})~mod~m & \text{if}~ z^{n-1} \neq 0 \\
+%a^1 & \text{if}~  z^{n-1} = 0 .\end{array} \right. \end{array}\end{equation}
+
+
+
+%\begin{table}
+%%\renewcommand{\arraystretch}{1}
+%\caption{TestU01 Statistical Test Failures}
+%\label{TestU011}
+%\centering
+%  \begin{tabular}{lccccc}
+%    \toprule
+%Test name &Tests& Logistic            & XORshift      & ISAAC\\
+%Rabbit                                &       38      &21             &14     &0       \\
+%Alphabit                      &       17      &16             &9      &0       \\
+%Pseudo DieHARD                        &126    &0              &2      &0      \\
+%FIPS\_140\_2                  &16     &0              &0      &0      \\
+%SmallCrush                    &15     &4              &5      &0       \\
+%Crush                                 &144    &95             &57     &0       \\
+%Big Crush                     &160    &125            &55     &0       \\ \hline
+%Failures              &       &261            &146    &0       \\
+%\bottomrule
+%  \end{tabular}
+%\end{table}
+
+
+
+%\begin{table}
+%%\renewcommand{\arraystretch}{1}
+%\caption{TestU01 Statistical Test Failures for Old CI algorithms ($\mathsf{N}=4$)}
+%\label{TestU01 for Old CI}
+%\centering
+%  \begin{tabular}{lcccc}
+%    \toprule
+%\multirow{3}*{Test name} & \multicolumn{4}{c}{Old CI}\\
+%&Logistic& XORshift& ISAAC&ISAAC  \\ 
+%&+& +& + & + \\ 
+%&Logistic& XORshift& XORshift&ISAAC  \\ \cmidrule(r){2-5}
+%Rabbit                                        &7      &2      &0      &0       \\
+%Alphabit                              & 3     &0      &0      &0       \\
+%DieHARD                       &0      &0      &0      &0      \\
+%FIPS\_140\_2                  &0      &0      &0      &0      \\
+%SmallCrush                            &2      &0      &0      &0       \\
+%Crush                                         &47     &4      &0      &0       \\
+%Big Crush                             &79     &3      &0      &0       \\ \hline
+%Failures                              &138    &9      &0      &0       \\
+%\bottomrule
+%  \end{tabular}
+%\end{table}
+
+
+
+
+
+%\subsection{Statistical tests}
+%\label{Security analysis}
+
+%Three batteries of tests are reputed and regularly used
+%to evaluate the statistical properties of newly designed pseudorandom
+%number generators. These batteries are named DieHard~\cite{Marsaglia1996},
+%the NIST suite~\cite{ANDREW2008}, and the most stringent one called
+%TestU01~\cite{LEcuyerS07}, which encompasses the two other batteries.
+
+
+
+%\label{Results and discussion}
+%\begin{table*}
+%%\renewcommand{\arraystretch}{1}
+%\caption{NIST and DieHARD tests suite passing rates for PRNGs without CI}
+%\label{NIST and DieHARD tests suite passing rate the for PRNGs without CI}
+%\centering
+%  \begin{tabular}{|l||c|c|c|c|c|c|c|c|c|c|}
+%    \hline\hline
+%Types of PRNGs & \multicolumn{2}{c|}{Linear PRNGs} & \multicolumn{4}{c|}{Lagged PRNGs} & \multicolumn{1}{c|}{ICG PRNGs} & \multicolumn{3}{c|}{Mixed PRNGs}\\ \hline
+%\backslashbox{\textbf{$Tests$}} {\textbf{$PRNG$}} & LCG& MRG& AWC & SWB  & SWC & GFSR & INV & LCG2& LCG3& MRG2 \\ \hline
+%NIST & 11/15 & 14/15 &\textbf{15/15} & \textbf{15/15}   & 14/15 & 14/15  & 14/15 & 14/15& 14/15& 14/15 \\ \hline
+%DieHARD & 16/18 & 16/18 & 15/18 & 16/18 & \textbf{18/18} & 16/18 & 16/18 & 16/18& 16/18& 16/18\\ \hline
+%\end{tabular}
+%\end{table*}
+
+%Table~\ref{NIST and DieHARD tests suite passing rate the for PRNGs without CI} shows the 
+%results on the two first batteries recalled above, indicating that all the PRNGs presented
+%in the previous section
+%cannot pass all these tests. In other words, the statistical quality of these PRNGs cannot 
+%fulfill the up-to-date standards presented previously. We have shown in~\cite{bfg12a:ip} that the use of chaotic
+%iterations can solve this issue.
+%%More precisely, to
+%%illustrate the effects of chaotic iterations on these defective PRNGs, experiments have been divided in three parts~\cite{bfg12a:ip}:
+%%\begin{enumerate}
+%%  \item \textbf{Single CIPRNG}: The PRNGs involved in CI computing are of the same category.
+%%  \item \textbf{Mixed CIPRNG}: Two different types of PRNGs are mixed during the chaotic iterations process.
+%%  \item \textbf{Multiple CIPRNG}: The generator is obtained by repeating the composition of the iteration function as follows: $x^0\in \mathds{B}^{\mathsf{N}}$, and $\forall n\in \mathds{N}^{\ast },\forall i\in \llbracket1;\mathsf{N}\rrbracket, x_i^n=$
+%%\begin{equation}
+%%\begin{array}{l}
+%%\left\{
+%%\begin{array}{l}
+%%x_i^{n-1}~~~~~\text{if}~S^n\neq i \\
+%%\forall j\in \llbracket1;\mathsf{m}\rrbracket,f^m(x^{n-1})_{S^{nm+j}}~\text{if}~S^{nm+j}=i.\end{array} \right. \end{array}
+%%\end{equation}
+%%$m$ is called the \emph{functional power}.
+%%\end{enumerate}
+%%
+%The obtained results are reproduced in Table
+%\ref{NIST and DieHARD tests suite passing rate the for single CIPRNGs}.
+%The scores written in boldface indicate that all the tests have been passed successfully, whereas an 
+%asterisk ``*'' means that the considered passing rate has been improved.
+%The improvements are obvious for both the ``Old CI'' and the ``New CI'' generators.
+%Concerning the ``Xor CI PRNG'', the score is less spectacular. Because of a large speed improvement, the statistics
+% are not as good as for the two other versions of these CIPRNGs.
+%However 8 tests have been improved (with no deflation for the other results).
+
+
+%\begin{table*}
+%%\renewcommand{\arraystretch}{1.3}
+%\caption{NIST and DieHARD tests suite passing rates for PRNGs with CI}
+%\label{NIST and DieHARD tests suite passing rate the for single CIPRNGs}
+%\centering
+%  \begin{tabular}{|l||c|c|c|c|c|c|c|c|c|c|c|c|}
+%    \hline
+%Types of PRNGs & \multicolumn{2}{c|}{Linear PRNGs} & \multicolumn{4}{c|}{Lagged PRNGs} & \multicolumn{1}{c|}{ICG PRNGs} & \multicolumn{3}{c|}{Mixed PRNGs}\\ \hline
+%\backslashbox{\textbf{$Tests$}} {\textbf{$Single~CIPRNG$}} & LCG  & MRG & AWC & SWB & SWC & GFSR & INV& LCG2 & LCG3& MRG2 \\ \hline\hline
+%Old CIPRNG\\ \hline \hline
+%NIST & \textbf{15/15} *  & \textbf{15/15} * & \textbf{15/15}   & \textbf{15/15}   & \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} *& \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} \\ \hline
+%DieHARD & \textbf{18/18} *  & \textbf{18/18} * & \textbf{18/18} *  & \textbf{18/18} *  & \textbf{18/18}  & \textbf{18/18} * & \textbf{18/18} *& \textbf{18/18} * & \textbf{18/18} *& \textbf{18/18} * \\ \hline
+%New CIPRNG\\ \hline \hline
+%NIST & \textbf{15/15} *  & \textbf{15/15} * & \textbf{15/15}   & \textbf{15/15}  & \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} *& \textbf{15/15} * & \textbf{15/15} * & \textbf{15/15} \\ \hline
+%DieHARD & \textbf{18/18} *  & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18}  & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} * & \textbf{18/18} *& \textbf{18/18} *\\ \hline
+%Xor CIPRNG\\ \hline\hline
+%NIST & 14/15*& \textbf{15/15} *   & \textbf{15/15}   & \textbf{15/15}   & 14/15 & \textbf{15/15} * & 14/15& \textbf{15/15} * & \textbf{15/15} *& \textbf{15/15}  \\ \hline
+%DieHARD & 16/18 & 16/18 & 17/18* & \textbf{18/18} * & \textbf{18/18}  & \textbf{18/18} * & 16/18 & 16/18 & 16/18& 16/18\\ \hline
+%\end{tabular}
+%\end{table*}
+
+
+%We have then investigated in~\cite{bfg12a:ip} if it were possible to improve
+%the statistical behavior of the Xor CI version by combining more than one 
+%$\oplus$ operation. Results are summarized in Table~\ref{threshold}, illustrating
+%the progressive increasing effects of chaotic iterations, when giving time to chaos to get settled in.
+%Thus rapid and perfect PRNGs, regarding the NIST and DieHARD batteries, can be obtained 
+%using chaotic iterations on defective generators.
+
+%\begin{table*}
+%%\renewcommand{\arraystretch}{1.3}
+%\caption{Number of $\oplus$ operations to pass the whole NIST and DieHARD batteries}
+%\label{threshold}
+%\centering
+%  \begin{tabular}{|l||c|c|c|c|c|c|c|c|}
+%    \hline
+%Inputted $PRNG$ & LCG & MRG & SWC & GFSR & INV& LCG2 & LCG3  & MRG2 \\ \hline\hline
+%Threshold  value $m$& 19 & 7  & 2& 1 & 11& 9& 3& 4\\ \hline\hline
+%\end{tabular}
+%\end{table*}
+
+%Finally, the TestU01 battery has been launched on three well-known generators 
+%(a logistic map, a simple XORshift, and the cryptographically secure ISAAC, 
+%see Table~\ref{TestU011}). These results can be compared with 
+%Table~\ref{TestU01 for Old CI}, which gives the scores obtained by the
+%Old CI PRNG that has received these generators.
+%The obvious improvement speaks for itself, and together with the other
+%results recalled in this section, it reinforces the opinion that a strong
+%correlation between topological properties and statistical behavior exists.
+
+
+%The next subsection will now give a concrete original implementation of the Xor CI PRNG, the
+%fastest generator in the chaotic iteration based family. In the remainder,
+%this generator will be simply referred to as CIPRNG, or ``the proposed PRNG'', if this statement does not
+%raise ambiguity.
+
+
+\section{Toward Efficiency and Improvement for CI PRNG}
+\label{sec:efficient PRNG}
+
+\subsection{First Efficient Implementation of a PRNG based on Chaotic Iterations}
+%
+%Based on the proof presented in the previous section, it is now possible to 
+%improve the speed of the generator formerly presented in~\cite{bgw09:ip,guyeux10}. 
+%The first idea is to consider
+%that the provided strategy is a pseudorandom Boolean vector obtained by a
+%given PRNG.
+%An iteration of the system is simply the bitwise exclusive or between
+%the last computed state and the current strategy.
+%Topological properties of disorder exhibited by chaotic 
+%iterations can be inherited by the inputted generator, we hope by doing so to 
+%obtain some statistical improvements while preserving speed.
+%
+%%RAPH : j'ai viré tout ca
+%% Let us give an example using 16-bits numbers, to clearly understand how the bitwise xor operations
+%% are
+%% done.  
+%% Suppose  that $x$ and the  strategy $S^i$ are given as
+%% binary vectors.
+%% Table~\ref{TableExemple} shows the result of $x \oplus S^i$.
+
+%% \begin{table}
+%% \begin{scriptsize}
+%% $$
+%% \begin{array}{|cc|cccccccccccccccc|}
+%% \hline
+%% x      &=&1&0&1&1&1&0&1&0&1&0&0&1&0&0&1&0\\
+%% \hline
+%% S^i      &=&0&1&1&0&0&1&1&0&1&1&1&0&0&1&1&1\\
+%% \hline
+%% x \oplus S^i&=&1&1&0&1&1&1&0&0&0&1&1&1&0&1&0&1\\
+%% \hline
+
+%% \hline
+%%  \end{array}
+%% $$
+%% \end{scriptsize}
+%% \caption{Example of an arbitrary round of the proposed generator}
+%% \label{TableExemple}
+%% \end{table}
+
+
+
+
+\lstset{language=C,caption={C code of the sequential PRNG based on chaotic iterations},label={algo:seqCIPRNG}}
+\begin{small}
+\begin{lstlisting}
+
+unsigned int CIPRNG() {
+  static unsigned int x = 123123123;
+  unsigned long t1 = xorshift();
+  unsigned long t2 = xor128();
+  unsigned long t3 = xorwow();
+  x = x^(unsigned int)t1;
+  x = x^(unsigned int)(t2>>32);
+  x = x^(unsigned int)(t3>>32);
+  x = x^(unsigned int)t2;
+  x = x^(unsigned int)(t1>>32);
+  x = x^(unsigned int)t3;
+  return x;
+}
+\end{lstlisting}
+\end{small}
  
  
  
-\subsubsection{The semiconjugacy}
+In Listing~\ref{algo:seqCIPRNG} a sequential  version of the proposed PRNG based
+on  chaotic  iterations  is  presented.   The xor  operator  is  represented  by
+\textasciicircum.  This function uses  three classical 64-bits PRNGs, namely the
+\texttt{xorshift},         the          \texttt{xor128},         and         the
+\texttt{xorwow}~\cite{Marsaglia2003}.  In the following, we call them ``xor-like
+PRNGs''.   As each  xor-like PRNG  uses 64-bits  whereas our  proposed generator
+works with 32-bits, we use the command \texttt{(unsigned int)}, that selects the
+32 least  significant bits  of a given  integer, and the  code \texttt{(unsigned
+  int)(t$>>$32)} in order to obtain the 32 most significant bits of \texttt{t}.
  
-It is now possible to define a topological semiconjugacy between $\mathcal{X}$ and an interval of $\mathds{R}$:
+Thus producing a pseudorandom number needs 6 xor operations with 6 32-bits numbers
+that  are provided by  3 64-bits  PRNGs.  This  version successfully  passes the
+stringent BigCrush battery of tests~\cite{LEcuyerS07}. 
+At this point, we thus
+have defined an efficient and statistically unbiased generator. Its speed is
+directly related to the use of linear operations, but for the same reason,
+this fast generator cannot be proven as secure.
  
-\begin{theorem}
-Chaotic iterations on the phase space $\mathcal{X}$ are simple iterations on $\mathds{R}$, which is illustrated by the semiconjugacy of the diagram bellow:
-\begin{equation*}
-\begin{CD}
-\left(~\mathcal{S}_{10} \times\mathds{B}^{10}, d~\right) @>G_{f_0}>> \left(~\mathcal{S}_{10} \times\mathds{B}^{10}, d~\right)\\
-    @V{\varphi}VV                    @VV{\varphi}V\\
-\left( ~\big[ 0, 2^{10} \big[, D~\right)  @>>g> \left(~\big[ 0, 2^{10} \big[, D~\right)
-\end{CD}
-\end{equation*}
-\end{theorem}
  
-\begin{proof}
-$\varphi$ has been constructed in order to be continuous and onto.
-\end{proof}
  
-In other words, $\mathcal{X}$ is approximately equal to $\big[ 0, 2^\mathsf{N} \big[$.
+\subsection{Efficient PRNGs based on Chaotic Iterations on GPU}
+\label{sec:efficient PRNG gpu}
  
+In order to  take benefits from the computing power  of GPU, a program
+needs  to have  independent blocks  of  threads that  can be  computed
+simultaneously. In general,  the larger the number of  threads is, the
+more local  memory is  used, and the  less branching  instructions are
+used  (if,  while,  ...),  the  better the  performances  on  GPU  is.
+Obviously, having these requirements in  mind, it is possible to build
+a   program    similar   to    the   one   presented    in  Listing 
+\ref{algo:seqCIPRNG}, which computes  pseudorandom numbers on GPU.  To
+do  so,  we  must   firstly  recall  that  in  the  CUDA~\cite{Nvid10}
+environment,    threads    have     a    local    identifier    called
+\texttt{ThreadIdx},  which   is  relative  to   the  block  containing
+them. Furthermore, in  CUDA, parts of  the code that are executed by the  GPU, are
+called {\it kernels}.
  
  
+\subsection{Naive Version for GPU}
  
+ 
+It is possible to deduce from the CPU version a quite similar version adapted to GPU.
+The simple principle consists in making each thread of the GPU computing the CPU version of our PRNG.  
+Of course,  the  three xor-like
+PRNGs  used in these computations must have different  parameters. 
+In a given thread, these parameters are
+randomly picked from another PRNGs. 
+The  initialization stage is performed by  the CPU.
+To do it, the  ISAAC  PRNG~\cite{Jenkins96} is used to  set  all  the
+parameters embedded into each thread.   
+
+The implementation of  the three
+xor-like  PRNGs  is  straightforward  when  their  parameters  have  been
+allocated in  the GPU memory.  Each xor-like  works with  an internal
+number  $x$  that saves  the  last  generated  pseudorandom number. Additionally,  the
+implementation of the  xor128, the xorshift, and the  xorwow respectively require
+4, 5, and 6 unsigned long as internal variables.
+
+
+\begin{algorithm}
+\begin{small}
+\KwIn{InternalVarXorLikeArray: array with internal variables of the 3 xor-like
+PRNGs in global memory\;
+NumThreads: number of threads\;}
+\KwOut{NewNb: array containing random numbers in global memory}
+\If{threadIdx is concerned by the computation} {
+  retrieve data from InternalVarXorLikeArray[threadIdx] in local variables\;
+  \For{i=1 to n} {
+    compute a new PRNG as in Listing\ref{algo:seqCIPRNG}\;
+    store the new PRNG in NewNb[NumThreads*threadIdx+i]\;
+  }
+  store internal variables in InternalVarXorLikeArray[threadIdx]\;
+}
+\end{small}
+\caption{Main kernel of the GPU ``naive'' version of the PRNG based on chaotic iterations}
+\label{algo:gpu_kernel}
+\end{algorithm}
+
+
+
+Algorithm~\ref{algo:gpu_kernel}  presents a naive  implementation of the proposed  PRNG on
+GPU.  Due to the available  memory in the  GPU and the number  of threads
+used simultaneously,  the number  of random numbers  that a thread  can generate
+inside   a    kernel   is   limited  (\emph{i.e.},    the    variable   \texttt{n}   in
+algorithm~\ref{algo:gpu_kernel}). For instance, if  $100,000$ threads are used and
+if $n=100$\footnote{in fact, we need to add the initial seed (a 32-bits number)},
+then   the  memory   required   to  store all of the  internals   variables  of both the  xor-like
+PRNGs\footnote{we multiply this number by $2$ in order to count 32-bits numbers}
+and  the pseudorandom  numbers generated by  our  PRNG,  is  equal to  $100,000\times  ((4+5+6)\times
+2+(1+100))=1,310,000$ 32-bits numbers, that is, approximately $52$Mb.
+\begin{color}{red}
+Remark that the only requirement regarding the seed regarding the security of our PRNG is
+that it must be randomly picked. Indeed, the asymptotic security of BBS guarantees
+that, as the seed length increases, no polynomial time statistical test can 
+distinguish the pseudorandom sequences from truly random sequences with non-negligible probability,
+see, \emph{e.g.},~\cite{Sidorenko:2005:CSB:2179218.2179250}.
+\end{color}
+
+This generator is able to pass the whole BigCrush battery of tests, for all
+the versions that have been tested depending on their number of threads 
+(called \texttt{NumThreads} in our algorithm, tested up to $5$ million).
  
+\begin{remark}
+The proposed algorithm has  the  advantage of  manipulating  independent
+PRNGs, so this version is easily adaptable on a cluster of computers too. The only thing
+to ensure is to use a single ISAAC PRNG. To achieve this requirement, a simple solution consists in
+using a master node for the initialization. This master node computes the initial parameters
+for all the different nodes involved in the computation.
+\end{remark}
  
-\subsection{Study of the chaotic iterations described as a real function}
+\subsection{Improved Version for GPU}
+
+As GPU cards using CUDA have shared memory between threads of the same block, it
+is possible  to use this  feature in order  to simplify the  previous algorithm,
+i.e., to use less  than 3 xor-like PRNGs. The solution  consists in computing only
+one xor-like PRNG by thread, saving  it into the shared memory, and then to use the results
+of some  other threads in the  same block of  threads. In order to  define which
+thread uses the result of which other  one, we can use a combination array that
+contains  the indexes  of  all threads  and  for which  a combination has  been
+performed. 
+
+In  Algorithm~\ref{algo:gpu_kernel2},  two  combination  arrays are  used.   The
+variable     \texttt{offset}    is     computed    using     the     value    of
+\texttt{combination\_size}.   Then we  can compute  \texttt{o1}  and \texttt{o2}
+representing the  indexes of  the other  threads whose results  are used  by the
+current one.   In this algorithm, we  consider that a 32-bits  xor-like PRNG has
+been chosen. In practice, we  use the xor128 proposed in~\cite{Marsaglia2003} in
+which  unsigned longs  (64 bits)  have been  replaced by  unsigned  integers (32
+bits).
+
+This version  can also pass the whole {\it BigCrush} battery of tests.
+
+\begin{algorithm}
+\begin{small}
+\KwIn{InternalVarXorLikeArray: array with internal variables of 1 xor-like PRNGs
+in global memory\;
+NumThreads: Number of threads\;
+array\_comb1, array\_comb2: Arrays containing combinations of size combination\_size\;}
+
+\KwOut{NewNb: array containing random numbers in global memory}
+\If{threadIdx is concerned} {
+  retrieve data from InternalVarXorLikeArray[threadIdx] in local variables including shared memory and x\;
+  offset = threadIdx\%combination\_size\;
+  o1 = threadIdx-offset+array\_comb1[offset]\;
+  o2 = threadIdx-offset+array\_comb2[offset]\;
+  \For{i=1 to n} {
+    t=xor-like()\;
+    t=t\textasciicircum shmem[o1]\textasciicircum shmem[o2]\;
+    shared\_mem[threadIdx]=t\;
+    x = x\textasciicircum t\;
+
+    store the new PRNG in NewNb[NumThreads*threadIdx+i]\;
+  }
+  store internal variables in InternalVarXorLikeArray[threadIdx]\;
+}
+\end{small}
+\caption{Main kernel for the chaotic iterations based PRNG GPU efficient
+version\label{IR}}
+\label{algo:gpu_kernel2} 
+\end{algorithm}
+
+\subsection{Chaos Evaluation of the Improved Version}
+
+A run of Algorithm~\ref{algo:gpu_kernel2} consists in an operation ($x=x\oplus t$) having 
+the form of Equation~\ref{equation Oplus}, which is equivalent to the iterative
+system of Eq.~\ref{eq:generalIC}. That is, an iteration of the general chaotic
+iterations is realized between the last stored value $x$ of the thread and a strategy $t$
+(obtained by a bitwise exclusive or between a value provided by a xor-like() call
+and two values previously obtained by two other threads).
+To be certain that we are in the framework of Theorem~\ref{t:chaos des general},
+we must guarantee that this dynamical system iterates on the space 
+$\mathcal{X} = \mathcal{P}\left(\llbracket 1, \mathsf{N} \rrbracket\right)^\mathds{N}\times\mathds{B}^\mathsf{N}$.
+The left term $x$ obviously belongs to $\mathds{B}^ \mathsf{N}$.
+To prevent from any flaws of chaotic properties, we must check that the right 
+term (the last $t$), corresponding to the strategies,  can possibly be equal to any
+integer of $\llbracket 1, \mathsf{N} \rrbracket$. 
+
+Such a result is obvious, as for the xor-like(), all the
+integers belonging into its interval of definition can occur at each iteration, and thus the 
+last $t$ respects the requirement. Furthermore, it is possible to
+prove by an immediate mathematical induction that, as the initial $x$
+is uniformly distributed (it is provided by a cryptographically secure PRNG),
+the two other stored values shmem[o1] and shmem[o2] are uniformly distributed too,
+(this is the induction hypothesis), and thus the next $x$ is finally uniformly distributed.
+
+Thus Algorithm~\ref{algo:gpu_kernel2} is a concrete realization of the general
+chaotic iterations presented previously, and for this reason, it satisfies the 
+Devaney's formulation of a chaotic behavior.
  
+\section{Experiments}
+\label{sec:experiments}
+
+Different experiments  have been  performed in order  to measure  the generation
+speed. We have used a first computer equipped with a Tesla C1060 NVidia  GPU card
+and an
+Intel  Xeon E5530 cadenced  at 2.40  GHz,  and 
+a second computer  equipped with a smaller  CPU and  a GeForce GTX  280. 
+All the
+cards have 240 cores.
+
+In  Figure~\ref{fig:time_xorlike_gpu} we  compare the  quantity of  pseudorandom numbers
+generated per second with various xor-like based PRNGs. In this figure, the optimized
+versions use the {\it xor64} described in~\cite{Marsaglia2003}, whereas the naive versions
+embed  the three  xor-like  PRNGs described  in Listing~\ref{algo:seqCIPRNG}.   In
+order to obtain the optimal performances, the storage of pseudorandom numbers
+into the GPU memory has been removed. This step is time consuming and slows down the numbers
+generation.  Moreover this   storage  is  completely
+useless, in case of applications that consume the pseudorandom
+numbers  directly   after generation. We can see  that when the number of  threads is greater
+than approximately 30,000 and lower than 5 million, the number of pseudorandom numbers generated
+per second  is almost constant.  With the  naive version, this value ranges from 2.5 to
+3GSamples/s.   With  the  optimized   version,  it  is  approximately  equal to
+20GSamples/s. Finally  we can remark  that both GPU  cards are quite  similar, but in
+practice,  the Tesla C1060  has more  memory than  the GTX  280, and  this memory
+should be of better quality.
+As a  comparison,   Listing~\ref{algo:seqCIPRNG}  leads   to the  generation of  about
+138MSample/s when using one core of the Xeon E5530.
  
-\begin{figure}[t]
+\begin{figure}[htbp]
  \begin{center}
-  \subfigure[ICs on the interval $(0,9;1)$.]{\includegraphics[scale=.35]{ICs09a1.pdf}}\quad
-  \subfigure[ICs on the interval $(0,7;1)$.]{\includegraphics[scale=.35]{ICs07a95.pdf}}\\
-  \subfigure[ICs on the interval $(0,5;1)$.]{\includegraphics[scale=.35]{ICs05a1.pdf}}\quad
-  \subfigure[ICs on the interval $(0;1)$]{\includegraphics[scale=.35]{ICs0a1.pdf}}
+  \includegraphics[scale=0.7]{curve_time_xorlike_gpu.pdf}
  \end{center}
-\caption{Representation of the chaotic iterations.}
-\label{fig:ICs}
+\caption{Quantity of pseudorandom numbers generated per second with the xorlike-based PRNG}
+\label{fig:time_xorlike_gpu}
  \end{figure}
  
  
  
  
-\begin{figure}[t]
-\begin{center}
-  \subfigure[ICs on the interval $(510;514)$.]{\includegraphics[scale=.35]{ICs510a514.pdf}}\quad
-  \subfigure[ICs on the interval $(1000;1008)$]{\includegraphics[scale=.35]{ICs1000a1008.pdf}}
-\end{center}
-\caption{ICs on small intervals.}
-\label{fig:ICs2}
-\end{figure}
  
-\begin{figure}[t]
+In Figure~\ref{fig:time_bbs_gpu} we highlight  the performances of the optimized
+BBS-based PRNG on GPU.  On  the Tesla C1060 we obtain approximately 700MSample/s
+and  on the  GTX 280  about  670MSample/s, which  is obviously  slower than  the
+xorlike-based PRNG on GPU. However, we  will show in the next sections that this
+new PRNG  has a strong  level of  security, which is  necessarily paid by  a speed
+reduction.
+
+\begin{figure}[htbp]
  \begin{center}
-  \subfigure[ICs on the interval $(0;16)$.]{\includegraphics[scale=.3]{ICs0a16.pdf}}\quad
-  \subfigure[ICs on the interval  $(40;70)$.]{\includegraphics[scale=.45]{ICs40a70.pdf}}\quad
+  \includegraphics[scale=0.7]{curve_time_bbs_gpu.pdf}
  \end{center}
-\caption{General aspect of the chaotic iterations.}
-\label{fig:ICs3}
+\caption{Quantity of pseudorandom numbers generated per second using the BBS-based PRNG}
+\label{fig:time_bbs_gpu}
  \end{figure}
  
+All  these  experiments allow  us  to conclude  that  it  is possible  to
+generate a very large quantity of pseudorandom  numbers statistically perfect with the  xor-like version.
+To a certain extend, it is also the case with the secure BBS-based version, the speed deflation being
+explained by the fact that the former  version has ``only''
+chaotic properties and statistical perfection, whereas the latter is also cryptographically secure,
+as it is shown in the next sections.
  
-We have written a Python program to represent the chaotic iterations with the vectorial negation on the real line $\mathds{R}$. Various representations of these CIs are given in Figures \ref{fig:ICs}, \ref{fig:ICs2} and \ref{fig:ICs3}. It can be remarked that the function $g$ is a piecewise linear function: it is linear on each interval having the form $\left[ \dfrac{n}{10}, \dfrac{n+1}{10}\right[$, $n \in \llbracket 0;2^{10}\times 10 \rrbracket$ and its slope is equal to 10. Let us justify these claims:
  
-\begin{proposition}
-\label{Prop:derivabilite des ICs}
-Chaotic iterations $g$ defined on $\mathds{R}$ have derivatives of all orders on $\big[ 0, 2^{10} \big[$, except on the 10241 points in $I$ defined by $\left\{ \dfrac{n}{10} ~\big/~ n \in \llbracket 0;2^{10}\times 10\rrbracket \right\}$.
  
-Furthermore, on each interval of the form $\left[ \dfrac{n}{10}, \dfrac{n+1}{10}\right[$, with $n \in \llbracket 0;2^{10}\times 10 \rrbracket$, $g$ is a linear function, having a slope equal to 10: $\forall x \notin I, g'(x)=10$.
-\end{proposition}
  
  
-\begin{proof}
-Let $I_n = \left[ \dfrac{n}{10}, \dfrac{n+1}{10}\right[$, with $n \in \llbracket 0;2^{10}\times 10 \rrbracket$. All the points of $I_n$ have the same integral prat $e$ and the same decimal part $s^0$: on the set $I_n$,  functions $e(x)$ and $x \mapsto s(x)^0$ of Definition \ref{def:e et s} only depend on $n$. So all the images $g(x)$ of these points $x$:
-\begin{itemize}
-\item Have the same integral part, which is $e$, except probably the bit number $s^0$. In other words, this integer has approximately the same binary decomposition than $e$, the sole exception being the digit $s^0$ (this number is then either $e+2^{10-s^0}$ or $e-2^{10-s^0}$, depending on the parity of $s^0$, \emph{i.e.}, it is equal to $e+(-1)^{s^0}\times 2^{10-s^0}$).
-\item A shift to the left has been applied to the decimal part $y$, losing by doing so the common first digit $s^0$. In other words, $y$ has been mapped into $10\times y - s^0$.
-\end{itemize}
-To sum up, the action of $g$ on the points of $I$ is as follows: first, make a multiplication by 10, and second, add the same constant to each term, which is $\dfrac{1}{10}\left(e+(-1)^{s^0}\times 2^{10-s^0}\right)-s^0$.
-\end{proof}
  
-\begin{remark}
-Finally, chaotic iterations are elements of the large family of functions that are both chaotic and piecewise linear (like the tent map).
-\end{remark}
+
+\section{Security Analysis}
  
  
+This section is dedicated to the security analysis of the
+  proposed PRNGs, both from a theoretical and from a practical point of view.
  
-\subsection{Comparison of the two metrics on $\big[ 0, 2^\mathsf{N} \big[$}
+\subsection{Theoretical Proof of Security}
+\label{sec:security analysis}
  
-The two propositions bellow allow to compare our two distances on $\big[ 0, 2^\mathsf{N} \big[$:
+The standard definition
+  of {\it indistinguishability} used is the classical one as defined for
+  instance in~\cite[chapter~3]{Goldreich}. 
+  This property shows that predicting the future results of the PRNG
+  cannot be done in a reasonable time compared to the generation time. It is important to emphasize that this
+  is a relative notion between breaking time and the sizes of the
+  keys/seeds. Of course, if small keys or seeds are chosen, the system can
+  be broken in practice. But it also means that if the keys/seeds are large
+  enough, the system is secured.
+As a complement, an example of a concrete practical evaluation of security
+is outlined in the next subsection.
+
+In this section the concatenation of two strings $u$ and $v$ is classically
+denoted by $uv$.
+In a cryptographic context, a pseudorandom generator is a deterministic
+algorithm $G$ transforming strings  into strings and such that, for any
+seed $s$ of length $m$, $G(s)$ (the output of $G$ on the input $s$) has size
+$\ell_G(m)$ with $\ell_G(m)>m$.
+The notion of {\it secure} PRNGs can now be defined as follows. 
+
+\begin{definition}
+A cryptographic PRNG $G$ is secure if for any probabilistic polynomial time
+algorithm $D$, for any positive polynomial $p$, and for all sufficiently
+large $m$'s,
+$$| \mathrm{Pr}[D(G(U_m))=1]-Pr[D(U_{\ell_G(m)})=1]|< \frac{1}{p(m)},$$
+where $U_r$ is the uniform distribution over $\{0,1\}^r$ and the
+probabilities are taken over $U_m$, $U_{\ell_G(m)}$ as well as over the
+internal coin tosses of $D$. 
+\end{definition}
+
+Intuitively,  it means  that  there is  no  polynomial time  algorithm that  can
+distinguish a  perfect uniform random generator  from $G$ with  a non negligible
+probability.   An equivalent  formulation of  this well-known  security property
+means that  it is  possible \emph{in practice}  to predict  the next bit  of the
+generator, knowing all  the previously produced ones.  The  interested reader is
+referred to~\cite[chapter~3]{Goldreich}  for more  information. Note that  it is
+quite easily possible to change the function $\ell$ into any polynomial function
+$\ell^\prime$ satisfying $\ell^\prime(m)>m)$~\cite[Chapter 3.3]{Goldreich}.
+
+The generation schema developed in (\ref{equation Oplus}) is based on a
+pseudorandom generator. Let $H$ be a cryptographic PRNG. We may assume,
+without loss of generality, that for any string $S_0$ of size $N$, the size
+of $H(S_0)$ is $kN$, with $k>2$. It means that $\ell_H(N)=kN$. 
+Let $S_1,\ldots,S_k$ be the 
+strings of length $N$ such that $H(S_0)=S_1 \ldots S_k$ ($H(S_0)$ is the concatenation of
+the $S_i$'s). The cryptographic PRNG $X$ defined in (\ref{equation Oplus})
+is the algorithm mapping any string of length $2N$ $x_0S_0$ into the string
+$(x_0\oplus S_0 \oplus S_1)(x_0\oplus S_0 \oplus S_1\oplus S_2)\ldots
+(x_o\bigoplus_{i=0}^{i=k}S_i)$. One in particular has $\ell_{X}(2N)=kN=\ell_H(N)$. 
+We claim now that if this PRNG is secure,
+then the new one is secure too.
  
  \begin{proposition}
-Id: $\left(~\big[ 0, 2^\mathsf{N} \big[,\Delta~\right) \to \left(~\big[ 0, 2^\mathsf{N} \big[, D~\right)$ is not continuous. 
+\label{cryptopreuve}
+If $H$ is a secure cryptographic PRNG, then $X$ is a secure cryptographic
+PRNG too.
  \end{proposition}
  
  \begin{proof}
-The sequence $x^n = 1,999\hdots 999$ constituted by $n$ 9 as decimal part, is such that:
-\begin{itemize}
-\item $\Delta (x^n,2) \to 0.$
-\item But $D(x^n,2) \geqslant 1$, then $D(x^n,2)$ does not converge to 0.
-\end{itemize}
-
-The sequential characterization of the continuity concludes the demonstration.
+The proposition is proven by contraposition. Assume that $X$ is not
+secure. By Definition, there exists a polynomial time probabilistic
+algorithm $D$, a positive polynomial $p$, such that for all $k_0$ there exists
+$N\geq \frac{k_0}{2}$ satisfying 
+$$| \mathrm{Pr}[D(X(U_{2N}))=1]-\mathrm{Pr}[D(U_{kN}=1]|\geq \frac{1}{p(2N)}.$$
+We describe a new probabilistic algorithm $D^\prime$ on an input $w$ of size
+$kN$:
+\begin{enumerate}
+\item Decompose $w$ into $w=w_1\ldots w_{k}$, where each $w_i$ has size $N$.
+\item Pick a string $y$ of size $N$ uniformly at random.
+\item Compute $z=(y\oplus w_1)(y\oplus w_1\oplus w_2)\ldots (y
+  \bigoplus_{i=1}^{i=k} w_i).$
+\item Return $D(z)$.
+\end{enumerate}
+
+
+Consider  for each $y\in \mathbb{B}^{kN}$ the function $\varphi_{y}$
+from $\mathbb{B}^{kN}$ into $\mathbb{B}^{kN}$ mapping $w=w_1\ldots w_k$
+(each $w_i$ has length $N$) to 
+$(y\oplus w_1)(y\oplus w_1\oplus w_2)\ldots (y
+  \bigoplus_{i=1}^{i=k_1} w_i).$ By construction, one has for every $w$,
+\begin{equation}\label{PCH-1}
+D^\prime(w)=D(\varphi_y(w)),
+\end{equation}
+where $y$ is randomly generated. 
+Moreover, for each $y$, $\varphi_{y}$ is injective: if 
+$(y\oplus w_1)(y\oplus w_1\oplus w_2)\ldots (y\bigoplus_{i=1}^{i=k_1}
+w_i)=(y\oplus w_1^\prime)(y\oplus w_1^\prime\oplus w_2^\prime)\ldots
+(y\bigoplus_{i=1}^{i=k} w_i^\prime)$, then for every $1\leq j\leq k$,
+$y\bigoplus_{i=1}^{i=j} w_i^\prime=y\bigoplus_{i=1}^{i=j} w_i$. It follows,
+by a direct induction, that $w_i=w_i^\prime$. Furthermore, since $\mathbb{B}^{kN}$
+is finite, each $\varphi_y$ is bijective. Therefore, and using (\ref{PCH-1}),
+one has
+$\mathrm{Pr}[D^\prime(U_{kN})=1]=\mathrm{Pr}[D(\varphi_y(U_{kN}))=1]$ and,
+therefore, 
+\begin{equation}\label{PCH-2}
+\mathrm{Pr}[D^\prime(U_{kN})=1]=\mathrm{Pr}[D(U_{kN})=1].
+\end{equation}
+
+Now, using (\ref{PCH-1}) again, one has  for every $x$,
+\begin{equation}\label{PCH-3}
+D^\prime(H(x))=D(\varphi_y(H(x))),
+\end{equation}
+where $y$ is randomly generated. By construction, $\varphi_y(H(x))=X(yx)$,
+thus
+\begin{equation}%\label{PCH-3}      %%RAPH : j'ai viré ce label qui existe déjà, il est 3 ligne avant
+D^\prime(H(x))=D(yx),
+\end{equation}
+where $y$ is randomly generated. 
+It follows that 
+
+\begin{equation}\label{PCH-4}
+\mathrm{Pr}[D^\prime(H(U_{N}))=1]=\mathrm{Pr}[D(U_{2N})=1].
+\end{equation}
+ From (\ref{PCH-2}) and (\ref{PCH-4}), one can deduce that
+there exists a polynomial time probabilistic
+algorithm $D^\prime$, a positive polynomial $p$, such that for all $k_0$ there exists
+$N\geq \frac{k_0}{2}$ satisfying 
+$$| \mathrm{Pr}[D(H(U_{N}))=1]-\mathrm{Pr}[D(U_{kN}=1]|\geq \frac{1}{p(2N)},$$
+proving that $H$ is not secure, which is a contradiction. 
  \end{proof}
  
  
  
-A contrario:
+\subsection{Practical Security Evaluation}
+\label{sec:Practicak evaluation}
  
-\begin{proposition}
-Id: $\left(~\big[ 0, 2^\mathsf{N} \big[,D~\right) \to \left(~\big[ 0, 2^\mathsf{N} \big[, \Delta ~\right)$ is a continuous fonction. 
-\end{proposition}
+Pseudorandom generators based on Eq.~\eqref{equation Oplus} are thus cryptographically secure when
+they are XORed with an already cryptographically
+secure PRNG. But, as stated previously,
+such a property does not mean that, whatever the
+key size, no attacker can predict the next bit
+knowing all the previously released ones.
+However, given a key size, it is possible to 
+measure in practice the minimum duration needed
+for an attacker to break a cryptographically
+secure PRNG, if we know the power of his/her
+machines. Such a concrete security evaluation 
+is related to the $(T,\varepsilon)-$security
+notion, which is recalled and evaluated in what 
+follows, for the sake of completeness.
  
-\begin{proof}
-If $D(x^n,x) \to 0$, then $D_e(x^n,x) = 0$ at least for $n$ larger than a given threshold, because $D_e$ only returns integers. So, after this threshold, the integral parts of all the $x^n$ are equal to the integral part of $x$. 
+Let us firstly recall that,
+\begin{definition}
+Let $\mathcal{D} : \mathds{B}^M \longrightarrow \mathds{B}$ be a probabilistic algorithm that runs
+in time $T$. 
+Let $\varepsilon > 0$. 
+$\mathcal{D}$ is called a $(T,\varepsilon)-$distinguishing attack on pseudorandom
+generator $G$ if
+
+$$\left| Pr[\mathcal{D}(G(k)) = 1 \mid k \in_R \{0,1\}^\ell ]\right. - \left. Pr[\mathcal{D}(s) = 1 \mid s \in_R \mathds{B}^M ]\right| \geqslant \varepsilon,$$
+\noindent where the probability is taken over the internal coin flips of $\mathcal{D}$, and the notation
+``$\in_R$'' indicates the process of selecting an element at random and uniformly over the
+corresponding set.
+\end{definition}
  
-Additionally, $D_s(x^n, x) \to 0$, then $\forall k \in \mathds{N}^*, \exists N_k \in \mathds{N}, n \geqslant N_k \Rightarrow D_s(x^n,x) \leqslant 10^{-k}$. This means that for all $k$, an index $N_k$ can be found such that, $\forall n \geqslant N_k$, all the $x^n$ have the same $k$ firsts digits, which are the digits of $x$. We can deduce the convergence $\Delta(x^n,x) \to 0$, and thus the result.
-\end{proof}
+Let us recall that the running time of a probabilistic algorithm is defined to be the
+maximum of the expected number of steps needed to produce an output, maximized
+over all inputs; the expected number is averaged over all coin flips made by the algorithm~\cite{Knuth97}.
+We are now able to define the notion of cryptographically secure PRNGs:
  
-The conclusion of these propositions is that the proposed metric is more precise than the Euclidian distance, that is:
+\begin{definition}
+A pseudorandom generator is $(T,\varepsilon)-$secure if there exists no $(T,\varepsilon)-$distinguishing attack on this pseudorandom generator.
+\end{definition}
  
-\begin{corollary}
-$D$ is finer than the Euclidian distance $\Delta$.
-\end{corollary}
  
-This corollary can be reformulated as follows:
  
-\begin{itemize}
-\item The topology produced by $\Delta$ is a subset of the topology produced by $D$.
-\item $D$ has more open sets than $\Delta$.
-\item It is harder to converge for the topology $\tau_D$ inherited by $D$, than to converge with the one inherited by $\Delta$, which is denoted here by $\tau_\Delta$.
-\end{itemize}
  
  
-\subsection{Chaos of the chaotic iterations on $\mathds{R}$}
-\label{chpt:Chaos des itérations chaotiques sur R}
  
  
+Suppose now that the PRNG of Eq.~\eqref{equation Oplus} will work during 
+$M=100$ time units, and that during this period,
+an attacker can realize $10^{12}$ clock cycles.
+We thus wonder whether, during the PRNG's 
+lifetime, the attacker can distinguish this 
+sequence from a truly random one, with a probability
+greater than $\varepsilon = 0.2$.
+We consider that $N$ has 900 bits.
  
-\subsubsection{Chaos according to Devaney}
+Predicting the next generated bit knowing all the
+previously released ones by Eq.~\eqref{equation Oplus} is obviously equivalent to predicting the
+next bit in the BBS generator, which
+is cryptographically secure. More precisely, it
+is $(T,\varepsilon)-$secure: no 
+$(T,\varepsilon)-$distinguishing attack can be
+successfully realized on this PRNG, if~\cite{Fischlin}
+\begin{equation}
+T \leqslant \dfrac{L(N)}{6 N (log_2(N))\varepsilon^{-2}M^2}-2^7 N \varepsilon^{-2} M^2 log_2 (8 N \varepsilon^{-1}M)
+\label{mesureConcrete}
+\end{equation}
+where $M$ is the length of the output ($M=100$ in
+our example), and $L(N)$ is equal to
+$$
+2.8\times 10^{-3} exp \left(1.9229 \times (N ~ln~ 2)^\frac{1}{3} \times (ln(N~ln~  2))^\frac{2}{3}\right)
+$$
+is the number of clock cycles to factor a $N-$bit
+integer.
  
-We have recalled previously that the chaotic iterations $\left(\Go, \mathcal{X}_d\right)$ are chaotic according to the formulation of Devaney. We can deduce that they are chaotic on $\mathds{R}$ too, when considering the order topology, because:
-\begin{itemize}
-\item $\left(\Go, \mathcal{X}_d\right)$ and $\left(g, \big[ 0, 2^{10} \big[_D\right)$ are semiconjugate by $\varphi$,
-\item Then $\left(g, \big[ 0, 2^{10} \big[_D\right)$ is a system chaotic according to Devaney, because the semiconjugacy preserve this character.
-\item But the topology generated by $D$ is finer than the topology generated by the Euclidian distance $\Delta$ -- which is the order topology.
-\item According to Theorem \ref{Th:chaos et finesse}, we can deduce that the chaotic iterations $g$ are indeed chaotic, as defined by Devaney, for the order topology on $\mathds{R}$.
-\end{itemize}
  
-This result can be formulated as follows.
  
-\begin{theorem}
-\label{th:IC et topologie de l'ordre}
-The chaotic iterations $g$ on $\mathds{R}$ are chaotic according to the Devaney's formulation, when $\mathds{R}$ has his usual topology, which is the order topology.
-\end{theorem}
  
-Indeed this result is weaker than the theorem establishing the chaos for the finer topology $d$. However the Theorem \ref{th:IC et topologie de l'ordre} still remains important. Indeed, we have studied in our previous works a set different from the usual set of study ($\mathcal{X}$ instead of $\mathds{R}$), in order to be as close as possible from the computer: the properties of disorder proved theoretically will then be preserved when computing. However, we could wonder whether this change does not lead to a disorder of a lower quality. In other words, have we replaced a situation of a good disorder lost when computing, to another situation of a disorder preserved but of bad quality. Theorem \ref{th:IC et topologie de l'ordre} prove exactly the contrary.
- 
+A direct numerical application shows that this attacker 
+cannot achieve its $(10^{12},0.2)$ distinguishing
+attack in that context.
  
  
  
-\section{Efficient prng based on chaotic iterations}
+\section{Cryptographical Applications}
  
-On parle du séquentiel avec des nombres 64 bits\\
+\subsection{A Cryptographically Secure PRNG for GPU}
+\label{sec:CSGPU}
  
-Faire le lien avec le paragraphe précédent (je considère que la stratégie s'appelle $S^i$\\
+It is  possible to build a  cryptographically secure PRNG based  on the previous
+algorithm (Algorithm~\ref{algo:gpu_kernel2}).   Due to Proposition~\ref{cryptopreuve},
+it simply consists  in replacing
+the  {\it  xor-like} PRNG  by  a  cryptographically  secure one.  
+We have chosen the Blum Blum Shub generator~\cite{BBS} (usually denoted by BBS) having the form:
+$$x_{n+1}=x_n^2~ mod~ M$$  where $M$ is the product of  two prime numbers (these
+prime numbers  need to be congruent  to 3 modulus  4). BBS is known to be
+very slow and only usable for cryptographic applications. 
  
-In  order to  implement efficiently  a PRNG  based on  chaotic iterations  it is
-possible to improve  previous works [ref]. One solution  consists in considering
-that the  strategy used $S^i$  contains all the  bits for which the  negation is
-achieved out. Then instead of applying  the negation on these bits we can simply
-apply the  xor operator between  the current number  and the strategy  $S^i$. In
-order to obtain the strategy we also use a classical PRNG.
+  
+The modulus operation is the most time consuming operation for current
+GPU cards.  So in order to obtain quite reasonable performances, it is
+required to use only modulus  on 32-bits integer numbers. Consequently
+$x_n^2$ need  to be lesser than $2^{32}$,  and thus the number $M$ must be
+lesser than $2^{16}$.  So in practice we can choose prime numbers around
+256 that are congruent to 3 modulus 4.  With 32-bits numbers, only the
+4 least significant bits of $x_n$ can be chosen (the maximum number of
+indistinguishable    bits    is    lesser    than   or    equals    to
+$log_2(log_2(M))$). In other words, to generate a  32-bits number, we need to use
+8 times  the BBS  algorithm with possibly different  combinations of  $M$. This
+approach is  not sufficient to be able to pass  all the tests of TestU01,
+as small values of  $M$ for the BBS  lead to
+  small periods. So, in  order to add randomness  we have proceeded with
+the followings  modifications. 
+\begin{itemize}
+\item
+Firstly, we  define 16 arrangement arrays  instead of 2  (as described in
+Algorithm \ref{algo:gpu_kernel2}), but only 2 of them are used at each call of
+the  PRNG kernels. In  practice, the  selection of   combination
+arrays to be used is different for all the threads. It is determined
+by using  the three last bits  of two internal variables  used by BBS.
+%This approach  adds more randomness.   
+In Algorithm~\ref{algo:bbs_gpu},
+character  \& is for the  bitwise AND. Thus using  \&7 with  a number
+gives the last 3 bits, thus providing a number between 0 and 7.
+\item
+Secondly, after the  generation of the 8 BBS numbers  for each thread, we
+have a 32-bits number whose period is possibly quite small. So
+to add randomness,  we generate 4 more BBS numbers   to
+shift  the 32-bits  numbers, and  add up to  6 new  bits.  This  improvement is
+described  in Algorithm~\ref{algo:bbs_gpu}.  In  practice, the last 2 bits
+of the first new BBS number are  used to make a left shift of at most
+3 bits. The  last 3 bits of the  second new BBS number are  added to the
+strategy whatever the value of the first left shift. The third and the
+fourth new BBS  numbers are used similarly to apply  a new left shift
+and add 3 new bits.
+\item
+Finally, as  we use 8 BBS numbers  for each thread, the  storage of these
+numbers at the end of the  kernel is performed using a rotation. So,
+internal  variable for  BBS number  1 is  stored in  place  2, internal
+variable  for BBS  number 2  is  stored in  place 3,  ..., and finally, internal
+variable for BBS number 8 is stored in place 1.
+\end{itemize}
  
-\begin{figure}[htbp]
-\begin{center}
-\fbox{
-\begin{minipage}{14cm}
-unsigned int CIprng() \{\\
-  static unsigned int x = 123123123;\\
-  unsigned long t1 = xorshift();\\
-  unsigned long t2 = xor128();\\
-  unsigned long t3 = xorwow();\\
-  x = x\textasciicircum (unsigned int)t1;\\
-  x = x\textasciicircum (unsigned int)(t2$>>$32);\\
-  x = x\textasciicircum (unsigned int)(t3$>>$32);\\
-  x = x\textasciicircum (unsigned int)t2;\\
-  x = x\textasciicircum (unsigned int)(t1$>>$32);\\
-  x = x\textasciicircum (unsigned int)t3;\\
-  return x;\\
-\}
-\end{minipage}
+\begin{algorithm}
+\begin{small}
+\KwIn{InternalVarBBSArray: array with internal variables of the 8 BBS
+in global memory\;
+NumThreads: Number of threads\;
+array\_comb: 2D Arrays containing 16 combinations (in first dimension)  of size combination\_size (in second dimension)\;
+array\_shift[4]=\{0,1,3,7\}\;
  }
-\end{center}
-\caption{sequential Chaotic Iteration PRNG}
-\label{algo:seqCIprng}
-\end{figure}
  
-In Figure~\ref{algo:seqCIprng}  a sequential  version of our  chaotic iterations
-based PRNG  is presented.  This version  uses three classical 64  bits PRNG: the
-\texttt{xorshift},  the \texttt{xor128}  and the  \texttt{xorwow}.   These three
-PRNGs  are presented  in~\cite{Marsaglia2003}.   As each  PRNG  used works  with
-64-bits and as  our PRNG works with 32 bits, the  use of \texttt{(unsigned int)}
-selects the 32 least  significant bits whereas \texttt{(unsigned int)(t3$>>$32)}
-selects the 32  most significants bits of the  variable \texttt{t}. This version
-sucesses   the   BigCrush   of    the   TestU01   battery   [P.   L’ecuyer   and
-  R. Simard. Testu01].
+\KwOut{NewNb: array containing random numbers in global memory}
+\If{threadIdx is concerned} {
+  retrieve data from InternalVarBBSArray[threadIdx] in local variables including shared memory and x\;
+  we consider that bbs1 ... bbs8 represent the internal states of the 8 BBS numbers\;
+  offset = threadIdx\%combination\_size\;
+  o1 = threadIdx-offset+array\_comb[bbs1\&7][offset]\;
+  o2 = threadIdx-offset+array\_comb[8+bbs2\&7][offset]\;
+  \For{i=1 to n} {
+    t$<<$=4\;
+    t|=BBS1(bbs1)\&15\;
+    ...\;
+    t$<<$=4\;
+    t|=BBS8(bbs8)\&15\;
+    \tcp{two new shifts}
+    shift=BBS3(bbs3)\&3\;
+    t$<<$=shift\;
+    t|=BBS1(bbs1)\&array\_shift[shift]\;
+    shift=BBS7(bbs7)\&3\;
+    t$<<$=shift\;
+    t|=BBS2(bbs2)\&array\_shift[shift]\;
+    t=t\textasciicircum  shmem[o1]\textasciicircum     shmem[o2]\;
+    shared\_mem[threadIdx]=t\;
+    x = x\textasciicircum   t\;
+
+    store the new PRNG in NewNb[NumThreads*threadIdx+i]\;
+  }
+  store internal variables in InternalVarXorLikeArray[threadIdx] using a rotation\;
+}
+\end{small}
+\caption{main kernel for the BBS based PRNG GPU}
+\label{algo:bbs_gpu}
+\end{algorithm}
+
+In Algorithm~\ref{algo:bbs_gpu}, $n$ is for  the quantity of random numbers that
+a thread has to  generate.  The operation t<<=4 performs a left  shift of 4 bits
+on the variable  $t$ and stores the result in  $t$, and $BBS1(bbs1)\&15$ selects
+the last  four bits  of the  result of $BBS1$.   Thus an  operation of  the form
+$t<<=4; t|=BBS1(bbs1)\&15\;$  realizes in $t$ a  left shift of 4  bits, and then
+puts the 4 last bits of $BBS1(bbs1)$  in the four last positions of $t$.  Let us
+remark that the initialization $t$ is not a  necessity as we fill it 4 bits by 4
+bits, until  having obtained 32-bits.  The  two last new shifts  are realized in
+order to enlarge the small periods of  the BBS used here, to introduce a kind of
+variability.  In these operations, we make twice a left shift of $t$ of \emph{at
+  most}  3 bits,  represented by  \texttt{shift} in  the algorithm,  and  we put
+\emph{exactly} the \texttt{shift}  last bits from a BBS  into the \texttt{shift}
+last bits of $t$. For this, an array named \texttt{array\_shift}, containing the
+correspondence between the  shift and the number obtained  with \texttt{shift} 1
+to make the \texttt{and} operation is used. For example, with a left shift of 0,
+we  make an  and operation  with 0,  with  a left  shift of  3, we  make an  and
+operation with 7 (represented by 111 in binary mode).
+
+It should  be noticed that this generator has once more the form $x^{n+1} = x^n \oplus S^n$,
+where $S^n$ is referred in this algorithm as $t$: each iteration of this
+PRNG ends with $x = x \wedge t$. This $S^n$ is only constituted
+by secure bits produced by the BBS generator, and thus, due to
+Proposition~\ref{cryptopreuve}, the resulted PRNG is 
+cryptographically secure.
+
+As stated before, even if the proposed PRNG is cryptocaphically
+secure, it does not mean that such a generator
+can be used as described here when attacks are
+awaited. The problem is to determine the minimum 
+time required for an attacker, with a given 
+computational power, to predict under a probability
+lower than 0.5 the $n+1$th bit, knowing the $n$
+previous ones. The proposed GPU generator will be
+useful in a security context, at least in some 
+situations where a secret protected by a pseudorandom
+keystream is rapidly obsolete, if this time to 
+predict the next bit is large enough when compared
+to both the generation and transmission times.
+It is true that the prime numbers used in the last
+section are very small compared to up-to-date 
+security recommendations. However the attacker has not
+access to each BBS, but to the output produced 
+by Algorithm~\ref{algo:bbs_gpu}, which is far
+more complicated than a simple BBS. Indeed, to
+determine if this cryptographically secure PRNG
+on GPU can be useful in security context with the 
+proposed parameters, or if it is only a very fast
+and statistically perfect generator on GPU, its
+$(T,\varepsilon)-$security must be determined, and
+a formulation similar to Eq.\eqref{mesureConcrete}
+must be established. Authors
+hope to achieve this difficult task in a future
+work.
+
+
+\subsection{Toward a Cryptographically Secure and Chaotic Asymmetric Cryptosystem}
+\label{Blum-Goldwasser}
+We finish this research work by giving some thoughts about the use of
+the proposed PRNG in an asymmetric cryptosystem.
+This first approach will be further investigated in a future work.
+
+\subsubsection{Recalls of the Blum-Goldwasser Probabilistic Cryptosystem}
+
+The Blum-Goldwasser cryptosystem is a cryptographically secure asymmetric key encryption algorithm 
+proposed in 1984~\cite{Blum:1985:EPP:19478.19501}.  The encryption algorithm 
+implements a XOR-based stream cipher using the BBS PRNG, in order to generate 
+the keystream. Decryption is done by obtaining the initial seed thanks to
+the final state of the BBS generator and the secret key, thus leading to the
+ reconstruction of the keystream.
+
+The key generation consists in generating two prime numbers $(p,q)$, 
+randomly and independently of each other, that are
+ congruent to 3 mod 4, and to compute the modulus $N=pq$.
+The public key is $N$, whereas the secret key is the factorization $(p,q)$.
+
+
+Suppose Bob wishes to send a string $m=(m_0, \dots, m_{L-1})$ of $L$ bits to Alice:
+\begin{enumerate}
+\item Bob picks an integer $r$ randomly in the interval $\llbracket 1,N\rrbracket$ and computes $x_0 = r^2~mod~N$.
+\item He uses the BBS to generate the keystream of $L$ pseudorandom bits $(b_0, \dots, b_{L-1})$, as follows. For $i=0$ to $L-1$,
+\begin{itemize}
+\item $i=0$.
+\item While $i \leqslant L-1$:
+\begin{itemize}
+\item Set $b_i$ equal to the least-significant\footnote{As signaled previously, BBS can securely output up to $\mathsf{N} = \lfloor log(log(N)) \rfloor$ of the least-significant bits of $x_i$ during each round.} bit of $x_i$,
+\item $i=i+1$,
+\item $x_i = (x_{i-1})^2~mod~N.$
+\end{itemize}
+\end{itemize}
+\item The ciphertext is computed by XORing the plaintext bits $m$ with the keystream: $ c = (c_0, \dots, c_{L-1}) = m \oplus  b$. This ciphertext is $[c, y]$, where $y=x_{0}^{2^{L}}~mod~N.$
+\end{enumerate}
  
-\section{Efficient prng based on chaotic iterations on GPU}
  
-On parle du passage du sequentiel au GPU
+When Alice receives $\left[(c_0, \dots, c_{L-1}), y\right]$, she can recover $m$ as follows:
+\begin{enumerate}
+\item Using the secret key $(p,q)$, she computes $r_p = y^{((p+1)/4)^{L}}~mod~p$ and $r_q = y^{((q+1)/4)^{L}}~mod~q$.
+\item The initial seed can be obtained using the following procedure: $x_0=q(q^{-1}~{mod}~p)r_p + p(p^{-1}~{mod}~q)r_q~{mod}~N$.
+\item She recomputes the bit-vector $b$ by using BBS and $x_0$.
+\item Alice finally computes the plaintext by XORing the keystream with the ciphertext: $ m = c \oplus  b$.
+\end{enumerate}
  
-\section{Experiments}
  
-On passe le BigCrush\\
-On donne des temps de générations sur GPU/CPU\\
-On donne des temps de générations de nombre sur GPU puis on rappatrie sur CPU / CPU ? bof bof, on verra
+\subsubsection{Proposal of a new Asymmetric Cryptosystem Adapted from Blum-Goldwasser}
+
+We propose to adapt the Blum-Goldwasser protocol as follows. 
+Let $\mathsf{N} = \lfloor log(log(N)) \rfloor$ be the number of bits that can
+be obtained securely with the BBS generator using the public key $N$ of Alice.
+Alice will pick randomly $S^0$ in $\llbracket 0, 2^{\mathsf{N}-1}\rrbracket$ too, and
+her new public key will be $(S^0, N)$.
  
+To encrypt his message, Bob will compute
+%%RAPH : ici, j'ai mis un simple $
+\begin{equation*}
+c = \left(m_0 \oplus (b_0 \oplus S^0), m_1 \oplus (b_0 \oplus b_1 \oplus S^0), \hdots, \right.
+ \left. m_{L-1} \oplus (b_0 \oplus b_1 \hdots \oplus b_{L-1} \oplus S^0) \right)
+\end{equation*}
+instead of $$\left(m_0 \oplus b_0, m_1 \oplus b_1, \hdots, m_{L-1} \oplus b_{L-1} \right).$$ 
+
+The same decryption stage as in Blum-Goldwasser leads to the sequence 
+$$\left(m_0 \oplus S^0, m_1 \oplus S^0, \hdots, m_{L-1} \oplus S^0 \right).$$
+Thus, with a simple use of $S^0$, Alice can obtain the plaintext.
+By doing so, the proposed generator is used in place of BBS, leading to
+the inheritance of all the properties presented in this paper.
  
  \section{Conclusion}
-\bibliographystyle{plain}
+
+
+In  this  paper, a formerly proposed PRNG based on chaotic iterations
+has been generalized to improve its speed. It has been proven to be
+chaotic according to Devaney.
+Efficient implementations on  GPU using xor-like  PRNGs as input generators
+have shown that a very large quantity of pseudorandom numbers can be generated per second (about
+20Gsamples/s), and that these proposed PRNGs succeed to pass the hardest battery in TestU01,
+namely the BigCrush.
+Furthermore, we have shown that when the inputted generator is cryptographically
+secure, then it is the case too for the PRNG we propose, thus leading to
+the possibility to develop fast and secure PRNGs using the GPU architecture.
+An improvement of the Blum-Goldwasser cryptosystem, making it 
+behave chaotically, has finally been proposed. 
+
+In future  work we plan to extend this research, building a parallel PRNG for  clusters or
+grid computing. Topological properties of the various proposed generators will be investigated,
+and the use of other categories of PRNGs as input will be studied too. The improvement
+of Blum-Goldwasser will be deepened.
+\begin{color}{red}
+Another aspect to consider might be different accelerator-based systems like 
+Intel Xeon Phi cards and speed measurements using such cards: as heterogeneity of
+supercomputers tends to increase using other accelerators than GPGPUs,
+a Xeon Phi solution might be interesting to investigate.
+\end{color}
+ Finally, we
+will try to enlarge the quantity of pseudorandom numbers generated per second either
+in a simulation context or in a cryptographic one.
+
+
+
+\bibliographystyle{plain} 
  \bibliography{mabase}
  \end{document}