\newcommand{\alert}[1]{\begin{color}{blue}\textit{#1}\end{color}}
-\title{Efficient Generation of Pseudo-Random Bumbers based on Chaotic Iterations
+\title{Efficient Generation of Pseudo-Random Numbers based on Chaotic Iterations
on GPU}
\begin{document}
\section{Introduction}
Random numbers are used in many scientific applications and simulations. On
-finite state machines, like computers, it is not possible to generate random
+finite state machines, as computers, it is not possible to generate random
numbers but only pseudo-random numbers. In practice, a good pseudo-random number
generator (PRNG) needs to verify some features to be used by scientists. It is
important to be able to generate pseudo-random numbers efficiently, the
generation needs to be reproducible and a PRNG needs to satisfy many usual
statistical properties. Finally, from our point a view, it is essential to prove
-that a PRNG is chaotic. Devaney~\cite{Devaney} proposed a common mathematical
-formulation of chaotic dynamical systems.
+that a PRNG is chaotic. Devaney~\cite{Devaney} proposed a common mathematical
+formulation of chaotic dynamical systems. Concerning the statistical tests,
+TestU01the is the best-known public-domain statistical testing packages. So we
+use it for all our PRNGs, especially the {\it BigCrush} which is based on the largest
+serie of tests.
In a previous work~\cite{bgw09:ip} we have proposed a new familly of chaotic
PRNG based on chaotic iterations (IC). In this paper we propose a faster
version which is also proven to be chaotic with the Devaney formulation.
Although graphics processing units (GPU) was initially designed to accelerate
-the manipulation of image, they are nowadays commonly used in many scientific
+the manipulation of images, they are nowadays commonly used in many scientific
applications. Therefore, it is important to be able to generate pseudo-random
-numbers in a GPU when a scientific application runs in a GPU. That is why we
-also provie an efficient PRNG for GPU respecting based on IC.
+numbers inside a GPU when a scientific application runs in a GPU. That is why we
+also provide an efficient PRNG for GPU respecting based on IC.
Interet de générer des nombres alea sur GPU
-\section{Related works}
+\section{Related works on GPU based PRNGs}
-In this section we review some GPU based PRNGs.
-\alert{RC, un petit state-of-the-art sur les PRNGs sur GPU ?}
+In the litterature many authors have work on defining GPU based PRNGs. We do not
+want to be exhaustive and we just give the most significant works from our point
+of view.
+
+In \cite{Pang:2008:cec}, the authors define a PRNG based on cellular automata
+which does not require high precision integer arithmetics nor bitwise
+operations. There is no mention of statistical tests nor proof that this PRNG is
+chaotic. Concerning the speed of generation, they can generate about 3200000
+random numbers per seconds on a GeForce 7800 GTX GPU (which is quite old now).
+
+In \cite{ZRKB10}, the authors propose different versions of efficient GPU PRNGs
+based on Lagged Fibonacci, Hybrid Taus or Hybrid Taus. They have used these
+PRNGs for Langevin simulations of biomolecules fully implemented on GPU.
\section{Basic Recalls}
\label{section:BASIC RECALLS}
In listing~\ref{algo:seqCIprng} a sequential version of our chaotic iterations
-based PRNG is presented. The xor operator is represented by
-\textasciicircum. This function uses three classical 64-bits PRNG: the
-\texttt{xorshift}, the \texttt{xor128} and the \texttt{xorwow}. In the
-following, we call them xor-like PRNGSs. These three PRNGs are presented
-in~\cite{Marsaglia2003}. As each xor-like PRNG used works with 64-bits and as
-our PRNG works with 32-bits, the use of \texttt{(unsigned int)} selects the 32
-least significant bits whereas \texttt{(unsigned int)(t3$>>$32)} selects the 32
-most significants bits of the variable \texttt{t}. So to produce a random
-number realizes 6 xor operations with 6 32-bits numbers produced by 3 64-bits
-PRNG. This version successes the BigCrush of the TestU01 battery [P. L’ecuyer
- and R. Simard. Testu01].
+based PRNG is presented. The xor operator is represented by \textasciicircum.
+This function uses three classical 64-bits PRNG: the \texttt{xorshift}, the
+\texttt{xor128} and the \texttt{xorwow}. In the following, we call them
+xor-like PRNGSs. These three PRNGs are presented in~\cite{Marsaglia2003}. As
+each xor-like PRNG used works with 64-bits and as our PRNG works with 32-bits,
+the use of \texttt{(unsigned int)} selects the 32 least significant bits whereas
+\texttt{(unsigned int)(t3$>>$32)} selects the 32 most significants bits of the
+variable \texttt{t}. So to produce a random number realizes 6 xor operations
+with 6 32-bits numbers produced by 3 64-bits PRNG. This version successes the
+BigCrush of the TestU01 battery~\cite{LEcuyerS07}.
\section{Efficient prng based on chaotic iterations on GPU}