\newcommand{\alert}[1]{\begin{color}{blue}\textit{#1}\end{color}}
-\title{Efficient Generation of Pseudo-Random Numbers based on Chaotic Iterations
-on GPU}
+\title{Efficient and Cryptographically Secure Generation of Chaotic Pseudorandom Numbers on GPU}
\begin{document}
-\author{Jacques M. Bahi, Rapha\"{e}l Couturier, and Christophe
-Guyeux, Pierre-Cyrille Heam\thanks{Authors in alphabetic order}}
-
+\author{Jacques M. Bahi, Rapha\"{e}l Couturier, Christophe
+Guyeux, and Pierre-Cyrille Heam\thanks{Authors in alphabetic order}}
+
\maketitle
\begin{abstract}
-In this paper we present a new pseudo-random numbers generator (PRNG) on
-graphics processing units (GPU). This PRNG is based on chaotic iterations. it
-is proven to be chaotic in the Devanay's formulation. We propose an efficient
-implementation for GPU which succeeds to the {\it BigCrush}, the hardest
-batteries of test of TestU01. Experimentations show that this PRNG can generate
+In this paper we present a new pseudorandom number generator (PRNG) on
+graphics processing units (GPU). This PRNG is based on the so-called chaotic iterations. It
+is firstly proven to be chaotic according to the Devaney's formulation. We thus propose an efficient
+implementation for GPU that successfully passes the {\it BigCrush} tests, deemed to be the hardest
+battery of tests in TestU01. Experiments show that this PRNG can generate
about 20 billions of random numbers per second on Tesla C1060 and NVidia GTX280
cards.
+It is finally established that, under reasonable assumptions, the proposed PRNG can be cryptographically
+secure.
\end{abstract}
\section{Introduction}
-Random numbers are used in many scientific applications and simulations. On
-finite state machines, as computers, it is not possible to generate random
-numbers but only pseudo-random numbers. In practice, a good pseudo-random numbers
-generator (PRNG) needs to verify some features to be used by scientists. It is
-important to be able to generate pseudo-random numbers efficiently, the
-generation needs to be reproducible and a PRNG needs to satisfy many usual
-statistical properties. Finally, from our point a view, it is essential to prove
-that a PRNG is chaotic. Concerning the statistical tests, TestU01 is the
-best-known public-domain statistical testing package. So we use it for all our
-PRNGs, especially the {\it BigCrush} which provides the largest serie of tests.
-Concerning the chaotic properties, Devaney~\cite{Devaney} proposed a common
-mathematical formulation of chaotic dynamical systems.
-
-In a previous work~\cite{bgw09:ip} we have proposed a new familly of chaotic
-PRNG based on chaotic iterations. We have proven that these PRNGs are
-chaotic in the Devaney's sense. In this paper we propose a faster version which
-is also proven to be chaotic.
-
-Although graphics processing units (GPU) was initially designed to accelerate
+Randomness is of importance in many fields as scientific simulations or cryptography.
+``Random numbers'' can mainly be generated either by a deterministic and reproducible algorithm
+called a pseudorandom number generator (PRNG), or by a physical non-deterministic
+process having all the characteristics of a random noise, called a truly random number
+generator (TRNG).
+In this paper, we focus on reproducible generators, useful for instance in
+Monte-Carlo based simulators or in several cryptographic schemes.
+These domains need PRNGs that are statistically irreproachable.
+On some fields as in numerical simulations, speed is a strong requirement
+that is usually attained by using parallel architectures. In that case,
+a recurrent problem is that a deflate of the statistical qualities is often
+reported, when the parallelization of a good PRNG is realized.
+This is why ad-hoc PRNGs for each possible architecture must be found to
+achieve both speed and randomness.
+On the other side, speed is not the main requirement in cryptography: the great
+need is to define \emph{secure} generators being able to withstand malicious
+attacks. Roughly speaking, an attacker should not be able in practice to make
+the distinction between numbers obtained with the secure generator and a true random
+sequence.
+Finally, a small part of the community working in this domain focus on a
+third requirement, that is to define chaotic generators.
+The main idea is to take benefits from a chaotic dynamical system to obtain a
+generator that is unpredictable, disordered, sensible to its seed, or in other words chaotic.
+Their desire is to map a given chaotic dynamics into a sequence that seems random
+and unassailable due to chaos.
+However, the chaotic maps used as a pattern are defined in the real line
+whereas computers deal with finite precision numbers.
+This distortion leads to a deflation of both chaotic properties and speed.
+Furthermore, authors of such chaotic generators often claim their PRNG
+as secure due to their chaos properties, but there is no obvious relation
+between chaos and security as it is understood in cryptography.
+This is why the use of chaos for PRNG still remains marginal and disputable.
+
+The authors' opinion is that topological properties of disorder, as they are
+properly defined in the mathematical theory of chaos, can reinforce the quality
+of a PRNG. But they are not substitutable for security or statistical perfection.
+Indeed, to the authors' point of view, such properties can be useful in the two following situations. On the
+one hand, a post-treatment based on a chaotic dynamical system can be applied
+to a PRNG statistically deflective, in order to improve its statistical
+properties. Such an improvement can be found, for instance, in~\cite{bgw09:ip,bcgr11:ip}.
+On the other hand, chaos can be added to a fast, statistically perfect PRNG and/or a
+cryptographically secure one, in case where chaos can be of interest,
+\emph{only if these last properties are not lost during
+the proposed post-treatment}. Such an assumption is behind this research work.
+It leads to the attempts to define a
+family of PRNGs that are chaotic while being fast and statistically perfect,
+or cryptographically secure.
+Let us finish this paragraph by noticing that, in this paper,
+statistical perfection refers to the ability to pass the whole
+{\it BigCrush} battery of tests, which is widely considered as the most
+stringent statistical evaluation of a sequence claimed as random.
+This battery can be found into the well-known TestU01 package.
+Chaos, for its part, refers to the well-established definition of a
+chaotic dynamical system proposed by Devaney~\cite{Devaney}.
+
+
+In a previous work~\cite{bgw09:ip,guyeux10} we have proposed a post-treatment on PRNGs making them behave
+as a chaotic dynamical system. Such a post-treatment leads to a new category of
+PRNGs. We have shown that proofs of Devaney's chaos can be established for this
+family, and that the sequence obtained after this post-treatment can pass the
+NIST, DieHARD, and TestU01 batteries of tests, even if the inputted generators
+cannot.
+The proposition of this paper is to improve widely the speed of the formerly
+proposed generator, without any lack of chaos or statistical properties.
+In particular, a version of this PRNG on graphics processing units (GPU)
+is proposed.
+Although GPU was initially designed to accelerate
the manipulation of images, they are nowadays commonly used in many scientific
-applications. Therefore, it is important to be able to generate pseudo-random
-numbers inside a GPU when a scientific application runs in a GPU. That is why we
-also provide an efficient PRNG for GPU respecting based on IC. Such devices
-allows us to generated almost 20 billions of random numbers per second.
-
-In order to establish that our PRNGs are chaotic according to the Devaney's
-formulation, we extend what we have proposed in~\cite{guyeux10}.
-
-The rest of this paper is organised as follows. In Section~\ref{section:related
- works} we review some GPU implementions of PRNG. Section~\ref{section:BASIC
- RECALLS} gives some basic recalls on Devanay's formation of chaos and chaotic
-iterations. In Section~\ref{sec:pseudo-random} the proof of chaos of our PRNGs
-is studied. Section~\ref{sec:efficient prng} presents an efficient
-implementation of our chaotic PRNG on a CPU. Section~\ref{sec:efficient prng
- gpu} describes the GPU implementation of our chaotic PRNG. In
-Section~\ref{sec:experiments} some experimentations are presented.
- Finally, we give a conclusion and some perspectives.
+applications. Therefore, it is important to be able to generate pseudorandom
+numbers inside a GPU when a scientific application runs in it. This remark
+motivates our proposal of a chaotic and statistically perfect PRNG for GPU.
+Such device
+allows us to generated almost 20 billions of pseudorandom numbers per second.
+Last, but not least, we show that the proposed post-treatment preserves the
+cryptographical security of the inputted PRNG, when this last has such a
+property.
+
+The remainder of this paper is organized as follows. In Section~\ref{section:related
+ works} we review some GPU implementations of PRNGs. Section~\ref{section:BASIC
+ RECALLS} gives some basic recalls on the well-known Devaney's formulation of chaos,
+ and on an iteration process called ``chaotic
+iterations'' on which the post-treatment is based.
+Proofs of chaos are given in Section~\ref{sec:pseudorandom}.
+Section~\ref{sec:efficient prng} presents an efficient
+implementation of this chaotic PRNG on a CPU, whereas Section~\ref{sec:efficient prng
+ gpu} describes the GPU implementation.
+Such generators are experimented in
+Section~\ref{sec:experiments}.
+We show in Section~\ref{sec:security analysis} that, if the inputted
+generator is cryptographically secure, then it is the case too for the
+generator provided by the post-treatment.
+Such a proof leads to the proposition of a cryptographically secure and
+chaotic generator on GPU based on the famous Blum Blum Shum
+in Section~\ref{sec:CSGPU}.
+This research work ends by a conclusion section, in which the contribution is
+summarized and intended future work is presented.
\end{theorem}
This result of chaos has lead us to study the possibility to build a
-pseudo-random number generator (PRNG) based on the chaotic iterations.
+pseudorandom number generator (PRNG) based on the chaotic iterations.
As $G_f$, defined on the domain $\llbracket 1 ; \mathsf{N} \rrbracket^{\mathds{N}}
\times \mathds{B}^\mathsf{N}$, is build from Boolean networks $f : \mathds{B}^\mathsf{N}
\rightarrow \mathds{B}^\mathsf{N}$, we can preserve the theoretical properties on $G_f$
$\mathds{B}^\mathsf{N}$ represents the memory of the computer whereas $\llbracket 1 ; \mathsf{N}
\rrbracket^{\mathds{N}}$ is its input stream (the seeds, for instance).
-\section{Application to Pseudo-Randomness}
-\label{sec:pseudo-random}
-\subsection{A First Pseudo-Random Number Generator}
+\section{Application to pseudorandomness}
+\label{sec:pseudorandom}
+\subsection{A First pseudorandom Number Generator}
We have proposed in~\cite{bgw09:ip} a new family of generators that receives
two PRNGs as inputs. These two generators are mixed with chaotic iterations,
of chaos obtained in~\cite{bg10:ij} have been established
only for chaotic iterations of the form presented in Definition
\ref{Def:chaotic iterations}. The question is now to determine whether the
-use of more general chaotic iterations to generate pseudo-random numbers
+use of more general chaotic iterations to generate pseudorandom numbers
faster, does not deflate their topological chaos properties.
\subsection{Proofs of Chaos of the General Formulation of the Chaotic Iterations}
chaotic iterations presented previously, and for this reason, it satisfies the
Devaney's formulation of a chaotic behavior.
-\section{A cryptographically secure prng for GPU}
-
-It is possible to build a cryptographically secure prng based on the previous
-algorithm (algorithm~\ref{algo:gpu_kernel2}). It simply consists in replacing
-the {\it xor-like} algorithm by another cryptographically secure prng. In
-practice, we suggest to use the BBS algorithm~\cite{BBS} which takes the form:
-$$x_{n+1}=x_n^2~ mod~ M$$ where $M$ is the product of two prime numbers. Those
-prime numbers need to be congruent to 3 modulus 4. In practice, this PRNG is
-known to be slow and not efficient for the generation of random numbers. For
-current GPU cards, the modulus operation is the most time consuming
-operation. So in order to obtain quite reasonable performances, it is required
-to use only modulus on 32 bits integer numbers. Consequently $x_n^2$ need to be
-less than $2^{32}$ and the number $M$ need to be less than $2^{16}$. So in
-pratice we can choose prime numbers around 256 that are congruent to 3 modulus
-4. With 32 bits numbers, only the 4 least significant bits of $x_n$ can be
-chosen (the maximum number of undistinguishing is less or equals to
-$log_2(log_2(x_n))$). So to generate a 32 bits number, we need to use 8 times
-the BBS algorithm, with different combinations of $M$ is required.
-
-Currently this PRNG does not succeed to pass all the tests of TestU01.
-
\section{Experiments}
\label{sec:experiments}
\end{figure}
Both these experimentations allows us to conclude that it is possible to
-generate a huge number of pseudo-random numbers with the xor-like version and
+generate a huge number of pseudorandom numbers with the xor-like version and
about tens times less with the BBS based version. The former version has only
chaotic properties whereas the latter also has cryptographically properties.
\section{Security Analysis}
-
+\label{sec:security analysis}
In this section the concatenation of two strings $u$ and $v$ is classically
denoted by $uv$.
-In a cryptographic context, a pseudo-random generator is a deterministic
+In a cryptographic context, a pseudorandom generator is a deterministic
algorithm $G$ transforming strings into strings and such that, for any
seed $w$ of length $N$, $G(w)$ (the output of $G$ on the input $w$) has size
$\ell_G(N)$ with $\ell_G(N)>N$.
function $\ell^\prime$ satisfying $\ell^\prime(N)>N)$~\cite[Chapter 3.3]{Goldreich}.
The generation schema developed in (\ref{equation Oplus}) is based on a
-pseudo-random generator. Let $H$ be a cryptographic PRNG. We may assume,
+pseudorandom generator. Let $H$ be a cryptographic PRNG. We may assume,
without loss of generality, that for any string $S_0$ of size $N$, the size
of $H(S_0)$ is $kN$, with $k>2$. It means that $\ell_H(N)=kN$.
Let $S_1,\ldots,S_k$ be the
+\section{A cryptographically secure prng for GPU}
+\label{sec:CSGPU}
+It is possible to build a cryptographically secure prng based on the previous
+algorithm (algorithm~\ref{algo:gpu_kernel2}). It simply consists in replacing
+the {\it xor-like} algorithm by another cryptographically secure prng. In
+practice, we suggest to use the BBS algorithm~\cite{BBS} which takes the form:
+$$x_{n+1}=x_n^2~ mod~ M$$ where $M$ is the product of two prime numbers. Those
+prime numbers need to be congruent to 3 modulus 4. In practice, this PRNG is
+known to be slow and not efficient for the generation of random numbers. For
+current GPU cards, the modulus operation is the most time consuming
+operation. So in order to obtain quite reasonable performances, it is required
+to use only modulus on 32 bits integer numbers. Consequently $x_n^2$ need to be
+less than $2^{32}$ and the number $M$ need to be less than $2^{16}$. So in
+pratice we can choose prime numbers around 256 that are congruent to 3 modulus
+4. With 32 bits numbers, only the 4 least significant bits of $x_n$ can be
+chosen (the maximum number of undistinguishing is less or equals to
+$log_2(log_2(x_n))$). So to generate a 32 bits number, we need to use 8 times
+the BBS algorithm, with different combinations of $M$ is required.
+
+Currently this PRNG does not succeed to pass all the tests of TestU01.
+
\section{Conclusion}
We also propose a PRNG cryptographically secure and its implementation on GPU.
An efficient implementation on GPU based on a xor-like PRNG allows us to
-generate a huge number of pseudo-random numbers per second (about
+generate a huge number of pseudorandom numbers per second (about
20Gsample/s). This PRNG succeeds to pass the hardest batteries of TestU01.
In future work we plan to extend this work for parallel PRNG for clusters or