X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/prng_gpu.git/blobdiff_plain/b5e3d39489874c8ac95b5825ed5baa520529de17..3e3534bd31a342c01e2f3770836bb62ecd460cfe:/prng_gpu.tex diff --git a/prng_gpu.tex b/prng_gpu.tex index 537feef..081ec9c 100644 --- a/prng_gpu.tex +++ b/prng_gpu.tex @@ -8,6 +8,7 @@ \usepackage{moreverb} \usepackage{commath} \usepackage{algorithm2e} +\usepackage{listings} \usepackage[standard]{ntheorem} % Pour mathds : les ensembles IR, IN, etc. @@ -48,9 +49,10 @@ This is the abstract Interet des itérations chaotiques pour générer des nombre alea\\ Interet de générer des nombres alea sur GPU +\alert{RC, un petit state-of-the-art sur les PRNGs sur GPU ?} ... -% >>>>>>>>>>>>>>>>>>>>>> Basic recalls <<<<<<<<<<<<<<<<<<<<<<<<<<<<<< + \section{Basic Recalls} \label{section:BASIC RECALLS} This section is devoted to basic definitions and terminologies in the fields of topological chaos and chaotic iterations. @@ -191,11 +193,10 @@ The distance presented above follows these recommendations. Indeed, if the floor Finally, it has been established in \cite{guyeux10} that, \begin{proposition} -$G_{f}$ must be continuous in the metric -space $(\mathcal{X},d)$. +Let $f$ be a map from $\mathds{B}^n$ to itself. Then $G_{f}$ is continuous in the metric space $(\mathcal{X},d)$. \end{proposition} -The chaotic property of $G_f$ has been firstly established for the vectorial Boolean negation \cite{guyeux10}. To obtain a characterization, we have introduced the notion of asynchronous iteration graph recalled bellow. +The chaotic property of $G_f$ has been firstly established for the vectorial Boolean negation \cite{guyeux10}. To obtain a characterization, we have secondly introduced the notion of asynchronous iteration graph recalled bellow. Let $f$ be a map from $\mathds{B}^n$ to itself. The {\emph{asynchronous iteration graph}} associated with $f$ is the @@ -216,8 +217,8 @@ Let $f:\mathds{B}^n\to\mathds{B}^n$. $G_f$ is chaotic (according to Devaney) if and only if $\Gamma(f)$ is strongly connected. \end{theorem} - - +This result of chaos has lead us to study the possibility to build a pseudo-random number generator (PRNG) based on the chaotic iterations. +As $G_f$, defined on the domain $\llbracket 1 ; n \rrbracket^{\mathds{N}} \times \mathds{B}^n$, is build from Boolean networks $f : \mathds{B}^n \rightarrow \mathds{B}^n$, we can preserve the theoretical properties on $G_f$ during implementations (due to the discrete nature of $f$). It is as if $\mathds{B}^n$ represents the memory of the computer whereas $\llbracket 1 ; n \rrbracket^{\mathds{N}}$ is its input stream (the seeds, for instance). \section{Application to Pseudo-Randomness} @@ -246,7 +247,7 @@ return $x$\; \end{algorithm} \begin{algorithm}[h!] -\SetAlgoLined +%\SetAlgoLined %%RAPH: cette ligne provoque une erreur chez moi \KwIn{the internal configuration $z$ (a 32-bit word)} \KwOut{$y$ (a 32-bit word)} $z\leftarrow{z\oplus{(z\ll13)}}$\; @@ -285,6 +286,11 @@ We have proven in \cite{FCT11} that, \end{theorem} + +\alert{Mettre encore un peu de blabla sur le PRNG, puis enchaîner en disant que, ok, on peut préserver le chaos quand on passe sur machine, mais que le chaos dont il s'agit a été prouvé pour une distance bizarroïde sur un espace non moins hémoroïde, d'où ce qui suit} + + + \section{The relativity of disorder} \label{sec:de la relativité du désordre} @@ -663,52 +669,111 @@ Indeed this result is weaker than the theorem establishing the chaos for the fin On parle du séquentiel avec des nombres 64 bits\\ -Faire le lien avec le paragraphe précédent (je considère que la stratégie s'appelle $S^i$\\ In order to implement efficiently a PRNG based on chaotic iterations it is possible to improve previous works [ref]. One solution consists in considering -that the strategy used $S^i$ contains all the bits for which the negation is +that the strategy used contains all the bits for which the negation is achieved out. Then instead of applying the negation on these bits we can simply -apply the xor operator between the current number and the strategy $S^i$. In +apply the xor operator between the current number and the strategy. In order to obtain the strategy we also use a classical PRNG. -\begin{figure}[htbp] -\begin{center} -\fbox{ -\begin{minipage}{14cm} -unsigned int CIprng() \{\\ - static unsigned int x = 123123123;\\ - unsigned long t1 = xorshift();\\ - unsigned long t2 = xor128();\\ - unsigned long t3 = xorwow();\\ - x = x\textasciicircum (unsigned int)t1;\\ - x = x\textasciicircum (unsigned int)(t2$>>$32);\\ - x = x\textasciicircum (unsigned int)(t3$>>$32);\\ - x = x\textasciicircum (unsigned int)t2;\\ - x = x\textasciicircum (unsigned int)(t1$>>$32);\\ - x = x\textasciicircum (unsigned int)t3;\\ - return x;\\ -\} -\end{minipage} +%% \begin{figure}[htbp] +%% \begin{center} +%% \fbox{ +%% \begin{minipage}{14cm} +%% unsigned int CIprng() \{\\ +%% static unsigned int x = 123123123;\\ +%% unsigned long t1 = xorshift();\\ +%% unsigned long t2 = xor128();\\ +%% unsigned long t3 = xorwow();\\ +%% x = x\textasciicircum (unsigned int)t1;\\ +%% x = x\textasciicircum (unsigned int)(t2$>>$32);\\ +%% x = x\textasciicircum (unsigned int)(t3$>>$32);\\ +%% x = x\textasciicircum (unsigned int)t2;\\ +%% x = x\textasciicircum (unsigned int)(t1$>>$32);\\ +%% x = x\textasciicircum (unsigned int)t3;\\ +%% return x;\\ +%% \} +%% \end{minipage} +%% } +%% \end{center} +%% \caption{sequential Chaotic Iteration PRNG} +%% \label{algo:seqCIprng} +%% \end{figure} + + + +\lstset{language=C,caption={C code of the sequential chaotic iterations based PRNG},label=algo:seqCIprng} +\begin{lstlisting} +unsigned int CIprng() { + static unsigned int x = 123123123; + unsigned long t1 = xorshift(); + unsigned long t2 = xor128(); + unsigned long t3 = xorwow(); + x = x^(unsigned int)t1; + x = x^(unsigned int)(t2>>32); + x = x^(unsigned int)(t3>>32); + x = x^(unsigned int)t2; + x = x^(unsigned int)(t1>>32); + x = x^(unsigned int)t3; + return x; } -\end{center} -\caption{sequential Chaotic Iteration PRNG} -\label{algo:seqCIprng} -\end{figure} +\end{lstlisting} + + + + -In Figure~\ref{algo:seqCIprng} a sequential version of our chaotic iterations -based PRNG is presented. This version uses three classical 64 bits PRNG: the -\texttt{xorshift}, the \texttt{xor128} and the \texttt{xorwow}. These three -PRNGs are presented in~\cite{Marsaglia2003}. As each PRNG used works with -64-bits and as our PRNG works with 32 bits, the use of \texttt{(unsigned int)} -selects the 32 least significant bits whereas \texttt{(unsigned int)(t3$>>$32)} -selects the 32 most significants bits of the variable \texttt{t}. This version -sucesses the BigCrush of the TestU01 battery [P. L’ecuyer and +In listing~\ref{algo:seqCIprng} a sequential version of our chaotic iterations +based PRNG is presented. This version uses three classical 64-bits PRNG: the +\texttt{xorshift}, the \texttt{xor128} and the \texttt{xorwow}. In the +following, we call them xor-like PRNGSs. These three PRNGs are presented +in~\cite{Marsaglia2003}. As each xor-like PRNG used works with 64-bits and as our PRNG +works with 32-bits, the use of \texttt{(unsigned int)} selects the 32 least +significant bits whereas \texttt{(unsigned int)(t3$>>$32)} selects the 32 most +significants bits of the variable \texttt{t}. So to produce a random number +realizes 6 xor operations with 6 32-bits numbers produced by 3 64-bits PRNG. +This version successes the BigCrush of the TestU01 battery [P. L’ecuyer and R. Simard. Testu01]. \section{Efficient prng based on chaotic iterations on GPU} -On parle du passage du sequentiel au GPU +In order to benefit from computing power of GPU, a program needs to define +independent blocks of threads which can be computed simultaneously. In general, +the larger the number of threads is, the more local memory is used and the less +branching instructions are used (if, while, ...), the better performance is +obtained on GPU. So with algorithm \ref{algo:seqCIprng} presented in the +previous section, it is possible to build a similar program which computes PRNG +on GPU. The principe consists in assigning the computation of a PRNG as in +sequential to each thread of the GPU. Of course, it is essential that the three +xor-like PRNGs used for our computation have different parameters. So we chose +them randomly with another PRNG. As the initialisation is performed by the CPU, +we have chosen to use the ISAAC PRNG to initalize all the parameters for the GPU +version of our PRNG. The implementation of the three xor-like PRNGs is +straightforward as soon as their parameters have been allocated in the GPU +memory. Each xor-like PRNGs used works with an internal number $x$ which keeps +the last generated random numbers. Other internal variables are used by the +xor-like PRNGs. More precisely, the implementation of the xor128, the xorshift +and the xorwow respectively require 4, 5 and 6 unsigned long as internal +variables. + +\begin{algorithm} + +\KwIn{InternalVarXorLikeArray: array with internal variables of the 3 xor-like PRNGs in global memory\; +NumThreads: Number of threads\;} +\KwOut{NewNb: array containing random numbers in global memory} +\If{threadId is concerned} { + retrieve data from InternalVarXorLikeArray[threadId] in local variables\; + \For{i=1 to n} { + compute a new PRNG as in Listing\ref{algo:seqCIprng}\; + store the new PRNG in NewNb[NumThreads*threadId+i]\; + } + store internal variables in InternalVarXorLikeArray[threadId]\; +} + +\caption{PRNG with chaotic functions} +\label{main kernel for the chaotic iterations based PRNG GPU version} +\end{algorithm} \section{Experiments}