--- /dev/null
+# Analysis description
+set encoding iso_8859_1
+set terminal x11
+set size 1,0.5
+set term postscript enhanced portrait "Helvetica" 12
+#set title "Performance on homogeneous cluster"
+set ylabel "Random numbers generated / second"
+set xlabel "Number of threads used by the GPU"
+#set nologscale;
+set logscale x;
+set logscale y;
+#set label "Taille" at -0.002,2.1 right
+#set label "file" at -0.003,2 right
+#set key 1500,1600
+#set xrange [20:200]
+#set yrange [0:300]
+#set offsets 0,0,2,2
+set key left top
+plot 'time_gpu.txt' using 1:2 t "naive prng gpu" with linespoints lt 2 lw 2 ps 0 pt 5,\
+'time_gpu.txt' using 1:3 t "optimized prng gpu" with linespoints lt 1 lw 2 ps 0 pt 5
apply the xor operator between the current number and the strategy. In
order to obtain the strategy we also use a classical PRNG.
-Here is an example with 16-bits numbers showing how the bit operations are
+Here is an example with 16-bits numbers showing how the bitwise operations are
applied. Suppose that $x$ and the strategy $S^i$ are defined in binary mode.
Then the following table shows the result of $x$ xor $S^i$.
$$
-In listing~\ref{algo:seqCIprng} a sequential version of our chaotic iterations
-based PRNG is presented. The xor operator is represented by \textasciicircum. This function uses three classical 64-bits PRNG: the
-\texttt{xorshift}, the \texttt{xor128} and the \texttt{xorwow}. In the
-following, we call them xor-like PRNGSs. These three PRNGs are presented
-in~\cite{Marsaglia2003}. As each xor-like PRNG used works with 64-bits and as our PRNG
-works with 32-bits, the use of \texttt{(unsigned int)} selects the 32 least
-significant bits whereas \texttt{(unsigned int)(t3$>>$32)} selects the 32 most
-significants bits of the variable \texttt{t}. So to produce a random number
-realizes 6 xor operations with 6 32-bits numbers produced by 3 64-bits PRNG.
-This version successes the BigCrush of the TestU01 battery [P. L’ecuyer and
- R. Simard. Testu01].
+In listing~\ref{algo:seqCIprng} a sequential version of our chaotic iterations
+based PRNG is presented. The xor operator is represented by
+\textasciicircum. This function uses three classical 64-bits PRNG: the
+\texttt{xorshift}, the \texttt{xor128} and the \texttt{xorwow}. In the
+following, we call them xor-like PRNGSs. These three PRNGs are presented
+in~\cite{Marsaglia2003}. As each xor-like PRNG used works with 64-bits and as
+our PRNG works with 32-bits, the use of \texttt{(unsigned int)} selects the 32
+least significant bits whereas \texttt{(unsigned int)(t3$>>$32)} selects the 32
+most significants bits of the variable \texttt{t}. So to produce a random
+number realizes 6 xor operations with 6 32-bits numbers produced by 3 64-bits
+PRNG. This version successes the BigCrush of the TestU01 battery [P. L’ecuyer
+ and R. Simard. Testu01].
\section{Efficient prng based on chaotic iterations on GPU}
by the current thread. In the algorithm, we consider that a 64-bits xor-like
PRNG is used, that is why both 32-bits parts are used.
+This version also succeed to the BigCrush batteries of tests.
+
\begin{algorithm}
\KwIn{InternalVarXorLikeArray: array with internal variables of 1 xor-like PRNGs in global memory\;
\caption{main kernel for the chaotic iterations based PRNG GPU efficient version}
\label{algo:gpu_kernel2}
\end{algorithm}
+
+
+
\section{Experiments}
Differents experiments have been performed in order to measure the generation speed.
+\begin{figure}[t]
+\begin{center}
+ \includegraphics[scale=.5]{curve_time_gpu.pdf}
+\end{center}
+\caption{Number of random numbers generated per second}
+\label{fig:time_naive_gpu}
+\end{figure}
First of all we have compared the time to generate X random numbers with both the CPU version and the GPU version.
--- /dev/null
+#threads naive nb rand/s opti
+10240 1958396000.03 13162317203.28
+20480 2607152000.80 17544514829.35
+30720 2932438000.82 19734780759.37
+51200 2787838000.25 18772978895.58
+76800 2926940000.81 19718338110.60
+102400 2778762000.41 18800512068.39
+153600 2927902000.36 19692840251.08
+512000 2905399000.83 19605898582.49
+768000 2826752000.70 19717903047.22
+1048576 2717620000.40 19625932346.26
+2097152 2720592856.85 19571418202.69
+5242880 2542399000.19 19497621662.45