+Different experiments have been performed in order to measure the generation
+speed. We have used a computer equiped with Tesla C1060 NVidia GPU card and an
+Intel Xeon E5530 cadenced at 2.40 GHz for our experiments and we have used
+another one equipped with a less performant CPU and a GeForce GTX 280. Both
+cards have 240 cores.
+
+In Figure~\ref{fig:time_gpu} we compare the number of random numbers generated
+per second. The xor-like prng is a xor64 described in~\cite{Marsaglia2003}. In
+order to obtain the optimal performance we remove the storage of random numbers
+in the GPU memory. This step is time consumming and slows down the random number
+generation. Moreover, if you are interested by applications that consome random
+numbers directly when they are generated, their storage is completely
+useless. In this figure we can see that when the number of threads is greater
+than approximately 30,000 upto 5 millions the number of random numbers generated
+per second is almost constant. With the naive version, it is between 2.5 and
+3GSample/s. With the optimized version, it is approximately equals to
+20GSample/s. Finally we can remark that both GPU cards are quite similar. In
+practice, the Tesla C1060 has more memory than the GTX 280 and this memory
+should be of better quality.