o2 = threadIdx-offset+tab2[offset]\;
\For{i=1 to n} {
t=xor-like()\;
- t=t$\oplus$shmem[o1]$\oplus$shmem[o2]\;
+ t=t $\hat{ }$ shmem[o1] $\hat{ }$ shmem[o2]\;
shared\_mem[threadId]=t\;
- x = x $\oplus$ t\;
+ x = x $\hat{ }$ t\;
store the new PRNG in NewNb[NumThreads*threadId+i]\;
}
In Figure~\ref{fig:time_bbs_gpu} we highlight the performances of the optimized
BBS-based PRNG on GPU. On the Tesla C1060 we
-obtain approximately 1.8GSample/s and on the GTX 280 about 1.6GSample/s, which is
+obtain approximately 700MSample/s and on the GTX 280 about 670MSample/s, which is
obviously slower than the xorlike-based PRNG on GPU. However, we will show in the
next sections that
this new PRNG has a strong level of security, which is necessary paid by a speed
t|=BBS1(bbs1)\&7\;
t<<=BBS7(bbs7)\&3\;
t|=BBS2(bbs2)\&7\;
- t=t$\oplus$shmem[o1]$\oplus$shmem[o2]\;
+ t=t $\hat{ }$ shmem[o1] $\hat{ }$ shmem[o2]\;
shared\_mem[threadId]=t\;
- x = x $\oplus$ t\;
+ x = x $\hat{ }$ t\;
store the new PRNG in NewNb[NumThreads*threadId+i]\;
}