Ajout diapos présentation PRNG

[16dcc.git] / chaos.tex
diff --git a/chaos.tex b/chaos.tex

index acf42e33893ed388a70ccc519d5e14cbcf944629..a82d57cf03021e14b0c5edef7ead4db15fc6f8f0 100644 (file)
--- a/chaos.tex
+++ b/chaos.tex
@@ -1,3 +1,5 @@
+
+\subsection{Motivations}
  Let us us first recall the chaos theoretical context presented 
  in~\cite{bcgr11:ip}. In this article, the space of interest 
  is $\Bool^{{\mathsf{N}}} \times \llbracket1;{\mathsf{N}}\rrbracket^{\Nats}$ 
@@ -22,13 +24,13 @@ We have proven~\cite[Theorem 1]{bcgr11:ip} that
  $\mathcal{H}_f$ is chaotic in 
  $\Bool^{{\mathsf{N}}} \times \llbracket1;{\mathsf{N}}\rrbracket^{\Nats}$
  if and only if $\Gamma(f)$ is strongly connected.
-However, the corrolary which would say that $\chi_{\textit{14Secrypt}}$ is chaotic 
+However, the corollary which would say that $\chi_{\textit{14Secrypt}}$ is chaotic 
  cannot be directly deduced since we do not output all the successive
-values of iterating $F_f$. Only a a few of them is concerned and 
+values of iterating $F_f$. Only a few of them is concerned and 
  any subsequence of a chaotic sequence  is   not  necessarily  
  a   chaotic  sequence  too.
  This necessitates a rigorous proof, which is the aim of this section.
-
+Let us firstly recall the theoretical framework in which this research takes place.
  
  
  
@@ -38,7 +40,7 @@ This necessitates a rigorous proof, which is the aim of this section.
  
  
  Consider a topological space $(\mathcal{X},\tau)$ and a continuous function $f :
-\mathcal{X} \rightarrow \mathcal{X}$.
+\mathcal{X} \rightarrow \mathcal{X}$~\cite{Devaney}.
  
  \begin{definition}
  The function $f$ is said to be \emph{topologically transitive} if, for any pair of open sets
@@ -153,13 +155,12 @@ Let us first introduce $\mathcal{P} \subset \mathds{N}$ a finite nonempty
  set having the cardinality $\mathsf{p} \in \mathds{N}^\ast$.
  Intuitively, this  is the set of authorized numbers of iterations.
  Denote by $p_1, p_2, \hdots, p_\mathsf{p}$ the ordered elements of $\mathcal{P}$: $\mathcal{P} = \{ p_1, p_2, \hdots, p_\mathsf{p}\}$
-and $p_1< p_2< \hdots < p_\mathsf{p}$. In our algorithm, 
-$\mathsf{p}$ is 1 and $p_1$ is $b$. 
-
+and $p_1< p_2< \hdots < p_\mathsf{p}$. 
  
-The Algorithm~\ref{CI Algorithm} 
-may be seen as $b$ functional composition of $F_f$.
-However, it can be generalized with $p_i$, $p_i \in \mathcal{P}$,
+In our Algorithm~\ref{CI Algorithm}, 
+$\mathsf{p}$ is 1 and $p_1$ is $b$. 
+But this algorithm can be seen as $b$ functional compositions of $F_f$.
+Obviously, it can be generalized with $p_i$, $p_i \in \mathcal{P}$,
  functional compositions of $F_f$.
  Thus, for any $p_i \in \mathcal{P}$ we introduce the function 
  $F_{f,p_i} :  \mathds{B}^\mathsf{N} \times \llbracket 1, \mathsf{N} \rrbracket^{p_i}  \rightarrow \mathds{B}^\mathsf{N}$ defined by 
@@ -179,10 +180,10 @@ $\mathds{S}_{\mathsf{N},\mathcal{P}}=
  Each element in this space is a pair where the first element is 
  $\mathsf{N}$-uple in $\Bool^{\mathsf{N}}$, as in the previous space.  
  The second element is a pair $((u^k)_{k \in \Nats},(v^k)_{k \in \Nats})$ of infinite sequences.
-The sequence $(v^k)_{k \in \Nats}$ defines how many iterations are executed at time $k$ between two outputs. 
-The sequence $(u^k)_{k \in \Nats}$ defines which elements is modified. 
+The sequence $(v^k)_{k \in \Nats}$ defines how many iterations are executed at time $k$ before the next output, 
+while $(u^k)_{k \in \Nats}$ details which elements are modified. 
  
-Let us define the shift function $\Sigma$ for any element of $\mathds{S}_{\mathsf{N},\mathcal{P}}$.
+Let us introduce the shift function $\Sigma$ for any element of $\mathds{S}_{\mathsf{N},\mathcal{P}}$.
  
  \[
  \begin{array}{cccc}
@@ -199,17 +200,18 @@ Let us define the shift function $\Sigma$ for any element of $\mathds{S}_{\maths
  In other words, $\Sigma$ receives two sequences $u$ and $v$, and
  it operates $v^0$ shifts on the first sequence and a single shift
  on the second one. 
-Let
+Let us consider
  \begin{equation}
  \begin{array}{cccc}
  G_f :&  \mathcal{X}_{\mathsf{N},\mathcal{P}} & \rightarrow & \mathcal{X}_{\mathsf{N},\mathcal{P}}\\
     & (e,(u,v)) & \mapsto & \left( F_{f,v^0}\left( e, (u^0, \hdots, u^{v^0-1}\right), \Sigma (u,v) \right) .
  \end{array}
  \end{equation}
-Then the outputs $(y^0, y^1, \hdots )$ produced by the $\textit{CIPRNG}_f^2(u,v)$ generator 
-are the first components of the iterations $X^0 = (x^0, (u,v))$ and $\forall n \in \mathds{N}, 
+Then the outputs $(y^0, y^1, \hdots )$ produced by the $\textit{CIPRNG}_f^2(u,v)$ generator~\cite{wbg10:ip} 
+are by definition the first components of the iterations $X^0 = (x^0, (u,v))$ and $\forall n \in \mathds{N}, 
  X^{n+1} = G_f(X^n)$ on $\mathcal{X}_{\mathsf{N},\mathcal{P}}$.
-
+The new obtained generator can be shown as either a post-treatment over generators $u$ and $v$, or a
+discrete dynamical system on a set constituted by binary vectors and couple of integer sequences.
  
  
  
@@ -233,7 +235,7 @@ between finite sequences $u^0, u^1, \hdots, u^{v^0-1}$ and  $\check{u}^0, \check
  between $u^{v^0}, u^{v^0+1}, \hdots, u^{v^1-1}$ and  $\check{u}^{\check{v}^0}, \check{u}^{\check{v}^0+1}, \hdots, \check{u}^{\check{v}^1-1}$, etc.
  More precisely, let $p = \lfloor \log_{10}{(\max{\mathcal{P}})}\rfloor +1$ and $n = \lfloor \log_{10}{(\mathsf{N})}\rfloor +1$.
  \begin{itemize}
-\item The $p$ first digits of $d(x,\check{x})$ is $|v^0-\check{v}^0|$ written in decimal numeration (and with $p$ digits).
+\item The $p$ first digits of $d(x,\check{x})$ is $|v^0-\check{v}^0|$ written in decimal numeration (and with $p$ digits: zeros are added on the left if needed).
  \item The next $n\times \max{(\mathcal{P})}$ digits aim at measuring how much $u^0, u^1, \hdots, u^{v^0-1}$ differs from $\check{u}^0, \check{u}^1, \hdots, \check{u}^{\check{v}^0-1}$. The $n$ first
  digits are $|u^0-\check{u}^0|$. They are followed by 
  $|u^1-\check{u}^1|$ written with $n$ digits, etc.
@@ -251,12 +253,20 @@ $\check{u}^{v^0}$ (on $n$ digits), ..., $\check{u}^{\check{v}^0-1}$ (on $n$ digi
  \end{itemize}
  \end{itemize}
  
-
-
-\newcommand{\ns}{$\hspace{.1em}$}
-
+This distance has been defined to capture all aspects of divergences between two sequences generated 
+by the $\textit{CIPRNG}_f^2$ method, when setting respectively $(u,v)$ and $(\check{u},\check{v})$ as inputted 
+couples of generators. The integral part measures the bitwise Hamming distance between 
+the two $\mathsf{N}$-length binary vectors chosen as seeds. The fractional part must decrease 
+when the number of identical iterations applied by the $\textit{CIPRNG}_f^2$ discrete dynamical system on these seeds, in both cases (that is, when inputting either $(u,v)$ or $(\check{u},\check{v})$), increases.
+More precisely, the fractional part will alternately measure the following elements:
+\begin{itemize}
+  \item Do we iterate the same number of times between the next two outputs, when considering either $(u,v)$ or $(\check{u},\check{v})$?
+  \item Then, do we iterate the same components between the next two outputs of $\textit{CIPRNG}_f^2$ ?
+  \item etc.
+\end{itemize}
+Finally, zeros are put to be able to recover what occurred at a given iteration. Such aims are illustrated in the two following examples.
  \begin{xpl}
-Consider for instance that $\mathsf{N}=13$, $\mathcal{P}=\{1,2,11\}$ (so $\mathsf{p}=2$), and that
+Consider for instance that $\mathsf{N}=13$, $\mathcal{P}=\{1,2,11\}$ (so $\mathsf{p}=3$, $p=\lfloor \log_{10}{(\max{\mathcal{P}})}\rfloor +1 = 2$, while $n=2$), and that
  $s=\left\{
  \begin{array}{l}
  u=\underline{6,} ~ \underline{11,5}, ...\\
@@ -271,28 +281,29 @@ $\check{s}=\left\{
  \end{array}
  \right.$.
  
-So $d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(s,\check{s}) = 0.01\ns00\ns04\ns00\ns00\ns00\ns00\ns00\ns00\ns00\ns00\ns00\ns01\ns10\ns05 ...$
+So 
+$$d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(s,\check{s}) = 0.01~0004000000000000000000~01~1005...$$
  Indeed, the $p=2$ first digits are 01, as $|v^0-\check{v}^0|=1$, 
  and we use $p$ digits to code this difference ($\mathcal{P}$ being $\{1,2,11\}$, this difference can be equal to 10). We then take the $v^0=1$ first terms of $u$, each term being coded in $n=2$ digits, that is, 06. As we can iterate
  at most $\max{(\mathcal{P})}$ times, we must complete this
  value by some 0's in such a way that the obtained result
  has $n\times \max{(\mathcal{P})}=22$ digits, that is: 
  0600000000000000000000. Similarly, the $\check{v}^0=2$ first
-terms in $\check{u}$ are represented by 0604000000000000000000, and the absolute value of their
-difference is equal to 0004000000000000000000. These digits are concatenated to 01, and
+terms in $\check{u}$ are represented by 0604000000000000000000, and the value of their
+digit per digit absolute difference is equal to 0004000000000000000000. These digits are concatenated to 01, and
  we start again with the remainder of the sequences.
  \end{xpl}
  
  
  \begin{xpl}
-Consider now that $\mathsf{N}=9$, and $\mathcal{P}=\{2,7\}$, and that
+Consider now that $\mathsf{N}=9$ ($n=1$), $\mathcal{P}=\{2,7\}$ ($\mathsf{p}=2, p=1$), and that
  
  $s=\left\{
  \begin{array}{l}
  u=\underline{6,7,} ~ \underline{4,2,} ...\\
  v=2,2,...
  \end{array}
-\right.$
+\right.$\\
  while
  $\check{s}=\left\{
  \begin{array}{l}
@@ -301,8 +312,10 @@ $\check{s}=\left\{
  \end{array}
  \right.$
  
-So $d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(s,\check{s}) = 0.5173633305600000...$, as $|v^0-\check{v}^0|=5$, $|4963667-6700000| = 1736333$, $|v^1-\check{v}^1|=0$,
-and $|9800000-4200000| = 5600000$.
+So: 
+$d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(s,\check{s}) = 0.5~2263667~1~5600000...$. 
+%as $|v^0-\check{v}^0|=5$, $|4963667-6700000| = 1736333$, $|v^1-\check{v}^1|=0$,
+%and $|9800000-4200000| = 5600000$.
  \end{xpl}
  
  
@@ -342,7 +355,7 @@ too, thus $d$ will also be a distance, being the sum of two distances.
  \item Obviously, $d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(s,\check{s})\geqslant 0$, and if $s=\check{s}$, then 
  $d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(s,\check{s})=0$. Conversely, if $d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(s,\check{s})=0$, then 
  $\forall k \in \mathds{N}, v^k=\check{v}^k$ due to the 
-definition of $d$. Then, as digits between positions $p+1$ and $p+n$ are null and correspond to $|u^0-\check{u}^0|$, we can conclude that $u^0=\check{u}^0$. An extension of this result to the whole first $n \times \max{(\mathcal{P})}$ bloc leads to $u^i=\check{u}^i$, $\forall i \leqslant v^0=\check{v}^0$, and by checking all the $n \times \max{(\mathcal{P})}$ blocs, $u=\check{u}$.
+definition of $d$. Then, as digits between positions $p+1$ and $p+n$ are null and correspond to $|u^0-\check{u}^0|$, we can conclude that $u^0=\check{u}^0$. An extension of this result to the whole first $n \times \max{(\mathcal{P})}$ blocs leads to $u^i=\check{u}^i$, $\forall i \leqslant v^0=\check{v}^0$, and by checking all the $n \times \max{(\mathcal{P})}$ blocs, $u=\check{u}$.
   \item $d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}$ is clearly symmetric 
  ($d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(s,\check{s})=d_{\mathds{S}_{\mathsf{N},\mathcal{P}}}(\check{s},s)$). 
  \item The triangle inequality is obtained because the absolute value satisfies it too.
@@ -403,7 +416,8 @@ $y=F_{f,p_i} (x, (u_0, \hdots, u_{p_i-1})) $.
  \end{itemize}
  
  It is not hard to see that the graph $\Gamma_{\{1\}}(f)$ is 
-$\Gamma(f)$.
+$\Gamma(f)$ formerly introduced in~\cite{bcgr11:ip} for the $\textit{CIPRNG}_f^1(u)$ generator,
+which is indeed $\textit{CIPRNG}_f^2(u,(1)_{n \in \mathds{N}})$.
  
  \begin{figure}[ht]
    \centering
@@ -433,10 +447,10 @@ Consider for instance $\mathsf{N}=2$,
  Let $f_0:\mathds{B}^2 \longrightarrow \mathds{B}^2$ be the negation function,
  \textit{i.e.}, $f_0(x_1,x_2) = (\overline{x_1}, \overline{x_2})$, and consider
  $\mathcal{P}=\{2,3\}$. The graphs of iterations are given in 
-\textsc{Figure~\ref{fig:itg}}.
-The \textsc{Figure~\ref{graphe1}} shows what happens when 
+Figure~\ref{fig:itg}.
+The Figure~\ref{graphe1} shows what happens when 
  displaying each iteration result.
-On the contrary, the \textsc{Figure~\ref{graphe2}} explicits the behaviors
+On the contrary, Figure~\ref{graphe2} illustrates the behaviors
  when always applying either 2 or 3 modifications before generating results. 
  Notice that here, orientations of arcs are not necessary 
  since the function $f_0$ is equal to its inverse $f_0^{-1}$. 
@@ -539,9 +553,55 @@ and only if its iteration graph $\Gamma_{\mathcal{P}}(f)$ is strongly connected.
    and thus $\Gamma_{\{b\}}(f_0)$ is not strongly connected.
  \end{proof}
  
-The next section recalls a general scheme to produce
-functions and a iteration number $b$
-such that $\Gamma_{\{b\}}$ is strongly connected.
+
+\subsection{Comparison with other well-known generators}
+
+\begin{table}
+\centering
+  \begin{tabular}{c|ccccccc}
+  PRNG & LCG & MRG & AWC & SWB & SWC & GFSR & INV\\
+  \hline
+  NIST & 11 & 14 & 15 & 15 & 14 & 14 & 14\\
+  DieHARD & 16 & 16 & 15 & 16 &18 & 16 & 16
+  \end{tabular}
+  \caption{Statistical evaluation of known PRNGs: number of succeeded tests}
+  \label{table:comparisonWithout}
+\end{table}
+
+\begin{table}
+\centering
+  \begin{tabular}{c|ccccccc}
+  PRNG & LCG & MRG & AWC & SWB & SWC & GFSR & INV\\
+  \hline
+  NIST & 15 & 15 & 15 & 15 & 15 & 15 & 15\\
+  DieHARD & 18 & 18 & 18 & 18 & 18 & 18 & 18
+  \end{tabular}
+  \caption{Statistical effects of CIPRNG on the succeeded tests}
+  \label{table:comparisonWith}
+\end{table}
+The objective of this section is to evaluate the statistical performance of the 
+proposed CIPRNG method, by comparing the effects of its application on well-known
+but defective generators. We considered during experiments the following PRNGs:
+linear congruential generator (LCG), multiple recursive generators (MRG)
+add-with-carry (AWC), subtract-with-borrow (SWB), shift-with-carry (SWC)
+Generalized Feedback Shift Register (GFSR), and nonlinear inversive generator.
+A general overview and a recall of design of these famous generators 
+can be found, for instance, in the documentation of the TestU01 statistical
+battery of tests~\cite{LEcuyerS07}. For each studied generator, we have compared
+their scores according to both NIST~\cite{Nist10} and DieHARD~\cite{Marsaglia1996}
+statistical batteries of tests, by launching them alone or inside the $\textit{CIPRNG}_f^2(v,v)$
+dynamical system, where $v$ is the considered PRNG set with most usual parameters,
+and $f$ is the vectorial negation. 
+
+Obtained results are reproduced in Tables~\ref{table:comparisonWithout} and \ref{table:comparisonWith}.
+As can be seen, all these generators considered alone failed to pass either the 15 NIST tests or the
+18 DieHARD ones, while both batteries of tests are always passed when applying the $\textit{CIPRNG}_f^2$
+post-treatment. Other results in the same direction, which can be found in~\cite{bfgw11:ip}, illustrate
+the fact that operating a provable chaotic post-treatment on defective generators tends to improve
+their statistical profile. 
+
+Such post-treatment depending on the properties of the inputted function $f$, we need to recall a general scheme to produce
+functions and an iteration number $b$ such that $\Gamma_{\{b\}}$ is strongly connected.
  
  
  %%% Local Variables: