C'est de la merde ce git

[chaos1.git] / main.tex
diff --git a/main.tex b/main.tex

index 82e99bf347b5d83e7434f69dafe7be54805e0975..49564db89def8615a0ad19f0d6b20ab7b6a7bc4f 100644 (file)
--- a/main.tex
+++ b/main.tex
@@ -88,7 +88,6 @@ far more difficult to learn.
  \maketitle
  
  \begin{quotation}
-
  Chaotic neural  networks have received a  lot of attention  due to the
  appealing   properties  of   deterministic   chaos  (unpredictability,
  sensitivity,  and so  on). However,  such networks  are  often claimed
@@ -97,7 +96,7 @@ work  a theoretical  framework based  on the  Devaney's  definition of
  chaos is  introduced.  Starting  with a relationship  between discrete
  iterations  and  Devaney's chaos,  we  firstly  show  how to  build  a
  recurrent  neural network  that is  equivalent  to a  chaotic map  and
-secondly  a way  to check  whether  an already  available network  is
+secondly  a way  to  check  whether an  already  available network  is
  chaotic  or not.  We  also study  different topological  properties of
  these  truly  chaotic neural  networks.   Finally,  we  show that  the
  learning,  with   neural  networks  having   a  classical  feedforward
@@ -110,7 +109,7 @@ chaotic maps, is far more difficult than non chaotic behaviors.
  \label{S1}
  
  Several research  works have proposed or used  chaotic neural networks
-these last  years.  The  complex dynamics of  such networks  lead to
+these  last years.   The complex  dynamics  of such  networks lead  to
  various       potential      application       areas:      associative
  memories~\cite{Crook2007267}  and  digital  security tools  like  hash
  functions~\cite{Xiao10},                                       digital
@@ -136,8 +135,8 @@ are  suitable to model  nonlinear relationships  between data,  due to
  their            universal            approximator            capacity
  \cite{Cybenko89,DBLP:journals/nn/HornikSW89}.   Thus,   this  kind  of
  networks can  be trained  to model a  physical phenomenon known  to be
-chaotic such as Chua's circuit \cite{dalkiran10}.  Sometime a neural
-network,  which is build  by combining  transfer functions  and initial
+chaotic such  as Chua's circuit \cite{dalkiran10}.   Sometime a neural
+network, which  is build by  combining transfer functions  and initial
  conditions  that are  both chaotic,  is itself  claimed to  be chaotic
  \cite{springerlink:10.1007/s00521-010-0432-2}.
  
@@ -151,10 +150,10 @@ a  dynamical  system:  ergodicity,   expansivity,  and  so  on.   More
  precisely, in  this paper,  which is an  extension of a  previous work
  \cite{bgs11:ip},   we  establish   the  equivalence   between  chaotic
  iterations  and  a  class  of  globally  recurrent  MLP.   The  second
-contribution is a  study of the converse problem,  indeed we investigate the
-ability  of classical  multiLayer  perceptrons to  learn a  particular
+contribution is a study of the converse problem, indeed we investigate
+the ability of classical  multilayer perceptrons to learn a particular
  family of discrete chaotic  dynamical systems.  This family is defined
-by a Boolean vector, an update function, and a sequence defining which
+by a Boolean vector, an update function, and a sequence defining the
  component  to  update  at  each  iteration.  It  has  been  previously
  established that  such dynamical systems are  chaotically iterated (as
  it  is defined by  Devaney) when  the chosen  function has  a strongly
@@ -234,14 +233,15 @@ point dependence on initial conditions.
  neighborhood of its future evolution eventually overlap with any other
  given region.  This property implies that a dynamical system cannot be
  broken into simpler subsystems.   Intuitively, its complexity does not
-allow any  simplification.  On the  contrary, a dense set  of periodic
-points is an element of regularity that a chaotic dynamical system has
-to exhibit.
+allow any  simplification.  
  
  However, chaos needs some regularity to ``counteracts'' the effects of
-transitivity.
+transitivity. % Thus, two points close to each other can behave in a completely different manner, the first one visiting the whole space whereas the second one has a regular orbit.
+In the Devaney's formulation, a dense set  of periodic
+points is the element of regularity that a chaotic dynamical system has
+to exhibit.
  %\begin{definition} \label{def3}
-We recall that a  point $x$ is  {\bf periodic point} for $f$ of 
+We recall that a  point $x$ is a {\bf periodic point} for $f$ of 
  period~$n \in \mathds{N}^{\ast}$ if $f^{n}(x)=x$.
  %\end{definition}
  Then, the map 
@@ -261,12 +261,12 @@ Let  us recall  that  $f$  has {\bf  sensitive  dependence on  initial
  $n  > 0$ such  that $d\left(f^{n}(x),  f^{n}(y)\right) >\delta  $. The
  value $\delta$ is called the {\bf constant of sensitivity} of $f$.
  
-Finally,  The  dynamical system  that  iterates  $f$  is {\bf  chaotic
+Finally,  the  dynamical system  that  iterates  $f$  is {\bf  chaotic
    according  to Devaney}  on $(\mathcal{X},\tau)$  if $f$  is regular,
  topologically transitive, and has  sensitive dependence to its initial
  conditions.   In  what follows,  iterations  are  said  to be  chaotic
-according  Devaney  when  corresponding  dynamical system  is  chaotic
-according Devaney.
+(according  to Devaney)  when  the corresponding  dynamical system  is
+chaotic, as it is defined in the Devaney's formulation.
  
  %Let us notice that for a  metric space the last condition follows from
  %the two first ones~\cite{Banks92}.
@@ -311,15 +311,15 @@ $i$ from $x$ to $N(i,x)$ if and only if $f_i(x)$ is $N(i,x)$.
  In the  sequel, the  {\it strategy} $S=(S^{t})^{t  \in \Nats}$  is the
  sequence defining  which component to  update at time $t$  and $S^{t}$
  denotes its $t-$th term.  This iteration scheme that only modifies one
-element at each iteration is classically referred as {\it asynchronous
+element at each iteration is usually referred as {\it asynchronous
    iterations}.  More precisely, we have for any $i$, $1\le i \le n$,
  \begin{equation}
  \left\{ \begin{array}{l}
  x^{0}  \in \Bool^n \\ 
  x^{t+1}_i = \left\{ 
  \begin{array}{l}
-  f_i(x^t) \textrm{ if $S^t = i$} \\
-  x_i^t \textrm{ otherwise} 
+  f_i(x^t) \textrm{ if $S^t = i$}\enspace , \\
+  x_i^t \textrm{ otherwise}\enspace . 
   \end{array}
  \right.
  \end{array} \right.
@@ -373,15 +373,17 @@ X^{k+1}& = & G_{f}(X^{k})\\
  \right.
  \label{eq:Gf}
  \end{equation}
-where  $\sigma$ is the  function that  removes the  first term  of the
-strategy  ({\it  i.e.},~$S^0$).    This  definition  allows  to  links
-asynchronous  iterations  with  classical  iterations of  a  dynamical
+where $\sigma$ is the so-called  shift function that removes the first
+term of  the strategy ({\it i.e.},~$S^0$).  This  definition allows to
+link asynchronous iterations with  classical iterations of a dynamical
  system. Note that it can be extended by considering subsets for $S^t$.
  
  To study topological properties of  these iterations, we are then left
  to  introduce a  {\bf distance}  $d$  between two  points $(S,x)$  and
-$(\check{S},\check{x})\in \mathcal{X} = \llbracket1;n\rrbracket^\Nats
-\times \Bool^{n}$. It is defined by
+$(\check{S},\check{x})$           in           $\mathcal{X}          =
+\llbracket1;n\rrbracket^\Nats \times \Bool^{n}$. Let $\Delta(x,y) = 0$
+if   $x=y$,  and   $\Delta(x,y)  =   1$   else,  be   a  distance   on
+$\mathds{B}$. The distance $d$ is defined by
  \begin{equation}
  d((S,x);(\check{S},\check{x}))=d_{e}(x,\check{x})+d_{s}(S,\check{S})
  \enspace ,
@@ -406,10 +408,10 @@ these  requirements: on  the one  hand  its floor  value reflects  the
  difference between  the cells, on  the other hand its  fractional part
  measures the difference between the strategies.
  
-The relation  between $\Gamma(f)$ and  $G_f$ is clear: there  exists a
+The relation  between $\Gamma(f)$ and  $G_f$ is obvious: there  exists a
  path from  $x$ to $x'$  in $\Gamma(f)$ if  and only if there  exists a
  strategy  $s$ such  that iterations  of $G_f$  from the  initial point
-$(s,x)$   reaches   the   configuration   $x'$.   Using   this   link,
+$(s,x)$   reach   the   configuration   $x'$.   Using   this   link,
  Guyeux~\cite{GuyeuxThese10} has proven that,
  \begin{theorem}%[Characterization of $\mathcal{C}$]
  \label{Th:Caracterisation   des   IC   chaotiques}  
@@ -424,7 +426,7 @@ case, iterations of the function $G_f$ as defined in Eq.~(\ref{eq:Gf})
  are chaotic according to Devaney.
  
  
-Let   us  then   define  two   function  $f_0$   and  $f_1$   both  in
+Let   us  then   define  two   functions  $f_0$   and  $f_1$   both  in
  $\Bool^n\to\Bool^n$ that are used all along this paper.  The former is
  the   vectorial  negation,   \textit{i.e.},  $f_{0}(x_{1},\dots,x_{n})
  =(\overline{x_{1}},\dots,\overline{x_{n}})$.      The     latter    is
@@ -460,9 +462,9 @@ we obtain  a global recurrent  neural network that behaves  as follows
  \item When  the network  is activated at  the $t^{th}$  iteration, the
    state of the system $x^t  \in \mathds{B}^n$ received from the output
    layer and  the initial  term of the  sequence $(S^t)^{t  \in \Nats}$
-  ($S^0  \in \llbracket 1;n\rrbracket$)  are used  to compute  the new
-  output vector.  This  new vector, which represents the  new state of
-  the dynamical system, satisfies:
+  (\textit{i.e.},  $S^0  \in \llbracket  1;n\rrbracket$)  are used  to
+  compute the  new output vector.   This new vector,  which represents
+  the new state of the dynamical system, satisfies:
    \begin{equation}
      x^{t+1}=F_{f_0}(S^0, x^t) \in \mathds{B}^n \enspace .
    \end{equation}
@@ -477,20 +479,17 @@ we obtain  a global recurrent  neural network that behaves  as follows
  
  The behavior of the neural network is such that when the initial state
  is  $x^0~\in~\mathds{B}^n$ and  a  sequence $(S^t)^{t  \in \Nats}$  is
-given  as outside  input,
-\JFC{en dire davantage sur l'outside world} %% TO BE UPDATED
-then  the sequence  of  successive published
+given  as outside  input, then  the sequence  of  successive published
  output vectors $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ is exactly
  the  one produced  by  the chaotic  iterations  formally described  in
-Eq.~(\ref{eq:Gf}).  It  means that  mathematically if we  use similar
+Eq.~(\ref{eq:Gf}).   It means  that mathematically  if we  use similar
  input  vectors   they  both  generate  the   same  successive  outputs
  $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$,  and therefore that they
  are  equivalent  reformulations  of  the iterations  of  $G_{f_0}$  in
  $\mathcal{X}$. Finally, since the  proposed neural network is built to
-model  the  behavior  of  $G_{f_0}$,  whose iterations are
-  chaotic  according  to
-Devaney's definition  of chaos,  we can conclude  that the  network is
-also chaotic in this sense.
+model  the  behavior  of   $G_{f_0}$,  whose  iterations  are  chaotic
+according to the  Devaney's definition of chaos, we  can conclude that
+the network is also chaotic in this sense.
  
  The previous construction scheme  is not restricted to function $f_0$.
  It can  be extended to any function  $f$ such that $G_f$  is a chaotic
@@ -527,7 +526,6 @@ to an input one.
    compute the new output one $\left(x^{t+1}_1,\dots,x^{t+1}_n\right)$.
    While  the remaining  input receives  a new  integer value  $S^t \in
    \llbracket1;n\rrbracket$, which is provided by the outside world.
-\JFC{en dire davantage sur l'outside world}
  \end{itemize}
  
  The topological  behavior of these  particular neural networks  can be
@@ -558,7 +556,7 @@ condition  $\left(S,(x_1^0,\dots,  x_n^0)\right)  \in  \llbracket  1;n
  \rrbracket^{\mathds{N}}  \times \mathds{B}^n$.
  Theoretically  speaking, such iterations  of $F_f$  are thus  a formal
  model of these kind of recurrent  neural networks. In the rest of this
-paper,  we will  call such  multilayer perceptrons  CI-MLP($f$), which
+paper, we will call such multilayer perceptrons ``CI-MLP($f$)'', which
  stands for ``Chaotic Iterations based MultiLayer Perceptron''.
  
  Checking  if CI-MLP($f$)  behaves chaotically  according  to Devaney's
@@ -640,7 +638,7 @@ of the output space can be discarded when studying CI-MLPs: this space
  is  intrinsically   complicated  and   it  cannot  be   decomposed  or
  simplified.
  
-Furthermore, those  recurrent neural networks  exhibit the instability
+Furthermore, these  recurrent neural networks  exhibit the instability
  property:
  \begin{definition}
  A dynamical  system $\left( \mathcal{X}, f\right)$ is {\bf unstable}
@@ -701,7 +699,7 @@ Figure~\ref{Fig:scheme} is a summary  of addressed neural networks and
  chaos  problems.   In  Section~\ref{S2}   we  have  explained  how  to
  construct    a    truly    chaotic    neural   networks,    $A$    for
  instance. Section~\ref{S3} has shown how  to check whether a given MLP
-$A$ or $C$ is chaotic or not  in the sens of Devaney, and how to study
+$A$ or $C$ is chaotic or not in the sense of Devaney, and how to study
  its  topological  behavior.   The  last  thing  to  investigate,  when
  comparing neural networks and Devaney's chaos, is to determine whether
  an  artificial neural network  $C$ is  able to  learn or  predict some
@@ -720,7 +718,7 @@ extraction takes place once  at destination.  The reverse ({\it i.e.},
  automatically detecting the presence  of hidden messages inside media)
  is  called   steganalysis.   Among  the   deployed  strategies  inside
  detectors,          there         are          support         vectors
-machines~\cite{Qiao:2009:SM:1704555.1704664},                    neural
+machines~\cite{Qiao:2009:SM:1704555.1704664},                   neural
  networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36},  and
  Markov   chains~\cite{Sullivan06steganalysisfor}.    Most   of   these
  detectors  give quite  good results  and are  rather  competitive when
@@ -827,7 +825,7 @@ $3$~terms we obtain 2304~input-output pairs.
  \label{section:experiments}
  
  To study  if chaotic iterations can  be predicted, we  choose to train
-the multiLayer perceptron.  As stated  before, this kind of network is
+the multilayer perceptron.  As stated  before, this kind of network is
  in  particular  well-known for  its  universal approximation  property
  \cite{Cybenko89,DBLP:journals/nn/HornikSW89}.  Furthermore,  MLPs have
  been  already  considered for  chaotic  time  series prediction.   For
@@ -861,10 +859,10 @@ trainings of two data sets, one of them describing chaotic iterations,
  are compared.
  
  Thereafter we give,  for the different learning setups  and data sets,
-the mean prediction success rate obtained for each output. A such rate
-represent the  percentage of input-output pairs belonging  to the test
+the mean prediction success rate obtained for each output. Such a rate
+represents the percentage of  input-output pairs belonging to the test
  subset  for  which  the   corresponding  output  value  was  correctly
-predicted.  These values  are computed  considering  10~trainings with
+predicted.   These values are  computed considering  10~trainings with
  random  subsets  construction,   weights  and  biases  initialization.
  Firstly, neural networks having  10 and 25~hidden neurons are trained,
  with   a  maximum   number  of   epochs  that   takes  its   value  in
@@ -880,7 +878,7 @@ hidden layer up to 40~neurons and we consider larger number of epochs.
  \centering {\small
  \begin{tabular}{|c|c||c|c|c|}
  \hline 
-\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs and one hidden layer} \\
+\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs, and one hidden layer} \\
  \hline
  \hline
  \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\
@@ -932,32 +930,39 @@ is observed (from 36.10\% for 10~neurons and 125~epochs to 70.97\% for
  25~neurons  and  500~epochs). We  also  notice  that  the learning  of
  outputs~(2)   and~(3)  is   more  difficult.    Conversely,   for  the
  non-chaotic  case the  simplest training  setup is  enough  to predict
-configurations.  For all those  feedforward network topologies and all
+configurations.  For all these  feedforward network topologies and all
  outputs the  obtained results for the non-chaotic  case outperform the
  chaotic  ones. Finally,  the rates  for the  strategies show  that the
-different networks are unable to learn them.
-
-\newpage
+different feedforward networks are unable to learn them.
  
  For  the  second  coding   scheme  (\textit{i.e.},  with  Gray  Codes)
  Table~\ref{tab2} shows  that any network learns about  five times more
-non-chaotic  configurations  than chaotic  ones.  As  in the  previous
-scheme, the strategies cannot be predicted.
+non-chaotic  configurations than  chaotic  ones.  As  in the  previous
+scheme,       the      strategies      cannot       be      predicted.
+Figures~\ref{Fig:chaotic_predictions}                              and
+\ref{Fig:non-chaotic_predictions} present the predictions given by two
+feedforward multilayer  perceptrons that were  respectively trained to
+learn chaotic  and non-chaotic data,  using the second  coding scheme.
+Each figure  shows for  each sample of  the test  subset (577~samples,
+representing 25\%  of the 2304~samples) the  configuration that should
+have been predicted and the one given by the multilayer perceptron. It
+can be  seen that for  the chaotic data  the predictions are  far away
+from the  expected configurations.  Obviously,  the better predictions
+for the non-chaotic data reflect their regularity.
  
  Let us now compare the  two coding schemes. Firstly, the second scheme
  disturbs  the   learning  process.   In   fact  in  this   scheme  the
  configuration is always expressed as  a natural number, whereas in the
-first one  the number  of inputs follows  the increase of  the boolean
+first one  the number  of inputs follows  the increase of  the Boolean
  vectors coding configurations. In this latter case, the coding gives a
  finer information on configuration evolution.
-\JFC{Je n'ai pas compris le paragraphe precedent. Devrait être repris}
  \begin{table}[b]
  \caption{Prediction success rates for configurations expressed with Gray code}
  \label{tab2}
  \centering
  \begin{tabular}{|c|c||c|c|c|}
  \hline 
-\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs and one hidden layer} \\
+\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs, and one hidden layer} \\
  \hline
  \hline
  & Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\
@@ -984,12 +989,26 @@ finer information on configuration evolution.
  \end{tabular}
  \end{table}
  
+\begin{figure}
+  \centering
+  \includegraphics[scale=0.5]{chaotic_trace2}
+  \caption {Second coding scheme - Predictions obtained for a chaotic test subset.}
+  \label{Fig:chaotic_predictions}
+\end{figure}
+
+\begin{figure}
+  \centering
+  \includegraphics[scale=0.5]{non-chaotic_trace2} 
+  \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.}
+  \label{Fig:non-chaotic_predictions}
+\end{figure}
+
  Unfortunately, in  practical applications the number  of components is
  usually  unknown.   Hence, the  first  coding  scheme  cannot be  used
  systematically.   Therefore, we  provide  a refinement  of the  second
  scheme: each  output is learned  by a different  ANN. Table~\ref{tab3}
  presents the  results for  this approach.  In  any case,  whatever the
-considered  feedforward network topologies,  the maximum  epoch number
+considered feedforward  network topologies, the  maximum epoch number,
  and the kind of iterations, the configuration success rate is slightly
  improved.   Moreover, the  strategies predictions  rates  reach almost
  12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%.  Despite of
@@ -1002,7 +1021,7 @@ appear to be an open issue.
  \centering
  \begin{tabular}{|c||c|c|c|}
  \hline 
-\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output and one hidden layer} \\
+\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output, and one hidden layer} \\
  \hline
  \hline
  Epochs & 125 & 250 & 500 \\ 
@@ -1064,10 +1083,6 @@ Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
  \end{tabular}
  \end{table}
  
-%
-% TO BE COMPLETED
-%
-
  \section{Conclusion}
  
  In  this paper,  we have  established an  equivalence  between chaotic
@@ -1101,12 +1116,10 @@ be investigated  too, to  discover which tools  are the  most relevant
  when facing a truly chaotic phenomenon.  A comparison between learning
  rate  success  and  prediction  quality will  be  realized.   Concrete
  consequences in biology, physics, and computer science security fields
-will be  stated.
+will then be stated.
  
  % \appendix{}
  
-
-
  % \begin{definition} \label{def2}
  % A continuous function $f$  on a topological space $(\mathcal{X},\tau)$
  % is defined  to be  {\emph{topologically transitive}}  if for any  pair of
@@ -1117,7 +1130,6 @@ will be  stated.
  % $f^k(U) \cap V \neq \emptyset$.
  % \end{definition}
  
-
  \bibliography{chaos-paper}% Produces the bibliography via BibTeX.
  
  \end{document}