chaosANN.tex

   1 % \section{Introduction}
   2 % \label{S1}
   3
   4 Les réseaux de neurones chaotiques ont été étudiés à de maintes reprises
   5 par le passé en raison notamment de leurs applications potentielles:
   6 %les mémoires associatives~\cite{Crook2007267}
   7 les  composants utils à la  sécurité comme les fonctions de
   8 hachage~\cite{Xiao10},
   9 le tatouage numérique~\cite{1309431,Zhang2005759}
  10 ou les schémas de chiffrement~\cite{Lian20091296}.
  11 Dans tous ces cas, l'emploi de fonctions chaotiques est motivé par
  12 leur comportement imprévisibile et proche de l'aléa.
  13
  14
  15 Les réseaux de neurones chaotiques peuvent être conçus selon plusieurs
  16 principes. Des neurones modifiant leur état en suivant une fonction non
  17 linéaire son par exemple appelés neurones chaotiques~\cite{Crook2007267}.
  18 L'architecture de réseaux de neurones de type Perceptron multi-couches
  19 (MLP) n'iterent quant à eux, pas nécesssairement de fonctions chaotiques.
  20 Il a cependant été démontré  que ce sont des approximateurs
  21 universels~\cite{Cybenko89,DBLP:journals/nn/HornikSW89}.
  22 Ils permettent, dans certains cas, de simuler des comportements
  23 physiques chaotiques comme le  circuit de Chua~\cite{dalkiran10}.
  24 Parfois~\cite{springerlink:10.1007/s00521-010-0432-2},
  25 la fonction de transfert de cette famille de réseau celle
  26 d'initialisation sont toutes les deux définies à l'aide de
  27 fonctions chaotiques.
  28
  29
  30
  31
  32
  33 Ces réseaux de neurones partagent le fait qu'ils sont qualifiés de
  34 ``chaotiques'' sous prétexte qu'ils embarquent une fonction de ce type
  35 et ce sans aucune preuve rigoureuse. Ce chapitre caractérise la
  36 classe des réseaux de neurones MLP chaotiques. Il
  37 s'intéresse ensuite à l'étude de prévisibilité de systèmes dynamiques
  38 discrets chaotiques par cette famille de MLP.
  39
  40 \JFC{revoir plan}
  41
  42 The remainder of this research  work is organized as follows. The next
  43 section is devoted to the basics of Devaney's chaos.  Section~\ref{S2}
  44 formally  describes  how  to  build  a neural  network  that  operates
  45 chaotically.  Section~\ref{S3} is devoted to the dual case of checking
  46 whether  an existing neural  network is  chaotic or  not.  Topological
  47 properties of chaotic neural networks are discussed in Sect.~\ref{S4}.
  48 The  Section~\ref{section:translation}  shows  how to  translate  such
  49 iterations  into  an Artificial  Neural  Network  (ANN),  in order  to
  50 evaluate the  capability for this  latter to learn  chaotic behaviors.
  51 This  ability  is  studied in  Sect.~\ref{section:experiments},  where
  52 various ANNs try to learn two  sets of data: the first one is obtained
  53 by chaotic iterations while the  second one results from a non-chaotic
  54 system.  Prediction success rates are  given and discussed for the two
  55 sets.  The paper ends with a conclusion section where our contribution
  56 is summed up and intended future work is exposed.
  57
  58
  59 \section{Un réseau de neurones chaotique au sens de Devaney}
  60 \label{S2}
  61
  62 On considère une fonction
  63 $f:\Bool^n\to\Bool^n$ telle que  $\textsc{giu}(f)$ est fortement connexe.
  64 Ainsi $G_{f_u}$ est chaotique d'après le théorème~\ref{Th:CaracIC}.
  65
  66 On considère ici le schéma unaire défini par l'équation (\ref{eq:asyn}).
  67 On construit un perceptron multi-couche associé à la fonction
  68 $F_{f_u}$.
  69 Plus précisément, pour chaque entrée
  70  $(x,s)   \in  \mathds{B}^n \times [n]$,
  71 la couche de sortie doit générer $F_{f_u}(x,s)$.
  72 On peut ainsi lier la couche de sortie avec celle d'entrée pour représenter
  73 les dépendance entre deux itérations successives.
  74 On obtient une réseau de neurones dont le comportement est le suivant
  75 (voir Figure.~\ref{Fig:perceptron}):
  76
  77 \begin{itemize}
  78 \item   Le réseau est initialisé avec le vecteur d'entrée
  79   $\left(x^0,S^0\right)  \mathds{B}^n \times [n]$
  80   et calcule le  vecteur de sortie
  81   $x^1=F_{f_u}\left(x^0,S^0\right)$.
  82   Ce vecteur est publié comme une sortie et est aussi retournée sur la couche d'entrée
  83   à travers les liens de retours.
  84 \item Lorsque le réseau est  activé à la $t^{th}$  itération, l'état du
  85   système $x^t  \in \mathds{B}^n$ reçu depuis la couche de sortie ainsi que le
  86   premier terme de la  sequence $(S^t)^{t  \in \Nats}$
  87   (\textit{i.e.},  $S^0  \in [n]$)  servent à construire le nouveau vecteur de sortie.
  88   Ce nouveau vecteur, qui représente le nouvel état du système dynamique, satisfait:
  89   \begin{equation}
  90     x^{t+1}=F_{f_u}(x^t,S^0) \in \mathds{B}^n \enspace .
  91   \end{equation}
  92 \end{itemize}
  93
  94 \begin{figure}
  95   \centering
  96   \includegraphics[scale=0.625]{images/perceptron}
  97   \caption{Un perceptron équivalent  aux itérations unitaires}
  98   \label{Fig:perceptron}
  99 \end{figure}
 100
 101 Le comportement de ce réseau de neurones est tel que lorsque l'état
 102 initial est composé de $x^0~\in~\mathds{B}^n$ et d'une séquence
 103  $(S^t)^{t  \in \Nats}$, alors la séquence  contenant les vecteurs successifs
 104 publiés $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ est exactement celle produite
 105 par les itérations unaires décrites à la section~\ref{sec:TIPE12}.
 106 Mathématiquement, cela signifie que si on utilise les mêmes vecteurs d'entrées
 107 les deux approches génèrent successivement les mêmes sorties.
 108 En d'autres termes ce réseau de neurones modélise le comportement de
 109 $G_{f_u}$,  dont les itérations sont chaotiques sur $\mathcal{X}_u$.
 110 On peut donc le  qualifier de chaotique au sens de Devaney.
 111
 112 \section{Vérifier si un réseau de neurones est chaotique}
 113 \label{S3}
 114 On s'intéresse maintenant au cas où l'on dispose d'un
 115 réseau de neurones de type perceptron multi-couches
 116 dont on cherche à savoir s'il est chaotique (parce qu'il a par exemple été
 117 déclaré comme tel) au sens de Devaney.
 118 On considère de plus que sa topologie est la suivante:
 119 l'entrée est constituée de  $n$ bits et un entier, la sortie est constituée de $n$ bits
 120 et chaque sortie est liée à une entrée par une boucle.
 121
 122 \begin{itemize}
 123 \item Le réseau est initialisé avec  $n$~bits
 124    $\left(x^0_1,\dots,x^0_n\right)$ et une valeur entière $S^0 \in [n]$.
 125 \item  A l'itération~$t$,     le vecteur
 126   $\left(x^t_1,\dots,x^t_n\right)$  permet de construire les $n$~bits
 127   servant de sortie $\left(x^{t+1}_1,\dots,x^{t+1}_n\right)$.
 128 \end{itemize}
 129
 130 Le comportement de ce type de réseau de neurones peut être prouvé comme
 131 étant chaotique en suivant la démarche énoncée maintenant.
 132 On nomme tout d'abord $F:    \mathds{B}^n  \times [n] \rightarrow
 133 \mathds{B}^n$     la      fonction qui associe
 134 au vecteur
 135 $\left(\left(x_1,\dots,x_n\right),s\right)    \in   \mathds{B}^n \times[n]$
 136 le vecteur
 137 $\left(y_1,\dots,y_n\right)       \in       \mathds{B}^n$, où
 138 $\left(y_1,\dots,y_n\right)$  sont les sorties du réseau neuronal
 139 àaprès l'initialisation de la couche d'entrée avec
 140 $\left(s,\left(x_1,\dots, x_n\right)\right)$.  Ensuite, on définie $f:
 141 \mathds{B}^n       \rightarrow      \mathds{B}^n$  telle que
 142 $f\left(x_1,x_2,\dots,x_n\right)$ est égal à
 143 \begin{equation}
 144 \left(F\left(\left(x_1,x_2,\dots,x_n\right),1\right),\dots,
 145   F\left(\left(x_1,x_2,\dots,x_n\right)\right),n\right) \enspace .
 146 \end{equation}
 147 Ainsi pour chaque $j$, $1 \le j \le n$, on a
 148 $f_j\left(x_1,x_2,\dots,x_n\right) =
 149 F\left(\left(x_1,x_2,\dots,x_n\right),j\right)$.
 150 Si ce réseau de neurones est initialisé avec
 151 $\left(x_1^0,\dots,x_n^0\right)$   et $S   \in    [n]^{\mathds{N}}$,
 152 il produit exactement les même sorties que les itérations de  $F_{f_u}$ avec une
 153 condition initiale $\left((x_1^0,\dots,  x_n^0),S\right)  \in  \mathds{B}^n \times [n]^{\mathds{N}}$.
 154 Les itérations de $F_{f_u}$
 155 sont donc un modèle formel de ce genre de réseau de neurones.
 156 Pour vérifier s'il est chaotique, il suffit ainsi
 157 de vérifier si le graphe d'itérations
 158 $\textsc{giu}(f)$ est fortement connexe.
 159
 160
 161 \section{Suitability of Feedforward Neural Networks
 162 for Predicting Chaotic and Non-chaotic Behaviors}
 163
 164 In  the context  of computer  science  different topic  areas have  an
 165 interest       in       chaos,       as       for       steganographic
 166 techniques~\cite{1309431,Zhang2005759}.    Steganography  consists  in
 167 embedding a  secret message within  an ordinary one, while  the secret
 168 extraction takes place once  at destination.  The reverse ({\it i.e.},
 169 automatically detecting the presence  of hidden messages inside media)
 170 is  called   steganalysis.   Among  the   deployed  strategies  inside
 171 detectors,          there         are          support         vectors
 172 machines~\cite{Qiao:2009:SM:1704555.1704664},                   neural
 173 networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36},  and
 174 Markov   chains~\cite{Sullivan06steganalysisfor}.    Most   of   these
 175 detectors  give quite  good results  and are  rather  competitive when
 176 facing steganographic  tools.  However, to  the best of  our knowledge
 177 none of the considered information hiding schemes fulfills the Devaney
 178 definition  of chaos~\cite{Devaney}.  Indeed,  one can  wonder whether
 179 detectors  continue to  give good  results when  facing  truly chaotic
 180 schemes.  More  generally, there remains the open  problem of deciding
 181 whether artificial intelligence is suitable for predicting topological
 182 chaotic behaviors.
 183
 184 \subsection{Representing Chaotic Iterations for Neural Networks}
 185 \label{section:translation}
 186
 187 The  problem  of  deciding  whether  classical  feedforward  ANNs  are
 188 suitable  to approximate  topological chaotic  iterations may  then be
 189 reduced to  evaluate such neural  networks on iterations  of functions
 190 with  Strongly  Connected  Component  (SCC)~graph of  iterations.   To
 191 compare with  non-chaotic iterations, the experiments  detailed in the
 192 following  sections  are carried  out  using  both  kinds of  function
 193 (chaotic and non-chaotic). Let  us emphasize on the difference between
 194 this  kind  of  neural  networks  and  the  Chaotic  Iterations  based
 195 multilayer peceptron.
 196
 197 We are  then left to compute  two disjoint function  sets that contain
 198 either functions  with topological chaos properties  or not, depending
 199 on  the strong  connectivity of  their iterations graph.  This  can be
 200 achieved for  instance by removing a  set of edges  from the iteration
 201 graph $\Gamma(f_0)$ of the vectorial negation function~$f_0$.  One can
 202 deduce whether  a function verifies the topological  chaos property or
 203 not  by checking  the strong  connectivity of  the resulting  graph of
 204 iterations.
 205
 206 For instance let us consider  the functions $f$ and $g$ from $\Bool^4$
 207 to $\Bool^4$ respectively defined by the following lists:
 208 $$[0,  0,  2,   3,  13,  13,  6,   3,  8,  9,  10,  11,   8,  13,  14,
 209   15]$$ $$\mbox{and } [11, 14, 13, 14, 11, 10, 1, 8, 7, 6, 5, 4, 3, 2,
 210   1, 0]  \enspace.$$ In  other words,  the image of  $0011$ by  $g$ is
 211 $1110$: it  is obtained as the  binary value of the  fourth element in
 212 the  second  list  (namely~14).   It   is  not  hard  to  verify  that
 213 $\Gamma(f)$ is  not SCC  (\textit{e.g.}, $f(1111)$ is  $1111$) whereas
 214 $\Gamma(g)$ is. The  remaining of this section shows  how to translate
 215 iterations of such functions into a model amenable to be learned by an
 216 ANN.   Formally, input  and  output vectors  are pairs~$((S^t)^{t  \in
 217   \Nats},x)$          and          $\left(\sigma((S^t)^{t          \in
 218   \Nats}),F_{f}(S^0,x)\right)$ as defined in~Eq.~(\ref{eq:Gf}).
 219
 220 Firstly, let us focus on how to memorize configurations.  Two distinct
 221 translations are  proposed.  In the first  case, we take  one input in
 222 $\Bool$  per  component;  in   the  second  case,  configurations  are
 223 memorized  as   natural  numbers.    A  coarse  attempt   to  memorize
 224 configuration  as  natural  number  could  consist  in  labeling  each
 225 configuration  with  its  translation  into  decimal  numeral  system.
 226 However,  such a  representation induces  too many  changes  between a
 227 configuration  labeled by  a  power  of two  and  its direct  previous
 228 configuration: for instance, 16~(10000)  and 15~(01111) are close in a
 229 decimal ordering, but  their Hamming distance is 5.   This is why Gray
 230 codes~\cite{Gray47} have been preferred.
 231
 232 Secondly, let us detail how to deal with strategies.  Obviously, it is
 233 not possible to  translate in a finite way  an infinite strategy, even
 234 if both $(S^t)^{t \in \Nats}$ and $\sigma((S^t)^{t \in \Nats})$ belong
 235 to  $\{1,\ldots,n\}^{\Nats}$.  Input  strategies are  then  reduced to
 236 have a length of size $l \in \llbracket 2,k\rrbracket$, where $k$ is a
 237 parameter of the evaluation. Notice  that $l$ is greater than or equal
 238 to $2$ since  we do not want the shift  $\sigma$~function to return an
 239 empty strategy.  Strategies are memorized as natural numbers expressed
 240 in base  $n+1$.  At  each iteration, either  none or one  component is
 241 modified  (among the  $n$ components)  leading to  a radix  with $n+1$
 242 entries.  Finally,  we give an  other input, namely $m  \in \llbracket
 243 1,l-1\rrbracket$, which  is the  number of successive  iterations that
 244 are applied starting  from $x$.  Outputs are translated  with the same
 245 rules.
 246
 247 To address  the complexity  issue of the  problem, let us  compute the
 248 size of the data set an ANN has to deal with.  Each input vector of an
 249 input-output pair  is composed of a configuration~$x$,  an excerpt $S$
 250 of the strategy to iterate  of size $l \in \llbracket 2, k\rrbracket$,
 251 and a  number $m \in  \llbracket 1, l-1\rrbracket$ of  iterations that
 252 are executed.
 253
 254 Firstly, there are $2^n$  configurations $x$, with $n^l$ strategies of
 255 size $l$ for  each of them. Secondly, for  a given configuration there
 256 are $\omega = 1 \times n^2 +  2 \times n^3 + \ldots+ (k-1) \times n^k$
 257 ways  of writing  the pair  $(m,S)$. Furthermore,  it is  not  hard to
 258 establish that
 259 \begin{equation}
 260 \displaystyle{(n-1) \times \omega = (k-1)\times n^{k+1} - \sum_{i=2}^k n^i} \nonumber
 261 \end{equation}
 262 then
 263 \begin{equation}
 264 \omega =
 265 \dfrac{(k-1)\times n^{k+1}}{n-1} - \dfrac{n^{k+1}-n^2}{(n-1)^2} \enspace . \nonumber
 266 \end{equation}
 267 \noindent And then, finally, the number of  input-output pairs for our
 268 ANNs is
 269 $$
 270 2^n \times \left(\dfrac{(k-1)\times n^{k+1}}{n-1} - \dfrac{n^{k+1}-n^2}{(n-1)^2}\right) \enspace .
 271 $$
 272 For  instance, for $4$  binary components  and a  strategy of  at most
 273 $3$~terms we obtain 2304~input-output pairs.
 274
 275 \subsection{Experiments}
 276 \label{section:experiments}
 277
 278 To study  if chaotic iterations can  be predicted, we  choose to train
 279 the multilayer perceptron.  As stated  before, this kind of network is
 280 in  particular  well-known for  its  universal approximation  property
 281 \cite{Cybenko89,DBLP:journals/nn/HornikSW89}.  Furthermore,  MLPs have
 282 been  already  considered for  chaotic  time  series prediction.   For
 283 example,   in~\cite{dalkiran10}  the   authors  have   shown   that  a
 284 feedforward  MLP with  two hidden  layers, and  trained  with Bayesian
 285 Regulation  back-propagation, can learn  successfully the  dynamics of
 286 Chua's circuit.
 287
 288 In  these experiments  we consider  MLPs  having one  hidden layer  of
 289 sigmoidal  neurons  and  output   neurons  with  a  linear  activation
 290 function.     They    are    trained    using    the    Limited-memory
 291 Broyden-Fletcher-Goldfarb-Shanno quasi-newton algorithm in combination
 292 with the Wolfe linear search.  The training process is performed until
 293 a maximum number of epochs  is reached.  To prevent overfitting and to
 294 estimate the  generalization performance we use  holdout validation by
 295 splitting the  data set into  learning, validation, and  test subsets.
 296 These subsets  are obtained through  random selection such  that their
 297 respective size represents 65\%, 10\%, and 25\% of the whole data set.
 298
 299 Several  neural  networks  are  trained  for  both  iterations  coding
 300 schemes.   In  both  cases   iterations  have  the  following  layout:
 301 configurations of  four components and  strategies with at  most three
 302 terms. Thus, for  the first coding scheme a data  set pair is composed
 303 of 6~inputs and 5~outputs, while for the second one it is respectively
 304 3~inputs and 2~outputs. As noticed at the end of the previous section,
 305 this  leads to  data sets  that  consist of  2304~pairs. The  networks
 306 differ  in the  size of  the hidden  layer and  the maximum  number of
 307 training epochs.  We remember that  to evaluate the ability  of neural
 308 networks to  predict a  chaotic behavior for  each coding  scheme, the
 309 trainings of two data sets, one of them describing chaotic iterations,
 310 are compared.
 311
 312 Thereafter we give,  for the different learning setups  and data sets,
 313 the mean prediction success rate obtained for each output. Such a rate
 314 represents the percentage of  input-output pairs belonging to the test
 315 subset  for  which  the   corresponding  output  value  was  correctly
 316 predicted.   These values are  computed considering  10~trainings with
 317 random  subsets  construction,   weights  and  biases  initialization.
 318 Firstly, neural networks having  10 and 25~hidden neurons are trained,
 319 with   a  maximum   number  of   epochs  that   takes  its   value  in
 320 $\{125,250,500\}$  (see Tables~\ref{tab1} and  \ref{tab2}).  Secondly,
 321 we refine the second coding scheme by splitting the output vector such
 322 that   each  output   is  learned   by  a   specific   neural  network
 323 (Table~\ref{tab3}). In  this last  case, we increase  the size  of the
 324 hidden layer up to 40~neurons and we consider larger number of epochs.
 325
 326 \begin{table}[htbp!]
 327 \caption{Prediction success rates for configurations expressed as boolean vectors.}
 328 \label{tab1}
 329 \centering {\small
 330 \begin{tabular}{|c|c||c|c|c|}
 331 \hline
 332 \multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs, and one hidden layer} \\
 333 \hline
 334 \hline
 335 \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\
 336 \cline{3-5}
 337 \multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\
 338 \hline
 339 \multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 90.92\% & 91.75\% & 91.82\% \\
 340 & Output~(2) & 69.32\% & 78.46\% & 82.15\% \\
 341 & Output~(3) & 68.47\% & 78.49\% & 82.22\% \\
 342 & Output~(4) & 91.53\% & 92.37\% & 93.4\% \\
 343 & Config. & 36.10\% & 51.35\% & 56.85\% \\
 344 & Strategy~(5) & 1.91\% & 3.38\% & 2.43\% \\
 345 \hline
 346 \multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.64\% & 98.10\% & 98.20\% \\
 347 & Output~(2) & 95.15\% & 95.39\% & 95.46\% \\
 348 & Output~(3) & 100\% & 100\% & 100\% \\
 349 & Output~(4) & 97.47\% & 97.90\% & 97.99\% \\
 350 & Config. & 90.52\% & 91.59\% & 91.73\% \\
 351 & Strategy~(5) & 3.41\% & 3.40\% & 3.47\% \\
 352 \hline
 353 \hline
 354 \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{25 neurons} \\ %& \multicolumn{3}{|c|}{40 neurons} \\
 355 \cline{3-5}
 356 \multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\ %& 125 & 250 & 500 \\
 357 \hline
 358 \multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 91.65\% & 92.69\% & 93.93\% \\ %& 91.94\% & 92.89\% & 94.00\% \\
 359 & Output~(2) & 72.06\% & 88.46\% & 90.5\% \\ %& 74.97\% & 89.83\% & 91.14\% \\
 360 & Output~(3) & 79.19\% & 89.83\% & 91.59\% \\ %& 76.69\% & 89.58\% & 91.84\% \\
 361 & Output~(4) & 91.61\% & 92.34\% & 93.47\% \\% & 82.77\% & 92.93\% & 93.48\% \\
 362 & Config. & 48.82\% & 67.80\% & 70.97\% \\%& 49.46\% & 68.94\% & 71.11\% \\
 363 & Strategy~(5) & 2.62\% & 3.43\% & 3.78\% \\% & 3.10\% & 3.10\% & 3.03\% \\
 364 \hline
 365 \multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.87\% & 97.99\% & 98.03\% \\ %& 98.16\% \\
 366 & Output~(2) & 95.46\% & 95.84\% & 96.75\% \\ % & 97.4\% \\
 367 & Output~(3) & 100\% & 100\% & 100\% \\%& 100\% \\
 368 & Output~(4) & 97.77\% & 97.82\% & 98.06\% \\%& 98.31\% \\
 369 & Config. & 91.36\% & 91.99\% & 93.03\% \\%& 93.98\% \\
 370 & Strategy~(5) & 3.37\% & 3.44\% & 3.29\% \\%& 3.23\% \\
 371 \hline
 372 \end{tabular}
 373 }
 374 \end{table}
 375
 376 Table~\ref{tab1}  presents the  rates  obtained for  the first  coding
 377 scheme.   For  the chaotic  data,  it can  be  seen  that as  expected
 378 configuration  prediction becomes  better  when the  number of  hidden
 379 neurons and maximum  epochs increases: an improvement by  a factor two
 380 is observed (from 36.10\% for 10~neurons and 125~epochs to 70.97\% for
 381 25~neurons  and  500~epochs). We  also  notice  that  the learning  of
 382 outputs~(2)   and~(3)  is   more  difficult.    Conversely,   for  the
 383 non-chaotic  case the  simplest training  setup is  enough  to predict
 384 configurations.  For all these  feedforward network topologies and all
 385 outputs the  obtained results for the non-chaotic  case outperform the
 386 chaotic  ones. Finally,  the rates  for the  strategies show  that the
 387 different feedforward networks are unable to learn them.
 388
 389 For  the  second  coding   scheme  (\textit{i.e.},  with  Gray  Codes)
 390 Table~\ref{tab2} shows  that any network learns about  five times more
 391 non-chaotic  configurations than  chaotic  ones.  As  in the  previous
 392 scheme,       the      strategies      cannot       be      predicted.
 393 Figures~\ref{Fig:chaotic_predictions}                              and
 394 \ref{Fig:non-chaotic_predictions} present the predictions given by two
 395 feedforward multilayer  perceptrons that were  respectively trained to
 396 learn chaotic  and non-chaotic data,  using the second  coding scheme.
 397 Each figure  shows for  each sample of  the test  subset (577~samples,
 398 representing 25\%  of the 2304~samples) the  configuration that should
 399 have been predicted and the one given by the multilayer perceptron. It
 400 can be  seen that for  the chaotic data  the predictions are  far away
 401 from the  expected configurations.  Obviously,  the better predictions
 402 for the non-chaotic data reflect their regularity.
 403
 404 Let us now compare the  two coding schemes. Firstly, the second scheme
 405 disturbs  the   learning  process.   In   fact  in  this   scheme  the
 406 configuration is always expressed as  a natural number, whereas in the
 407 first one  the number  of inputs follows  the increase of  the Boolean
 408 vectors coding configurations. In this latter case, the coding gives a
 409 finer information on configuration evolution.
 410 \begin{table}[b]
 411 \caption{Prediction success rates for configurations expressed with Gray code}
 412 \label{tab2}
 413 \centering
 414 \begin{tabular}{|c|c||c|c|c|}
 415 \hline
 416 \multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs, and one hidden layer} \\
 417 \hline
 418 \hline
 419 & Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\
 420 \cline{2-5}
 421 & Epochs & 125 & 250 & 500 \\ %& 1000
 422 \hline
 423 \multirow{2}{*}{Chaotic}& Config.~(1) & 13.29\% & 13.55\% & 13.08\% \\ %& 12.5\%
 424 & Strategy~(2) & 0.50\% & 0.52\% & 1.32\% \\ %& 1.42\%
 425 \hline
 426 \multirow{2}{*}{Non-Chaotic}&Config.~(1) & 77.12\% & 74.00\% & 72.60\% \\ %& 75.81\%
 427 & Strategy~(2) & 0.42\% & 0.80\% & 1.16\% \\ %& 1.42\%
 428 \hline
 429 \hline
 430 & Hidden neurons & \multicolumn{3}{c|}{25 neurons} \\
 431 \cline{2-5}
 432 & Epochs & 125 & 250 & 500 \\ %& 1000
 433 \hline
 434 \multirow{2}{*}{Chaotic}& Config.~(1) & 12.27\% & 13.15\% & 13.05\% \\ %& 15.44\%
 435 & Strategy~(2) & 0.71\% & 0.66\% & 0.88\% \\ %& 1.73\%
 436 \hline
 437 \multirow{2}{*}{Non-Chaotic}&Config.~(1) & 73.60\% & 74.70\% & 75.89\% \\ %& 68.32\%
 438 & Strategy~(2) & 0.64\% & 0.97\% & 1.23\% \\ %& 1.80\%
 439 \hline
 440 \end{tabular}
 441 \end{table}
 442
 443 \begin{figure}
 444   \centering
 445   \includegraphics[scale=0.5]{images/chaotic_trace2}
 446   \caption {Second coding scheme - Predictions obtained for a chaotic test subset.}
 447   \label{Fig:chaotic_predictions}
 448 \end{figure}
 449
 450 \begin{figure}
 451   \centering
 452   \includegraphics[scale=0.5]{images/non-chaotic_trace2}
 453   \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.}
 454   \label{Fig:non-chaotic_predictions}
 455 \end{figure}
 456
 457 Unfortunately, in  practical applications the number  of components is
 458 usually  unknown.   Hence, the  first  coding  scheme  cannot be  used
 459 systematically.   Therefore, we  provide  a refinement  of the  second
 460 scheme: each  output is learned  by a different  ANN. Table~\ref{tab3}
 461 presents the  results for  this approach.  In  any case,  whatever the
 462 considered feedforward  network topologies, the  maximum epoch number,
 463 and the kind of iterations, the configuration success rate is slightly
 464 improved.   Moreover, the  strategies predictions  rates  reach almost
 465 12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%.  Despite of
 466 this improvement,  a long term prediction of  chaotic iterations still
 467 appear to be an open issue.
 468
 469 \begin{table}
 470 \caption{Prediction success rates for split outputs.}
 471 \label{tab3}
 472 \centering
 473 \begin{tabular}{|c||c|c|c|}
 474 \hline
 475 \multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output, and one hidden layer} \\
 476 \hline
 477 \hline
 478 Epochs & 125 & 250 & 500 \\
 479 \hline
 480 \hline
 481 Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
 482 \hline
 483 10~neurons & 12.39\% & 14.06\% & 14.32\% \\
 484 25~neurons & 13.00\% & 14.28\% & 14.58\% \\
 485 40~neurons & 11.58\% & 13.47\% & 14.23\% \\
 486 \hline
 487 \hline
 488 Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
 489 \cline{2-4}
 490 %Epochs & 125 & 250 & 500 \\
 491 \hline
 492 10~neurons & 76.01\% & 74.04\% & 78.16\% \\
 493 25~neurons & 76.60\% & 72.13\% & 75.96\% \\
 494 40~neurons & 76.34\% & 75.63\% & 77.50\% \\
 495 \hline
 496 \hline
 497 Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
 498 \cline{2-4}
 499 %Epochs & 125 & 250 & 500 \\
 500 \hline
 501 10~neurons & 0.76\% & 0.97\% & 1.21\% \\
 502 25~neurons & 1.09\% & 0.73\% & 1.79\% \\
 503 40~neurons & 0.90\% & 1.02\% & 2.15\% \\
 504 \hline
 505 \multicolumn{4}{c}{} \\
 506 \hline
 507 Epochs & 1000 & 2500 & 5000 \\
 508 \hline
 509 \hline
 510 Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
 511 \hline
 512 10~neurons & 14.51\% & 15.22\% & 15.22\% \\
 513 25~neurons & 16.95\% & 17.57\% & 18.46\% \\
 514 40~neurons & 17.73\% & 20.75\% & 22.62\% \\
 515 \hline
 516 \hline
 517 Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
 518 \cline{2-4}
 519 %Epochs & 1000 & 2500 & 5000 \\
 520 \hline
 521 10~neurons & 78.98\% & 80.02\% & 79.97\% \\
 522 25~neurons & 79.19\% & 81.59\% & 81.53\% \\
 523 40~neurons & 79.64\% & 81.37\% & 81.37\% \\
 524 \hline
 525 \hline
 526 Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
 527 \cline{2-4}
 528 %Epochs & 1000 & 2500 & 5000 \\
 529 \hline
 530 10~neurons & 3.47\% & 9.98\% & 11.66\% \\
 531 25~neurons & 3.92\% & 8.63\% & 10.09\% \\
 532 40~neurons & 3.29\% & 7.19\% & 7.18\% \\
 533 \hline
 534 \end{tabular}
 535 \end{table}
 536
 537 \section{Conclusion}
 538
 539 In  this paper,  we have  established an  equivalence  between chaotic
 540 iterations,  according to  the Devaney's  definition of  chaos,  and a
 541 class  of multilayer  perceptron  neural networks.   Firstly, we  have
 542 described how to build a neural network that can be trained to learn a
 543 given chaotic map function. Secondly,  we found a condition that allow
 544 to check whether  the iterations induced by a  function are chaotic or
 545 not, and thus  if a chaotic map is obtained.  Thanks to this condition
 546 our  approach is not  limited to  a particular  function. In  the dual
 547 case, we show that checking if a neural network is chaotic consists in
 548 verifying  a property  on an  associated  graph, called  the graph  of
 549 iterations.   These results  are valid  for recurrent  neural networks
 550 with a  particular architecture.  However,  we believe that  a similar
 551 work can be done for  other neural network architectures.  Finally, we
 552 have  discovered at  least one  family of  problems with  a reasonable
 553 size, such  that artificial neural  networks should not be  applied in
 554 the  presence  of chaos,  due  to  their  inability to  learn  chaotic
 555 behaviors in this  context.  Such a consideration is  not reduced to a
 556 theoretical detail:  this family of discrete  iterations is concretely
 557 implemented  in a  new steganographic  method  \cite{guyeux10ter}.  As
 558 steganographic   detectors  embed  tools   like  neural   networks  to
 559 distinguish between  original and stego contents, our  studies tend to
 560 prove that such  detectors might be unable to  tackle with chaos-based
 561 information  hiding  schemes.
 562
 563 In  future  work we  intend  to  enlarge  the comparison  between  the
 564 learning   of  truly   chaotic  and   non-chaotic   behaviors.   Other
 565 computational intelligence tools such  as support vector machines will
 566 be investigated  too, to  discover which tools  are the  most relevant
 567 when facing a truly chaotic phenomenon.  A comparison between learning
 568 rate  success  and  prediction  quality will  be  realized.   Concrete
 569 consequences in biology, physics, and computer science security fields
 570 will then be stated.
 571
 572 % \appendix{}
 573
 574 % \begin{Def} \label{def2}
 575 % A continuous function $f$  on a topological space $(\mathcal{X},\tau)$
 576 % is defined  to be  {\emph{topologically transitive}}  if for any  pair of
 577 % open  sets $U$,  $V  \in  \mathcal{X}$ there  exists
 578 % $k \in
 579 % \mathds{N}^{\ast}$
 580 %  such  that
 581 % $f^k(U) \cap V \neq \emptyset$.
 582 % \end{Def}
 583
 584 %\bibliography{chaos-paper}% Produces the bibliography via BibTeX.
 585
 586 %\end{document}
 587 %
 588 % ****** End of file chaos-paper.tex ******