1 % ****** Start of file chaos-paper.tex ******
6 numerical bibliography, %(default)
7 %jcp,% short, numerical bibliography,
16 \usepackage{graphicx}% Include figure files
17 \usepackage{dcolumn}% Align table columns on decimal point
18 \usepackage{bm}% bold math
21 \usepackage[mathlines]{lineno}% Enable numbering of text and display math
22 \linenumbers\relax % Commence numbering lines
23 % The amssymb package provides various useful mathematical symbols
25 % The amsthm package provides various useful mathematical fonts
27 % Various mathematical packages
28 \PassOptionsToPackage{cmex10draft}{graphics}
29 \usepackage[standard]{ntheorem}
31 \usepackage{stmaryrd} % llbracket
32 \usepackage{dsfont} % mathds
33 \usepackage{multirow} % tables
35 \newcommand{\ie}{\textit{i.e.}}
36 \newcommand{\Nats}[0]{\ensuremath{\mathds{N}}}
37 \newcommand{\R}[0]{\ensuremath{\mathds{R}}}
38 \newcommand{\Bool}[0]{\ensuremath{\mathds{B}}}
42 \title[Recurrent Neural Networks and Chaos]{Recurrent Neural Networks
43 and Chaos: Construction, Evaluation, \\
44 and Prediction Ability}
46 \author{Jacques M. Bahi}
47 \author{Jean-Fran\c{c}ois Couchot}
48 \author{Christophe Guyeux}
49 \email{christophe.guyeux@univ-fcomte.fr.}
50 \author{Michel Salomon}
51 \altaffiliation[Authors in ]{alphabetic order}
53 Computer Science Laboratory (LIFC), University of Franche-Comt\'e, \\
54 IUT de Belfort-Montb\'eliard, BP 527, \\
55 90016 Belfort Cedex, France
60 \newcommand{\CG}[1]{\begin{color}{red}\textit{#1}\end{color}}
61 \newcommand{\JFC}[1]{\begin{color}{blue}\textit{#1}\end{color}}
69 Many research works deal with chaotic neural networks for various
70 fields of application. Unfortunately, up to now these networks are
71 usually claimed to be chaotic without any mathematical proof. The
72 purpose of this paper is to establish, based on a rigorous theoretical
73 framework, an equivalence between chaotic iterations according to
74 Devaney and a particular class of neural
75 networks. On the one hand we show how to build such a network, on the
76 other hand we provide a method to check if a neural network is a
77 chaotic one. Finally, the ability of classical feedforward multilayer
78 perceptrons to learn sets of data obtained from a dynamical
79 system is regarded. Various Boolean functions are iterated on finite
80 states. Iterations of some of them are proven to be chaotic
82 Devaney. In that context, important differences occur in the training
83 process, establishing with various neural networks that chaotic
84 behaviors are far more difficult to learn.
87 %%\pacs{Valid PACS appear here}% PACS, the Physics and Astronomy
88 % Classification Scheme.
89 %%\keywords{Suggested keywords}%Use showkeys class option if keyword
95 Chaotic neural networks have received a lot of attention due to the
96 appealing properties of deterministic chaos (unpredictability,
97 sensitivity, and so on). However, such networks are often claimed
98 chaotic without any rigorous mathematical proof. Therefore, in this
99 work a theoretical framework based on the Devaney's definition of
100 chaos is introduced. Starting with a relationship between discrete
101 iterations and Devaney's chaos, we firstly show how to build a
102 recurrent neural networks that is equivalent to a chaotic map and
103 secondly a way to check whether an already available network, is
104 chaotic or not. We also study different topological properties of
105 these truly chaotic neural networks. Finally, we show that the
106 learning, with neural networks having a feedforward structure, of
107 chaotic behaviors represented by data sets obtained from chaotic maps,
108 is far more difficult than non chaotic behaviors.
112 \section{Introduction}
115 Several research works have proposed or run chaotic neural networks
116 these last years. The complex dynamics of such a networks leads to
117 various potential application areas: associative
118 memories~\cite{Crook2007267} and digital security tools like hash
119 functions~\cite{Xiao10}, digital
120 watermarking~\cite{1309431,Zhang2005759}, or cipher
121 schemes~\cite{Lian20091296}. In the former case, the background idea
122 is to control chaotic dynamics in order to store patterns, with the
123 key advantage of offering a large storage capacity. For the latter
124 case, the use of chaotic dynamics is motivated by their
125 unpredictability and random-like behaviors. Thus, investigating new
126 concepts is crucial in this field, because new threats are constantly
127 emerging. As an illustrative example, the former standard in hash
128 functions, namely the SHA-1 algorithm, has been recently weakened
129 after flaws were discovered.
131 Chaotic neural networks have been built with different approaches. In
132 the context of associative memory, chaotic neurons like the nonlinear
133 dynamic state neuron \cite{Crook2007267} frequently constitute the
134 nodes of the network. These neurons have an inherent chaotic behavior,
135 which is usually assessed through the computation of the Lyapunov
136 exponent. An alternative approach is to consider a well-known neural
137 network architecture: the MultiLayer Perceptron (MLP). These networks
138 are suitable to model nonlinear relationships between data, due to
139 their universal approximator capacity.
140 \JFC{Michel, peux-tu donner une ref la dessus}
141 Thus, this kind of networks can
142 be trained to model a physical phenomenon known to be chaotic such as
143 Chua's circuit \cite{dalkiran10}. Sometimes, a neural network which
144 is build by combining transfer functions and initial conditions that are both
145 chaotic, is itself claimed to be chaotic
146 \cite{springerlink:10.1007/s00521-010-0432-2}.
148 What all of these chaotic neural networks have in common is that they
149 are claimed to be chaotic despite a lack of any rigorous mathematical
150 proof. The first contribution of this paper is to fill this gap,
151 using a theoretical framework based on the Devaney's definition of chaos
152 \cite{Devaney}. This mathematical theory of chaos provides both
153 qualitative and quantitative tools to evaluate the complex behavior of
154 a dynamical system: ergodicity, expansivity, and so on. More
155 precisely, in this paper, which is an extension of a previous work
156 \cite{bgs11:ip}, we establish the equivalence between chaotic
157 iterations and a class of globally recurrent MLP.
158 The investigation the converse problem is the second contribution:
159 we indeed study the ability for
160 classical MultiLayer Perceptrons to learn a particular family of
161 discrete chaotic dynamical systems. This family
162 is defined by a Boolean vector, an update function, and a
163 sequence giving which component to update at each iteration. It has
164 been previously established that such dynamical systems is
165 chaotically iterated (as it is defined by Devaney) when the chosen function has
166 a strongly connected iterations graph. In this document, we
167 experiment several MLPs and try to learn some iterations of this kind.
168 We show that non-chaotic iterations can be learned, whereas it is
169 far more difficult for chaotic ones. That is to say, we have
170 discovered at least one family of problems with a reasonable size,
171 such that artificial neural networks should not be applied
172 due to their inability to learn chaotic behaviors in this context.
174 The remainder of this research work is organized as follows. The next
175 section is devoted to the basics of Devaney's
176 chaos. Section~\ref{S2} formally describes how to build a neural
177 network that operates chaotically. Section~\ref{S3} is
178 devoted to the dual case of checking whether an existing neural network
180 Topological properties of chaotic neural networks
181 are discussed in Sect.~\ref{S4}. The
182 Section~\ref{section:translation} shows how to translate such
183 iterations into an Artificial Neural Network (ANN), in order to
184 evaluate the capability for this latter to learn chaotic behaviors.
185 This ability is studied in Sect.~\ref{section:experiments}, where
186 various ANNs try to learn two sets of data: the first one is obtained
187 by chaotic iterations while the second one results from a non-chaotic
188 system. Prediction success rates are given and discussed for the two
189 sets. The paper ends with a conclusion section where our contribution
190 is summed up and intended future work is exposed.
192 \section{Chaotic Iterations according to Devaney}
194 In this section, the well-established notion of Devaney's mathematical
195 chaos is firstly recalled. Preservation of the unpredictability of
196 such dynamical system when implemented on a computer is obtained by
197 using some discrete iterations called ``asynchronous iterations'', which
198 are thus introduced. The result establishing the link between such
199 iterations and Devaney's chaos is finally presented at the end of this
202 In what follows and for any function $f$, $f^n$ means the composition
203 $f \circ f \circ \hdots \circ f$ ($n$ times) and an \emph{iteration}
204 of a \emph{dynamical system} is the step that consists in
205 updating the global state $x^t$ at time $t$ with respect to a function $f$
206 s.t. $x^{t+1} = f(x^t)$.
208 \subsection{Devaney's chaotic dynamical systems}
210 Various domains such as physics, biology, or economy, contain systems
211 that exhibit a chaotic behavior, a well-known example is the weather.
212 These systems are in particular highly sensitive to initial
213 conditions, a concept usually presented as the butterfly effect: small
214 variations in the initial conditions possibly lead to widely different
215 behaviors. Theoretically speaking, a system is sensitive if for each
216 point $x$ in the iteration space, one can find a point in each
217 neighborhood of $x$ having a significantly different future evolution.
218 Conversely, a system seeded with the same initial conditions always
219 has the same evolution. In other words, chaotic systems have a
220 deterministic behavior defined through a physical or mathematical
221 model and a high sensitivity to the initial conditions. Besides
222 mathematically this kind of unpredictability is also referred to as
223 deterministic chaos. For example, many weather forecast models exist,
224 but they give only suitable predictions for about a week, because they
225 are initialized with conditions that reflect only a partial knowledge
226 of the current weather. Even the differences are initially small,
227 they are amplified in the course of time, and thus make difficult a
228 long-term prediction. In fact, in a chaotic system, an approximation
229 of the current state is a quite useless indicator for predicting
232 From mathematical point of view, deterministic chaos has been
233 thoroughly studied these last decades, with different research works
234 that have provide various definitions of chaos. Among these
235 definitions, the one given by Devaney~\cite{Devaney} is
236 well-established. This definition consists of three conditions:
237 topological transitivity, density of periodic points, and sensitive
238 point dependence on initial conditions.
240 Topological transitivity is checked when, for any point, any
241 neighborhood of its future evolution eventually overlap with any other
242 given region. This property implies that a dynamical system
243 cannot be broken into simpler subsystems.
244 Intuitively, its complexity does not allow any simplification.
245 On the contrary, a dense set of periodic points is an
246 element of regularity that a chaotic dynamical system has to exhibit.
248 However, chaos need some regularity to ``counteracts''
249 the effects of transitivity.
250 %\begin{definition} \label{def3}
251 We recall that a point $x$ is {\emph{periodic point}} for $f$ of
252 period~$n \in \mathds{N}^{\ast}$ if $f^{n}(x)=x$.
255 %\begin{definition} \label{def4}
256 $f$ is {\emph{ regular}} on $(\mathcal{X},\tau)$ if the set of
257 periodic points for $f$ is dense in $\mathcal{X}$ (for any $x \in
258 \mathcal{X}$, we can find at least one periodic point in any of its
262 due to these two properties, two points close to each other can behave
263 in a completely different manner, leading to unpredictability for the
266 Let we recall that $f$
267 has {\emph{ sensitive dependence on initial conditions}} if there
268 exists $\delta >0$ such that, for any $x\in \mathcal{X}$ and any
269 neighborhood $V$ of $x$, there exist $y\in V$ and $n > 0$ such that
270 $d\left(f^{n}(x), f^{n}(y)\right) >\delta $. The value $\delta$ is called the
271 {\emph{constant of sensitivity}} of $f$.
273 Finally, The dynamical system that iterates $f$ is {\emph{ chaotic according to Devaney}} on $(\mathcal{X},\tau)$ if $f$ is regular, topologically transitive,
274 and has sensitive dependence to its initial conditions.
275 In what follows, iterations are said to be \emph{chaotic according Devaney}
276 when corresponding dynamical system is chaotic according Devaney.
279 %Let us notice that for a metric space the last condition follows from
280 %the two first ones~\cite{Banks92}.
282 \subsection{Asynchronous Iterations}
284 %This section presents some basics on topological chaotic iterations.
285 Let us firstly discuss about the domain of iteration. As far as we
286 know, no result rules that the chaotic behavior of a dynamical system
287 that has been theoretically proven on $\R$ remains valid on the
289 numbers, which is the implementation domain. Thus, to avoid loss of
290 chaos this work presents an alternative, that is to iterate Boolean
291 maps: results that are theoretically obtained in that domain are
292 preserved in implementations.
294 Let us denote by $\llbracket a ; b \rrbracket$ the following interval
295 of integers: $\{a, a+1, \hdots, b\}$, where $a~<~b$.
296 In that section, a system
297 under consideration iteratively modifies a collection of
298 $n$~components. Each component $i \in \llbracket 1; n \rrbracket$
299 takes its value $x_i$ among the domain $\Bool=\{0,1\}$.
300 A \emph{configuration} of the system at discrete time $t$ is the vector
302 $x^{t}=(x_1^{t},\ldots,x_{n}^{t}) \in \Bool^n$.
304 The dynamics of the system is described according to a function $f :
305 \Bool^n \rightarrow \Bool^n$ such that
307 $f(x)=(f_1(x),\ldots,f_n(x))$.
309 % Notice that $f^k$ denotes the
310 % $k-$th composition $f\circ \ldots \circ f$ of the function $f$.
312 Let be given a configuration $x$. In what follows
313 $N(i,x)=(x_1,\ldots,\overline{x_i},\ldots,x_n)$ is the configuration
314 obtained by switching the $i-$th component of $x$ ($\overline{x_i}$ is
315 indeed the negation of $x_i$). Intuitively, $x$ and $N(i,x)$ are
316 neighbors. The discrete iterations of $f$ are represented by the
317 oriented \emph{graph of iterations} $\Gamma(f)$. In such a graph,
318 vertices are configurations of $\Bool^n$ and there is an arc labeled
319 $i$ from $x$ to $N(i,x)$ if and only if $f_i(x)$ is $N(i,x)$.
321 In the sequel, the \emph{strategy} $S=(S^{t})^{t \in \Nats}$ is the
322 sequence defining which component to update at time $t$ and $S^{t}$
323 denotes its $t-$th term.
324 This iteration scheme that only modifies one element at each iteration
325 is classically referred as \emph{asynchronous iterations}.
326 More precisely, we have here for any $i$, $1\le i \le n$,
328 \left\{ \begin{array}{l}
332 f_i(x^t) \textrm{ if $S^t = i$} \\
333 x_i^t \textrm{ otherwise}
339 Next section shows the link between asynchronous iterations and
342 \subsection{On the link between asynchronous iterations and
345 In this subsection we recall the link we have established between
346 asynchronous iterations and Devaney's chaos. The theoretical framework is
347 fully described in \cite{guyeux09}.
349 We introduce the function $F_{f}$ that is
350 defined for any given application $f:\Bool^{n} \to \Bool^{n}$ by
351 $F_{f}: \llbracket1;n\rrbracket\times \mathds{B}^{n} \rightarrow
352 \mathds{B}^{n}$, s.t.
358 f_j(x) \textrm{ if } j= s \enspace , \\
359 x_{j} \textrm{ otherwise} \enspace .
364 \noindent With such a notation, configurations
365 asynchronously obtained are defined for times
367 \begin{equation}\label{eq:sync}
368 \left\{\begin{array}{l}
369 x^{0}\in \mathds{B}^{n} \textrm{ and}\\
370 x^{t+1}=F_{f}(S^t,x^{t}) \enspace .
374 \noindent Finally, iterations defined in Eq.~(\ref{eq:sync}) can be
375 described by the following system:
379 X^{0} & = & ((S^t)^{t \in \Nats},x^0) \in
380 \llbracket1;n\rrbracket^\Nats \times \Bool^{n}\\
381 X^{k+1}& = & G_{f}(X^{k})\\
382 \multicolumn{3}{c}{\textrm{where } G_{f}\left(((S^t)^{t \in \Nats},x)\right)
383 = \left(\sigma((S^t)^{t \in \Nats}),F_{f}(S^0,x)\right) \enspace ,}
388 where $\sigma$ is the function that removes the first term of the
389 strategy ({\it i.e.},~$S^0$).
390 This definition allows to links asynchronous iterations with
391 classical iterations of a dynamical system.
395 %component of the system is updated at an iteration: the $S^t$-th
396 %element. But it can be extended by considering subsets for $S^t$.
399 To study topological properties of these iterations, we are then left to
400 introduce a {\emph{ distance}} $d$ between two points $(S,x)$ and
401 $(\check{S},\check{x})\in \mathcal{X} = \llbracket1;n\rrbracket^\Nats.
402 \times \Bool^{n}$. It is defined by
404 d((S,x);(\check{S},\check{x}))=d_{e}(x,\check{x})+d_{s}(S,\check{S})
409 d_{e}(x,\check{x})=\sum_{j=1}^{n}\Delta
410 (x_{j},\check{x}_{j}) \in \llbracket 0 ; n \rrbracket
414 d_{s}(S,\check{S})=\frac{9}{2n}\sum_{t=0}^{\infty
415 }\frac{|S^{t}-\check{S}^{t}|}{10^{t+1}} \in [0 ; 1] \enspace .
418 Notice that the more two systems have different components,
419 the larger the distance between them is. Secondly, two systems with
420 similar components and strategies, which have the same starting terms,
421 must induce only a small distance. The proposed distance fulfill
422 these requirements: on the one hand its floor value reflects the
423 difference between the cells, on the other hand its fractional part
424 measures the difference between the strategies.
426 The relation between $\Gamma(f)$ and $G_f$ is clear: there exists a
427 path from $x$ to $x'$ in $\Gamma(f)$ if and only if there exists a
428 strategy $s$ such that iterations of $G_f$ from the
429 initial point $(s,x)$ reaches the configuration $x'$. Using this
430 link, Guyeux~\cite{GuyeuxThese10} has proven that,
431 \begin{theorem}%[Characterization of $\mathcal{C}$]
432 \label{Th:Caracterisation des IC chaotiques}
433 Let $f:\Bool^n\to\Bool^n$. Iterations of $G_f$ are chaotic according
434 to Devaney if and only if $\Gamma(f)$ is strongly connected.
437 Checking if a graph is strongly connected is not difficult
438 (by the Tarjan's algorithm for instance).
439 Let be given a strategy $S$ and a function $f$ such that
440 $\Gamma(f)$ is strongly connected.
441 In that case, iterations of the function $G_f$ as defined in
442 Eq.~(\ref{eq:Gf}) are chaotic according to Devaney.
445 Let us then define two function $f_0$ and $f_1$ both in
446 $\Bool^n\to\Bool^n$ that are used all along this article.
447 The former is the vectorial negation, \textit{i.e.},
448 $f_{0}(x_{1},\dots,x_{n}) =(\overline{x_{1}},\dots,\overline{x_{n}})$.
449 The latter is $f_1\left(x_1,\dots,x_n\right)=\left(
450 \overline{x_1},x_1,x_2,\dots,x_{n-1}\right)$.
451 It is not hard to see that $\Gamma(f_0)$ and $\Gamma(f_1)$ are
452 both strongly connected, then iterations of $G_{f_0}$ and of
453 $G_{f_1}$ are chaotic according to Devaney.
455 With this material, we are now able to build a first chaotic neural
456 network, as defined in the Devaney's formulation.
458 \section{A chaotic neural network in the sense of Devaney}
461 Firstly, let us build a
462 multilayer perceptron neural network modeling
463 $F_{f_0}:\llbracket 1; n \rrbracket \times \mathds{B}^n \to
464 \mathds{B}^n$ associated to the vectorial negation.
465 More precisely, for all inputs
466 $(s,x) \in \llbracket 1;n\rrbracket \times \mathds{B}^n$,
467 the output layer produces $F_{f_0}(s,x)$. It is then possible to
468 link the output layer and the input one, in order to model the
469 dependence between two successive iterations. As a result we obtain a
470 global recurrent neural network that behaves as follows (see
471 Fig.~\ref{Fig:perceptron}).
474 \item The network is initialized with the input vector
475 $\left(S^0,x^0\right) \in \llbracket 1;n\rrbracket \times
476 \mathds{B}^n$ and computes the output vector
477 $x^1=F_{f_0}\left(S^0,x^0\right)$. This last vector is published as
478 an output one of the chaotic neural network and is sent back to the
479 input layer through the feedback links.
480 \item When the network is activated at the $t^{th}$ iteration, the
481 state of the system $x^t \in \mathds{B}^n$ received from the output
482 layer and the initial term of the sequence $(S^t)^{t \in \Nats}$
483 ($S^0 \in \llbracket 1;n\rrbracket$) are used to compute the new
484 output. This new output, which represents the new state of the
485 dynamical system, satisfies:
487 x^{t+1}=F_{f_0}(S^0, x^t) \in \mathds{B}^n \enspace .
493 \includegraphics[scale=0.625]{perceptron}
494 \caption{A perceptron equivalent to chaotic iterations}
495 \label{Fig:perceptron}
498 The behavior of the neural network is such that when the initial state
499 is $x^0~\in~\mathds{B}^n$ and a sequence $(S^t)^{t \in \Nats}$ is
500 given as outside input,
501 \JFC{en dire davantage sur l'outside world}
502 then the sequence of successive published
503 output vectors $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ is exactly
504 the one produced by the chaotic iterations formally described in
505 Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar
506 input vectors they both generate the same successive outputs
507 $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$, and therefore that they
508 are equivalent reformulations of the iterations of $G_{f_0}$ in
509 $\mathcal{X}$. Finally, since the proposed neural network is built to
510 model the behavior of $G_{f_0}$, whose iterations are
512 Devaney's definition of chaos, we can conclude that the network is
513 also chaotic in this sense.
515 The previous construction scheme is not restricted to function $f_0$.
516 It can be extended to any function $f$ such that $G_f$ is a chaotic
517 map by training the network to model $F_{f}:\llbracket 1; n \rrbracket
518 \times \mathds{B}^n \to \mathds{B}^n$. Due to
519 Theorem~\ref{Th:Caracterisation des IC chaotiques}, we can find
520 alternative functions $f$ for $f_0$ through a simple check of their
521 graph of iterations $\Gamma(f)$. For example, we can build another
522 chaotic neural network by using $f_1$ instead of $f_0$.
524 \section{Checking whether a neural network is chaotic or not}
527 We focus now on the case where a neural network is already available,
528 and for which we want to know if it is chaotic. Typically, in
529 many research papers neural network are usually claimed to be chaotic
530 without any convincing mathematical proof. We propose an approach to
531 overcome this drawback for a particular category of multilayer
532 perceptrons defined below, and for the Devaney's formulation of chaos.
533 In spite of this restriction, we think that this approach can be
534 extended to a large variety of neural networks.
537 We consider a multilayer perceptron of the following form: inputs
538 are $n$ binary digits and one integer value, while outputs are $n$
539 bits. Moreover, each binary output is connected with a feedback
540 connection to an input one.
543 \item During initialization, the network is seeded with $n$~bits denoted
544 $\left(x^0_1,\dots,x^0_n\right)$ and an integer value $S^0$ that
545 belongs to $\llbracket1;n\rrbracket$.
546 \item At iteration~$t$, the last output vector
547 $\left(x^t_1,\dots,x^t_n\right)$ defines the $n$~bits used to
548 compute the new output one $\left(x^{t+1}_1,\dots,x^{t+1}_n\right)$.
549 While the remaining input receives a new integer value $S^t \in
550 \llbracket1;n\rrbracket$, which is provided by the outside world.
551 \JFC{en dire davantage sur l'outside world}
554 The topological behavior of these particular neural networks can be
555 proven to be chaotic through the following process. Firstly, we denote
556 by $F: \llbracket 1;n \rrbracket \times \mathds{B}^n \rightarrow
557 \mathds{B}^n$ the function that maps the value
558 $\left(s,\left(x_1,\dots,x_n\right)\right) \in \llbracket 1;n
559 \rrbracket \times \mathds{B}^n$
560 \JFC{ici, cela devait etre $S^t$ et pas $s$, nn ?}
562 $\left(y_1,\dots,y_n\right) \in \mathds{B}^n$, where
563 $\left(y_1,\dots,y_n\right)$ is the response of the neural network
564 after the initialization of its input layer with
565 $\left(s,\left(x_1,\dots, x_n\right)\right)$.
566 \JFC{ici, cela devait etre $S^t$ et pas $s$, nn ?}
567 Secondly, we define $f:
568 \mathds{B}^n \rightarrow \mathds{B}^n$ such that
569 $f\left(x_1,x_2,\dots,x_n\right)$ is equal to
571 \left(F\left(1,\left(x_1,x_2,\dots,x_n\right)\right),\dots,
572 F\left(n,\left(x_1,x_2,\dots,x_n\right)\right)\right) \enspace .
574 Thus, for any $j$, $1 \le j \le n$, we have
575 $f_j\left(x_1,x_2,\dots,x_n\right) =
576 F\left(j,\left(x_1,x_2,\dots,x_n\right)\right)$.
577 If this recurrent neural network is seeded with
578 $\left(x_1^0,\dots,x_n^0\right)$ and $S \in \llbracket 1;n
579 \rrbracket^{\mathds{N}}$, it produces exactly the
580 same output vectors than the
581 chaotic iterations of $F_f$ with initial
582 condition $\left(S,(x_1^0,\dots, x_n^0)\right) \in \llbracket 1;n
583 \rrbracket^{\mathds{N}} \times \mathds{B}^n$.
584 Theoretically speaking, such iterations of $F_f$ are thus a formal model of
585 these kind of recurrent neural networks. In the rest of this
586 paper, we will call such multilayer perceptrons CI-MLP($f$), which
587 stands for ``Chaotic Iterations based MultiLayer Perceptron''.
589 Checking if CI-MLP($f$) behaves chaotically according to Devaney's
590 definition of chaos is simple: we need just to verify if the
591 associated graph of iterations $\Gamma(f)$ is strongly connected or
592 not. As an incidental consequence, we finally obtain an equivalence
593 between chaotic iterations and CI-MLP($f$). Therefore, we can
594 obviously study such multilayer perceptrons with mathematical tools
595 like topology to establish, for example, their convergence or,
596 contrarily, their unpredictable behavior. An example of such a study
597 is given in the next section.
599 \JFC{Ce qui suit est davantage qu'un exemple.Il faudrait
600 motiver davantage, non?}
603 \section{Topological properties of chaotic neural networks}
606 Let us first recall two fundamental definitions from the mathematical
609 \begin{definition} \label{def8}
610 A function $f$ is said to be {\emph{ expansive}} if $\exists
611 \varepsilon>0$, $\forall x \neq y$, $\exists n \in \mathds{N}$ such
612 that $d\left(f^n(x),f^n(y)\right) \geq \varepsilon$.
615 \begin{definition} \label{def9}
616 A discrete dynamical system is said to be {\emph{ topologically mixing}}
617 if and only if, for any pair of disjoint open sets $U$,$V \neq
618 \emptyset$, we can find some $n_0 \in \mathds{N}$ such that for any $n$,
619 $n\geq n_0$, we have $f^n(U) \cap V \neq \emptyset$.
621 \JFC{Donner un sens à ces definitions}
624 It has been proven in Ref.~\cite{gfb10:ip}, that chaotic iterations
625 are expansive and topologically mixing when $f$ is the
626 vectorial negation $f_0$.
627 Consequently, these properties are inherited by the CI-MLP($f_0$)
628 recurrent neural network previously presented, which induce a greater
629 unpredictability. Any difference on the initial value of the input
630 layer is in particular magnified up to be equal to the expansivity
633 Let us then focus on the consequences for a neural network to be chaotic
634 according to Devaney's definition. Intuitively, the topological
635 transitivity property implies indecomposability, which is formally defined
639 \begin{definition} \label{def10}
640 A dynamical system $\left( \mathcal{X}, f\right)$ is
641 {\emph{not decomposable}} if it is not the union of two closed sets $A, B
642 \subset \mathcal{X}$ such that $f(A) \subset A, f(B) \subset B$.
645 \noindent Hence, reducing the set of outputs generated by CI-MLP($f$),
646 in order to simplify its complexity, is impossible if $\Gamma(f)$ is
647 strongly connected. Moreover, under this hypothesis CI-MLPs($f$) are
650 \begin{definition} \label{def11}
651 A dynamical system $\left( \mathcal{X}, f\right)$ is {\emph{ strongly
652 transitive}} if $\forall x,y \in \mathcal{X}$, $\forall r>0$, $\exists
653 z \in \mathcal{X}$, $d(z,x)~\leq~r \Rightarrow \exists n \in
654 \mathds{N}^{\ast}$, $f^n(z)=y$.
656 According to this definition, for all pairs of points $(x, y)$ in the
657 phase space, a point $z$ can be found in the neighborhood of $x$ such
658 that one of its iterates $f^n(z)$ is $y$. Indeed, this result has been
659 established during the proof of the transitivity presented in
660 Ref.~\cite{guyeux09}. Among other things, the strong transitivity
661 leads to the fact that without the knowledge of the initial input
662 layer, all outputs are possible. Additionally, no point of the output
663 space can be discarded when studying CI-MLPs: this space is
664 intrinsically complicated and it cannot be decomposed or simplified.
666 Furthermore, those recurrent neural networks exhibit the instability
669 A dynamical system $\left( \mathcal{X}, f\right)$ is \emph{unstable}
671 all $x \in \mathcal{X}$, the orbit $\gamma_x:n \in \mathds{N}
672 \longmapsto f^n(x)$ is unstable, that means: $\exists \varepsilon >
673 0$, $\forall \delta>0$, $\exists y \in \mathcal{X}$, $\exists n \in
674 \mathds{N}$, such that $d(x,y)<\delta$ and
675 $d\left(\gamma_x(n),\gamma_y(n)\right) \geq \varepsilon$.
677 This property, which is implied by the sensitive point dependence on
678 initial conditions, leads to the fact that in all neighborhoods of any
679 point $x$, there are points that can be apart by $\varepsilon$ in the
680 future through iterations of the CI-MLP($f$). Thus, we can claim that
681 the behavior of these MLPs is unstable when $\Gamma (f)$ is strongly
684 Let us now consider a compact metric space $(M, d)$ and $f: M
685 \rightarrow M$ a continuous map. For each natural number $n$, a new
686 metric $d_n$ is defined on $M$ by
687 $$d_n(x,y)=\max\{d(f^i(x),f^i(y)): 0\leq i<n\} \enspace .$$
689 Given any $\varepsilon > 0$ and $n \geqslant 1$, two points of $M$ are
690 $\varepsilon$-close with respect to this metric if their first $n$
691 iterates are $\varepsilon$-close.
693 This metric allows one to distinguish in a neighborhood of an orbit
694 the points that move away from each other during the iteration from
695 the points that travel together. A subset $E$ of $M$ is said to be
696 $(n, \varepsilon)$-separated if each pair of distinct points of $E$ is
697 at least $\varepsilon$ apart in the metric $d_n$. Denote by $H(n,
698 \varepsilon)$ the maximum cardinality of an $(n,
699 \varepsilon)$-separated set,
701 The \emph{topological entropy} of the map $f$ is defined by (see e.g.,
702 Ref.~\cite{Adler65} or Ref.~\cite{Bowen})
703 $$h(f)=\lim_{\varepsilon\to 0} \left(\limsup_{n\to \infty}
704 \frac{1}{n}\log H(n,\varepsilon)\right) \enspace .$$
707 Then we have the following result \cite{GuyeuxThese10},
709 $\left( \mathcal{X},d\right)$ is compact and the topological entropy
710 of $(\mathcal{X},G_{f_0})$ is infinite.
715 \includegraphics[scale=0.625]{scheme}
716 \caption{Summary of addressed neural networks and chaos problems}
720 The Figure~\ref{Fig:scheme} is a summary of addressed neural networks and chaos problems.
721 Section~\ref{S2} has explained how to construct a truly chaotic neural
722 networks $A$ for instance.
723 Section~\ref{S3} has shown how to check whether a given MLP
724 $A$ or $C$ is chaotic or not in the sens of Devaney.
725 %, and how to study its topological behavior.
726 The last thing to investigate, when comparing
727 neural networks and Devaney's chaos, is to determine whether
728 an artificial neural network $A$ is able to learn or predict some chaotic
729 behaviors of $B$, as it is defined in the Devaney's formulation (when they
730 are not specifically constructed for this purpose). This statement is
731 studied in the next section.
739 \section{Suitability of Artificial Neural Networks
740 for Predicting Chaotic Behaviors}
742 In the context of computer science different topic areas have an
743 interest in chaos, as for steganographic
744 techniques~\cite{1309431,Zhang2005759}. Steganography consists in
745 embedding a secret message within an ordinary one, while the secret
746 extraction takes place once at destination. The reverse ({\it i.e.},
747 automatically detecting the presence of hidden messages inside media)
748 is called steganalysis. Among the deployed strategies inside
749 detectors, there are support vectors
750 machines~\cite{Qiao:2009:SM:1704555.1704664}, neural
751 networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36}, and
752 Markov chains~\cite{Sullivan06steganalysisfor}. Most of these
753 detectors give quite good results and are rather competitive when
754 facing steganographic tools. However, to the best of our knowledge
755 none of the considered information hiding schemes fulfills the Devaney
756 definition of chaos~\cite{Devaney}. Indeed, one can wonder whether
757 detectors continue to give good results when facing truly chaotic
758 schemes. More generally, there remains the open problem of deciding
759 whether artificial intelligence is suitable for predicting topological
762 \subsection{Representing Chaotic Iterations for Neural Networks}
763 \label{section:translation}
765 The problem of deciding whether classical feedforward ANNs are
766 suitable to approximate topological chaotic iterations may then be
767 reduced to evaluate ANNs on iterations of functions with Strongly
768 Connected Component (SCC)~graph of iterations. To compare with
769 non-chaotic iterations, the experiments detailed in the following
770 sections are carried out using both kinds of function (chaotic and
771 non-chaotic). Let us emphasize on the difference between this kind of
772 neural networks and the Chaotic Iterations based MultiLayer
775 We are then left to compute two disjoint function sets that contain
776 either functions with topological chaos properties or not, depending
777 on the strong connectivity of their iterations graph. This can be
778 achieved for instance by removing a set of edges from the iteration
779 graph $\Gamma(f_0)$ of the vectorial negation function~$f_0$. One can
780 deduce whether a function verifies the topological chaos property or
781 not by checking the strong connectivity of the resulting graph of
784 For instance let us consider the functions $f$ and $g$ from $\Bool^4$
785 to $\Bool^4$ respectively defined by the following lists:
786 $$[0, 0, 2, 3, 13, 13, 6, 3, 8, 9, 10, 11, 8, 13, 14,
787 15]$$ $$\mbox{and } [11, 14, 13, 14, 11, 10, 1, 8, 7, 6, 5, 4, 3, 2,
788 1, 0] \enspace.$$ In other words, the image of $0011$ by $g$ is
789 $1110$: it is obtained as the binary value of the fourth element in
790 the second list (namely~14). It is not hard to verify that
791 $\Gamma(f)$ is not SCC (\textit{e.g.}, $f(1111)$ is $1111$) whereas
792 $\Gamma(g)$ is. Next section shows how to translate iterations of
793 such functions into a model amenable to be learned by an ANN.
795 This section presents how (not) chaotic iterations of $G_f$ are
796 translated into another model more suited to artificial neural
798 \JFC{détailler le more suited}
799 Formally, input and output vectors are pairs~$((S^t)^{t \in
800 \Nats},x)$ and $\left(\sigma((S^t)^{t \in \Nats}),F_{f}(S^0,x)\right)$
801 as defined in~Eq.~(\ref{eq:Gf}).
803 Firstly, let us focus on how to memorize configurations. Two distinct
804 translations are proposed. In the first case, we take one input in
805 $\Bool$ per component; in the second case, configurations are
806 memorized as natural numbers. A coarse attempt to memorize
807 configuration as natural number could consist in labeling each
808 configuration with its translation into decimal numeral system.
809 However, such a representation induces too many changes between a
810 configuration labeled by a power of two and its direct previous
811 configuration: for instance, 16~(10000) and 15~(01111) are closed in a
812 decimal ordering, but their Hamming distance is 5. This is why Gray
813 codes~\cite{Gray47} have been preferred.
815 Let us secondly detail how to deal with strategies. Obviously, it is not
816 possible to translate in a finite way an infinite strategy, even if
817 both $(S^t)^{t \in \Nats}$ and $\sigma((S^t)^{t \in \Nats})$ belong to
818 $\{1,\ldots,n\}^{\Nats}$. Input strategies are then reduced to have a
819 length of size $l \in \llbracket 2,k\rrbracket$, where $k$ is a
820 parameter of the evaluation. Notice that $l$ is greater than or equal
821 to $2$ since we do not want the shift $\sigma$~function to return an
822 empty strategy. Strategies are memorized as natural numbers expressed
823 in base $n+1$. At each iteration, either none or one component is
824 modified (among the $n$ components) leading to a radix with $n+1$
825 entries. Finally, we give an other input, namely $m \in \llbracket
826 1,l-1\rrbracket$, which is the number of successive iterations that
827 are applied starting from $x$. Outputs are translated with the same
830 To address the complexity issue of the problem, let us compute the
831 size of the data set an ANN has to deal with. Each input vector of an
832 input-output pair is composed of a configuration~$x$, an excerpt $S$
833 of the strategy to iterate of size $l \in \llbracket 2, k\rrbracket$,
834 and a number $m \in \llbracket 1, l-1\rrbracket$ of iterations that
837 Firstly, there are $2^n$ configurations $x$, with $n^l$ strategies of
838 size $l$ for each of them. Secondly, for a given configuration there
839 are $\omega = 1 \times n^2 + 2 \times n^3 + \ldots+ (k-1) \times n^k$
840 ways of writing the pair $(m,S)$. Furthermore, it is not hard to
843 \displaystyle{(n-1) \times \omega = (k-1)\times n^{k+1} - \sum_{i=2}^k n^i} \nonumber
848 \dfrac{(k-1)\times n^{k+1}}{n-1} - \dfrac{n^{k+1}-n^2}{(n-1)^2} \enspace . \nonumber
850 \noindent And then, finally, the number of input-output pairs for our
853 2^n \times \left(\dfrac{(k-1)\times n^{k+1}}{n-1} - \dfrac{n^{k+1}-n^2}{(n-1)^2}\right) \enspace .
855 For instance, for $4$ binary components and a strategy of at most
856 $3$~terms we obtain 2304~input-output pairs.
858 \subsection{Experiments}
859 \label{section:experiments}
861 To study if chaotic iterations can be predicted, we choose to train
862 the MultiLayer Perceptron. As stated before, this kind of network is
863 in particular well-known for its universal approximation
864 property. Furthermore, MLPs have been already considered for chaotic
865 time series prediction. For example, in~\cite{dalkiran10} the authors
866 have shown that a feedforward MLP with two hidden layers, and trained
867 with Bayesian Regulation back-propagation, can learn successfully the
868 dynamics of Chua's circuit.
870 In these experiments we consider MLPs having one hidden layer of
871 sigmoidal neurons and output neurons with a linear activation
872 function. They are trained using the Limited-memory
873 Broyden-Fletcher-Goldfarb-Shanno quasi-newton algorithm in combination
874 with the Wolfe linear search. The training process is performed until
875 a maximum number of epochs is reached. To prevent overfitting and to
876 estimate the generalization performance we use holdout validation by
877 splitting the data set into learning, validation, and test subsets.
878 These subsets are obtained through random selection such that their
879 respective size represents 65\%, 10\%, and 25\% of the whole data set.
881 Several neural networks are trained for both iterations coding
882 schemes. In both cases iterations have the following layout:
883 configurations of four components and strategies with at most three
884 terms. Thus, for the first coding scheme a data set pair is composed
885 of 6~inputs and 5~outputs, while for the second one it is respectively
886 3~inputs and 2~outputs. As noticed at the end of the previous section,
887 this leads to data sets that consist of 2304~pairs. The networks
888 differ in the size of the hidden layer and the maximum number of
889 training epochs. We remember that to evaluate the ability of neural
890 networks to predict a chaotic behavior for each coding scheme, the
891 trainings of two data sets, one of them describing chaotic iterations,
894 Thereafter we give, for the different learning setups and data sets,
895 the mean prediction success rate obtained for each output. These
896 values are computed considering 10~trainings with random subsets
897 construction, weights and biases initialization. Firstly, neural
898 networks having 10 and 25~hidden neurons are trained, with a maximum
899 number of epochs that takes its value in $\{125,250,500\}$ (see
900 Tables~\ref{tab1} and \ref{tab2}). Secondly, we refine the second
901 coding scheme by splitting the output vector such that each output is
902 learned by a specific neural network (Table~\ref{tab3}). In this last
903 case, we increase the size of the hidden layer up to 40~neurons, and
904 we consider larger number of epochs.
907 \caption{Prediction success rates for configurations expressed as boolean vectors.}
910 \begin{tabular}{|c|c||c|c|c|}
912 \multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs and one hidden layer} \\
915 \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\
917 \multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\
919 \multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 90.92\% & 91.75\% & 91.82\% \\
920 & Output~(2) & 69.32\% & 78.46\% & 82.15\% \\
921 & Output~(3) & 68.47\% & 78.49\% & 82.22\% \\
922 & Output~(4) & 91.53\% & 92.37\% & 93.4\% \\
923 & Config. & 36.10\% & 51.35\% & 56.85\% \\
924 & Strategy~(5) & 1.91\% & 3.38\% & 2.43\% \\
926 \multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.64\% & 98.10\% & 98.20\% \\
927 & Output~(2) & 95.15\% & 95.39\% & 95.46\% \\
928 & Output~(3) & 100\% & 100\% & 100\% \\
929 & Output~(4) & 97.47\% & 97.90\% & 97.99\% \\
930 & Config. & 90.52\% & 91.59\% & 91.73\% \\
931 & Strategy~(5) & 3.41\% & 3.40\% & 3.47\% \\
934 \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{25 neurons} \\ %& \multicolumn{3}{|c|}{40 neurons} \\
936 \multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\ %& 125 & 250 & 500 \\
938 \multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 91.65\% & 92.69\% & 93.93\% \\ %& 91.94\% & 92.89\% & 94.00\% \\
939 & Output~(2) & 72.06\% & 88.46\% & 90.5\% \\ %& 74.97\% & 89.83\% & 91.14\% \\
940 & Output~(3) & 79.19\% & 89.83\% & 91.59\% \\ %& 76.69\% & 89.58\% & 91.84\% \\
941 & Output~(4) & 91.61\% & 92.34\% & 93.47\% \\% & 82.77\% & 92.93\% & 93.48\% \\
942 & Config. & 48.82\% & 67.80\% & 70.97\% \\%& 49.46\% & 68.94\% & 71.11\% \\
943 & Strategy~(5) & 2.62\% & 3.43\% & 3.78\% \\% & 3.10\% & 3.10\% & 3.03\% \\
945 \multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.87\% & 97.99\% & 98.03\% \\ %& 98.16\% \\
946 & Output~(2) & 95.46\% & 95.84\% & 96.75\% \\ % & 97.4\% \\
947 & Output~(3) & 100\% & 100\% & 100\% \\%& 100\% \\
948 & Output~(4) & 97.77\% & 97.82\% & 98.06\% \\%& 98.31\% \\
949 & Config. & 91.36\% & 91.99\% & 93.03\% \\%& 93.98\% \\
950 & Strategy~(5) & 3.37\% & 3.44\% & 3.29\% \\%& 3.23\% \\
956 Table~\ref{tab1} presents the rates obtained for the first coding
957 scheme. For the chaotic data, it can be seen that as expected
958 configuration prediction becomes better when the number of hidden
959 neurons and maximum epochs increases: an improvement by a factor two
960 is observed (from 36.10\% for 10~neurons and 125~epochs to 70.97\% for
961 25~neurons and 500~epochs). We also notice that the learning of
962 outputs~(2) and~(3) is more difficult. Conversely, for the
963 non-chaotic case the simplest training setup is enough to predict
964 configurations. For all network topologies and all outputs the
965 obtained results for the non-chaotic case outperform the chaotic
966 ones. Finally, the rates for the strategies show that the different
967 networks are unable to learn them.
969 For the second coding scheme (\textit{i.e.}, with Gray Codes)
970 Table~\ref{tab2} shows that any network
971 learns about five times more non-chaotic configurations than chaotic
972 ones. As in the previous scheme, the strategies cannot be predicted.
974 Let us now compare the two coding schemes. Firstly, the second scheme
975 disturbs the learning process. In fact in this scheme the
976 configuration is always expressed as a natural number, whereas in the
977 first one the number of inputs follows the increase of the boolean
978 vectors coding configurations. In this latter case, the coding gives a
979 finer information on configuration evolution.
980 \JFC{Je n'ai pas compris le paragraphe precedent. Devrait être repris}
982 \caption{Prediction success rates for configurations expressed with Gray code}
985 \begin{tabular}{|c|c||c|c|c|}
987 \multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs and one hidden layer} \\
990 & Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\
992 & Epochs & 125 & 250 & 500 \\ %& 1000
994 \multirow{2}{*}{Chaotic}& Config.~(1) & 13.29\% & 13.55\% & 13.08\% \\ %& 12.5\%
995 & Strategy~(2) & 0.50\% & 0.52\% & 1.32\% \\ %& 1.42\%
997 \multirow{2}{*}{Non-Chaotic}&Config.~(1) & 77.12\% & 74.00\% & 72.60\% \\ %& 75.81\%
998 & Strategy~(2) & 0.42\% & 0.80\% & 1.16\% \\ %& 1.42\%
1001 & Hidden neurons & \multicolumn{3}{c|}{25 neurons} \\
1003 & Epochs & 125 & 250 & 500 \\ %& 1000
1005 \multirow{2}{*}{Chaotic}& Config.~(1) & 12.27\% & 13.15\% & 13.05\% \\ %& 15.44\%
1006 & Strategy~(2) & 0.71\% & 0.66\% & 0.88\% \\ %& 1.73\%
1008 \multirow{2}{*}{Non-Chaotic}&Config.~(1) & 73.60\% & 74.70\% & 75.89\% \\ %& 68.32\%
1009 & Strategy~(2) & 0.64\% & 0.97\% & 1.23\% \\ %& 1.80\%
1014 Unfortunately, in practical applications the number of components is
1015 usually unknown. Hence, the first coding scheme cannot be used
1016 systematically. Therefore, we provide a refinement of the second
1017 scheme: each output is learned by a different ANN. Table~\ref{tab3}
1018 presents the results for this approach. In any case, whatever the
1019 network topologies, the maximum epoch number and the kind of
1020 iterations, the configuration success rate is slightly improved.
1021 Moreover, the strategies predictions rates reach almost 12\%, whereas
1022 in Table~\ref{tab2} they never exceed 1.5\%. Despite of this
1023 improvement, a long term prediction of chaotic iterations still
1028 \caption{Prediction success rates for split outputs.}
1031 \begin{tabular}{|c||c|c|c|}
1033 \multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output and one hidden layer} \\
1036 Epochs & 125 & 250 & 500 \\
1039 Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
1041 10~neurons & 12.39\% & 14.06\% & 14.32\% \\
1042 25~neurons & 13.00\% & 14.28\% & 14.58\% \\
1043 40~neurons & 11.58\% & 13.47\% & 14.23\% \\
1046 Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
1048 %Epochs & 125 & 250 & 500 \\
1050 10~neurons & 76.01\% & 74.04\% & 78.16\% \\
1051 25~neurons & 76.60\% & 72.13\% & 75.96\% \\
1052 40~neurons & 76.34\% & 75.63\% & 77.50\% \\
1055 Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
1057 %Epochs & 125 & 250 & 500 \\
1059 10~neurons & 0.76\% & 0.97\% & 1.21\% \\
1060 25~neurons & 1.09\% & 0.73\% & 1.79\% \\
1061 40~neurons & 0.90\% & 1.02\% & 2.15\% \\
1063 \multicolumn{4}{c}{} \\
1065 Epochs & 1000 & 2500 & 5000 \\
1068 Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
1070 10~neurons & 14.51\% & 15.22\% & 15.22\% \\
1071 25~neurons & 16.95\% & 17.57\% & 18.46\% \\
1072 40~neurons & 17.73\% & 20.75\% & 22.62\% \\
1075 Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
1077 %Epochs & 1000 & 2500 & 5000 \\
1079 10~neurons & 78.98\% & 80.02\% & 79.97\% \\
1080 25~neurons & 79.19\% & 81.59\% & 81.53\% \\
1081 40~neurons & 79.64\% & 81.37\% & 81.37\% \\
1084 Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
1086 %Epochs & 1000 & 2500 & 5000 \\
1088 10~neurons & 3.47\% & 9.98\% & 11.66\% \\
1089 25~neurons & 3.92\% & 8.63\% & 10.09\% \\
1090 40~neurons & 3.29\% & 7.19\% & 7.18\% \\
1095 \section{Conclusion}
1097 In this paper, we have established an equivalence between chaotic
1098 iterations, according to the Devaney's definition of chaos, and a
1099 class of multilayer perceptron neural networks. Firstly, we have
1100 described how to build a neural network that can be trained to learn a
1101 given chaotic map function. Then, we found a condition that allow to
1102 check whether the iterations induced by a function are chaotic or not,
1103 and thus if a chaotic map is obtained. Thanks to this condition our
1104 approach is not limited to a particular function. In the dual case, we
1105 show that checking if a neural network is chaotic consists in
1106 verifying a property on an associated graph, called the graph of
1107 iterations. These results are valid for recurrent neural networks
1108 with a particular architecture. However, we believe that a similar
1109 work can be done for other neural network architectures. Finally, we
1110 have discovered at least one family of problems with a reasonable
1111 size, such that artificial neural networks should not be applied in
1112 the presence of chaos, due to their inability to learn chaotic
1113 behaviors in this context. Such a consideration is not reduced to a
1114 theoretical detail: this family of discrete iterations is concretely
1115 implemented in a new steganographic method \cite{guyeux10ter}. As
1116 steganographic detectors embed tools like neural networks to
1117 distinguish between original and stego contents, our studies tend to
1118 prove that such detectors might be unable to tackle with chaos-based
1119 information hiding schemes. Furthermore, iterations such that not all
1120 of the components are updated at each step are very common in
1121 biological and physics mechanisms. Therefore, one can reasonably
1122 wonder whether neural networks should be applied in these contexts.
1124 In future work we intend to enlarge the comparison between the
1125 learning of truly chaotic and non-chaotic behaviors. Other
1126 computational intelligence tools such as support vector machines will
1127 be investigated too, to discover which tools are the most relevant
1128 when facing a truly chaotic phenomenon. A comparison between learning
1129 rate success and prediction quality will be realized. Concrete
1130 consequences in biology, physics, and computer science security fields
1131 will be stated. Lastly, thresholds separating systems depending on
1132 the ability to learn their dynamics will be established.
1138 % \begin{definition} \label{def2}
1139 % A continuous function $f$ on a topological space $(\mathcal{X},\tau)$
1140 % is defined to be {\emph{topologically transitive}} if for any pair of
1141 % open sets $U$, $V \in \mathcal{X}$ there exists
1143 % \mathds{N}^{\ast}$
1145 % $f^k(U) \cap V \neq \emptyset$.
1149 \bibliography{chaos-paper}% Produces the bibliography via BibTeX.
1153 % ****** End of file chaos-paper.tex ******