1 % ****** Start of file chaos-paper.tex ******
6 numerical bibliography, %(default)
7 %jcp,% short, numerical bibliography,
16 \usepackage{graphicx}% Include figure files
17 \usepackage{dcolumn}% Align table columns on decimal point
18 \usepackage{bm}% bold math
21 \usepackage[mathlines]{lineno}% Enable numbering of text and display math
22 \linenumbers\relax % Commence numbering lines
23 % The amssymb package provides various useful mathematical symbols
25 % The amsthm package provides various useful mathematical fonts
27 % Various mathematical packages
28 \PassOptionsToPackage{cmex10draft}{graphics}
29 \usepackage[standard]{ntheorem}
31 \usepackage{stmaryrd} % llbracket
32 \usepackage{dsfont} % mathds
33 \usepackage{multirow} % tables
35 \newcommand{\ie}{\textit{i.e.}}
36 \newcommand{\Nats}[0]{\ensuremath{\mathds{N}}}
37 \newcommand{\R}[0]{\ensuremath{\mathds{R}}}
38 \newcommand{\Bool}[0]{\ensuremath{\mathds{B}}}
42 \title[Neural Networks and Chaos]{Neural Networks and Chaos:
43 Construction, Evaluation of Chaotic Networks, \\
44 and Prediction of Chaos with Multilayer Feedforward Networks
47 \author{Jacques M. Bahi}
48 \author{Jean-Fran\c{c}ois Couchot}
49 \author{Christophe Guyeux}
50 \email{christophe.guyeux@univ-fcomte.fr.}
51 \author{Michel Salomon}
52 \altaffiliation[Authors in ]{alphabetic order}
54 Computer Science Laboratory (LIFC), University of Franche-Comt\'e, \\
55 IUT de Belfort-Montb\'eliard, BP 527, \\
56 90016 Belfort Cedex, France
61 \newcommand{\CG}[1]{\begin{color}{red}\textit{#1}\end{color}}
62 \newcommand{\JFC}[1]{\begin{color}{blue}\textit{#1}\end{color}}
63 \newcommand{\MS}[1]{\begin{color}{green}\textit{#1}\end{color}}
67 Many research works deal with chaotic neural networks for various
68 fields of application. Unfortunately, up to now these networks are
69 usually claimed to be chaotic without any mathematical proof. The
70 purpose of this paper is to establish, based on a rigorous theoretical
71 framework, an equivalence between chaotic iterations according to
72 Devaney and a particular class of neural networks. On the one hand we
73 show how to build such a network, on the other hand we provide a
74 method to check if a neural network is a chaotic one. Finally, the
75 ability of classical feedforward multilayer perceptrons to learn sets
76 of data obtained from a dynamical system is regarded. Various Boolean
77 functions are iterated on finite states. Iterations of some of them
78 are proven to be chaotic as it is defined by Devaney. In that
79 context, important differences occur in the training process,
80 establishing with various neural networks that chaotic behaviors are
81 far more difficult to learn.
84 %%\pacs{Valid PACS appear here}% PACS, the Physics and Astronomy
85 % Classification Scheme.
86 %%\keywords{Suggested keywords}%Use showkeys class option if keyword
91 Chaotic neural networks have received a lot of attention due to the
92 appealing properties of deterministic chaos (unpredictability,
93 sensitivity, and so on). However, such networks are often claimed
94 chaotic without any rigorous mathematical proof. Therefore, in this
95 work a theoretical framework based on the Devaney's definition of
96 chaos is introduced. Starting with a relationship between discrete
97 iterations and Devaney's chaos, we firstly show how to build a
98 recurrent neural network that is equivalent to a chaotic map and
99 secondly a way to check whether an already available network is
100 chaotic or not. We also study different topological properties of
101 these truly chaotic neural networks. Finally, we show that the
102 learning, with neural networks having a classical feedforward
103 structure, of chaotic behaviors represented by data sets obtained from
104 chaotic maps, is far more difficult than non chaotic behaviors.
108 \section{Introduction}
111 Several research works have proposed or used chaotic neural networks
112 these last years. The complex dynamics of such networks lead to
113 various potential application areas: associative
114 memories~\cite{Crook2007267} and digital security tools like hash
115 functions~\cite{Xiao10}, digital
116 watermarking~\cite{1309431,Zhang2005759}, or cipher
117 schemes~\cite{Lian20091296}. In the former case, the background idea
118 is to control chaotic dynamics in order to store patterns, with the
119 key advantage of offering a large storage capacity. For the latter
120 case, the use of chaotic dynamics is motivated by their
121 unpredictability and random-like behaviors. Indeed, investigating new
122 concepts is crucial for the computer security field, because new
123 threats are constantly emerging. As an illustrative example, the
124 former standard in hash functions, namely the SHA-1 algorithm, has
125 been recently weakened after flaws were discovered.
127 Chaotic neural networks have been built with different approaches. In
128 the context of associative memory, chaotic neurons like the nonlinear
129 dynamic state neuron \cite{Crook2007267} frequently constitute the
130 nodes of the network. These neurons have an inherent chaotic behavior,
131 which is usually assessed through the computation of the Lyapunov
132 exponent. An alternative approach is to consider a well-known neural
133 network architecture: the MultiLayer Perceptron (MLP). These networks
134 are suitable to model nonlinear relationships between data, due to
135 their universal approximator capacity
136 \cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Thus, this kind of
137 networks can be trained to model a physical phenomenon known to be
138 chaotic such as Chua's circuit \cite{dalkiran10}. Sometime a neural
139 network, which is build by combining transfer functions and initial
140 conditions that are both chaotic, is itself claimed to be chaotic
141 \cite{springerlink:10.1007/s00521-010-0432-2}.
143 What all of these chaotic neural networks have in common is that they
144 are claimed to be chaotic despite a lack of any rigorous mathematical
145 proof. The first contribution of this paper is to fill this gap,
146 using a theoretical framework based on the Devaney's definition of
147 chaos \cite{Devaney}. This mathematical theory of chaos provides both
148 qualitative and quantitative tools to evaluate the complex behavior of
149 a dynamical system: ergodicity, expansivity, and so on. More
150 precisely, in this paper, which is an extension of a previous work
151 \cite{bgs11:ip}, we establish the equivalence between chaotic
152 iterations and a class of globally recurrent MLP. The second
153 contribution is a study of the converse problem, indeed we investigate
154 the ability of classical multilayer perceptrons to learn a particular
155 family of discrete chaotic dynamical systems. This family is defined
156 by a Boolean vector, an update function, and a sequence defining the
157 component to update at each iteration. It has been previously
158 established that such dynamical systems are chaotically iterated (as
159 it is defined by Devaney) when the chosen function has a strongly
160 connected iterations graph. In this document, we experiment several
161 MLPs and try to learn some iterations of this kind. We show that
162 non-chaotic iterations can be learned, whereas it is far more
163 difficult for chaotic ones. That is to say, we have discovered at
164 least one family of problems with a reasonable size, such that
165 artificial neural networks should not be applied due to their
166 inability to learn chaotic behaviors in this context.
168 The remainder of this research work is organized as follows. The next
169 section is devoted to the basics of Devaney's chaos. Section~\ref{S2}
170 formally describes how to build a neural network that operates
171 chaotically. Section~\ref{S3} is devoted to the dual case of checking
172 whether an existing neural network is chaotic or not. Topological
173 properties of chaotic neural networks are discussed in Sect.~\ref{S4}.
174 The Section~\ref{section:translation} shows how to translate such
175 iterations into an Artificial Neural Network (ANN), in order to
176 evaluate the capability for this latter to learn chaotic behaviors.
177 This ability is studied in Sect.~\ref{section:experiments}, where
178 various ANNs try to learn two sets of data: the first one is obtained
179 by chaotic iterations while the second one results from a non-chaotic
180 system. Prediction success rates are given and discussed for the two
181 sets. The paper ends with a conclusion section where our contribution
182 is summed up and intended future work is exposed.
184 \section{Chaotic Iterations according to Devaney}
186 In this section, the well-established notion of Devaney's mathematical
187 chaos is firstly recalled. Preservation of the unpredictability of
188 such dynamical system when implemented on a computer is obtained by
189 using some discrete iterations called ``asynchronous iterations'',
190 which are thus introduced. The result establishing the link between
191 such iterations and Devaney's chaos is finally presented at the end of
194 In what follows and for any function $f$, $f^n$ means the composition
195 $f \circ f \circ \hdots \circ f$ ($n$ times) and an {\bf iteration} of
196 a {\bf dynamical system} is the step that consists in updating the
197 global state $x^t$ at time $t$ with respect to a function $f$
198 s.t. $x^{t+1} = f(x^t)$.
200 \subsection{Devaney's chaotic dynamical systems}
202 Various domains such as physics, biology, or economy, contain systems
203 that exhibit a chaotic behavior, a well-known example is the weather.
204 These systems are in particular highly sensitive to initial
205 conditions, a concept usually presented as the butterfly effect: small
206 variations in the initial conditions possibly lead to widely different
207 behaviors. Theoretically speaking, a system is sensitive if for each
208 point $x$ in the iteration space, one can find a point in each
209 neighborhood of $x$ having a significantly different future evolution.
210 Conversely, a system seeded with the same initial conditions always
211 has the same evolution. In other words, chaotic systems have a
212 deterministic behavior defined through a physical or mathematical
213 model and a high sensitivity to the initial conditions. Besides
214 mathematically this kind of unpredictability is also referred to as
215 deterministic chaos. For example, many weather forecast models exist,
216 but they give only suitable predictions for about a week, because they
217 are initialized with conditions that reflect only a partial knowledge
218 of the current weather. Even if the differences are initially small,
219 they are amplified in the course of time, and thus make difficult a
220 long-term prediction. In fact, in a chaotic system, an approximation
221 of the current state is a quite useless indicator for predicting
224 From mathematical point of view, deterministic chaos has been
225 thoroughly studied these last decades, with different research works
226 that have provide various definitions of chaos. Among these
227 definitions, the one given by Devaney~\cite{Devaney} is
228 well-established. This definition consists of three conditions:
229 topological transitivity, density of periodic points, and sensitive
230 point dependence on initial conditions.
232 {\bf Topological transitivity} is checked when, for any point, any
233 neighborhood of its future evolution eventually overlap with any other
234 given region. This property implies that a dynamical system cannot be
235 broken into simpler subsystems. Intuitively, its complexity does not
236 allow any simplification.
238 However, chaos needs some regularity to ``counteracts'' the effects of
239 transitivity. % Thus, two points close to each other can behave in a completely different manner, the first one visiting the whole space whereas the second one has a regular orbit.
240 In the Devaney's formulation, a dense set of periodic
241 points is the element of regularity that a chaotic dynamical system has
243 %\begin{definition} \label{def3}
244 We recall that a point $x$ is a {\bf periodic point} for $f$ of
245 period~$n \in \mathds{N}^{\ast}$ if $f^{n}(x)=x$.
248 %\begin{definition} \label{def4}
249 $f$ is {\bf regular} on the topological space $(\mathcal{X},\tau)$ if
250 the set of periodic points for $f$ is dense in $\mathcal{X}$ (for any
251 $x \in \mathcal{X}$, we can find at least one periodic point in any of
254 Thus, due to these two properties, two points close to each other can
255 behave in a completely different manner, leading to unpredictability
256 for the whole system.
258 Let us recall that $f$ has {\bf sensitive dependence on initial
259 conditions} if there exists $\delta >0$ such that, for any $x\in
260 \mathcal{X}$ and any neighborhood $V$ of $x$, there exist $y\in V$ and
261 $n > 0$ such that $d\left(f^{n}(x), f^{n}(y)\right) >\delta $. The
262 value $\delta$ is called the {\bf constant of sensitivity} of $f$.
264 Finally, the dynamical system that iterates $f$ is {\bf chaotic
265 according to Devaney} on $(\mathcal{X},\tau)$ if $f$ is regular,
266 topologically transitive, and has sensitive dependence to its initial
267 conditions. In what follows, iterations are said to be chaotic
268 (according to Devaney) when the corresponding dynamical system is
269 chaotic, as it is defined in the Devaney's formulation.
271 %Let us notice that for a metric space the last condition follows from
272 %the two first ones~\cite{Banks92}.
274 \subsection{Asynchronous Iterations}
276 %This section presents some basics on topological chaotic iterations.
277 Let us firstly discuss about the domain of iteration. As far as we
278 know, no result rules that the chaotic behavior of a dynamical system
279 that has been theoretically proven on $\R$ remains valid on the
280 floating-point numbers, which is the implementation domain. Thus, to
281 avoid loss of chaos this work presents an alternative, that is to
282 iterate Boolean maps: results that are theoretically obtained in that
283 domain are preserved in implementations.
285 Let us denote by $\llbracket a ; b \rrbracket$ the following interval
286 of integers: $\{a, a+1, \hdots, b\}$, where $a~<~b$. In this section,
287 a {\it system} under consideration iteratively modifies a collection
288 of $n$~components. Each component $i \in \llbracket 1; n \rrbracket$
289 takes its value $x_i$ among the domain $\Bool=\{0,1\}$. A {\it
290 configuration} of the system at discrete time $t$ is the vector
292 $x^{t}=(x_1^{t},\ldots,x_{n}^{t}) \in \Bool^n$.
294 The dynamics of the system is described according to a function $f :
295 \Bool^n \rightarrow \Bool^n$ such that
297 $f(x)=(f_1(x),\ldots,f_n(x))$.
299 % Notice that $f^k$ denotes the
300 % $k-$th composition $f\circ \ldots \circ f$ of the function $f$.
302 Let be given a configuration $x$. In what follows
303 $N(i,x)=(x_1,\ldots,\overline{x_i},\ldots,x_n)$ is the configuration
304 obtained by switching the $i-$th component of $x$ ($\overline{x_i}$ is
305 indeed the negation of $x_i$). Intuitively, $x$ and $N(i,x)$ are
306 neighbors. The discrete iterations of $f$ are represented by the
307 oriented {\it graph of iterations} $\Gamma(f)$. In such a graph,
308 vertices are configurations of $\Bool^n$ and there is an arc labeled
309 $i$ from $x$ to $N(i,x)$ if and only if $f_i(x)$ is $N(i,x)$.
311 In the sequel, the {\it strategy} $S=(S^{t})^{t \in \Nats}$ is the
312 sequence defining which component to update at time $t$ and $S^{t}$
313 denotes its $t-$th term. This iteration scheme that only modifies one
314 element at each iteration is usually referred as {\it asynchronous
315 iterations}. More precisely, we have for any $i$, $1\le i \le n$,
317 \left\{ \begin{array}{l}
321 f_i(x^t) \textrm{ if $S^t = i$}\enspace , \\
322 x_i^t \textrm{ otherwise}\enspace .
328 Next section shows the link between asynchronous iterations and
331 \subsection{On the link between asynchronous iterations and
334 In this subsection we recall the link we have established between
335 asynchronous iterations and Devaney's chaos. The theoretical
336 framework is fully described in \cite{guyeux09}.
338 We introduce the function $F_{f}$ that is defined for any given
339 application $f:\Bool^{n} \to \Bool^{n}$ by $F_{f}:
340 \llbracket1;n\rrbracket\times \mathds{B}^{n} \rightarrow
341 \mathds{B}^{n}$, s.t.
347 f_j(x) \textrm{ if } j= s \enspace , \\
348 x_{j} \textrm{ otherwise} \enspace .
353 \noindent With such a notation, asynchronously obtained configurations
354 are defined for times \linebreak $t=0,1,2,\ldots$ by:
355 \begin{equation}\label{eq:sync}
356 \left\{\begin{array}{l}
357 x^{0}\in \mathds{B}^{n} \textrm{ and}\\
358 x^{t+1}=F_{f}(S^t,x^{t}) \enspace .
362 \noindent Finally, iterations defined in Eq.~(\ref{eq:sync}) can be
363 described by the following system:
367 X^{0} & = & ((S^t)^{t \in \Nats},x^0) \in
368 \llbracket1;n\rrbracket^\Nats \times \Bool^{n}\\
369 X^{k+1}& = & G_{f}(X^{k})\\
370 \multicolumn{3}{c}{\textrm{where } G_{f}\left(((S^t)^{t \in \Nats},x)\right)
371 = \left(\sigma((S^t)^{t \in \Nats}),F_{f}(S^0,x)\right) \enspace ,}
376 where $\sigma$ is the so-called shift function that removes the first
377 term of the strategy ({\it i.e.},~$S^0$). This definition allows to
378 link asynchronous iterations with classical iterations of a dynamical
379 system. Note that it can be extended by considering subsets for $S^t$.
381 To study topological properties of these iterations, we are then left
382 to introduce a {\bf distance} $d$ between two points $(S,x)$ and
383 $(\check{S},\check{x})$ in $\mathcal{X} =
384 \llbracket1;n\rrbracket^\Nats \times \Bool^{n}$. Let $\Delta(x,y) = 0$
385 if $x=y$, and $\Delta(x,y) = 1$ else, be a distance on
386 $\mathds{B}$. The distance $d$ is defined by
388 d((S,x);(\check{S},\check{x}))=d_{e}(x,\check{x})+d_{s}(S,\check{S})
393 d_{e}(x,\check{x})=\sum_{j=1}^{n}\Delta
394 (x_{j},\check{x}_{j}) \in \llbracket 0 ; n \rrbracket
398 d_{s}(S,\check{S})=\frac{9}{2n}\sum_{t=0}^{\infty
399 }\frac{|S^{t}-\check{S}^{t}|}{10^{t+1}} \in [0 ; 1] \enspace .
402 This distance is defined to reflect the following
403 information. Firstly, the more two systems have different components,
404 the larger the distance between them. Secondly, two systems with
405 similar components and strategies, which have the same starting terms,
406 must induce only a small distance. The proposed distance fulfills
407 these requirements: on the one hand its floor value reflects the
408 difference between the cells, on the other hand its fractional part
409 measures the difference between the strategies.
411 The relation between $\Gamma(f)$ and $G_f$ is obvious: there exists a
412 path from $x$ to $x'$ in $\Gamma(f)$ if and only if there exists a
413 strategy $s$ such that iterations of $G_f$ from the initial point
414 $(s,x)$ reach the configuration $x'$. Using this link,
415 Guyeux~\cite{GuyeuxThese10} has proven that,
416 \begin{theorem}%[Characterization of $\mathcal{C}$]
417 \label{Th:Caracterisation des IC chaotiques}
418 Let $f:\Bool^n\to\Bool^n$. Iterations of $G_f$ are chaotic according
419 to Devaney if and only if $\Gamma(f)$ is strongly connected.
422 Checking if a graph is strongly connected is not difficult (by the
423 Tarjan's algorithm for instance). Let be given a strategy $S$ and a
424 function $f$ such that $\Gamma(f)$ is strongly connected. In that
425 case, iterations of the function $G_f$ as defined in Eq.~(\ref{eq:Gf})
426 are chaotic according to Devaney.
429 Let us then define two functions $f_0$ and $f_1$ both in
430 $\Bool^n\to\Bool^n$ that are used all along this paper. The former is
431 the vectorial negation, \textit{i.e.}, $f_{0}(x_{1},\dots,x_{n})
432 =(\overline{x_{1}},\dots,\overline{x_{n}})$. The latter is
433 $f_1\left(x_1,\dots,x_n\right)=\left(
434 \overline{x_1},x_1,x_2,\dots,x_{n-1}\right)$. It is not hard to check
435 that $\Gamma(f_0)$ and $\Gamma(f_1)$ are both strongly connected, then
436 iterations of $G_{f_0}$ and of $G_{f_1}$ are chaotic according to
439 With this material, we are now able to build a first chaotic neural
440 network, as defined in the Devaney's formulation.
442 \section{A chaotic neural network in the sense of Devaney}
445 Let us build a multilayer perceptron neural network modeling
446 $F_{f_0}:\llbracket 1; n \rrbracket \times \mathds{B}^n \to
447 \mathds{B}^n$ associated to the vectorial negation. More precisely,
448 for all inputs $(s,x) \in \llbracket 1;n\rrbracket \times
449 \mathds{B}^n$, the output layer will produce $F_{f_0}(s,x)$. It is
450 then possible to link the output layer and the input one, in order to
451 model the dependence between two successive iterations. As a result
452 we obtain a global recurrent neural network that behaves as follows
453 (see Fig.~\ref{Fig:perceptron}).
456 \item The network is initialized with the input vector
457 $\left(S^0,x^0\right) \in \llbracket 1;n\rrbracket \times
458 \mathds{B}^n$ and computes the output vector
459 $x^1=F_{f_0}\left(S^0,x^0\right)$. This last vector is published as
460 an output one of the chaotic neural network and is sent back to the
461 input layer through the feedback links.
462 \item When the network is activated at the $t^{th}$ iteration, the
463 state of the system $x^t \in \mathds{B}^n$ received from the output
464 layer and the initial term of the sequence $(S^t)^{t \in \Nats}$
465 (\textit{i.e.}, $S^0 \in \llbracket 1;n\rrbracket$) are used to
466 compute the new output vector. This new vector, which represents
467 the new state of the dynamical system, satisfies:
469 x^{t+1}=F_{f_0}(S^0, x^t) \in \mathds{B}^n \enspace .
475 \includegraphics[scale=0.625]{perceptron}
476 \caption{A perceptron equivalent to chaotic iterations}
477 \label{Fig:perceptron}
480 The behavior of the neural network is such that when the initial state
481 is $x^0~\in~\mathds{B}^n$ and a sequence $(S^t)^{t \in \Nats}$ is
482 given as outside input, then the sequence of successive published
483 output vectors $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ is exactly
484 the one produced by the chaotic iterations formally described in
485 Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar
486 input vectors they both generate the same successive outputs
487 $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$, and therefore that they
488 are equivalent reformulations of the iterations of $G_{f_0}$ in
489 $\mathcal{X}$. Finally, since the proposed neural network is built to
490 model the behavior of $G_{f_0}$, whose iterations are chaotic
491 according to the Devaney's definition of chaos, we can conclude that
492 the network is also chaotic in this sense.
494 The previous construction scheme is not restricted to function $f_0$.
495 It can be extended to any function $f$ such that $G_f$ is a chaotic
496 map by training the network to model $F_{f}:\llbracket 1; n \rrbracket
497 \times \mathds{B}^n \to \mathds{B}^n$. Due to
498 Theorem~\ref{Th:Caracterisation des IC chaotiques}, we can find
499 alternative functions $f$ for $f_0$ through a simple check of their
500 graph of iterations $\Gamma(f)$. For example, we can build another
501 chaotic neural network by using $f_1$ instead of $f_0$.
503 \section{Checking whether a neural network is chaotic or not}
506 We focus now on the case where a neural network is already available,
507 and for which we want to know if it is chaotic. Typically, in many
508 research papers neural network are usually claimed to be chaotic
509 without any convincing mathematical proof. We propose an approach to
510 overcome this drawback for a particular category of multilayer
511 perceptrons defined below, and for the Devaney's formulation of chaos.
512 In spite of this restriction, we think that this approach can be
513 extended to a large variety of neural networks.
515 We consider a multilayer perceptron of the following form: inputs are
516 $n$ binary digits and one integer value, while outputs are $n$ bits.
517 Moreover, each binary output is connected with a feedback connection
521 \item During initialization, the network is seeded with $n$~bits
522 denoted $\left(x^0_1,\dots,x^0_n\right)$ and an integer value $S^0$
523 that belongs to $\llbracket1;n\rrbracket$.
524 \item At iteration~$t$, the last output vector
525 $\left(x^t_1,\dots,x^t_n\right)$ defines the $n$~bits used to
526 compute the new output one $\left(x^{t+1}_1,\dots,x^{t+1}_n\right)$.
527 While the remaining input receives a new integer value $S^t \in
528 \llbracket1;n\rrbracket$, which is provided by the outside world.
531 The topological behavior of these particular neural networks can be
532 proven to be chaotic through the following process. Firstly, we denote
533 by $F: \llbracket 1;n \rrbracket \times \mathds{B}^n \rightarrow
534 \mathds{B}^n$ the function that maps the value
535 $\left(s,\left(x_1,\dots,x_n\right)\right) \in \llbracket 1;n
536 \rrbracket \times \mathds{B}^n$ into the value
537 $\left(y_1,\dots,y_n\right) \in \mathds{B}^n$, where
538 $\left(y_1,\dots,y_n\right)$ is the response of the neural network
539 after the initialization of its input layer with
540 $\left(s,\left(x_1,\dots, x_n\right)\right)$. Secondly, we define $f:
541 \mathds{B}^n \rightarrow \mathds{B}^n$ such that
542 $f\left(x_1,x_2,\dots,x_n\right)$ is equal to
544 \left(F\left(1,\left(x_1,x_2,\dots,x_n\right)\right),\dots,
545 F\left(n,\left(x_1,x_2,\dots,x_n\right)\right)\right) \enspace .
547 Thus, for any $j$, $1 \le j \le n$, we have
548 $f_j\left(x_1,x_2,\dots,x_n\right) =
549 F\left(j,\left(x_1,x_2,\dots,x_n\right)\right)$.
550 If this recurrent neural network is seeded with
551 $\left(x_1^0,\dots,x_n^0\right)$ and $S \in \llbracket 1;n
552 \rrbracket^{\mathds{N}}$, it produces exactly the
553 same output vectors than the
554 chaotic iterations of $F_f$ with initial
555 condition $\left(S,(x_1^0,\dots, x_n^0)\right) \in \llbracket 1;n
556 \rrbracket^{\mathds{N}} \times \mathds{B}^n$.
557 Theoretically speaking, such iterations of $F_f$ are thus a formal
558 model of these kind of recurrent neural networks. In the rest of this
560 paper, we will call such multilayer perceptrons ``CI-MLP($f$)'', which
562 paper, we will call such multilayer perceptrons ``CI-MLP($f$)'', which
563 >>>>>>> 3df586d673bc4f3b32fa0dd1cb46796256744772
564 stands for ``Chaotic Iterations based MultiLayer Perceptron''.
566 Checking if CI-MLP($f$) behaves chaotically according to Devaney's
567 definition of chaos is simple: we need just to verify if the
568 associated graph of iterations $\Gamma(f)$ is strongly connected or
569 not. As an incidental consequence, we finally obtain an equivalence
570 between chaotic iterations and CI-MLP($f$). Therefore, we can
571 obviously study such multilayer perceptrons with mathematical tools
572 like topology to establish, for example, their convergence or,
573 contrarily, their unpredictable behavior. An example of such a study
574 is given in the next section.
576 \section{Topological properties of chaotic neural networks}
579 Let us first recall two fundamental definitions from the mathematical
582 \begin{definition} \label{def8}
583 A function $f$ is said to be {\bf expansive} if $\exists
584 \varepsilon>0$, $\forall x \neq y$, $\exists n \in \mathds{N}$ such
585 that $d\left(f^n(x),f^n(y)\right) \geq \varepsilon$.
588 \noindent In other words, a small error on any initial condition is
589 always amplified until $\varepsilon$, which denotes the constant of
592 \begin{definition} \label{def9}
593 A discrete dynamical system is said to be {\bf topologically mixing}
594 if and only if, for any pair of disjoint open sets $U$,$V \neq
595 \emptyset$, we can find some $n_0 \in \mathds{N}$ such that for any
596 $n$, $n\geq n_0$, we have $f^n(U) \cap V \neq \emptyset$.
599 \noindent Topologically mixing means that the dynamical system evolves
600 in time such that any given region of its topological space might
601 overlap with any other region.
603 It has been proven in Ref.~\cite{gfb10:ip} that chaotic iterations are
604 expansive and topologically mixing when $f$ is the vectorial negation
605 $f_0$. Consequently, these properties are inherited by the
606 CI-MLP($f_0$) recurrent neural network previously presented, which
607 induce a greater unpredictability. Any difference on the initial
608 value of the input layer is in particular magnified up to be equal to
609 the expansivity constant.
611 Let us then focus on the consequences for a neural network to be
612 chaotic according to Devaney's definition. Intuitively, the
613 topological transitivity property implies indecomposability, which is
614 formally defined as follows:
616 \begin{definition} \label{def10}
617 A dynamical system $\left( \mathcal{X}, f\right)$ is {\bf not
618 decomposable} if it is not the union of two closed sets $A, B
619 \subset \mathcal{X}$ such that $f(A) \subset A, f(B) \subset B$.
622 \noindent Hence, reducing the set of outputs generated by CI-MLP($f$),
623 in order to simplify its complexity, is impossible if $\Gamma(f)$ is
624 strongly connected. Moreover, under this hypothesis CI-MLPs($f$) are
627 \begin{definition} \label{def11}
628 A dynamical system $\left( \mathcal{X}, f\right)$ is {\bf strongly
629 transitive} if $\forall x,y \in \mathcal{X}$, $\forall r>0$,
630 $\exists z \in \mathcal{X}$, $d(z,x)~\leq~r \Rightarrow \exists n \in
631 \mathds{N}^{\ast}$, $f^n(z)=y$.
634 \noindent According to this definition, for all pairs of points $(x,
635 y)$ in the phase space, a point $z$ can be found in the neighborhood
636 of $x$ such that one of its iterates $f^n(z)$ is $y$. Indeed, this
637 result has been established during the proof of the transitivity
638 presented in Ref.~\cite{guyeux09}. Among other things, the strong
639 transitivity leads to the fact that without the knowledge of the
640 initial input layer, all outputs are possible. Additionally, no point
641 of the output space can be discarded when studying CI-MLPs: this space
642 is intrinsically complicated and it cannot be decomposed or
645 Furthermore, these recurrent neural networks exhibit the instability
648 A dynamical system $\left( \mathcal{X}, f\right)$ is {\bf unstable}
650 all $x \in \mathcal{X}$, the orbit $\gamma_x:n \in \mathds{N}
651 \longmapsto f^n(x)$ is unstable, that means: $\exists \varepsilon >
652 0$, $\forall \delta>0$, $\exists y \in \mathcal{X}$, $\exists n \in
653 \mathds{N}$, such that $d(x,y)<\delta$ and
654 $d\left(\gamma_x(n),\gamma_y(n)\right) \geq \varepsilon$.
657 \noindent This property, which is implied by the sensitive point
658 dependence on initial conditions, leads to the fact that in all
659 neighborhoods of any point $x$, there are points that can be apart by
660 $\varepsilon$ in the future through iterations of the
661 CI-MLP($f$). Thus, we can claim that the behavior of these MLPs is
662 unstable when $\Gamma (f)$ is strongly connected.
664 Let us now consider a compact metric space $(M, d)$ and $f: M
665 \rightarrow M$ a continuous map. For each natural number $n$, a new
666 metric $d_n$ is defined on $M$ by
668 d_n(x,y)=\max\{d(f^i(x),f^i(y)): 0\leq i<n\} \enspace .
671 Given any $\varepsilon > 0$ and $n \geqslant 1$, two points of $M$ are
672 $\varepsilon$-close with respect to this metric if their first $n$
673 iterates are $\varepsilon$-close.
675 This metric allows one to distinguish in a neighborhood of an orbit
676 the points that move away from each other during the iteration from
677 the points that travel together. A subset $E$ of $M$ is said to be
678 $(n, \varepsilon)$-separated if each pair of distinct points of $E$ is
679 at least $\varepsilon$ apart in the metric $d_n$. Denote by $H(n,
680 \varepsilon)$ the maximum cardinality of an $(n,
681 \varepsilon)$-separated set,
683 The {\bf topological entropy} of the map $f$ is defined by (see e.g.,
684 Ref.~\cite{Adler65} or Ref.~\cite{Bowen})
685 $$h(f)=\lim_{\varepsilon\to 0} \left(\limsup_{n\to \infty}
686 \frac{1}{n}\log H(n,\varepsilon)\right) \enspace .$$
689 Then we have the following result \cite{GuyeuxThese10},
691 $\left( \mathcal{X},d\right)$ is compact and the topological entropy
692 of $(\mathcal{X},G_{f_0})$ is infinite.
697 \includegraphics[scale=0.5]{scheme}
698 \caption{Summary of addressed neural networks and chaos problems}
702 Figure~\ref{Fig:scheme} is a summary of addressed neural networks and
703 chaos problems. In Section~\ref{S2} we have explained how to
704 construct a truly chaotic neural networks, $A$ for
705 instance. Section~\ref{S3} has shown how to check whether a given MLP
706 $A$ or $C$ is chaotic or not in the sense of Devaney, and how to study
707 its topological behavior. The last thing to investigate, when
708 comparing neural networks and Devaney's chaos, is to determine whether
709 an artificial neural network $C$ is able to learn or predict some
710 chaotic behaviors of $B$, as it is defined in the Devaney's
711 formulation (when they are not specifically constructed for this
712 purpose). This statement is studied in the next section.
714 \section{Suitability of Feedforward Neural Networks
715 for Predicting Chaotic and Non-chaotic Behaviors}
717 In the context of computer science different topic areas have an
718 interest in chaos, as for steganographic
719 techniques~\cite{1309431,Zhang2005759}. Steganography consists in
720 embedding a secret message within an ordinary one, while the secret
721 extraction takes place once at destination. The reverse ({\it i.e.},
722 automatically detecting the presence of hidden messages inside media)
723 is called steganalysis. Among the deployed strategies inside
724 detectors, there are support vectors
725 machines~\cite{Qiao:2009:SM:1704555.1704664}, neural
726 networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36}, and
727 Markov chains~\cite{Sullivan06steganalysisfor}. Most of these
728 detectors give quite good results and are rather competitive when
729 facing steganographic tools. However, to the best of our knowledge
730 none of the considered information hiding schemes fulfills the Devaney
731 definition of chaos~\cite{Devaney}. Indeed, one can wonder whether
732 detectors continue to give good results when facing truly chaotic
733 schemes. More generally, there remains the open problem of deciding
734 whether artificial intelligence is suitable for predicting topological
737 \subsection{Representing Chaotic Iterations for Neural Networks}
738 \label{section:translation}
740 The problem of deciding whether classical feedforward ANNs are
741 suitable to approximate topological chaotic iterations may then be
742 reduced to evaluate such neural networks on iterations of functions
743 with Strongly Connected Component (SCC)~graph of iterations. To
744 compare with non-chaotic iterations, the experiments detailed in the
745 following sections are carried out using both kinds of function
746 (chaotic and non-chaotic). Let us emphasize on the difference between
747 this kind of neural networks and the Chaotic Iterations based
748 multilayer peceptron.
750 We are then left to compute two disjoint function sets that contain
751 either functions with topological chaos properties or not, depending
752 on the strong connectivity of their iterations graph. This can be
753 achieved for instance by removing a set of edges from the iteration
754 graph $\Gamma(f_0)$ of the vectorial negation function~$f_0$. One can
755 deduce whether a function verifies the topological chaos property or
756 not by checking the strong connectivity of the resulting graph of
759 For instance let us consider the functions $f$ and $g$ from $\Bool^4$
760 to $\Bool^4$ respectively defined by the following lists:
761 $$[0, 0, 2, 3, 13, 13, 6, 3, 8, 9, 10, 11, 8, 13, 14,
762 15]$$ $$\mbox{and } [11, 14, 13, 14, 11, 10, 1, 8, 7, 6, 5, 4, 3, 2,
763 1, 0] \enspace.$$ In other words, the image of $0011$ by $g$ is
764 $1110$: it is obtained as the binary value of the fourth element in
765 the second list (namely~14). It is not hard to verify that
766 $\Gamma(f)$ is not SCC (\textit{e.g.}, $f(1111)$ is $1111$) whereas
767 $\Gamma(g)$ is. The remaining of this section shows how to translate
768 iterations of such functions into a model amenable to be learned by an
769 ANN. Formally, input and output vectors are pairs~$((S^t)^{t \in
770 \Nats},x)$ and $\left(\sigma((S^t)^{t \in
771 \Nats}),F_{f}(S^0,x)\right)$ as defined in~Eq.~(\ref{eq:Gf}).
773 Firstly, let us focus on how to memorize configurations. Two distinct
774 translations are proposed. In the first case, we take one input in
775 $\Bool$ per component; in the second case, configurations are
776 memorized as natural numbers. A coarse attempt to memorize
777 configuration as natural number could consist in labeling each
778 configuration with its translation into decimal numeral system.
779 However, such a representation induces too many changes between a
780 configuration labeled by a power of two and its direct previous
781 configuration: for instance, 16~(10000) and 15~(01111) are close in a
782 decimal ordering, but their Hamming distance is 5. This is why Gray
783 codes~\cite{Gray47} have been preferred.
785 Secondly, let us detail how to deal with strategies. Obviously, it is
786 not possible to translate in a finite way an infinite strategy, even
787 if both $(S^t)^{t \in \Nats}$ and $\sigma((S^t)^{t \in \Nats})$ belong
788 to $\{1,\ldots,n\}^{\Nats}$. Input strategies are then reduced to
789 have a length of size $l \in \llbracket 2,k\rrbracket$, where $k$ is a
790 parameter of the evaluation. Notice that $l$ is greater than or equal
791 to $2$ since we do not want the shift $\sigma$~function to return an
792 empty strategy. Strategies are memorized as natural numbers expressed
793 in base $n+1$. At each iteration, either none or one component is
794 modified (among the $n$ components) leading to a radix with $n+1$
795 entries. Finally, we give an other input, namely $m \in \llbracket
796 1,l-1\rrbracket$, which is the number of successive iterations that
797 are applied starting from $x$. Outputs are translated with the same
800 To address the complexity issue of the problem, let us compute the
801 size of the data set an ANN has to deal with. Each input vector of an
802 input-output pair is composed of a configuration~$x$, an excerpt $S$
803 of the strategy to iterate of size $l \in \llbracket 2, k\rrbracket$,
804 and a number $m \in \llbracket 1, l-1\rrbracket$ of iterations that
807 Firstly, there are $2^n$ configurations $x$, with $n^l$ strategies of
808 size $l$ for each of them. Secondly, for a given configuration there
809 are $\omega = 1 \times n^2 + 2 \times n^3 + \ldots+ (k-1) \times n^k$
810 ways of writing the pair $(m,S)$. Furthermore, it is not hard to
813 \displaystyle{(n-1) \times \omega = (k-1)\times n^{k+1} - \sum_{i=2}^k n^i} \nonumber
818 \dfrac{(k-1)\times n^{k+1}}{n-1} - \dfrac{n^{k+1}-n^2}{(n-1)^2} \enspace . \nonumber
820 \noindent And then, finally, the number of input-output pairs for our
823 2^n \times \left(\dfrac{(k-1)\times n^{k+1}}{n-1} - \dfrac{n^{k+1}-n^2}{(n-1)^2}\right) \enspace .
825 For instance, for $4$ binary components and a strategy of at most
826 $3$~terms we obtain 2304~input-output pairs.
828 \subsection{Experiments}
829 \label{section:experiments}
831 To study if chaotic iterations can be predicted, we choose to train
832 the multilayer perceptron. As stated before, this kind of network is
833 in particular well-known for its universal approximation property
834 \cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Furthermore, MLPs have
835 been already considered for chaotic time series prediction. For
836 example, in~\cite{dalkiran10} the authors have shown that a
837 feedforward MLP with two hidden layers, and trained with Bayesian
838 Regulation back-propagation, can learn successfully the dynamics of
841 In these experiments we consider MLPs having one hidden layer of
842 sigmoidal neurons and output neurons with a linear activation
843 function. They are trained using the Limited-memory
844 Broyden-Fletcher-Goldfarb-Shanno quasi-newton algorithm in combination
845 with the Wolfe linear search. The training process is performed until
846 a maximum number of epochs is reached. To prevent overfitting and to
847 estimate the generalization performance we use holdout validation by
848 splitting the data set into learning, validation, and test subsets.
849 These subsets are obtained through random selection such that their
850 respective size represents 65\%, 10\%, and 25\% of the whole data set.
852 Several neural networks are trained for both iterations coding
853 schemes. In both cases iterations have the following layout:
854 configurations of four components and strategies with at most three
855 terms. Thus, for the first coding scheme a data set pair is composed
856 of 6~inputs and 5~outputs, while for the second one it is respectively
857 3~inputs and 2~outputs. As noticed at the end of the previous section,
858 this leads to data sets that consist of 2304~pairs. The networks
859 differ in the size of the hidden layer and the maximum number of
860 training epochs. We remember that to evaluate the ability of neural
861 networks to predict a chaotic behavior for each coding scheme, the
862 trainings of two data sets, one of them describing chaotic iterations,
865 Thereafter we give, for the different learning setups and data sets,
866 the mean prediction success rate obtained for each output. Such a rate
868 represents the percentage of input-output pairs belonging to the test
870 represents the percentage of input-output pairs belonging to the test
871 >>>>>>> 3df586d673bc4f3b32fa0dd1cb46796256744772
872 subset for which the corresponding output value was correctly
873 predicted. These values are computed considering 10~trainings with
874 random subsets construction, weights and biases initialization.
875 Firstly, neural networks having 10 and 25~hidden neurons are trained,
876 with a maximum number of epochs that takes its value in
877 $\{125,250,500\}$ (see Tables~\ref{tab1} and \ref{tab2}). Secondly,
878 we refine the second coding scheme by splitting the output vector such
879 that each output is learned by a specific neural network
880 (Table~\ref{tab3}). In this last case, we increase the size of the
881 hidden layer up to 40~neurons and we consider larger number of epochs.
884 \caption{Prediction success rates for configurations expressed as boolean vectors.}
887 \begin{tabular}{|c|c||c|c|c|}
889 \multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs, and one hidden layer} \\
892 \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\
894 \multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\
896 \multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 90.92\% & 91.75\% & 91.82\% \\
897 & Output~(2) & 69.32\% & 78.46\% & 82.15\% \\
898 & Output~(3) & 68.47\% & 78.49\% & 82.22\% \\
899 & Output~(4) & 91.53\% & 92.37\% & 93.4\% \\
900 & Config. & 36.10\% & 51.35\% & 56.85\% \\
901 & Strategy~(5) & 1.91\% & 3.38\% & 2.43\% \\
903 \multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.64\% & 98.10\% & 98.20\% \\
904 & Output~(2) & 95.15\% & 95.39\% & 95.46\% \\
905 & Output~(3) & 100\% & 100\% & 100\% \\
906 & Output~(4) & 97.47\% & 97.90\% & 97.99\% \\
907 & Config. & 90.52\% & 91.59\% & 91.73\% \\
908 & Strategy~(5) & 3.41\% & 3.40\% & 3.47\% \\
911 \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{25 neurons} \\ %& \multicolumn{3}{|c|}{40 neurons} \\
913 \multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\ %& 125 & 250 & 500 \\
915 \multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 91.65\% & 92.69\% & 93.93\% \\ %& 91.94\% & 92.89\% & 94.00\% \\
916 & Output~(2) & 72.06\% & 88.46\% & 90.5\% \\ %& 74.97\% & 89.83\% & 91.14\% \\
917 & Output~(3) & 79.19\% & 89.83\% & 91.59\% \\ %& 76.69\% & 89.58\% & 91.84\% \\
918 & Output~(4) & 91.61\% & 92.34\% & 93.47\% \\% & 82.77\% & 92.93\% & 93.48\% \\
919 & Config. & 48.82\% & 67.80\% & 70.97\% \\%& 49.46\% & 68.94\% & 71.11\% \\
920 & Strategy~(5) & 2.62\% & 3.43\% & 3.78\% \\% & 3.10\% & 3.10\% & 3.03\% \\
922 \multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.87\% & 97.99\% & 98.03\% \\ %& 98.16\% \\
923 & Output~(2) & 95.46\% & 95.84\% & 96.75\% \\ % & 97.4\% \\
924 & Output~(3) & 100\% & 100\% & 100\% \\%& 100\% \\
925 & Output~(4) & 97.77\% & 97.82\% & 98.06\% \\%& 98.31\% \\
926 & Config. & 91.36\% & 91.99\% & 93.03\% \\%& 93.98\% \\
927 & Strategy~(5) & 3.37\% & 3.44\% & 3.29\% \\%& 3.23\% \\
933 Table~\ref{tab1} presents the rates obtained for the first coding
934 scheme. For the chaotic data, it can be seen that as expected
935 configuration prediction becomes better when the number of hidden
936 neurons and maximum epochs increases: an improvement by a factor two
937 is observed (from 36.10\% for 10~neurons and 125~epochs to 70.97\% for
938 25~neurons and 500~epochs). We also notice that the learning of
939 outputs~(2) and~(3) is more difficult. Conversely, for the
940 non-chaotic case the simplest training setup is enough to predict
941 configurations. For all these feedforward network topologies and all
942 outputs the obtained results for the non-chaotic case outperform the
943 chaotic ones. Finally, the rates for the strategies show that the
944 different feedforward networks are unable to learn them.
946 For the second coding scheme (\textit{i.e.}, with Gray Codes)
947 Table~\ref{tab2} shows that any network learns about five times more
948 non-chaotic configurations than chaotic ones. As in the previous
949 scheme, the strategies cannot be predicted.
950 Figures~\ref{Fig:chaotic_predictions} and
951 \ref{Fig:non-chaotic_predictions} present the predictions given by two
952 feedforward multilayer perceptrons that were respectively trained to
953 learn chaotic and non-chaotic data, using the second coding scheme.
954 Each figure shows for each sample of the test subset (577~samples,
955 representing 25\% of the 2304~samples) the configuration that should
956 have been predicted and the one given by the multilayer perceptron. It
957 can be seen that for the chaotic data the predictions are far away
958 from the expected configurations. Obviously, the better predictions
959 for the non-chaotic data reflect their regularity.
961 Let us now compare the two coding schemes. Firstly, the second scheme
962 disturbs the learning process. In fact in this scheme the
963 configuration is always expressed as a natural number, whereas in the
964 first one the number of inputs follows the increase of the Boolean
965 vectors coding configurations. In this latter case, the coding gives a
966 finer information on configuration evolution.
970 >>>>>>> 3df586d673bc4f3b32fa0dd1cb46796256744772
972 \caption{Prediction success rates for configurations expressed with Gray code}
975 \begin{tabular}{|c|c||c|c|c|}
977 \multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs, and one hidden layer} \\
980 & Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\
982 & Epochs & 125 & 250 & 500 \\ %& 1000
984 \multirow{2}{*}{Chaotic}& Config.~(1) & 13.29\% & 13.55\% & 13.08\% \\ %& 12.5\%
985 & Strategy~(2) & 0.50\% & 0.52\% & 1.32\% \\ %& 1.42\%
987 \multirow{2}{*}{Non-Chaotic}&Config.~(1) & 77.12\% & 74.00\% & 72.60\% \\ %& 75.81\%
988 & Strategy~(2) & 0.42\% & 0.80\% & 1.16\% \\ %& 1.42\%
991 & Hidden neurons & \multicolumn{3}{c|}{25 neurons} \\
993 & Epochs & 125 & 250 & 500 \\ %& 1000
995 \multirow{2}{*}{Chaotic}& Config.~(1) & 12.27\% & 13.15\% & 13.05\% \\ %& 15.44\%
996 & Strategy~(2) & 0.71\% & 0.66\% & 0.88\% \\ %& 1.73\%
998 \multirow{2}{*}{Non-Chaotic}&Config.~(1) & 73.60\% & 74.70\% & 75.89\% \\ %& 68.32\%
999 & Strategy~(2) & 0.64\% & 0.97\% & 1.23\% \\ %& 1.80\%
1006 \includegraphics[scale=0.5]{chaotic_trace2}
1007 \caption {Second coding scheme - Predictions obtained for a chaotic test subset.}
1008 \label{Fig:chaotic_predictions}
1013 \includegraphics[scale=0.5]{non-chaotic_trace2}
1014 \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.}
1015 \label{Fig:non-chaotic_predictions}
1018 Unfortunately, in practical applications the number of components is
1019 usually unknown. Hence, the first coding scheme cannot be used
1020 systematically. Therefore, we provide a refinement of the second
1021 scheme: each output is learned by a different ANN. Table~\ref{tab3}
1022 presents the results for this approach. In any case, whatever the
1024 considered feedforward network topologies, the maximum epoch number,
1026 considered feedforward network topologies, the maximum epoch number,
1027 >>>>>>> 3df586d673bc4f3b32fa0dd1cb46796256744772
1028 and the kind of iterations, the configuration success rate is slightly
1029 improved. Moreover, the strategies predictions rates reach almost
1030 12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%. Despite of
1031 this improvement, a long term prediction of chaotic iterations still
1032 appear to be an open issue.
1035 \caption{Prediction success rates for split outputs.}
1038 \begin{tabular}{|c||c|c|c|}
1040 \multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output, and one hidden layer} \\
1043 Epochs & 125 & 250 & 500 \\
1046 Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
1048 10~neurons & 12.39\% & 14.06\% & 14.32\% \\
1049 25~neurons & 13.00\% & 14.28\% & 14.58\% \\
1050 40~neurons & 11.58\% & 13.47\% & 14.23\% \\
1053 Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
1055 %Epochs & 125 & 250 & 500 \\
1057 10~neurons & 76.01\% & 74.04\% & 78.16\% \\
1058 25~neurons & 76.60\% & 72.13\% & 75.96\% \\
1059 40~neurons & 76.34\% & 75.63\% & 77.50\% \\
1062 Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
1064 %Epochs & 125 & 250 & 500 \\
1066 10~neurons & 0.76\% & 0.97\% & 1.21\% \\
1067 25~neurons & 1.09\% & 0.73\% & 1.79\% \\
1068 40~neurons & 0.90\% & 1.02\% & 2.15\% \\
1070 \multicolumn{4}{c}{} \\
1072 Epochs & 1000 & 2500 & 5000 \\
1075 Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
1077 10~neurons & 14.51\% & 15.22\% & 15.22\% \\
1078 25~neurons & 16.95\% & 17.57\% & 18.46\% \\
1079 40~neurons & 17.73\% & 20.75\% & 22.62\% \\
1082 Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
1084 %Epochs & 1000 & 2500 & 5000 \\
1086 10~neurons & 78.98\% & 80.02\% & 79.97\% \\
1087 25~neurons & 79.19\% & 81.59\% & 81.53\% \\
1088 40~neurons & 79.64\% & 81.37\% & 81.37\% \\
1091 Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
1093 %Epochs & 1000 & 2500 & 5000 \\
1095 10~neurons & 3.47\% & 9.98\% & 11.66\% \\
1096 25~neurons & 3.92\% & 8.63\% & 10.09\% \\
1097 40~neurons & 3.29\% & 7.19\% & 7.18\% \\
1102 \section{Conclusion}
1104 In this paper, we have established an equivalence between chaotic
1105 iterations, according to the Devaney's definition of chaos, and a
1106 class of multilayer perceptron neural networks. Firstly, we have
1107 described how to build a neural network that can be trained to learn a
1108 given chaotic map function. Secondly, we found a condition that allow
1109 to check whether the iterations induced by a function are chaotic or
1110 not, and thus if a chaotic map is obtained. Thanks to this condition
1111 our approach is not limited to a particular function. In the dual
1112 case, we show that checking if a neural network is chaotic consists in
1113 verifying a property on an associated graph, called the graph of
1114 iterations. These results are valid for recurrent neural networks
1115 with a particular architecture. However, we believe that a similar
1116 work can be done for other neural network architectures. Finally, we
1117 have discovered at least one family of problems with a reasonable
1118 size, such that artificial neural networks should not be applied in
1119 the presence of chaos, due to their inability to learn chaotic
1120 behaviors in this context. Such a consideration is not reduced to a
1121 theoretical detail: this family of discrete iterations is concretely
1122 implemented in a new steganographic method \cite{guyeux10ter}. As
1123 steganographic detectors embed tools like neural networks to
1124 distinguish between original and stego contents, our studies tend to
1125 prove that such detectors might be unable to tackle with chaos-based
1126 information hiding schemes.
1128 In future work we intend to enlarge the comparison between the
1129 learning of truly chaotic and non-chaotic behaviors. Other
1130 computational intelligence tools such as support vector machines will
1131 be investigated too, to discover which tools are the most relevant
1132 when facing a truly chaotic phenomenon. A comparison between learning
1133 rate success and prediction quality will be realized. Concrete
1134 consequences in biology, physics, and computer science security fields
1136 will then be stated.
1138 will then be stated.
1139 >>>>>>> 3df586d673bc4f3b32fa0dd1cb46796256744772
1143 % \begin{definition} \label{def2}
1144 % A continuous function $f$ on a topological space $(\mathcal{X},\tau)$
1145 % is defined to be {\emph{topologically transitive}} if for any pair of
1146 % open sets $U$, $V \in \mathcal{X}$ there exists
1148 % \mathds{N}^{\ast}$
1150 % $f^k(U) \cap V \neq \emptyset$.
1153 \bibliography{chaos-paper}% Produces the bibliography via BibTeX.
1157 % ****** End of file chaos-paper.tex ******