\begin{document}
\title[Neural Networks and Chaos]{Neural Networks and Chaos:
-Construction, Evaluation of Chaotic Networks \\
+Construction, Evaluation of Chaotic Networks, \\
and Prediction of Chaos with Multilayer Feedforward Networks
}
\maketitle
\begin{quotation}
-
Chaotic neural networks have received a lot of attention due to the
appealing properties of deterministic chaos (unpredictability,
sensitivity, and so on). However, such networks are often claimed
chaos is introduced. Starting with a relationship between discrete
iterations and Devaney's chaos, we firstly show how to build a
recurrent neural network that is equivalent to a chaotic map and
-secondly a way to check whether an already available network, is
+secondly a way to check whether an already available network is
chaotic or not. We also study different topological properties of
these truly chaotic neural networks. Finally, we show that the
learning, with neural networks having a classical feedforward
\label{S1}
Several research works have proposed or used chaotic neural networks
-these last years. The complex dynamics of such a network leads to
+these last years. The complex dynamics of such networks lead to
various potential application areas: associative
memories~\cite{Crook2007267} and digital security tools like hash
functions~\cite{Xiao10}, digital
their universal approximator capacity
\cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Thus, this kind of
networks can be trained to model a physical phenomenon known to be
-chaotic such as Chua's circuit \cite{dalkiran10}. Sometimes, a neural
-network which is build by combining transfer functions and initial
+chaotic such as Chua's circuit \cite{dalkiran10}. Sometime a neural
+network, which is build by combining transfer functions and initial
conditions that are both chaotic, is itself claimed to be chaotic
\cite{springerlink:10.1007/s00521-010-0432-2}.
precisely, in this paper, which is an extension of a previous work
\cite{bgs11:ip}, we establish the equivalence between chaotic
iterations and a class of globally recurrent MLP. The second
-contribution is a study of the converse problem, indeed we study the
-ability of classical multiLayer perceptrons to learn a particular
+contribution is a study of the converse problem, indeed we investigate
+the ability of classical multilayer perceptrons to learn a particular
family of discrete chaotic dynamical systems. This family is defined
-by a Boolean vector, an update function, and a sequence defining which
+by a Boolean vector, an update function, and a sequence defining the
component to update at each iteration. It has been previously
established that such dynamical systems are chaotically iterated (as
it is defined by Devaney) when the chosen function has a strongly
neighborhood of its future evolution eventually overlap with any other
given region. This property implies that a dynamical system cannot be
broken into simpler subsystems. Intuitively, its complexity does not
-allow any simplification. On the contrary, a dense set of periodic
-points is an element of regularity that a chaotic dynamical system has
-to exhibit.
+allow any simplification.
However, chaos needs some regularity to ``counteracts'' the effects of
-transitivity.
+transitivity. % Thus, two points close to each other can behave in a completely different manner, the first one visiting the whole space whereas the second one has a regular orbit.
+In the Devaney's formulation, a dense set of periodic
+points is the element of regularity that a chaotic dynamical system has
+to exhibit.
%\begin{definition} \label{def3}
-We recall that a point $x$ is {\bf periodic point} for $f$ of
+We recall that a point $x$ is a {\bf periodic point} for $f$ of
period~$n \in \mathds{N}^{\ast}$ if $f^{n}(x)=x$.
%\end{definition}
Then, the map
$n > 0$ such that $d\left(f^{n}(x), f^{n}(y)\right) >\delta $. The
value $\delta$ is called the {\bf constant of sensitivity} of $f$.
-Finally, The dynamical system that iterates $f$ is {\bf chaotic
+Finally, the dynamical system that iterates $f$ is {\bf chaotic
according to Devaney} on $(\mathcal{X},\tau)$ if $f$ is regular,
topologically transitive, and has sensitive dependence to its initial
conditions. In what follows, iterations are said to be chaotic
-according Devaney when corresponding dynamical system is chaotic
-according Devaney.
+(according to Devaney) when the corresponding dynamical system is
+chaotic, as it is defined in the Devaney's formulation.
%Let us notice that for a metric space the last condition follows from
%the two first ones~\cite{Banks92}.
In the sequel, the {\it strategy} $S=(S^{t})^{t \in \Nats}$ is the
sequence defining which component to update at time $t$ and $S^{t}$
denotes its $t-$th term. This iteration scheme that only modifies one
-element at each iteration is classically referred as {\it asynchronous
+element at each iteration is usually referred as {\it asynchronous
iterations}. More precisely, we have for any $i$, $1\le i \le n$,
\begin{equation}
\left\{ \begin{array}{l}
x^{0} \in \Bool^n \\
x^{t+1}_i = \left\{
\begin{array}{l}
- f_i(x^t) \textrm{ if $S^t = i$} \\
- x_i^t \textrm{ otherwise}
+ f_i(x^t) \textrm{ if $S^t = i$}\enspace , \\
+ x_i^t \textrm{ otherwise}\enspace .
\end{array}
\right.
\end{array} \right.
\right.
\label{eq:Gf}
\end{equation}
-where $\sigma$ is the function that removes the first term of the
-strategy ({\it i.e.},~$S^0$). This definition allows to links
-asynchronous iterations with classical iterations of a dynamical
+where $\sigma$ is the so-called shift function that removes the first
+term of the strategy ({\it i.e.},~$S^0$). This definition allows to
+link asynchronous iterations with classical iterations of a dynamical
system. Note that it can be extended by considering subsets for $S^t$.
To study topological properties of these iterations, we are then left
to introduce a {\bf distance} $d$ between two points $(S,x)$ and
-$(\check{S},\check{x})\in \mathcal{X} = \llbracket1;n\rrbracket^\Nats
-\times \Bool^{n}$. It is defined by
+$(\check{S},\check{x})$ in $\mathcal{X} =
+\llbracket1;n\rrbracket^\Nats \times \Bool^{n}$. Let $\Delta(x,y) = 0$
+if $x=y$, and $\Delta(x,y) = 1$ else, be a distance on
+$\mathds{B}$. The distance $d$ is defined by
\begin{equation}
d((S,x);(\check{S},\check{x}))=d_{e}(x,\check{x})+d_{s}(S,\check{S})
\enspace ,
difference between the cells, on the other hand its fractional part
measures the difference between the strategies.
-The relation between $\Gamma(f)$ and $G_f$ is clear: there exists a
+The relation between $\Gamma(f)$ and $G_f$ is obvious: there exists a
path from $x$ to $x'$ in $\Gamma(f)$ if and only if there exists a
strategy $s$ such that iterations of $G_f$ from the initial point
-$(s,x)$ reaches the configuration $x'$. Using this link,
+$(s,x)$ reach the configuration $x'$. Using this link,
Guyeux~\cite{GuyeuxThese10} has proven that,
\begin{theorem}%[Characterization of $\mathcal{C}$]
\label{Th:Caracterisation des IC chaotiques}
are chaotic according to Devaney.
-Let us then define two function $f_0$ and $f_1$ both in
+Let us then define two functions $f_0$ and $f_1$ both in
$\Bool^n\to\Bool^n$ that are used all along this paper. The former is
the vectorial negation, \textit{i.e.}, $f_{0}(x_{1},\dots,x_{n})
=(\overline{x_{1}},\dots,\overline{x_{n}})$. The latter is
\item When the network is activated at the $t^{th}$ iteration, the
state of the system $x^t \in \mathds{B}^n$ received from the output
layer and the initial term of the sequence $(S^t)^{t \in \Nats}$
- ($S^0 \in \llbracket 1;n\rrbracket$) are used to compute the new
- output vector. This new vector, which represents the new state of
- the dynamical system, satisfies:
+ (\textit{i.e.}, $S^0 \in \llbracket 1;n\rrbracket$) are used to
+ compute the new output vector. This new vector, which represents
+ the new state of the dynamical system, satisfies:
\begin{equation}
x^{t+1}=F_{f_0}(S^0, x^t) \in \mathds{B}^n \enspace .
\end{equation}
The behavior of the neural network is such that when the initial state
is $x^0~\in~\mathds{B}^n$ and a sequence $(S^t)^{t \in \Nats}$ is
-given as outside input,
-\JFC{en dire davantage sur l'outside world} %% TO BE UPDATED
-then the sequence of successive published
+given as outside input, then the sequence of successive published
output vectors $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ is exactly
the one produced by the chaotic iterations formally described in
-Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar
+Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar
input vectors they both generate the same successive outputs
$\left(x^t\right)^{t \in \mathds{N}^{\ast}}$, and therefore that they
are equivalent reformulations of the iterations of $G_{f_0}$ in
$\mathcal{X}$. Finally, since the proposed neural network is built to
-model the behavior of $G_{f_0}$, whose iterations are
- chaotic according to
-Devaney's definition of chaos, we can conclude that the network is
-also chaotic in this sense.
+model the behavior of $G_{f_0}$, whose iterations are chaotic
+according to the Devaney's definition of chaos, we can conclude that
+the network is also chaotic in this sense.
The previous construction scheme is not restricted to function $f_0$.
It can be extended to any function $f$ such that $G_f$ is a chaotic
compute the new output one $\left(x^{t+1}_1,\dots,x^{t+1}_n\right)$.
While the remaining input receives a new integer value $S^t \in
\llbracket1;n\rrbracket$, which is provided by the outside world.
-\JFC{en dire davantage sur l'outside world}
\end{itemize}
The topological behavior of these particular neural networks can be
\rrbracket^{\mathds{N}} \times \mathds{B}^n$.
Theoretically speaking, such iterations of $F_f$ are thus a formal
model of these kind of recurrent neural networks. In the rest of this
-paper, we will call such multilayer perceptrons CI-MLP($f$), which
+paper, we will call such multilayer perceptrons ``CI-MLP($f$)'', which
stands for ``Chaotic Iterations based MultiLayer Perceptron''.
Checking if CI-MLP($f$) behaves chaotically according to Devaney's
is intrinsically complicated and it cannot be decomposed or
simplified.
-Furthermore, those recurrent neural networks exhibit the instability
+Furthermore, these recurrent neural networks exhibit the instability
property:
\begin{definition}
A dynamical system $\left( \mathcal{X}, f\right)$ is {\bf unstable}
chaos problems. In Section~\ref{S2} we have explained how to
construct a truly chaotic neural networks, $A$ for
instance. Section~\ref{S3} has shown how to check whether a given MLP
-$A$ or $C$ is chaotic or not in the sens of Devaney, and how to study
+$A$ or $C$ is chaotic or not in the sense of Devaney, and how to study
its topological behavior. The last thing to investigate, when
comparing neural networks and Devaney's chaos, is to determine whether
an artificial neural network $C$ is able to learn or predict some
automatically detecting the presence of hidden messages inside media)
is called steganalysis. Among the deployed strategies inside
detectors, there are support vectors
-machines~\cite{Qiao:2009:SM:1704555.1704664}, neural
+machines~\cite{Qiao:2009:SM:1704555.1704664}, neural
networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36}, and
Markov chains~\cite{Sullivan06steganalysisfor}. Most of these
detectors give quite good results and are rather competitive when
\label{section:experiments}
To study if chaotic iterations can be predicted, we choose to train
-the multiLayer perceptron. As stated before, this kind of network is
+the multilayer perceptron. As stated before, this kind of network is
in particular well-known for its universal approximation property
\cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Furthermore, MLPs have
been already considered for chaotic time series prediction. For
are compared.
Thereafter we give, for the different learning setups and data sets,
-the mean prediction success rate obtained for each output. A such rate
-represent the percentage of input-output pairs belonging to the test
+the mean prediction success rate obtained for each output. Such a rate
+represents the percentage of input-output pairs belonging to the test
subset for which the corresponding output value was correctly
-predicted. These values are computed considering 10~trainings with
+predicted. These values are computed considering 10~trainings with
random subsets construction, weights and biases initialization.
Firstly, neural networks having 10 and 25~hidden neurons are trained,
with a maximum number of epochs that takes its value in
\centering {\small
\begin{tabular}{|c|c||c|c|c|}
\hline
-\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs and one hidden layer} \\
+\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs, and one hidden layer} \\
\hline
\hline
\multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\
25~neurons and 500~epochs). We also notice that the learning of
outputs~(2) and~(3) is more difficult. Conversely, for the
non-chaotic case the simplest training setup is enough to predict
-configurations. For all those feedforward network topologies and all
+configurations. For all these feedforward network topologies and all
outputs the obtained results for the non-chaotic case outperform the
chaotic ones. Finally, the rates for the strategies show that the
-different networks are unable to learn them.
-
-\newpage
+different feedforward networks are unable to learn them.
For the second coding scheme (\textit{i.e.}, with Gray Codes)
Table~\ref{tab2} shows that any network learns about five times more
-non-chaotic configurations than chaotic ones. As in the previous
-scheme, the strategies cannot be predicted.
+non-chaotic configurations than chaotic ones. As in the previous
+scheme, the strategies cannot be predicted.
+Figures~\ref{Fig:chaotic_predictions} and
+\ref{Fig:non-chaotic_predictions} present the predictions given by two
+feedforward multilayer perceptrons that were respectively trained to
+learn chaotic and non-chaotic data, using the second coding scheme.
+Each figure shows for each sample of the test subset (577~samples,
+representing 25\% of the 2304~samples) the configuration that should
+have been predicted and the one given by the multilayer perceptron. It
+can be seen that for the chaotic data the predictions are far away
+from the expected configurations. Obviously, the better predictions
+for the non-chaotic data reflect their regularity.
Let us now compare the two coding schemes. Firstly, the second scheme
disturbs the learning process. In fact in this scheme the
configuration is always expressed as a natural number, whereas in the
-first one the number of inputs follows the increase of the boolean
+first one the number of inputs follows the increase of the Boolean
vectors coding configurations. In this latter case, the coding gives a
finer information on configuration evolution.
-\JFC{Je n'ai pas compris le paragraphe precedent. Devrait être repris}
\begin{table}[b]
\caption{Prediction success rates for configurations expressed with Gray code}
\label{tab2}
\centering
\begin{tabular}{|c|c||c|c|c|}
\hline
-\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs and one hidden layer} \\
+\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs, and one hidden layer} \\
\hline
\hline
& Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\
\end{tabular}
\end{table}
+\begin{figure}
+ \centering
+ \includegraphics[scale=0.5]{chaotic_trace2}
+ \caption {Second coding scheme - Predictions obtained for a chaotic test subset.}
+ \label{Fig:chaotic_predictions}
+\end{figure}
+
+\begin{figure}
+ \centering
+ \includegraphics[scale=0.5]{non-chaotic_trace2}
+ \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.}
+ \label{Fig:non-chaotic_predictions}
+\end{figure}
+
Unfortunately, in practical applications the number of components is
usually unknown. Hence, the first coding scheme cannot be used
systematically. Therefore, we provide a refinement of the second
scheme: each output is learned by a different ANN. Table~\ref{tab3}
presents the results for this approach. In any case, whatever the
-considered feedforward network topologies, the maximum epoch number
+considered feedforward network topologies, the maximum epoch number,
and the kind of iterations, the configuration success rate is slightly
improved. Moreover, the strategies predictions rates reach almost
12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%. Despite of
\centering
\begin{tabular}{|c||c|c|c|}
\hline
-\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output and one hidden layer} \\
+\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output, and one hidden layer} \\
\hline
\hline
Epochs & 125 & 250 & 500 \\
\end{tabular}
\end{table}
-%
-% TO BE COMPLETED
-%
-
\section{Conclusion}
In this paper, we have established an equivalence between chaotic
when facing a truly chaotic phenomenon. A comparison between learning
rate success and prediction quality will be realized. Concrete
consequences in biology, physics, and computer science security fields
-will be stated.
+will then be stated.
% \appendix{}
-
-
% \begin{definition} \label{def2}
% A continuous function $f$ on a topological space $(\mathcal{X},\tau)$
% is defined to be {\emph{topologically transitive}} if for any pair of
% $f^k(U) \cap V \neq \emptyset$.
% \end{definition}
-
\bibliography{chaos-paper}% Produces the bibliography via BibTeX.
\end{document}