\maketitle
\begin{quotation}
-
Chaotic neural networks have received a lot of attention due to the
appealing properties of deterministic chaos (unpredictability,
sensitivity, and so on). However, such networks are often claimed
chaos is introduced. Starting with a relationship between discrete
iterations and Devaney's chaos, we firstly show how to build a
recurrent neural network that is equivalent to a chaotic map and
-secondly a way to check whether an already available network is
+secondly a way to check whether an already available network is
chaotic or not. We also study different topological properties of
these truly chaotic neural networks. Finally, we show that the
learning, with neural networks having a classical feedforward
\label{S1}
Several research works have proposed or used chaotic neural networks
-these last years. The complex dynamics of such networks lead to
+these last years. The complex dynamics of such networks lead to
various potential application areas: associative
memories~\cite{Crook2007267} and digital security tools like hash
functions~\cite{Xiao10}, digital
their universal approximator capacity
\cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Thus, this kind of
networks can be trained to model a physical phenomenon known to be
-chaotic such as Chua's circuit \cite{dalkiran10}. Sometime a neural
-network, which is build by combining transfer functions and initial
+chaotic such as Chua's circuit \cite{dalkiran10}. Sometime a neural
+network, which is build by combining transfer functions and initial
conditions that are both chaotic, is itself claimed to be chaotic
\cite{springerlink:10.1007/s00521-010-0432-2}.
precisely, in this paper, which is an extension of a previous work
\cite{bgs11:ip}, we establish the equivalence between chaotic
iterations and a class of globally recurrent MLP. The second
-contribution is a study of the converse problem, indeed we investigate the
-ability of classical multiLayer perceptrons to learn a particular
+contribution is a study of the converse problem, indeed we investigate
+the ability of classical multilayer perceptrons to learn a particular
family of discrete chaotic dynamical systems. This family is defined
by a Boolean vector, an update function, and a sequence defining the
component to update at each iteration. It has been previously
according to Devaney} on $(\mathcal{X},\tau)$ if $f$ is regular,
topologically transitive, and has sensitive dependence to its initial
conditions. In what follows, iterations are said to be chaotic
-(according to Devaney) when the corresponding dynamical system is chaotic, as it is defined in the Devaney's formulation.
+(according to Devaney) when the corresponding dynamical system is
+chaotic, as it is defined in the Devaney's formulation.
%Let us notice that for a metric space the last condition follows from
%the two first ones~\cite{Banks92}.
\right.
\label{eq:Gf}
\end{equation}
-where $\sigma$ is the so-called shift function that removes the first term of the
-strategy ({\it i.e.},~$S^0$). This definition allows to link
-asynchronous iterations with classical iterations of a dynamical
+where $\sigma$ is the so-called shift function that removes the first
+term of the strategy ({\it i.e.},~$S^0$). This definition allows to
+link asynchronous iterations with classical iterations of a dynamical
system. Note that it can be extended by considering subsets for $S^t$.
To study topological properties of these iterations, we are then left
to introduce a {\bf distance} $d$ between two points $(S,x)$ and
-$(\check{S},\check{x})$ in $\mathcal{X} = \llbracket1;n\rrbracket^\Nats
-\times \Bool^{n}$. Let $\Delta(x,y) = 0$ if $x=y$, and $\Delta(x,y) = 1$ else, be a distance on $\mathds{B}$. The distance $d$ is defined by
+$(\check{S},\check{x})$ in $\mathcal{X} =
+\llbracket1;n\rrbracket^\Nats \times \Bool^{n}$. Let $\Delta(x,y) = 0$
+if $x=y$, and $\Delta(x,y) = 1$ else, be a distance on
+$\mathds{B}$. The distance $d$ is defined by
\begin{equation}
d((S,x);(\check{S},\check{x}))=d_{e}(x,\check{x})+d_{s}(S,\check{S})
\enspace ,
\item When the network is activated at the $t^{th}$ iteration, the
state of the system $x^t \in \mathds{B}^n$ received from the output
layer and the initial term of the sequence $(S^t)^{t \in \Nats}$
- (\textit{i.e.}, $S^0 \in \llbracket 1;n\rrbracket$) are used to compute the new
- output vector. This new vector, which represents the new state of
- the dynamical system, satisfies:
+ (\textit{i.e.}, $S^0 \in \llbracket 1;n\rrbracket$) are used to
+ compute the new output vector. This new vector, which represents
+ the new state of the dynamical system, satisfies:
\begin{equation}
x^{t+1}=F_{f_0}(S^0, x^t) \in \mathds{B}^n \enspace .
\end{equation}
The behavior of the neural network is such that when the initial state
is $x^0~\in~\mathds{B}^n$ and a sequence $(S^t)^{t \in \Nats}$ is
-given as outside input,
-then the sequence of successive published
+given as outside input, then the sequence of successive published
output vectors $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ is exactly
the one produced by the chaotic iterations formally described in
-Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar
+Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar
input vectors they both generate the same successive outputs
$\left(x^t\right)^{t \in \mathds{N}^{\ast}}$, and therefore that they
are equivalent reformulations of the iterations of $G_{f_0}$ in
$\mathcal{X}$. Finally, since the proposed neural network is built to
-model the behavior of $G_{f_0}$, whose iterations are
- chaotic according to the
-Devaney's definition of chaos, we can conclude that the network is
-also chaotic in this sense.
+model the behavior of $G_{f_0}$, whose iterations are chaotic
+according to the Devaney's definition of chaos, we can conclude that
+the network is also chaotic in this sense.
The previous construction scheme is not restricted to function $f_0$.
It can be extended to any function $f$ such that $G_f$ is a chaotic
\rrbracket^{\mathds{N}} \times \mathds{B}^n$.
Theoretically speaking, such iterations of $F_f$ are thus a formal
model of these kind of recurrent neural networks. In the rest of this
-paper, we will call such multilayer perceptrons ``CI-MLP($f$)'', which
+paper, we will call such multilayer perceptrons ``CI-MLP($f$)'', which
stands for ``Chaotic Iterations based MultiLayer Perceptron''.
Checking if CI-MLP($f$) behaves chaotically according to Devaney's
chaos problems. In Section~\ref{S2} we have explained how to
construct a truly chaotic neural networks, $A$ for
instance. Section~\ref{S3} has shown how to check whether a given MLP
-$A$ or $C$ is chaotic or not in the sens of Devaney, and how to study
+$A$ or $C$ is chaotic or not in the sense of Devaney, and how to study
its topological behavior. The last thing to investigate, when
comparing neural networks and Devaney's chaos, is to determine whether
an artificial neural network $C$ is able to learn or predict some
automatically detecting the presence of hidden messages inside media)
is called steganalysis. Among the deployed strategies inside
detectors, there are support vectors
-machines~\cite{Qiao:2009:SM:1704555.1704664}, neural
+machines~\cite{Qiao:2009:SM:1704555.1704664}, neural
networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36}, and
Markov chains~\cite{Sullivan06steganalysisfor}. Most of these
detectors give quite good results and are rather competitive when
\label{section:experiments}
To study if chaotic iterations can be predicted, we choose to train
-the multiLayer perceptron. As stated before, this kind of network is
+the multilayer perceptron. As stated before, this kind of network is
in particular well-known for its universal approximation property
\cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Furthermore, MLPs have
been already considered for chaotic time series prediction. For
Thereafter we give, for the different learning setups and data sets,
the mean prediction success rate obtained for each output. Such a rate
-represents the percentage of input-output pairs belonging to the test
+represents the percentage of input-output pairs belonging to the test
subset for which the corresponding output value was correctly
-predicted. These values are computed considering 10~trainings with
+predicted. These values are computed considering 10~trainings with
random subsets construction, weights and biases initialization.
Firstly, neural networks having 10 and 25~hidden neurons are trained,
with a maximum number of epochs that takes its value in
configurations. For all these feedforward network topologies and all
outputs the obtained results for the non-chaotic case outperform the
chaotic ones. Finally, the rates for the strategies show that the
-different networks are unable to learn them.
-
-\newpage
+different feedforward networks are unable to learn them.
For the second coding scheme (\textit{i.e.}, with Gray Codes)
Table~\ref{tab2} shows that any network learns about five times more
-non-chaotic configurations than chaotic ones. As in the previous
-scheme, the strategies cannot be predicted.
+non-chaotic configurations than chaotic ones. As in the previous
+scheme, the strategies cannot be predicted.
+Figures~\ref{Fig:chaotic_predictions} and
+\ref{Fig:non-chaotic_predictions} present the predictions given by two
+feedforward multilayer perceptrons that were respectively trained to
+learn chaotic and non-chaotic data, using the second coding scheme.
+Each figure shows for each sample of the test subset (577~samples,
+representing 25\% of the 2304~samples) the configuration that should
+have been predicted and the one given by the multilayer perceptron. It
+can be seen that for the chaotic data the predictions are far away
+from the expected configurations. Obviously, the better predictions
+for the non-chaotic data reflect their regularity.
Let us now compare the two coding schemes. Firstly, the second scheme
disturbs the learning process. In fact in this scheme the
first one the number of inputs follows the increase of the Boolean
vectors coding configurations. In this latter case, the coding gives a
finer information on configuration evolution.
-
\begin{table}[b]
\caption{Prediction success rates for configurations expressed with Gray code}
\label{tab2}
\end{tabular}
\end{table}
+\begin{figure}
+ \centering
+ \includegraphics[scale=0.5]{chaotic_trace2}
+ \caption {Second coding scheme - Predictions obtained for a chaotic test subset.}
+ \label{Fig:chaotic_predictions}
+\end{figure}
+
+\begin{figure}
+ \centering
+ \includegraphics[scale=0.5]{non-chaotic_trace2}
+ \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.}
+ \label{Fig:non-chaotic_predictions}
+\end{figure}
+
Unfortunately, in practical applications the number of components is
usually unknown. Hence, the first coding scheme cannot be used
systematically. Therefore, we provide a refinement of the second
scheme: each output is learned by a different ANN. Table~\ref{tab3}
presents the results for this approach. In any case, whatever the
-considered feedforward network topologies, the maximum epoch number,
+considered feedforward network topologies, the maximum epoch number,
and the kind of iterations, the configuration success rate is slightly
improved. Moreover, the strategies predictions rates reach almost
12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%. Despite of
\end{tabular}
\end{table}
-%
-% TO BE COMPLETED
-%
-
\section{Conclusion}
In this paper, we have established an equivalence between chaotic
when facing a truly chaotic phenomenon. A comparison between learning
rate success and prediction quality will be realized. Concrete
consequences in biology, physics, and computer science security fields
-will then be stated.
+will then be stated.
% \appendix{}
-
-
% \begin{definition} \label{def2}
% A continuous function $f$ on a topological space $(\mathcal{X},\tau)$
% is defined to be {\emph{topologically transitive}} if for any pair of
% $f^k(U) \cap V \neq \emptyset$.
% \end{definition}
-
\bibliography{chaos-paper}% Produces the bibliography via BibTeX.
\end{document}