+% \section{Introduction}
+% \label{S1}
+
+Les réseaux de neurones chaotiques ont été étudiés à de maintes reprises
+par le passé en raison notamment de leurs applications potentielles:
+%les mémoires associatives~\cite{Crook2007267}
+les composants utils à la sécurité comme les fonctions de
+hachage~\cite{Xiao10},
+le tatouage numérique~\cite{1309431,Zhang2005759}
+ou les schémas de chiffrement~\cite{Lian20091296}.
+Dans tous ces cas, l'emploi de fonctions chaotiques est motivé par
+leur comportement imprévisibile et proche de l'aléa.
+
+
+Les réseaux de neurones chaotiques peuvent être conçus selon plusieurs
+principes. Des neurones modifiant leur état en suivant une fonction non
+linéaire son par exemple appelés neurones chaotiques~\cite{Crook2007267}.
+L'architecture de réseaux de neurones de type Perceptron multi-couches
+(MLP) n'iterent quant à eux, pas nécesssairement de fonctions chaotiques.
+Il a cependant été démontré que ce sont des approximateurs
+universels~\cite{Cybenko89,DBLP:journals/nn/HornikSW89}.
+Ils permettent, dans certains cas, de simuler des comportements
+physiques chaotiques comme le circuit de Chua~\cite{dalkiran10}.
+Parfois~\cite{springerlink:10.1007/s00521-010-0432-2},
+la fonction de transfert de cette famille de réseau celle
+d'initialisation sont toutes les deux définies à l'aide de
+fonctions chaotiques.
+
+
+
+
+
+Ces réseaux de neurones partagent le fait qu'ils sont qualifiés de
+``chaotiques'' sous prétexte qu'ils embarquent une fonction de ce type
+et ce sans aucune preuve rigoureuse. Ce chapitre caractérise la
+classe des réseaux de neurones MLP chaotiques. Il
+s'intéresse ensuite à l'étude de prévisibilité de systèmes dynamiques
+discrets chaotiques par cette famille de MLP.
+
+\JFC{revoir plan}
+
+The remainder of this research work is organized as follows. The next
+section is devoted to the basics of Devaney's chaos. Section~\ref{S2}
+formally describes how to build a neural network that operates
+chaotically. Section~\ref{S3} is devoted to the dual case of checking
+whether an existing neural network is chaotic or not. Topological
+properties of chaotic neural networks are discussed in Sect.~\ref{S4}.
+The Section~\ref{section:translation} shows how to translate such
+iterations into an Artificial Neural Network (ANN), in order to
+evaluate the capability for this latter to learn chaotic behaviors.
+This ability is studied in Sect.~\ref{section:experiments}, where
+various ANNs try to learn two sets of data: the first one is obtained
+by chaotic iterations while the second one results from a non-chaotic
+system. Prediction success rates are given and discussed for the two
+sets. The paper ends with a conclusion section where our contribution
+is summed up and intended future work is exposed.
+
+
+\section{Un réseau de neurones chaotique au sens de Devaney}
+\label{S2}
+
+On considère une fonction
+$f:\Bool^n\to\Bool^n$ telle que $\textsc{giu}(f)$ est fortement connexe.
+Ainsi $G_{f_u}$ est chaotique d'après le théorème~\ref{Th:CaracIC}.
+
+On considère ici le schéma unaire défini par l'équation (\ref{eq:asyn}).
+On construit un perceptron multi-couche associé à la fonction
+$F_{f_u}$.
+Plus précisément, pour chaque entrée
+ $(x,s) \in \mathds{B}^n \times [n]$,
+la couche de sortie doit générer $F_{f_u}(x,s)$.
+On peut ainsi lier la couche de sortie avec celle d'entrée pour représenter
+les dépendance entre deux itérations successives.
+On obtient une réseau de neurones dont le comportement est le suivant
+(voir Figure.~\ref{Fig:perceptron}):
+
+\begin{itemize}
+\item Le réseau est initialisé avec le vecteur d'entrée
+ $\left(x^0,S^0\right) \mathds{B}^n \times [n]$
+ et calcule le vecteur de sortie
+ $x^1=F_{f_u}\left(x^0,S^0\right)$.
+ Ce vecteur est publié comme une sortie et est aussi retournée sur la couche d'entrée
+ à travers les liens de retours.
+\item Lorsque le réseau est activé à la $t^{th}$ itération, l'état du
+ système $x^t \in \mathds{B}^n$ reçu depuis la couche de sortie ainsi que le
+ premier terme de la sequence $(S^t)^{t \in \Nats}$
+ (\textit{i.e.}, $S^0 \in [n]$) servent à construire le nouveau vecteur de sortie.
+ Ce nouveau vecteur, qui représente le nouvel état du système dynamique, satisfait:
+ \begin{equation}
+ x^{t+1}=F_{f_u}(x^t,S^0) \in \mathds{B}^n \enspace .
+ \end{equation}
+\end{itemize}
+
+\begin{figure}
+ \centering
+ \includegraphics[scale=0.625]{images/perceptron}
+ \caption{Un perceptron équivalent aux itérations unitaires}
+ \label{Fig:perceptron}
+\end{figure}
+
+Le comportement de ce réseau de neurones est tel que lorsque l'état
+initial est composé de $x^0~\in~\mathds{B}^n$ et d'une séquence
+ $(S^t)^{t \in \Nats}$, alors la séquence contenant les vecteurs successifs
+publiés $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ est exactement celle produite
+par les itérations unaires décrites à la section~\ref{sec:TIPE12}.
+Mathématiquement, cela signifie que si on utilise les mêmes vecteurs d'entrées
+les deux approches génèrent successivement les mêmes sorties.
+En d'autres termes ce réseau de neurones modélise le comportement de
+$G_{f_u}$, dont les itérations sont chaotiques sur $\mathcal{X}_u$.
+On peut donc le qualifier de chaotique au sens de Devaney.
+
+\section{Vérifier si un réseau de neurones est chaotique}
+\label{S3}
+On s'intéresse maintenant au cas où l'on dispose d'un
+réseau de neurones de type perceptron multi-couches
+dont on cherche à savoir s'il est chaotique (parce qu'il a par exemple été
+déclaré comme tel) au sens de Devaney.
+On considère de plus que sa topologie est la suivante:
+l'entrée est constituée de $n$ bits et un entier, la sortie est constituée de $n$ bits
+et chaque sortie est liée à une entrée par une boucle.
+
+\begin{itemize}
+\item Le réseau est initialisé avec $n$~bits
+ $\left(x^0_1,\dots,x^0_n\right)$ et une valeur entière $S^0 \in [n]$.
+\item A l'itération~$t$, le vecteur
+ $\left(x^t_1,\dots,x^t_n\right)$ permet de construire les $n$~bits
+ servant de sortie $\left(x^{t+1}_1,\dots,x^{t+1}_n\right)$.
+\end{itemize}
+
+Le comportement de ce type de réseau de neurones peut être prouvé comme
+étant chaotique en suivant la démarche énoncée maintenant.
+On nomme tout d'abord $F: \mathds{B}^n \times [n] \rightarrow
+\mathds{B}^n$ la fonction qui associe
+au vecteur
+$\left(\left(x_1,\dots,x_n\right),s\right) \in \mathds{B}^n \times[n]$
+le vecteur
+$\left(y_1,\dots,y_n\right) \in \mathds{B}^n$, où
+$\left(y_1,\dots,y_n\right)$ sont les sorties du réseau neuronal
+àaprès l'initialisation de la couche d'entrée avec
+$\left(s,\left(x_1,\dots, x_n\right)\right)$. Ensuite, on définie $f:
+\mathds{B}^n \rightarrow \mathds{B}^n$ telle que
+$f\left(x_1,x_2,\dots,x_n\right)$ est égal à
+\begin{equation}
+\left(F\left(\left(x_1,x_2,\dots,x_n\right),1\right),\dots,
+ F\left(\left(x_1,x_2,\dots,x_n\right)\right),n\right) \enspace .
+\end{equation}
+Ainsi pour chaque $j$, $1 \le j \le n$, on a
+$f_j\left(x_1,x_2,\dots,x_n\right) =
+F\left(\left(x_1,x_2,\dots,x_n\right),j\right)$.
+Si ce réseau de neurones est initialisé avec
+$\left(x_1^0,\dots,x_n^0\right)$ et $S \in [n]^{\mathds{N}}$,
+il produit exactement les même sorties que les itérations de $F_{f_u}$ avec une
+condition initiale $\left((x_1^0,\dots, x_n^0),S\right) \in \mathds{B}^n \times [n]^{\mathds{N}}$.
+Les itérations de $F_{f_u}$
+sont donc un modèle formel de ce genre de réseau de neurones.
+Pour vérifier s'il est chaotique, il suffit ainsi
+de vérifier si le graphe d'itérations
+$\textsc{giu}(f)$ est fortement connexe.
+
+
+\section{Suitability of Feedforward Neural Networks
+for Predicting Chaotic and Non-chaotic Behaviors}
+
+In the context of computer science different topic areas have an
+interest in chaos, as for steganographic
+techniques~\cite{1309431,Zhang2005759}. Steganography consists in
+embedding a secret message within an ordinary one, while the secret
+extraction takes place once at destination. The reverse ({\it i.e.},
+automatically detecting the presence of hidden messages inside media)
+is called steganalysis. Among the deployed strategies inside
+detectors, there are support vectors
+machines~\cite{Qiao:2009:SM:1704555.1704664}, neural
+networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36}, and
+Markov chains~\cite{Sullivan06steganalysisfor}. Most of these
+detectors give quite good results and are rather competitive when
+facing steganographic tools. However, to the best of our knowledge
+none of the considered information hiding schemes fulfills the Devaney
+definition of chaos~\cite{Devaney}. Indeed, one can wonder whether
+detectors continue to give good results when facing truly chaotic
+schemes. More generally, there remains the open problem of deciding
+whether artificial intelligence is suitable for predicting topological
+chaotic behaviors.
+
+\subsection{Representing Chaotic Iterations for Neural Networks}
+\label{section:translation}
+
+The problem of deciding whether classical feedforward ANNs are
+suitable to approximate topological chaotic iterations may then be
+reduced to evaluate such neural networks on iterations of functions
+with Strongly Connected Component (SCC)~graph of iterations. To
+compare with non-chaotic iterations, the experiments detailed in the
+following sections are carried out using both kinds of function
+(chaotic and non-chaotic). Let us emphasize on the difference between
+this kind of neural networks and the Chaotic Iterations based
+multilayer peceptron.
+
+We are then left to compute two disjoint function sets that contain
+either functions with topological chaos properties or not, depending
+on the strong connectivity of their iterations graph. This can be
+achieved for instance by removing a set of edges from the iteration
+graph $\Gamma(f_0)$ of the vectorial negation function~$f_0$. One can
+deduce whether a function verifies the topological chaos property or
+not by checking the strong connectivity of the resulting graph of
+iterations.
+
+For instance let us consider the functions $f$ and $g$ from $\Bool^4$
+to $\Bool^4$ respectively defined by the following lists:
+$$[0, 0, 2, 3, 13, 13, 6, 3, 8, 9, 10, 11, 8, 13, 14,
+ 15]$$ $$\mbox{and } [11, 14, 13, 14, 11, 10, 1, 8, 7, 6, 5, 4, 3, 2,
+ 1, 0] \enspace.$$ In other words, the image of $0011$ by $g$ is
+$1110$: it is obtained as the binary value of the fourth element in
+the second list (namely~14). It is not hard to verify that
+$\Gamma(f)$ is not SCC (\textit{e.g.}, $f(1111)$ is $1111$) whereas
+$\Gamma(g)$ is. The remaining of this section shows how to translate
+iterations of such functions into a model amenable to be learned by an
+ANN. Formally, input and output vectors are pairs~$((S^t)^{t \in
+ \Nats},x)$ and $\left(\sigma((S^t)^{t \in
+ \Nats}),F_{f}(S^0,x)\right)$ as defined in~Eq.~(\ref{eq:Gf}).
+
+Firstly, let us focus on how to memorize configurations. Two distinct
+translations are proposed. In the first case, we take one input in
+$\Bool$ per component; in the second case, configurations are
+memorized as natural numbers. A coarse attempt to memorize
+configuration as natural number could consist in labeling each
+configuration with its translation into decimal numeral system.
+However, such a representation induces too many changes between a
+configuration labeled by a power of two and its direct previous
+configuration: for instance, 16~(10000) and 15~(01111) are close in a
+decimal ordering, but their Hamming distance is 5. This is why Gray
+codes~\cite{Gray47} have been preferred.
+
+Secondly, let us detail how to deal with strategies. Obviously, it is
+not possible to translate in a finite way an infinite strategy, even
+if both $(S^t)^{t \in \Nats}$ and $\sigma((S^t)^{t \in \Nats})$ belong
+to $\{1,\ldots,n\}^{\Nats}$. Input strategies are then reduced to
+have a length of size $l \in \llbracket 2,k\rrbracket$, where $k$ is a
+parameter of the evaluation. Notice that $l$ is greater than or equal
+to $2$ since we do not want the shift $\sigma$~function to return an
+empty strategy. Strategies are memorized as natural numbers expressed
+in base $n+1$. At each iteration, either none or one component is
+modified (among the $n$ components) leading to a radix with $n+1$
+entries. Finally, we give an other input, namely $m \in \llbracket
+1,l-1\rrbracket$, which is the number of successive iterations that
+are applied starting from $x$. Outputs are translated with the same
+rules.
+
+To address the complexity issue of the problem, let us compute the
+size of the data set an ANN has to deal with. Each input vector of an
+input-output pair is composed of a configuration~$x$, an excerpt $S$
+of the strategy to iterate of size $l \in \llbracket 2, k\rrbracket$,
+and a number $m \in \llbracket 1, l-1\rrbracket$ of iterations that
+are executed.
+
+Firstly, there are $2^n$ configurations $x$, with $n^l$ strategies of
+size $l$ for each of them. Secondly, for a given configuration there
+are $\omega = 1 \times n^2 + 2 \times n^3 + \ldots+ (k-1) \times n^k$
+ways of writing the pair $(m,S)$. Furthermore, it is not hard to
+establish that
+\begin{equation}
+\displaystyle{(n-1) \times \omega = (k-1)\times n^{k+1} - \sum_{i=2}^k n^i} \nonumber
+\end{equation}
+then
+\begin{equation}
+\omega =
+\dfrac{(k-1)\times n^{k+1}}{n-1} - \dfrac{n^{k+1}-n^2}{(n-1)^2} \enspace . \nonumber
+\end{equation}
+\noindent And then, finally, the number of input-output pairs for our
+ANNs is
+$$
+2^n \times \left(\dfrac{(k-1)\times n^{k+1}}{n-1} - \dfrac{n^{k+1}-n^2}{(n-1)^2}\right) \enspace .
+$$
+For instance, for $4$ binary components and a strategy of at most
+$3$~terms we obtain 2304~input-output pairs.
+
+\subsection{Experiments}
+\label{section:experiments}
+
+To study if chaotic iterations can be predicted, we choose to train
+the multilayer perceptron. As stated before, this kind of network is
+in particular well-known for its universal approximation property
+\cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Furthermore, MLPs have
+been already considered for chaotic time series prediction. For
+example, in~\cite{dalkiran10} the authors have shown that a
+feedforward MLP with two hidden layers, and trained with Bayesian
+Regulation back-propagation, can learn successfully the dynamics of
+Chua's circuit.
+
+In these experiments we consider MLPs having one hidden layer of
+sigmoidal neurons and output neurons with a linear activation
+function. They are trained using the Limited-memory
+Broyden-Fletcher-Goldfarb-Shanno quasi-newton algorithm in combination
+with the Wolfe linear search. The training process is performed until
+a maximum number of epochs is reached. To prevent overfitting and to
+estimate the generalization performance we use holdout validation by
+splitting the data set into learning, validation, and test subsets.
+These subsets are obtained through random selection such that their
+respective size represents 65\%, 10\%, and 25\% of the whole data set.
+
+Several neural networks are trained for both iterations coding
+schemes. In both cases iterations have the following layout:
+configurations of four components and strategies with at most three
+terms. Thus, for the first coding scheme a data set pair is composed
+of 6~inputs and 5~outputs, while for the second one it is respectively
+3~inputs and 2~outputs. As noticed at the end of the previous section,
+this leads to data sets that consist of 2304~pairs. The networks
+differ in the size of the hidden layer and the maximum number of
+training epochs. We remember that to evaluate the ability of neural
+networks to predict a chaotic behavior for each coding scheme, the
+trainings of two data sets, one of them describing chaotic iterations,
+are compared.
+
+Thereafter we give, for the different learning setups and data sets,
+the mean prediction success rate obtained for each output. Such a rate
+represents the percentage of input-output pairs belonging to the test
+subset for which the corresponding output value was correctly
+predicted. These values are computed considering 10~trainings with
+random subsets construction, weights and biases initialization.
+Firstly, neural networks having 10 and 25~hidden neurons are trained,
+with a maximum number of epochs that takes its value in
+$\{125,250,500\}$ (see Tables~\ref{tab1} and \ref{tab2}). Secondly,
+we refine the second coding scheme by splitting the output vector such
+that each output is learned by a specific neural network
+(Table~\ref{tab3}). In this last case, we increase the size of the
+hidden layer up to 40~neurons and we consider larger number of epochs.
+
+\begin{table}[htbp!]
+\caption{Prediction success rates for configurations expressed as boolean vectors.}
+\label{tab1}
+\centering {\small
+\begin{tabular}{|c|c||c|c|c|}
+\hline
+\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs, and one hidden layer} \\
+\hline
+\hline
+\multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\
+\cline{3-5}
+\multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\
+\hline
+\multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 90.92\% & 91.75\% & 91.82\% \\
+& Output~(2) & 69.32\% & 78.46\% & 82.15\% \\
+& Output~(3) & 68.47\% & 78.49\% & 82.22\% \\
+& Output~(4) & 91.53\% & 92.37\% & 93.4\% \\
+& Config. & 36.10\% & 51.35\% & 56.85\% \\
+& Strategy~(5) & 1.91\% & 3.38\% & 2.43\% \\
+\hline
+\multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.64\% & 98.10\% & 98.20\% \\
+& Output~(2) & 95.15\% & 95.39\% & 95.46\% \\
+& Output~(3) & 100\% & 100\% & 100\% \\
+& Output~(4) & 97.47\% & 97.90\% & 97.99\% \\
+& Config. & 90.52\% & 91.59\% & 91.73\% \\
+& Strategy~(5) & 3.41\% & 3.40\% & 3.47\% \\
+\hline
+\hline
+\multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{25 neurons} \\ %& \multicolumn{3}{|c|}{40 neurons} \\
+\cline{3-5}
+\multicolumn{2}{|c||}{Epochs} & 125 & 250 & 500 \\ %& 125 & 250 & 500 \\
+\hline
+\multirow{6}{*}{\rotatebox{90}{Chaotic}}&Output~(1) & 91.65\% & 92.69\% & 93.93\% \\ %& 91.94\% & 92.89\% & 94.00\% \\
+& Output~(2) & 72.06\% & 88.46\% & 90.5\% \\ %& 74.97\% & 89.83\% & 91.14\% \\
+& Output~(3) & 79.19\% & 89.83\% & 91.59\% \\ %& 76.69\% & 89.58\% & 91.84\% \\
+& Output~(4) & 91.61\% & 92.34\% & 93.47\% \\% & 82.77\% & 92.93\% & 93.48\% \\
+& Config. & 48.82\% & 67.80\% & 70.97\% \\%& 49.46\% & 68.94\% & 71.11\% \\
+& Strategy~(5) & 2.62\% & 3.43\% & 3.78\% \\% & 3.10\% & 3.10\% & 3.03\% \\
+\hline
+\multirow{6}{*}{\rotatebox{90}{Non-chaotic}}&Output~(1) & 97.87\% & 97.99\% & 98.03\% \\ %& 98.16\% \\
+& Output~(2) & 95.46\% & 95.84\% & 96.75\% \\ % & 97.4\% \\
+& Output~(3) & 100\% & 100\% & 100\% \\%& 100\% \\
+& Output~(4) & 97.77\% & 97.82\% & 98.06\% \\%& 98.31\% \\
+& Config. & 91.36\% & 91.99\% & 93.03\% \\%& 93.98\% \\
+& Strategy~(5) & 3.37\% & 3.44\% & 3.29\% \\%& 3.23\% \\
+\hline
+\end{tabular}
+}
+\end{table}
+
+Table~\ref{tab1} presents the rates obtained for the first coding
+scheme. For the chaotic data, it can be seen that as expected
+configuration prediction becomes better when the number of hidden
+neurons and maximum epochs increases: an improvement by a factor two
+is observed (from 36.10\% for 10~neurons and 125~epochs to 70.97\% for
+25~neurons and 500~epochs). We also notice that the learning of
+outputs~(2) and~(3) is more difficult. Conversely, for the
+non-chaotic case the simplest training setup is enough to predict
+configurations. For all these feedforward network topologies and all
+outputs the obtained results for the non-chaotic case outperform the
+chaotic ones. Finally, the rates for the strategies show that the
+different feedforward networks are unable to learn them.
+
+For the second coding scheme (\textit{i.e.}, with Gray Codes)
+Table~\ref{tab2} shows that any network learns about five times more
+non-chaotic configurations than chaotic ones. As in the previous
+scheme, the strategies cannot be predicted.
+Figures~\ref{Fig:chaotic_predictions} and
+\ref{Fig:non-chaotic_predictions} present the predictions given by two
+feedforward multilayer perceptrons that were respectively trained to
+learn chaotic and non-chaotic data, using the second coding scheme.
+Each figure shows for each sample of the test subset (577~samples,
+representing 25\% of the 2304~samples) the configuration that should
+have been predicted and the one given by the multilayer perceptron. It
+can be seen that for the chaotic data the predictions are far away
+from the expected configurations. Obviously, the better predictions
+for the non-chaotic data reflect their regularity.
+
+Let us now compare the two coding schemes. Firstly, the second scheme
+disturbs the learning process. In fact in this scheme the
+configuration is always expressed as a natural number, whereas in the
+first one the number of inputs follows the increase of the Boolean
+vectors coding configurations. In this latter case, the coding gives a
+finer information on configuration evolution.
+\begin{table}[b]
+\caption{Prediction success rates for configurations expressed with Gray code}
+\label{tab2}
+\centering
+\begin{tabular}{|c|c||c|c|c|}
+\hline
+\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs, and one hidden layer} \\
+\hline
+\hline
+& Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\
+\cline{2-5}
+& Epochs & 125 & 250 & 500 \\ %& 1000
+\hline
+\multirow{2}{*}{Chaotic}& Config.~(1) & 13.29\% & 13.55\% & 13.08\% \\ %& 12.5\%
+& Strategy~(2) & 0.50\% & 0.52\% & 1.32\% \\ %& 1.42\%
+\hline
+\multirow{2}{*}{Non-Chaotic}&Config.~(1) & 77.12\% & 74.00\% & 72.60\% \\ %& 75.81\%
+& Strategy~(2) & 0.42\% & 0.80\% & 1.16\% \\ %& 1.42\%
+\hline
+\hline
+& Hidden neurons & \multicolumn{3}{c|}{25 neurons} \\
+\cline{2-5}
+& Epochs & 125 & 250 & 500 \\ %& 1000
+\hline
+\multirow{2}{*}{Chaotic}& Config.~(1) & 12.27\% & 13.15\% & 13.05\% \\ %& 15.44\%
+& Strategy~(2) & 0.71\% & 0.66\% & 0.88\% \\ %& 1.73\%
+\hline
+\multirow{2}{*}{Non-Chaotic}&Config.~(1) & 73.60\% & 74.70\% & 75.89\% \\ %& 68.32\%
+& Strategy~(2) & 0.64\% & 0.97\% & 1.23\% \\ %& 1.80\%
+\hline
+\end{tabular}
+\end{table}
+
+\begin{figure}
+ \centering
+ \includegraphics[scale=0.5]{images/chaotic_trace2}
+ \caption {Second coding scheme - Predictions obtained for a chaotic test subset.}
+ \label{Fig:chaotic_predictions}
+\end{figure}
+
+\begin{figure}
+ \centering
+ \includegraphics[scale=0.5]{images/non-chaotic_trace2}
+ \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.}
+ \label{Fig:non-chaotic_predictions}
+\end{figure}
+
+Unfortunately, in practical applications the number of components is
+usually unknown. Hence, the first coding scheme cannot be used
+systematically. Therefore, we provide a refinement of the second
+scheme: each output is learned by a different ANN. Table~\ref{tab3}
+presents the results for this approach. In any case, whatever the
+considered feedforward network topologies, the maximum epoch number,
+and the kind of iterations, the configuration success rate is slightly
+improved. Moreover, the strategies predictions rates reach almost
+12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%. Despite of
+this improvement, a long term prediction of chaotic iterations still
+appear to be an open issue.
+
+\begin{table}
+\caption{Prediction success rates for split outputs.}
+\label{tab3}
+\centering
+\begin{tabular}{|c||c|c|c|}
+\hline
+\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output, and one hidden layer} \\
+\hline
+\hline
+Epochs & 125 & 250 & 500 \\
+\hline
+\hline
+Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
+\hline
+10~neurons & 12.39\% & 14.06\% & 14.32\% \\
+25~neurons & 13.00\% & 14.28\% & 14.58\% \\
+40~neurons & 11.58\% & 13.47\% & 14.23\% \\
+\hline
+\hline
+Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
+\cline{2-4}
+%Epochs & 125 & 250 & 500 \\
+\hline
+10~neurons & 76.01\% & 74.04\% & 78.16\% \\
+25~neurons & 76.60\% & 72.13\% & 75.96\% \\
+40~neurons & 76.34\% & 75.63\% & 77.50\% \\
+\hline
+\hline
+Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
+\cline{2-4}
+%Epochs & 125 & 250 & 500 \\
+\hline
+10~neurons & 0.76\% & 0.97\% & 1.21\% \\
+25~neurons & 1.09\% & 0.73\% & 1.79\% \\
+40~neurons & 0.90\% & 1.02\% & 2.15\% \\
+\hline
+\multicolumn{4}{c}{} \\
+\hline
+Epochs & 1000 & 2500 & 5000 \\
+\hline
+\hline
+Chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
+\hline
+10~neurons & 14.51\% & 15.22\% & 15.22\% \\
+25~neurons & 16.95\% & 17.57\% & 18.46\% \\
+40~neurons & 17.73\% & 20.75\% & 22.62\% \\
+\hline
+\hline
+Non chaotic & \multicolumn{3}{c|}{Output = Configuration} \\
+\cline{2-4}
+%Epochs & 1000 & 2500 & 5000 \\
+\hline
+10~neurons & 78.98\% & 80.02\% & 79.97\% \\
+25~neurons & 79.19\% & 81.59\% & 81.53\% \\
+40~neurons & 79.64\% & 81.37\% & 81.37\% \\
+\hline
+\hline
+Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\
+\cline{2-4}
+%Epochs & 1000 & 2500 & 5000 \\
+\hline
+10~neurons & 3.47\% & 9.98\% & 11.66\% \\
+25~neurons & 3.92\% & 8.63\% & 10.09\% \\
+40~neurons & 3.29\% & 7.19\% & 7.18\% \\
+\hline
+\end{tabular}
+\end{table}
+
+\section{Conclusion}
+
+In this paper, we have established an equivalence between chaotic
+iterations, according to the Devaney's definition of chaos, and a
+class of multilayer perceptron neural networks. Firstly, we have
+described how to build a neural network that can be trained to learn a
+given chaotic map function. Secondly, we found a condition that allow
+to check whether the iterations induced by a function are chaotic or
+not, and thus if a chaotic map is obtained. Thanks to this condition
+our approach is not limited to a particular function. In the dual
+case, we show that checking if a neural network is chaotic consists in
+verifying a property on an associated graph, called the graph of
+iterations. These results are valid for recurrent neural networks
+with a particular architecture. However, we believe that a similar
+work can be done for other neural network architectures. Finally, we
+have discovered at least one family of problems with a reasonable
+size, such that artificial neural networks should not be applied in
+the presence of chaos, due to their inability to learn chaotic
+behaviors in this context. Such a consideration is not reduced to a
+theoretical detail: this family of discrete iterations is concretely
+implemented in a new steganographic method \cite{guyeux10ter}. As
+steganographic detectors embed tools like neural networks to
+distinguish between original and stego contents, our studies tend to
+prove that such detectors might be unable to tackle with chaos-based
+information hiding schemes.
+
+In future work we intend to enlarge the comparison between the
+learning of truly chaotic and non-chaotic behaviors. Other
+computational intelligence tools such as support vector machines will
+be investigated too, to discover which tools are the most relevant
+when facing a truly chaotic phenomenon. A comparison between learning
+rate success and prediction quality will be realized. Concrete
+consequences in biology, physics, and computer science security fields
+will then be stated.
+
+% \appendix{}
+
+% \begin{Def} \label{def2}
+% A continuous function $f$ on a topological space $(\mathcal{X},\tau)$
+% is defined to be {\emph{topologically transitive}} if for any pair of
+% open sets $U$, $V \in \mathcal{X}$ there exists
+% $k \in
+% \mathds{N}^{\ast}$
+% such that
+% $f^k(U) \cap V \neq \emptyset$.
+% \end{Def}
+
+%\bibliography{chaos-paper}% Produces the bibliography via BibTeX.
+
+%\end{document}
+%
+% ****** End of file chaos-paper.tex ******