X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/chaos1.git/blobdiff_plain/486652fa99fdbcf0992d32f910b729e648e33644..refs/heads/master:/main.tex diff --git a/main.tex b/main.tex index 016243a..49564db 100644 --- a/main.tex +++ b/main.tex @@ -88,7 +88,6 @@ far more difficult to learn. \maketitle \begin{quotation} - Chaotic neural networks have received a lot of attention due to the appealing properties of deterministic chaos (unpredictability, sensitivity, and so on). However, such networks are often claimed @@ -97,7 +96,7 @@ work a theoretical framework based on the Devaney's definition of chaos is introduced. Starting with a relationship between discrete iterations and Devaney's chaos, we firstly show how to build a recurrent neural network that is equivalent to a chaotic map and -secondly a way to check whether an already available network is +secondly a way to check whether an already available network is chaotic or not. We also study different topological properties of these truly chaotic neural networks. Finally, we show that the learning, with neural networks having a classical feedforward @@ -110,7 +109,7 @@ chaotic maps, is far more difficult than non chaotic behaviors. \label{S1} Several research works have proposed or used chaotic neural networks -these last years. The complex dynamics of such networks lead to +these last years. The complex dynamics of such networks lead to various potential application areas: associative memories~\cite{Crook2007267} and digital security tools like hash functions~\cite{Xiao10}, digital @@ -136,8 +135,8 @@ are suitable to model nonlinear relationships between data, due to their universal approximator capacity \cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Thus, this kind of networks can be trained to model a physical phenomenon known to be -chaotic such as Chua's circuit \cite{dalkiran10}. Sometime a neural -network, which is build by combining transfer functions and initial +chaotic such as Chua's circuit \cite{dalkiran10}. Sometime a neural +network, which is build by combining transfer functions and initial conditions that are both chaotic, is itself claimed to be chaotic \cite{springerlink:10.1007/s00521-010-0432-2}. @@ -151,8 +150,8 @@ a dynamical system: ergodicity, expansivity, and so on. More precisely, in this paper, which is an extension of a previous work \cite{bgs11:ip}, we establish the equivalence between chaotic iterations and a class of globally recurrent MLP. The second -contribution is a study of the converse problem, indeed we investigate the -ability of classical multiLayer perceptrons to learn a particular +contribution is a study of the converse problem, indeed we investigate +the ability of classical multilayer perceptrons to learn a particular family of discrete chaotic dynamical systems. This family is defined by a Boolean vector, an update function, and a sequence defining the component to update at each iteration. It has been previously @@ -266,7 +265,8 @@ Finally, the dynamical system that iterates $f$ is {\bf chaotic according to Devaney} on $(\mathcal{X},\tau)$ if $f$ is regular, topologically transitive, and has sensitive dependence to its initial conditions. In what follows, iterations are said to be chaotic -(according to Devaney) when the corresponding dynamical system is chaotic, as it is defined in the Devaney's formulation. +(according to Devaney) when the corresponding dynamical system is +chaotic, as it is defined in the Devaney's formulation. %Let us notice that for a metric space the last condition follows from %the two first ones~\cite{Banks92}. @@ -373,15 +373,17 @@ X^{k+1}& = & G_{f}(X^{k})\\ \right. \label{eq:Gf} \end{equation} -where $\sigma$ is the so-called shift function that removes the first term of the -strategy ({\it i.e.},~$S^0$). This definition allows to link -asynchronous iterations with classical iterations of a dynamical +where $\sigma$ is the so-called shift function that removes the first +term of the strategy ({\it i.e.},~$S^0$). This definition allows to +link asynchronous iterations with classical iterations of a dynamical system. Note that it can be extended by considering subsets for $S^t$. To study topological properties of these iterations, we are then left to introduce a {\bf distance} $d$ between two points $(S,x)$ and -$(\check{S},\check{x})$ in $\mathcal{X} = \llbracket1;n\rrbracket^\Nats -\times \Bool^{n}$. Let $\Delta(x,y) = 0$ if $x=y$, and $\Delta(x,y) = 1$ else, be a distance on $\mathds{B}$. The distance $d$ is defined by +$(\check{S},\check{x})$ in $\mathcal{X} = +\llbracket1;n\rrbracket^\Nats \times \Bool^{n}$. Let $\Delta(x,y) = 0$ +if $x=y$, and $\Delta(x,y) = 1$ else, be a distance on +$\mathds{B}$. The distance $d$ is defined by \begin{equation} d((S,x);(\check{S},\check{x}))=d_{e}(x,\check{x})+d_{s}(S,\check{S}) \enspace , @@ -460,9 +462,9 @@ we obtain a global recurrent neural network that behaves as follows \item When the network is activated at the $t^{th}$ iteration, the state of the system $x^t \in \mathds{B}^n$ received from the output layer and the initial term of the sequence $(S^t)^{t \in \Nats}$ - (\textit{i.e.}, $S^0 \in \llbracket 1;n\rrbracket$) are used to compute the new - output vector. This new vector, which represents the new state of - the dynamical system, satisfies: + (\textit{i.e.}, $S^0 \in \llbracket 1;n\rrbracket$) are used to + compute the new output vector. This new vector, which represents + the new state of the dynamical system, satisfies: \begin{equation} x^{t+1}=F_{f_0}(S^0, x^t) \in \mathds{B}^n \enspace . \end{equation} @@ -477,19 +479,17 @@ we obtain a global recurrent neural network that behaves as follows The behavior of the neural network is such that when the initial state is $x^0~\in~\mathds{B}^n$ and a sequence $(S^t)^{t \in \Nats}$ is -given as outside input, -then the sequence of successive published +given as outside input, then the sequence of successive published output vectors $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ is exactly the one produced by the chaotic iterations formally described in -Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar +Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar input vectors they both generate the same successive outputs $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$, and therefore that they are equivalent reformulations of the iterations of $G_{f_0}$ in $\mathcal{X}$. Finally, since the proposed neural network is built to -model the behavior of $G_{f_0}$, whose iterations are - chaotic according to the -Devaney's definition of chaos, we can conclude that the network is -also chaotic in this sense. +model the behavior of $G_{f_0}$, whose iterations are chaotic +according to the Devaney's definition of chaos, we can conclude that +the network is also chaotic in this sense. The previous construction scheme is not restricted to function $f_0$. It can be extended to any function $f$ such that $G_f$ is a chaotic @@ -526,7 +526,6 @@ to an input one. compute the new output one $\left(x^{t+1}_1,\dots,x^{t+1}_n\right)$. While the remaining input receives a new integer value $S^t \in \llbracket1;n\rrbracket$, which is provided by the outside world. -\JFC{en dire davantage sur l'outside world} \end{itemize} The topological behavior of these particular neural networks can be @@ -557,7 +556,7 @@ condition $\left(S,(x_1^0,\dots, x_n^0)\right) \in \llbracket 1;n \rrbracket^{\mathds{N}} \times \mathds{B}^n$. Theoretically speaking, such iterations of $F_f$ are thus a formal model of these kind of recurrent neural networks. In the rest of this -paper, we will call such multilayer perceptrons CI-MLP($f$), which +paper, we will call such multilayer perceptrons ``CI-MLP($f$)'', which stands for ``Chaotic Iterations based MultiLayer Perceptron''. Checking if CI-MLP($f$) behaves chaotically according to Devaney's @@ -639,7 +638,7 @@ of the output space can be discarded when studying CI-MLPs: this space is intrinsically complicated and it cannot be decomposed or simplified. -Furthermore, those recurrent neural networks exhibit the instability +Furthermore, these recurrent neural networks exhibit the instability property: \begin{definition} A dynamical system $\left( \mathcal{X}, f\right)$ is {\bf unstable} @@ -700,7 +699,7 @@ Figure~\ref{Fig:scheme} is a summary of addressed neural networks and chaos problems. In Section~\ref{S2} we have explained how to construct a truly chaotic neural networks, $A$ for instance. Section~\ref{S3} has shown how to check whether a given MLP -$A$ or $C$ is chaotic or not in the sens of Devaney, and how to study +$A$ or $C$ is chaotic or not in the sense of Devaney, and how to study its topological behavior. The last thing to investigate, when comparing neural networks and Devaney's chaos, is to determine whether an artificial neural network $C$ is able to learn or predict some @@ -719,7 +718,7 @@ extraction takes place once at destination. The reverse ({\it i.e.}, automatically detecting the presence of hidden messages inside media) is called steganalysis. Among the deployed strategies inside detectors, there are support vectors -machines~\cite{Qiao:2009:SM:1704555.1704664}, neural +machines~\cite{Qiao:2009:SM:1704555.1704664}, neural networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36}, and Markov chains~\cite{Sullivan06steganalysisfor}. Most of these detectors give quite good results and are rather competitive when @@ -826,7 +825,7 @@ $3$~terms we obtain 2304~input-output pairs. \label{section:experiments} To study if chaotic iterations can be predicted, we choose to train -the multiLayer perceptron. As stated before, this kind of network is +the multilayer perceptron. As stated before, this kind of network is in particular well-known for its universal approximation property \cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Furthermore, MLPs have been already considered for chaotic time series prediction. For @@ -860,10 +859,10 @@ trainings of two data sets, one of them describing chaotic iterations, are compared. Thereafter we give, for the different learning setups and data sets, -the mean prediction success rate obtained for each output. A such rate -represent the percentage of input-output pairs belonging to the test +the mean prediction success rate obtained for each output. Such a rate +represents the percentage of input-output pairs belonging to the test subset for which the corresponding output value was correctly -predicted. These values are computed considering 10~trainings with +predicted. These values are computed considering 10~trainings with random subsets construction, weights and biases initialization. Firstly, neural networks having 10 and 25~hidden neurons are trained, with a maximum number of epochs that takes its value in @@ -879,7 +878,7 @@ hidden layer up to 40~neurons and we consider larger number of epochs. \centering {\small \begin{tabular}{|c|c||c|c|c|} \hline -\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs and one hidden layer} \\ +\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs, and one hidden layer} \\ \hline \hline \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\ @@ -931,17 +930,25 @@ is observed (from 36.10\% for 10~neurons and 125~epochs to 70.97\% for 25~neurons and 500~epochs). We also notice that the learning of outputs~(2) and~(3) is more difficult. Conversely, for the non-chaotic case the simplest training setup is enough to predict -configurations. For all those feedforward network topologies and all +configurations. For all these feedforward network topologies and all outputs the obtained results for the non-chaotic case outperform the chaotic ones. Finally, the rates for the strategies show that the -different networks are unable to learn them. - -\newpage +different feedforward networks are unable to learn them. For the second coding scheme (\textit{i.e.}, with Gray Codes) Table~\ref{tab2} shows that any network learns about five times more -non-chaotic configurations than chaotic ones. As in the previous -scheme, the strategies cannot be predicted. +non-chaotic configurations than chaotic ones. As in the previous +scheme, the strategies cannot be predicted. +Figures~\ref{Fig:chaotic_predictions} and +\ref{Fig:non-chaotic_predictions} present the predictions given by two +feedforward multilayer perceptrons that were respectively trained to +learn chaotic and non-chaotic data, using the second coding scheme. +Each figure shows for each sample of the test subset (577~samples, +representing 25\% of the 2304~samples) the configuration that should +have been predicted and the one given by the multilayer perceptron. It +can be seen that for the chaotic data the predictions are far away +from the expected configurations. Obviously, the better predictions +for the non-chaotic data reflect their regularity. Let us now compare the two coding schemes. Firstly, the second scheme disturbs the learning process. In fact in this scheme the @@ -949,14 +956,13 @@ configuration is always expressed as a natural number, whereas in the first one the number of inputs follows the increase of the Boolean vectors coding configurations. In this latter case, the coding gives a finer information on configuration evolution. -\JFC{Je n'ai pas compris le paragraphe precedent. Devrait être repris} \begin{table}[b] \caption{Prediction success rates for configurations expressed with Gray code} \label{tab2} \centering \begin{tabular}{|c|c||c|c|c|} \hline -\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs and one hidden layer} \\ +\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs, and one hidden layer} \\ \hline \hline & Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\ @@ -983,12 +989,26 @@ finer information on configuration evolution. \end{tabular} \end{table} +\begin{figure} + \centering + \includegraphics[scale=0.5]{chaotic_trace2} + \caption {Second coding scheme - Predictions obtained for a chaotic test subset.} + \label{Fig:chaotic_predictions} +\end{figure} + +\begin{figure} + \centering + \includegraphics[scale=0.5]{non-chaotic_trace2} + \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.} + \label{Fig:non-chaotic_predictions} +\end{figure} + Unfortunately, in practical applications the number of components is usually unknown. Hence, the first coding scheme cannot be used systematically. Therefore, we provide a refinement of the second scheme: each output is learned by a different ANN. Table~\ref{tab3} presents the results for this approach. In any case, whatever the -considered feedforward network topologies, the maximum epoch number +considered feedforward network topologies, the maximum epoch number, and the kind of iterations, the configuration success rate is slightly improved. Moreover, the strategies predictions rates reach almost 12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%. Despite of @@ -1001,7 +1021,7 @@ appear to be an open issue. \centering \begin{tabular}{|c||c|c|c|} \hline -\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output and one hidden layer} \\ +\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output, and one hidden layer} \\ \hline \hline Epochs & 125 & 250 & 500 \\ @@ -1063,10 +1083,6 @@ Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\ \end{tabular} \end{table} -% -% TO BE COMPLETED -% - \section{Conclusion} In this paper, we have established an equivalence between chaotic @@ -1100,12 +1116,10 @@ be investigated too, to discover which tools are the most relevant when facing a truly chaotic phenomenon. A comparison between learning rate success and prediction quality will be realized. Concrete consequences in biology, physics, and computer science security fields -will be stated. +will then be stated. % \appendix{} - - % \begin{definition} \label{def2} % A continuous function $f$ on a topological space $(\mathcal{X},\tau)$ % is defined to be {\emph{topologically transitive}} if for any pair of @@ -1116,7 +1130,6 @@ will be stated. % $f^k(U) \cap V \neq \emptyset$. % \end{definition} - \bibliography{chaos-paper}% Produces the bibliography via BibTeX. \end{document}