X-Git-Url: https://bilbo.iut-bm.univ-fcomte.fr/and/gitweb/chaos1.git/blobdiff_plain/e2b5fffa698ba0b8084b270892f0d20ed6cf64c0..HEAD:/main.tex?ds=inline diff --git a/main.tex b/main.tex index 82e99bf..49564db 100644 --- a/main.tex +++ b/main.tex @@ -88,7 +88,6 @@ far more difficult to learn. \maketitle \begin{quotation} - Chaotic neural networks have received a lot of attention due to the appealing properties of deterministic chaos (unpredictability, sensitivity, and so on). However, such networks are often claimed @@ -97,7 +96,7 @@ work a theoretical framework based on the Devaney's definition of chaos is introduced. Starting with a relationship between discrete iterations and Devaney's chaos, we firstly show how to build a recurrent neural network that is equivalent to a chaotic map and -secondly a way to check whether an already available network is +secondly a way to check whether an already available network is chaotic or not. We also study different topological properties of these truly chaotic neural networks. Finally, we show that the learning, with neural networks having a classical feedforward @@ -110,7 +109,7 @@ chaotic maps, is far more difficult than non chaotic behaviors. \label{S1} Several research works have proposed or used chaotic neural networks -these last years. The complex dynamics of such networks lead to +these last years. The complex dynamics of such networks lead to various potential application areas: associative memories~\cite{Crook2007267} and digital security tools like hash functions~\cite{Xiao10}, digital @@ -136,8 +135,8 @@ are suitable to model nonlinear relationships between data, due to their universal approximator capacity \cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Thus, this kind of networks can be trained to model a physical phenomenon known to be -chaotic such as Chua's circuit \cite{dalkiran10}. Sometime a neural -network, which is build by combining transfer functions and initial +chaotic such as Chua's circuit \cite{dalkiran10}. Sometime a neural +network, which is build by combining transfer functions and initial conditions that are both chaotic, is itself claimed to be chaotic \cite{springerlink:10.1007/s00521-010-0432-2}. @@ -151,10 +150,10 @@ a dynamical system: ergodicity, expansivity, and so on. More precisely, in this paper, which is an extension of a previous work \cite{bgs11:ip}, we establish the equivalence between chaotic iterations and a class of globally recurrent MLP. The second -contribution is a study of the converse problem, indeed we investigate the -ability of classical multiLayer perceptrons to learn a particular +contribution is a study of the converse problem, indeed we investigate +the ability of classical multilayer perceptrons to learn a particular family of discrete chaotic dynamical systems. This family is defined -by a Boolean vector, an update function, and a sequence defining which +by a Boolean vector, an update function, and a sequence defining the component to update at each iteration. It has been previously established that such dynamical systems are chaotically iterated (as it is defined by Devaney) when the chosen function has a strongly @@ -234,14 +233,15 @@ point dependence on initial conditions. neighborhood of its future evolution eventually overlap with any other given region. This property implies that a dynamical system cannot be broken into simpler subsystems. Intuitively, its complexity does not -allow any simplification. On the contrary, a dense set of periodic -points is an element of regularity that a chaotic dynamical system has -to exhibit. +allow any simplification. However, chaos needs some regularity to ``counteracts'' the effects of -transitivity. +transitivity. % Thus, two points close to each other can behave in a completely different manner, the first one visiting the whole space whereas the second one has a regular orbit. +In the Devaney's formulation, a dense set of periodic +points is the element of regularity that a chaotic dynamical system has +to exhibit. %\begin{definition} \label{def3} -We recall that a point $x$ is {\bf periodic point} for $f$ of +We recall that a point $x$ is a {\bf periodic point} for $f$ of period~$n \in \mathds{N}^{\ast}$ if $f^{n}(x)=x$. %\end{definition} Then, the map @@ -261,12 +261,12 @@ Let us recall that $f$ has {\bf sensitive dependence on initial $n > 0$ such that $d\left(f^{n}(x), f^{n}(y)\right) >\delta $. The value $\delta$ is called the {\bf constant of sensitivity} of $f$. -Finally, The dynamical system that iterates $f$ is {\bf chaotic +Finally, the dynamical system that iterates $f$ is {\bf chaotic according to Devaney} on $(\mathcal{X},\tau)$ if $f$ is regular, topologically transitive, and has sensitive dependence to its initial conditions. In what follows, iterations are said to be chaotic -according Devaney when corresponding dynamical system is chaotic -according Devaney. +(according to Devaney) when the corresponding dynamical system is +chaotic, as it is defined in the Devaney's formulation. %Let us notice that for a metric space the last condition follows from %the two first ones~\cite{Banks92}. @@ -311,15 +311,15 @@ $i$ from $x$ to $N(i,x)$ if and only if $f_i(x)$ is $N(i,x)$. In the sequel, the {\it strategy} $S=(S^{t})^{t \in \Nats}$ is the sequence defining which component to update at time $t$ and $S^{t}$ denotes its $t-$th term. This iteration scheme that only modifies one -element at each iteration is classically referred as {\it asynchronous +element at each iteration is usually referred as {\it asynchronous iterations}. More precisely, we have for any $i$, $1\le i \le n$, \begin{equation} \left\{ \begin{array}{l} x^{0} \in \Bool^n \\ x^{t+1}_i = \left\{ \begin{array}{l} - f_i(x^t) \textrm{ if $S^t = i$} \\ - x_i^t \textrm{ otherwise} + f_i(x^t) \textrm{ if $S^t = i$}\enspace , \\ + x_i^t \textrm{ otherwise}\enspace . \end{array} \right. \end{array} \right. @@ -373,15 +373,17 @@ X^{k+1}& = & G_{f}(X^{k})\\ \right. \label{eq:Gf} \end{equation} -where $\sigma$ is the function that removes the first term of the -strategy ({\it i.e.},~$S^0$). This definition allows to links -asynchronous iterations with classical iterations of a dynamical +where $\sigma$ is the so-called shift function that removes the first +term of the strategy ({\it i.e.},~$S^0$). This definition allows to +link asynchronous iterations with classical iterations of a dynamical system. Note that it can be extended by considering subsets for $S^t$. To study topological properties of these iterations, we are then left to introduce a {\bf distance} $d$ between two points $(S,x)$ and -$(\check{S},\check{x})\in \mathcal{X} = \llbracket1;n\rrbracket^\Nats -\times \Bool^{n}$. It is defined by +$(\check{S},\check{x})$ in $\mathcal{X} = +\llbracket1;n\rrbracket^\Nats \times \Bool^{n}$. Let $\Delta(x,y) = 0$ +if $x=y$, and $\Delta(x,y) = 1$ else, be a distance on +$\mathds{B}$. The distance $d$ is defined by \begin{equation} d((S,x);(\check{S},\check{x}))=d_{e}(x,\check{x})+d_{s}(S,\check{S}) \enspace , @@ -406,10 +408,10 @@ these requirements: on the one hand its floor value reflects the difference between the cells, on the other hand its fractional part measures the difference between the strategies. -The relation between $\Gamma(f)$ and $G_f$ is clear: there exists a +The relation between $\Gamma(f)$ and $G_f$ is obvious: there exists a path from $x$ to $x'$ in $\Gamma(f)$ if and only if there exists a strategy $s$ such that iterations of $G_f$ from the initial point -$(s,x)$ reaches the configuration $x'$. Using this link, +$(s,x)$ reach the configuration $x'$. Using this link, Guyeux~\cite{GuyeuxThese10} has proven that, \begin{theorem}%[Characterization of $\mathcal{C}$] \label{Th:Caracterisation des IC chaotiques} @@ -424,7 +426,7 @@ case, iterations of the function $G_f$ as defined in Eq.~(\ref{eq:Gf}) are chaotic according to Devaney. -Let us then define two function $f_0$ and $f_1$ both in +Let us then define two functions $f_0$ and $f_1$ both in $\Bool^n\to\Bool^n$ that are used all along this paper. The former is the vectorial negation, \textit{i.e.}, $f_{0}(x_{1},\dots,x_{n}) =(\overline{x_{1}},\dots,\overline{x_{n}})$. The latter is @@ -460,9 +462,9 @@ we obtain a global recurrent neural network that behaves as follows \item When the network is activated at the $t^{th}$ iteration, the state of the system $x^t \in \mathds{B}^n$ received from the output layer and the initial term of the sequence $(S^t)^{t \in \Nats}$ - ($S^0 \in \llbracket 1;n\rrbracket$) are used to compute the new - output vector. This new vector, which represents the new state of - the dynamical system, satisfies: + (\textit{i.e.}, $S^0 \in \llbracket 1;n\rrbracket$) are used to + compute the new output vector. This new vector, which represents + the new state of the dynamical system, satisfies: \begin{equation} x^{t+1}=F_{f_0}(S^0, x^t) \in \mathds{B}^n \enspace . \end{equation} @@ -477,20 +479,17 @@ we obtain a global recurrent neural network that behaves as follows The behavior of the neural network is such that when the initial state is $x^0~\in~\mathds{B}^n$ and a sequence $(S^t)^{t \in \Nats}$ is -given as outside input, -\JFC{en dire davantage sur l'outside world} %% TO BE UPDATED -then the sequence of successive published +given as outside input, then the sequence of successive published output vectors $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$ is exactly the one produced by the chaotic iterations formally described in -Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar +Eq.~(\ref{eq:Gf}). It means that mathematically if we use similar input vectors they both generate the same successive outputs $\left(x^t\right)^{t \in \mathds{N}^{\ast}}$, and therefore that they are equivalent reformulations of the iterations of $G_{f_0}$ in $\mathcal{X}$. Finally, since the proposed neural network is built to -model the behavior of $G_{f_0}$, whose iterations are - chaotic according to -Devaney's definition of chaos, we can conclude that the network is -also chaotic in this sense. +model the behavior of $G_{f_0}$, whose iterations are chaotic +according to the Devaney's definition of chaos, we can conclude that +the network is also chaotic in this sense. The previous construction scheme is not restricted to function $f_0$. It can be extended to any function $f$ such that $G_f$ is a chaotic @@ -527,7 +526,6 @@ to an input one. compute the new output one $\left(x^{t+1}_1,\dots,x^{t+1}_n\right)$. While the remaining input receives a new integer value $S^t \in \llbracket1;n\rrbracket$, which is provided by the outside world. -\JFC{en dire davantage sur l'outside world} \end{itemize} The topological behavior of these particular neural networks can be @@ -558,7 +556,7 @@ condition $\left(S,(x_1^0,\dots, x_n^0)\right) \in \llbracket 1;n \rrbracket^{\mathds{N}} \times \mathds{B}^n$. Theoretically speaking, such iterations of $F_f$ are thus a formal model of these kind of recurrent neural networks. In the rest of this -paper, we will call such multilayer perceptrons CI-MLP($f$), which +paper, we will call such multilayer perceptrons ``CI-MLP($f$)'', which stands for ``Chaotic Iterations based MultiLayer Perceptron''. Checking if CI-MLP($f$) behaves chaotically according to Devaney's @@ -640,7 +638,7 @@ of the output space can be discarded when studying CI-MLPs: this space is intrinsically complicated and it cannot be decomposed or simplified. -Furthermore, those recurrent neural networks exhibit the instability +Furthermore, these recurrent neural networks exhibit the instability property: \begin{definition} A dynamical system $\left( \mathcal{X}, f\right)$ is {\bf unstable} @@ -701,7 +699,7 @@ Figure~\ref{Fig:scheme} is a summary of addressed neural networks and chaos problems. In Section~\ref{S2} we have explained how to construct a truly chaotic neural networks, $A$ for instance. Section~\ref{S3} has shown how to check whether a given MLP -$A$ or $C$ is chaotic or not in the sens of Devaney, and how to study +$A$ or $C$ is chaotic or not in the sense of Devaney, and how to study its topological behavior. The last thing to investigate, when comparing neural networks and Devaney's chaos, is to determine whether an artificial neural network $C$ is able to learn or predict some @@ -720,7 +718,7 @@ extraction takes place once at destination. The reverse ({\it i.e.}, automatically detecting the presence of hidden messages inside media) is called steganalysis. Among the deployed strategies inside detectors, there are support vectors -machines~\cite{Qiao:2009:SM:1704555.1704664}, neural +machines~\cite{Qiao:2009:SM:1704555.1704664}, neural networks~\cite{10.1109/ICME.2003.1221665,10.1109/CIMSiM.2010.36}, and Markov chains~\cite{Sullivan06steganalysisfor}. Most of these detectors give quite good results and are rather competitive when @@ -827,7 +825,7 @@ $3$~terms we obtain 2304~input-output pairs. \label{section:experiments} To study if chaotic iterations can be predicted, we choose to train -the multiLayer perceptron. As stated before, this kind of network is +the multilayer perceptron. As stated before, this kind of network is in particular well-known for its universal approximation property \cite{Cybenko89,DBLP:journals/nn/HornikSW89}. Furthermore, MLPs have been already considered for chaotic time series prediction. For @@ -861,10 +859,10 @@ trainings of two data sets, one of them describing chaotic iterations, are compared. Thereafter we give, for the different learning setups and data sets, -the mean prediction success rate obtained for each output. A such rate -represent the percentage of input-output pairs belonging to the test +the mean prediction success rate obtained for each output. Such a rate +represents the percentage of input-output pairs belonging to the test subset for which the corresponding output value was correctly -predicted. These values are computed considering 10~trainings with +predicted. These values are computed considering 10~trainings with random subsets construction, weights and biases initialization. Firstly, neural networks having 10 and 25~hidden neurons are trained, with a maximum number of epochs that takes its value in @@ -880,7 +878,7 @@ hidden layer up to 40~neurons and we consider larger number of epochs. \centering {\small \begin{tabular}{|c|c||c|c|c|} \hline -\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs and one hidden layer} \\ +\multicolumn{5}{|c|}{Networks topology: 6~inputs, 5~outputs, and one hidden layer} \\ \hline \hline \multicolumn{2}{|c||}{Hidden neurons} & \multicolumn{3}{c|}{10 neurons} \\ @@ -932,32 +930,39 @@ is observed (from 36.10\% for 10~neurons and 125~epochs to 70.97\% for 25~neurons and 500~epochs). We also notice that the learning of outputs~(2) and~(3) is more difficult. Conversely, for the non-chaotic case the simplest training setup is enough to predict -configurations. For all those feedforward network topologies and all +configurations. For all these feedforward network topologies and all outputs the obtained results for the non-chaotic case outperform the chaotic ones. Finally, the rates for the strategies show that the -different networks are unable to learn them. - -\newpage +different feedforward networks are unable to learn them. For the second coding scheme (\textit{i.e.}, with Gray Codes) Table~\ref{tab2} shows that any network learns about five times more -non-chaotic configurations than chaotic ones. As in the previous -scheme, the strategies cannot be predicted. +non-chaotic configurations than chaotic ones. As in the previous +scheme, the strategies cannot be predicted. +Figures~\ref{Fig:chaotic_predictions} and +\ref{Fig:non-chaotic_predictions} present the predictions given by two +feedforward multilayer perceptrons that were respectively trained to +learn chaotic and non-chaotic data, using the second coding scheme. +Each figure shows for each sample of the test subset (577~samples, +representing 25\% of the 2304~samples) the configuration that should +have been predicted and the one given by the multilayer perceptron. It +can be seen that for the chaotic data the predictions are far away +from the expected configurations. Obviously, the better predictions +for the non-chaotic data reflect their regularity. Let us now compare the two coding schemes. Firstly, the second scheme disturbs the learning process. In fact in this scheme the configuration is always expressed as a natural number, whereas in the -first one the number of inputs follows the increase of the boolean +first one the number of inputs follows the increase of the Boolean vectors coding configurations. In this latter case, the coding gives a finer information on configuration evolution. -\JFC{Je n'ai pas compris le paragraphe precedent. Devrait être repris} \begin{table}[b] \caption{Prediction success rates for configurations expressed with Gray code} \label{tab2} \centering \begin{tabular}{|c|c||c|c|c|} \hline -\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs and one hidden layer} \\ +\multicolumn{5}{|c|}{Networks topology: 3~inputs, 2~outputs, and one hidden layer} \\ \hline \hline & Hidden neurons & \multicolumn{3}{c|}{10 neurons} \\ @@ -984,12 +989,26 @@ finer information on configuration evolution. \end{tabular} \end{table} +\begin{figure} + \centering + \includegraphics[scale=0.5]{chaotic_trace2} + \caption {Second coding scheme - Predictions obtained for a chaotic test subset.} + \label{Fig:chaotic_predictions} +\end{figure} + +\begin{figure} + \centering + \includegraphics[scale=0.5]{non-chaotic_trace2} + \caption{Second coding scheme - Predictions obtained for a non-chaotic test subset.} + \label{Fig:non-chaotic_predictions} +\end{figure} + Unfortunately, in practical applications the number of components is usually unknown. Hence, the first coding scheme cannot be used systematically. Therefore, we provide a refinement of the second scheme: each output is learned by a different ANN. Table~\ref{tab3} presents the results for this approach. In any case, whatever the -considered feedforward network topologies, the maximum epoch number +considered feedforward network topologies, the maximum epoch number, and the kind of iterations, the configuration success rate is slightly improved. Moreover, the strategies predictions rates reach almost 12\%, whereas in Table~\ref{tab2} they never exceed 1.5\%. Despite of @@ -1002,7 +1021,7 @@ appear to be an open issue. \centering \begin{tabular}{|c||c|c|c|} \hline -\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output and one hidden layer} \\ +\multicolumn{4}{|c|}{Networks topology: 3~inputs, 1~output, and one hidden layer} \\ \hline \hline Epochs & 125 & 250 & 500 \\ @@ -1064,10 +1083,6 @@ Chaotic/non chaotic & \multicolumn{3}{c|}{Output = Strategy} \\ \end{tabular} \end{table} -% -% TO BE COMPLETED -% - \section{Conclusion} In this paper, we have established an equivalence between chaotic @@ -1101,12 +1116,10 @@ be investigated too, to discover which tools are the most relevant when facing a truly chaotic phenomenon. A comparison between learning rate success and prediction quality will be realized. Concrete consequences in biology, physics, and computer science security fields -will be stated. +will then be stated. % \appendix{} - - % \begin{definition} \label{def2} % A continuous function $f$ on a topological space $(\mathcal{X},\tau)$ % is defined to be {\emph{topologically transitive}} if for any pair of @@ -1117,7 +1130,6 @@ will be stated. % $f^k(U) \cap V \neq \emptyset$. % \end{definition} - \bibliography{chaos-paper}% Produces the bibliography via BibTeX. \end{document}