IH/iihmsp10.tex

   1 \section{Chaotic iterations versus Spread-spectrum: chaos and stego security~\cite{gfb10:ip}}
   2 \label{sec:iihmsp10}
   3 The first investigations in steganography since our thesis was published in
   4 IIH-MSP'10, 6-th Int. Conf. on Intelligent Information Hiding and
   5 Multimedia Signal Processing~\cite{gfb10:ip}, whose contribution is summarized in this section.
   6
   7
   8 \subsection{Introduction}\label{sec:introduction}
   9
  10 Information hiding has recently become a major digital technology~\cite{1411349,Xie09:BASecurity}, especially with the increasing importance and widespread distribution of digital media through the Internet.
  11 Spread-spectrum data-hiding techniques have been widely studied in recent years
  12 under the scope of security. These techniques encompass several schemes, such as
  13 Improved Spread Spectrum~(ISS), Circular Watermarking~(CW), and Natural
  14 Watermarking~(NW). Some of these schemes have revealed various
  15 security issues. On the contrary, it has
  16 been proven in~\cite{Cayre2008} that the Natural Watermarking technique is
  17 stego-secure. This stego-security is one of the security classes defined
  18 in~\cite{Cayre2008}. In this paper, probabilistic models are used to
  19 categorize the security of data hiding algorithms in the Watermark Only
  20 Attack~(WOA) framework.
  21
  22 We have explained at the beginning of this manuscript
  23 that any algorithm can be rewritten as an iterative
  24 process, leading to the possibility to study its
  25 topological behavior. As a concrete example,
  26 we have shown in~\cite{gfb10:ip} that the security level of these information hiding algorithms can be studied into a
  27 novel framework based on unpredictability, as it is understood in the
  28 theory of chaos~\cite{Devaney}. To do so and in the continuation of our thesis, a new class of security has been
  29 introduced in~\cite{gfb10:ip}, namely the topological security. This new class can be used to study
  30 some categories of attacks that are difficult to investigate in the existing
  31 security approach. It also enriches the variety of qualitative and quantitative
  32 tools that evaluate how strong the security is, thus reinforcing the confidence
  33 that can be added in a given scheme.
  34
  35 In addition of being stego-secure, we have proven during our thesis and published later in~\cite{gfb10:ip} that
  36 Natural Watermarking technique is topologically secure. Moreover, this technique
  37 possesses additional properties of unpredictability, namely, strong transitivity,
  38 topological mixing, and a constant of sensitivity equal to $\frac{N}{2}$.
  39 However NW are not expansive, which is problematic in the Constant-Message
  40 Attack (CMA) and Known Message Attack (KMA) setups~\cite{GuyeuxThese10}. In this section, it is recalled
  41 by using the new topological security framework, that a more secure scheme than NW
  42 can be found to withstand attacks in these setups. This scheme, introduced in~\cite{guyeux10ter}, is
  43 based another time on the chaotic iterations. The aim of~\cite{gfb10:ip} was to
  44 prove that this algorithm is stego-secure and topologically secure, to study its
  45 qualitative and quantitative properties of unpredictability, and then to compare
  46 it with Natural Watermarking (the topological study was realized at the end of
  47 our thesis while the stego-security has been proven later in~\cite{gfb10:ip}).
  48
  49
  50
  51
  52
  53
  54
  55 \subsection{Using chaotic iterations as information hiding schemes}
  56 \label{sec:data-hiding-algo-chaotic iterations}
  57
  58 To explain how to use chaotic iterations for information hiding, we must firstly define the
  59 significance of a given coefficient. This definition, given in our thesis,
  60 has been published later (in Secrypt'2011~\cite{fgb11:ip}).
  61
  62 \paragraph{Most and least significant coefficients}\label{sec:msc-lsc}
  63
  64
  65
  66 We first notice that terms of the original content $x$ that may be replaced by terms issued
  67 from the watermark $y$ are less important than other: they could be changed
  68 without be perceived as such. More generally, a
  69 \emph{signification function}
  70 attaches a weight to each term defining a digital media,
  71 depending on its position $t$.
  72
  73 \begin{Def}%[Signification function]
  74 A \emph{signification function} is a real sequence
  75 $(u^k)^{k \in \mathds{N}}$. % with a limit equal to 0.
  76 \end{Def}
  77
  78
  79 \begin{Ex}\label{Exemple LSC}
  80 Let us consider a set of
  81 grayscale images stored into portable graymap format (P3-PGM):
  82 each pixel ranges between 256 gray levels, \textit{i.e.},
  83 is memorized with eight bits.
  84 In that context, we consider
  85 $u^k = 8 - (k  \mod  8)$  to be the $k$-th term of a signification function
  86 $(u^k)^{k \in \mathds{N}}$.
  87 Intuitively, in each group of eight bits (\textit{i.e.}, for each pixel)
  88 the first bit has an importance equal to 8, whereas the last bit has an
  89 importance equal to 1. This is compliant with the idea that
  90 changing the first bit affects more the image than changing the last one.
  91 \end{Ex}
  92
  93 \begin{Def}%[Significance of coefficients]
  94 \label{def:msc,lsc}
  95 Let $(u^k)^{k \in \mathds{N}}$ be a signification function,
  96 $m$ and $M$ be two reals s.t. $m < M$.
  97 \begin{itemize}
  98 \item The \emph{most significant coefficients (MSCs)} of $x$ is the finite
  99   vector  $$u_M = \left( k ~ \big|~ k \in \mathds{N} \textrm{ and } u^k
 100     \geqslant M \textrm{ and }  k \le \mid x \mid \right);$$
 101  \item The \emph{least significant coefficients (LSCs)} of $x$ is the
 102 finite vector
 103 $$u_m = \left( k ~ \big|~ k \in \mathds{N} \textrm{ and } u^k
 104   \le m \textrm{ and }  k \le \mid x \mid \right);$$
 105  \item The \emph{passive coefficients} of $x$ is the finite vector
 106    $$u_p = \left( k ~ \big|~ k \in \mathds{N} \textrm{ and }
 107 u^k \in ]m;M[ \textrm{ and }  k \le \mid x \mid \right).$$
 108  \end{itemize}
 109  \end{Def}
 110
 111 For a given host content $x$,
 112 MSCs are then ranks of $x$  that describe the relevant part
 113 of the image, whereas LSCs translate its less significant parts.
 114 These two definitions are illustrated on Figure~\ref{fig:MSCLSC},
 115  where the significance function $(u^k)$ is defined as in Example \ref{Exemple LSC}, $M=5$, and $m=6$.
 116
 117
 118
 119 \begin{figure}[htb]
 120
 121
 122 \begin{minipage}[b]{.98\linewidth}
 123   \centering
 124   \centerline{\includegraphics[width=3.cm]{IH/img/lena512}}
 125   \centerline{(a) Lena.}
 126 \end{minipage}
 127 \begin{minipage}[b]{.49\linewidth}
 128   \centering
 129     \centerline{\includegraphics[width=3.cm]{IH/img/lena_msb_678}}
 130   \centerline{(b) MSCs of Lena.}
 131 \end{minipage}
 132 \hfill
 133 \begin{minipage}[b]{0.49\linewidth}
 134   \centering
 135     \centerline{\includegraphics[width=3.cm]{IH/img/lena_lsb_1234_facteur17}}
 136
 137  %\centerline{\epsfig{figure=img/lena_lsb_1234_facteur17.pdf,width=4cm}}
 138   \centerline{(c) LSCs of Lena ($\times 17$).}
 139 \end{minipage}
 140 %
 141 \caption{Most and least significant coefficients of Lena.}
 142 \label{fig:MSCLSC}
 143 %
 144 \end{figure}
 145
 146
 147
 148
 149
 150 \subsubsection{Presentation of the $\mathcal{CIW}_1$ dhCI scheme}
 151
 152 We have proposed in SECRYPT10~\cite{guyeux10ter} to use chaotic iterations as an information hiding scheme.
 153 It has been called dhCI (for data hiding based on chaotic
 154 iterations) in our thesis, but it is denoted as
 155 $\mathcal{CIW}_1$ in later publications.
 156 It operates as follows (see Figure~\ref{fig:DWT}). Let:
 157
 158
 159 \begin{itemize}
 160   \item $(K,N) \in [0;1]\times \mathds{N}$ be an embedding key,
 161   \item $X \in \mathbb{B}^\mathsf{N}$ be the $\mathsf{N}$ least significant coefficients (LSCs) of a given cover media $C$,
 162   \item $(S^n)_{n \in \mathds{N}} \in \llbracket 1, \mathsf{N} \rrbracket^{\mathds{N}}$ be a strategy, which depends on the message to hide $M \in [0;1]$ and $K$,
 163   \item $f_0 : \mathbb{B}^\mathsf{N} \rightarrow \mathbb{B}^\mathsf{N}$ be the vectorial logical negation.
 164 \end{itemize}
 165
 166
 167
 168 \begin{figure}[h!]
 169 \centering
 170 \subfigure[Original Lena.]{\includegraphics[scale=0.3]{IH/Medias/lena512}}\hspace{1cm}\subfigure[Watermarked Lena.]{\includegraphics[scale=0.3]{IH/Medias/lena512marqueDWT}}
 171 \caption{Data hiding with chaotic iterations}
 172 \label{fig:DWT}
 173 \end{figure}
 174
 175
 176 So the watermarked media is $C$ whose LSCs are replaced by $Y_K=X^{N}$, where:
 177
 178 \begin{equation*}
 179 \left\{
 180  \begin{array}{l}
 181 X^0 = X\\
 182 \forall n < N, X^{n+1} = G_{f_0}\left(X^n\right).\\
 183 \end{array} \right.
 184 \end{equation*}
 185
 186 In the following section, two ways to generate $(S^n)_{n \in \mathds{N}}$ are given, namely
 187 Chaotic Iterations with Independent Strategy~(CIIS) and Chaotic Iterations with Dependent
 188 Strategy~(CIDS). In CIIS, the strategy is independent from the cover media $X$,
 189 whereas in CIDS the strategy will be dependent on $X$. These
 190 strategies have been introduced in~\cite{gfb10:ip}. Their stego-security are studied in Section~\ref{sec:chaotic iterations-security-level} and their topological security in Section~\ref{sec:topological security-evaluation}.
 191
 192
 193
 194
 195
 196
 197 \subsubsection{Examples of strategies}
 198 \label{sec:ciis-cids-example}
 199 \paragraph{CIIS strategy}
 200
 201 Let us first introduce the Piecewise Linear Chaotic Map~(PLCM, see~\cite{Shujun1}), defined by:
 202
 203 \begin{Def}[PLCM]\label{Def:PLCM}
 204 \begin{equation*}
 205 F(x,p)=\left\{
 206  \begin{array}{ccc}
 207 x/p & \text{if} & x \in [0;p] \\
 208 (x-p)/(\frac{1}{2} - p) & \text{if} & x \in \left[ p; \frac{1}{2} \right]
 209 \\
 210 F(1-x,p) & \text{else.} & \\
 211 \end{array} \right.
 212 \end{equation*}
 213 \end{Def}
 214
 215
 216 \noindent where $p \in \left] 0; \frac{1}{2} \right[$ is a ``control parameter''. Contrary to well-known chaotic maps like the logistic
 217 map, this PLCM is unbiased and does not present obvious security
 218 flaws~\cite{Shujun1}.
 219
 220 We define the general term of the strategy $(S^n)_n$ in CIIS setup by
 221 the following expression: $S^n = \left \lfloor \mathsf{N} \times K^n \right \rfloor +
 222 1$, where:
 223
 224 \begin{equation*}\label{eq:PLCM-for-strategy}
 225 \left\{
 226  \begin{array}{l}
 227 p \in \left[ 0 ; \frac{1}{2} \right] \\
 228 K^0 = M \otimes K\\
 229 K^{n+1} = F(K^n,p), \forall n \leq N_0\\ \end{array} \right.
 230 \end{equation*}
 231
 232
 233
 234 \noindent in which $\otimes$ denotes the bitwise exclusive or (XOR) between two floating part numbers (\emph{i.e.}, between their binary digits representation). Lastly, to be certain to enter into the chaotic regime of PLCM~\cite{Shujun1}, the strategy can be preferably defined by: $S^n = \left \lfloor \mathsf{N} \times K^{n+D} \right \rfloor + 1$, where $D \in \mathds{N}$.
 235
 236
 237
 238
 239 \paragraph{CIDS strategy}\label{sec:cids-example}
 240 The same notations as above are used.
 241 We define CIDS strategy as in~\cite{gfb10:ip}: $\forall k \leqslant N$,
 242 \begin{itemize}
 243 \item if $k \leqslant \mathsf{N}$ and $X^k = 1$, then $S^k=k$,
 244 \item else $S^k=1$.
 245 \end{itemize}
 246 In this situation, if $N \geqslant \mathsf{N}$, then only two watermarked contents are possible with the scheme proposed in Section~\ref{sec:data-hiding-algo-chaotic iterations}, namely: $Y_K=(0,0,\cdots,0)$ and $Y_K=(1,0,\cdots,0)$.
 247
 248
 249 \subsection{Security versus robustness}\label{sec:chaotic iterations-security-level}
 250
 251
 252 \subsubsection{Presentation}
 253
 254 Even if security and robustness are neighboring concepts without clearly
 255 established definitions~\cite{Perez-Freire06}, robustness is often considered
 256 to be mostly concerned with blind elementary attacks, whereas security is not
 257 limited to certain specific attacks. Indeed, security encompasses robustness
 258 and intentional attacks~\cite{Kalker2001,ComesanaPP05bis}. The best attempt to
 259 give an elegant and concise definition for each of these two terms was
 260 proposed in \cite{Kalker2001}. Following Kalker, we will consider in this
 261 research work the two following definitions:
 262
 263 \begin{Def}[Security~\cite{Kalker2001}]\label{def:security}
 264 Watermarking security refers to the
 265 inability by unauthorized users to have access to the raw watermarking channel
 266 [...] to remove, detect and estimate, write or modify the raw watermarking
 267 bits.
 268 \end{Def}
 269
 270 \begin{Def}[Robustness~\cite{Kalker2001}]\label{def:robustness}
 271 Robust watermarking is a mechanism to create a communication channel that is
 272 multiplexed into original content [...] It is required that, firstly, the
 273 perceptual degradation of the marked content [...] is minimal and, secondly,
 274 that the capacity of the watermark channel degrades as a smooth function of the
 275 degradation of the marked content.
 276 \end{Def}
 277
 278
 279 \subsubsection{Classification of attacks}
 280 \label{sec:attack-classes}
 281
 282 In the steganography framework, attacks have been classified
 283 in~\cite{Cayre2008} as follows.
 284
 285
 286 \begin{Def}
 287 Watermark-Only Attack (WOA) occurs when an attacker has only access to several watermarked contents.
 288 \end{Def}
 289
 290 \begin{Def}
 291 Known-Message Attack (KMA) occurs when an attacker has access to several pairs of watermarked contents and
 292 corresponding hidden messages.
 293 \end{Def}
 294
 295 \begin{Def}
 296 Known-Original Attack (KOA) is when an attacker has access to several pairs of watermarked contents and
 297 their corresponding original versions.
 298 \end{Def}
 299
 300 \begin{Def}
 301 Constant-Message Attack (CMA) occurs when the attacker observes several watermarked contents and only knows that the
 302 unknown hidden message is the same in all contents.
 303 \end{Def}
 304
 305
 306 A synthesis of this classification is given in
 307 Table~\ref{table:attack-classification}.
 308 \begin{table}[h]
 309 \begin{center}
 310
 311  \begin{tabular}{|c||c|c|c|}
 312   \hline
 313    \textbf{Class} &  \textbf{Original content} &  \textbf{Stego content} &
 314    \textbf{Hidden message}\\
 315     \hline
 316     \hline
 317  \textbf{WOA} &   &    $\times$ & \\
 318   \hline
 319  \textbf{KMA}  & & $\times$ & $\times$ \\
 320   \hline
 321  \textbf{KOA} & $\times$ & $\times$ & \\
 322  \hline
 323  \textbf{CMA} & & & $\times$ \\
 324   \hline
 325   \end{tabular}
 326   \caption{Watermarking attacks classification in context of~\cite{Kalker2001}}
 327   \label{table:attack-classification}
 328   \end{center}
 329   \end{table}
 330
 331
 332
 333 \subsubsection{Definition of stego-security}\label{sec:stego-security-definition}
 334 \r
 335 \r
 336 In the Simmons' prisoner problem~\cite{Simmons83}, Alice and Bob are in jail and\r
 337 they want to,  possibly, devise an escape plan by  exchanging hidden messages in\r
 338 innocent-looking  cover contents.  These  messages  are to  be  conveyed to  one\r
 339 another by a common warden named Eve, who eavesdrops all contents and can choose\r
 340 to interrupt the communication if they appear to be stego-contents.\r
 341 \r
 342 Stego-security,  defined in  this well-known  context, is  the  highest security\r
 343 class in Watermark-Only  Attack setup, which occurs when Eve  has only access to\r
 344 several marked contents~\cite{Cayre2008}.\r
 345 \r
 346 \r
 347 Let $\mathds{K}$ be the set of embedding keys, $p(X)$ the probabilistic model of\r
 348 $N_0$ initial  host contents,  and $p(Y|K)$ the  probabilistic model  of $N_0$\r
 349 marked contents s.t. each host  content has  been marked\r
 350 with the same key $K$ and the same embedding function.\r
 351 \r
 352 \begin{Def}[Stego-Security~\cite{Cayre2008}]\r
 353 \label{Def:Stego-security}  The embedding  function  is \emph{stego-secure}\r
 354 if  $\forall K \in \mathds{K}, p(Y|K)=p(X)$ is established.\r
 355 \end{Def}\r
 356 \r
 357 \r
 358 \r
 359  Stego-security  states that  the knowledge  of  $K$ does  not help  to make  the\r
 360  difference  between $p(X)$ and  $p(Y)$.  This  definition implies  the following\r
 361  property:\r
 362  $$p(Y|K_1)= \cdots = p(Y|K_{N_k})=p(Y)=p(X)$$ \r
 363  This property is equivalent to  a zero Kullback-Leibler divergence, which is the\r
 364  accepted definition of the "perfect secrecy" in steganography~\cite{Cachin2004}.\r
 365 \r
 366
 367
 368 \subsection{Security evaluation}
 369
 370
 371 \subsubsection{Evaluation of the stego-security}
 372 \label{sec:stego-security-proof}
 373
 374 We have proven in~\cite{gfb10:ip} the following proposition.
 375
 376 \begin{Prop}
 377 CIIS are stego-secure, while CIDS do not satisfy this security property.
 378 \end{Prop}
 379
 380 \begin{Pre}
 381 See~\cite{gfb10:ip}.
 382 \end{Pre}
 383
 384
 385
 386 \subsubsection{Evaluation of the topological security}
 387 \label{sec:topological security-evaluation}
 388
 389 To check whether an information hiding scheme $S$ is topologically secure or not, we have proposed in our thesis, and published later
 390 in~\cite{gfb10:ip}, to write
 391 $S$ as an iterate process $x^{n+1}=f(x^n)$ on a metric space $(\mathcal{X},d)$. As recalled in the first
 392 chapter of this manuscript, this formulation is always possible. So,
 393
 394 \begin{Def}%[topological security]
 395 \label{Def:topological security-definition}
 396 An information hiding scheme $S$ is said to be topologically secure on
 397 $(\mathcal{X},d)$ if its iterative process has a chaotic
 398 behavior according to Devaney.
 399 \end{Def}
 400
 401
 402
 403 Due to the chaos properties of the so-called chaotic iterations,
 404 we have then deduced in~\cite{gfb10:ip} that,
 405
 406 \begin{Prop}
 407 CIIS and CIDS are topologically secure.
 408 \end{Prop}
 409
 410 We have then deduced qualitative and quantitative
 411 properties of topological security for this information
 412 hiding scheme in~\cite{gfb10:ip}: it is expansive (with a constant of
 413 expansiveness equal to 1), topologically mixing,
 414 etc. These properties can measure the
 415 disorder generated by our scheme, giving by doing so an important information about the
 416 unpredictability level of such a process, which helps to compare it
 417 to other data hiding methods. Such a comparison is outlined in
 418 the next section~\cite{gfb10:ip}.
 419
 420
 421
 422 \subsection{Comparison between spread-spectrum and chaotic
 423 iterations}\label{sec:comparison-application-context}
 424
 425 The consequences of topological mixing for data hiding are multiple. Firstly,
 426 security can be largely improved by considering
 427 the number of iterations as a secret key. An attacker
 428 will reach all of the possible media when iterating without this key.
 429 Additionally, he cannot benefit from a KOA setup, by studying media in the
 430 neighborhood of the original cover. Moreover, as in a topological mixing
 431 situation, it is possible that any hidden message (the initial condition), is
 432 sent to the same fixed watermarked content (with different numbers of
 433 iterations), the interest to be in a KMA setup is drastically reduced. Lastly, as
 434 all of the watermarked contents are possible for a given hidden message, depending
 435 on the number of iterations, CMA attacks will fail.
 436
 437
 438
 439 The property of expansiveness reinforces drastically the sensitivity in the aims of reducing the benefits that Eve can obtain from an attack in KMA or KOA setup. For example, it is impossible to have an estimation of the watermark by moving the message (or the cover) as a cursor in situation of expansiveness: this cursor will be too much sensitive and the changes will be too important to be useful.
 440 On the contrary, a very large constant of expansiveness $\varepsilon$ is unsuitable: the cover media will be strongly altered whereas the watermark would be undetectable.
 441 Finally, spread-spectrum is relevant when a discrete and secure data hiding technique is required in WOA setup. However, this technique should not be used in KOA and KMA setup, due to its lack of expansiveness.