IH/internet12.tex

   1 %\section{A Robust Data Hiding Process Contributing to the Development of a Semantic Web~\cite{bcfg12b:ip}}
   2
   3
   4 %
   5 % \subsection{Introduction}
   6 %
   7 % This contribution focuses another time on robustness, which is defined by
   8 % Kalker in~\cite{Kalker2001} as follows: ``Robust watermarking is a mechanism to
   9 % create a communication channel that is multiplexed into
  10 % original content [...] It is required that, firstly, the perceptual
  11 % degradation of the marked content [...] is minimal and, secondly, that the capacity of the watermark channel degrades as
  12 % a smooth function of the degradation of the marked content''. The context
  13 % of~\cite{bcfg12b:ip} is the social web search engines development:
  14 % the key idea is to embed tags or metada directly into internet
  15 % medias, to realize social search without any websites and databases,
  16 % but with ad hoc engines able to extract these descriptions.
  17 % As descriptions are directly embedded into media, whatever their formats,
  18 % robustness of the
  19 % chosen watermarking scheme is thus required in this situation, as descriptions should resist to user modifications
  20 % like resizing, compression, and format conversion or other
  21 % classical user transformations in the field.
  22 %
  23 % To achieve such a goal,
  24 %
  25 %  a new algorithm named  $\mathcal{DI}_3$ is presented in~\cite{bcfg12b:ip}. It is
  26 % inspired from $\mathcal{CIW}_1$ and $\mathcal{CIS}_2$ respectively published\r
  27 % in~\cite{fgb11:ip} and~\cite{gfb10:ip}, and recalled previously in this chapter.
  28 % Compare to the first one, $\mathcal{DI}_3$ is a steganographic scheme,
  29 % not just a watermarking technique. That is, in our understanding, it can embed more than one bit. Unlike\r
  30 % $\mathcal{CIS}_2$, which requires embedding keys with three strategies, only one sequence\r
  31 % is required for $\mathcal{DI}_3$, so it is easier to implement. Indeed
  32 % $\mathcal{DI}_3$ is a faster instance of $\mathcal{CIS}_2$, as
  33 % there is no message mixing in it.
  34 %  $\mathcal{DI}_3$ is well-defined through detailed algorithms in~\cite{bcfg12b:ip}, evaluated, and compared to
  35 % some well-known watermarking schemes, namely the
  36 % YASS~\cite{DBLP:conf/ih/SolankiSM07}, \r
  37 %  nsF5~\cite{DBLP:conf/mmsec/FridrichPK07}, MMx~\cite{DBLP:conf/ih/KimDR06}, and HUGO~\cite{DBLP:conf/ih/PevnyFB10} algorithms
  38 %  detailed in the Appendix~\ref{AppendixIH}.
  39 %
  40
  41
  42 \subsection{Implementing the new steganographic process}
  43
  44 \subsubsection{Implementation}
  45
  46 \r
  47 \r
  48 In the following algorithms, the following notations are used:\r
  49 $S$ denotes the embedding and extraction strategy, \r
  50  $H$ the host content or the stego-content depending of the context.\r
  51   $LSC$ denotes the old or new\r
  52 LSCs of the host or stego-content $H$ depending of the context too.\r
  53  $N$ denotes the number of LSCs,\r
  54  $\lambda$ the number of iterations to realize,\r
  55  $M$ the secret message, and\r
  56  $P$ the width of the message (number of bits).\r
  57 \r
  58 \r
  59 Our new scheme theoretically presented in~\cite{bcfg12a:ip} is here described\r
  60 by three main algorithms:\r
  61 \begin{enumerate}\r
  62   \item The first one, detailed in Algorithm~\ref{algo:strategy}\r
  63 allows to generate the embedding strategy of the system which\r
  64 is a part of the embedding key in addition with the choice of the LSCs and\r
  65 the number of iterations to realize.\r
  66 \r
  67 \item The second one, detailed\r
  68 in Algorithm~\ref{algo:embed} allows to embed the message into\r
  69 the LSCs of the cover media using the strategy. The \r
  70 strategy has been generated by the first algorithm and the same number of\r
  71 iterations is used.\r
  72 \item The last one, detailed in Algorithm~~\ref{algo:extract} allows to extract\r
  73 the secret message from the LSCs of the media (the stego-content) using the\r
  74 strategy wich is a part of the extraction key in addition with the width of the\r
  75 message.\r
  76 \end{enumerate}   \r
  77 \r
  78 In adjunction of these three functions, two other complementary functions have\r
  79 to be used:\r
  80 \r
  81 \r
  82 \begin{enumerate}\r
  83   \item The first one, detailed in Algorithm~\ref{algo:signification-function},\r
  84   allow to extract MSCs, LSCs, and passive coefficients from the host content.\r
  85   Its implementation is based on the concept of signification\r
  86   function described in Definition~\ref{def:msc-lsc}.\r
  87  \item The last one, detailed in Algorithm~\ref{algo:build-function}, allow to\r
  88  rebuild the new host content (the stego-content) from the corresponding MSCs,\r
  89  LSCs, and passive coefficients. Its implementation is also based on the concept\r
  90  of signification function described in Definition~\ref{def:msc-lsc}. This\r
  91  function realize the invert operation of the previous one.\r
  92 \end{enumerate}\r
  93 \r
  94 \begin{Rem}\r
  95 The two previous algorithms have\r
  96 to be implemented by the user depending on each application context should be\r
  97 adjusted accordingly: either in spatial description, in frequency description, or in other description. They\r
  98 correspond to the theoretical concept described in Definition~\ref{def:msc-lsc}. Their\r
  99 implementation depends on the application context.\r
 100 \end{Rem}\r
 101 \begin{Ex}\r
 102 For example the algorithm~\ref{algo:signification-function} in spatial domain\r
 103 can correspond to the extraction of the 3 last bits of each pixel as LSCs, the 3\r
 104 first bits as MSCs, and the 2 center bits as passive coefficients.\r
 105 \end{Ex}\r
 106 \r
 107 \r
 108 \r
 109 \begin{algorithm}[h]\r
 110 \tcc{$S$ is a sequence of\r
 111 integers into $\llbracket 0,P-1 \rrbracket$, such that $(S_{n_0},\ldots,S_{n_0+P-1})$ is injective on\r
 112 $\llbracket 0,P-1 \rrbracket$.}\r
 113 \KwResult{$S$: The strategy, integer sequence $(S_0,S_1,\ldots)$.}\r
 114 \Begin{\r
 115 $n_0 \longleftarrow L - P + 1$\;\r
 116 \If{$P > N$ OR $n_0 < 0$}{\r
 117 \Return{ERROR}}\r
 118 $S \longleftarrow$ Array of width $\lambda$, all values initialized to 0\;\r
 119 $cpt \longleftarrow 0$\;\r
 120 \While{$cpt < n_0$}{\r
 121 $S_{cpt} \longleftarrow $Random integer in $\llbracket 0,P-1\r
 122 \rrbracket$.\;\r
 123 $cpt \longleftarrow cpt + 1$\;}\r
 124 $A \longleftarrow$ We generate an arrangement of $\llbracket 0,P-1\r
 125 \rrbracket$\;\r
 126 \For{$k \in \llbracket 0,P-1\rrbracket$}{\r
 127 $S_{n_0 + k} \longleftarrow A_k$\;\r
 128 }\r
 129 \Return{$S$}\r
 130 }\r
 131 \caption{$strategy(N,P,\lambda)$}\r
 132 \label{algo:strategy}\r
 133 \end{algorithm}\r
 134 \r
 135 \r
 136 \r
 137 \begin{algorithm}[h]\r
 138 \KwResult{New LSCs with embedded message.}\r
 139 \Begin{\r
 140 $N \longleftarrow$ Number of LSCs in $LSC$\;\r
 141 $P \longleftarrow$ Width of the message $M$\;\r
 142 \For{$k \in \llbracket 0,\lambda\rrbracket$}{\r
 143 $i \longleftarrow S_k$\;\r
 144 $LSC_{i} \longleftarrow M_i$\;\r
 145 }\r
 146 \Return{$LSC$}\r
 147 }\r
 148 \caption{$embed(LSC, M, S, \lambda)$}\r
 149 \label{algo:embed}\r
 150 \end{algorithm}\r
 151 \r
 152 \begin{algorithm}[h]\r
 153 \KwResult{The message to extract from $LSC$.}\r
 154 \Begin{\r
 155 $RS \longleftarrow$ The strategy $S$ written in reverse order.\;\r
 156 $M \longleftarrow$ Array of width $P$, all values initialized to 0\;\r
 157 \For{$k \in \llbracket 0,\lambda\rrbracket$}{\r
 158 $i \longleftarrow RS_k$\;\r
 159 $M_{i} \longleftarrow LSC_i$\;\r
 160 }\r
 161 \Return{$M$}\r
 162 }\r
 163 \caption{$extract(LSC, S, \lambda,P)$}\r
 164 \label{algo:extract}\r
 165 \end{algorithm}\r
 166 \r
 167 \r
 168 \begin{algorithm}[h]\r
 169 \KwData{$H$: The original host content.}\r
 170 \KwResult{$MSC$: MSCs of the host content $H$.}\r
 171 \KwResult{$PC$: Passive coefficients of the host content $H$.}\r
 172 \KwResult{$LSC$: LSCs of the host content $H$.}\r
 173 \Begin{\r
 174 \tcc{Implemented by the user.}\r
 175 \Return{$(MSC,PC,LSC)$}\r
 176 }\r
 177 \caption{$significationFunction(H)$}\r
 178 \label{algo:signification-function}\r
 179 \end{algorithm}\r
 180 \r
 181 \begin{algorithm}[h]\r
 182 \KwResult{$H$: The new rebuilt host content.}\r
 183 \Begin{\r
 184 \tcc{Implemented by the user.}\r
 185 \Return{$(MSC,PC,LSC)$}\r
 186 }\r
 187 \caption{$buildFunction(MSC,PC,LSC)$ )\label{algo:build-function}}\r
 188 \end{algorithm}\r
 189 \r
 190
 191 \subsubsection{Discussion}\r
 192 \r
 193 We first notice that our $\mathcal{DI}_3$ scheme embeds the message in LSB as\r
 194 all the other approaches.\r
 195 Furthermore, among all the LSB, the choice of those which are modified\r
 196 according to the message is based on a secured PRNG whereas F5, and thus nsF5\r
 197 only require a PRNG.\r
 198 Finally in this scheme, we have postponed the optimization of considering\r
 199 again a subset of them according to the distortion their modification\r
 200 may induce. According to us, further theoretical study\r
 201 are necessary to take this feature into consideration.\r
 202 In future work, it is planed to compare the robustness and efficiency of all the\r
 203 schemes in the context of semantic web. To initiate this study in this first\r
 204 article, the robustness of $\mathcal{DI}_3$ is detailled in the next section.\r
 205 \r
 206 \subsection{Robustness Study}\label{sec:robustness-study}\r
 207
 208
 209 This section evaluates the robustness of our approach~\cite{bcg11:ij}.
 210
 211 Each experiment is build on a set of 50 images which are randomly selected
 212 among database taken from the BOSS contest~\cite{DBLP:conf/ih/BasFP11}.
 213 Each cover is a $512\times 512$ greyscale digital image.
 214 The relative payload
 215 is always set with 0.1 bit per pixel. Under that constrain,
 216 the embedded message $m$ is a sequence of 26214 randomly generated bits.
 217
 218 Following the same model of robustness studies in previous similar work in the
 219 field of information hiding, we choose some classical attacks like cropping,
 220 compression, and rotation studied in this research work. Other attacks
 221 and geometric transformations will be explore in a complementary study. Testing
 222 the robustness of the approach is achieved by successively applying on stego content images attacks. Differences between the message that is extracted from the attacked image and the original one are computed and expressed as percentage.
 223
 224
 225 To deal with cropping attack, different percentage of cropping
 226 (from 1\% to 81\%) are applied on the stego content image.
 227 Fig.~\ref{fig:robustness-results}~(c) presents effects of such an attack.
 228
 229
 230 We address robustness against JPEG an JPEG 2000 compression.
 231 Results are respectively presented in Fig.~\ref{fig:robustness-results}~(a) and
 232 in Fig.~\ref{fig:robustness-results}~(b).
 233
 234
 235 \begin{figure*}[Htb]
 236
 237 \begin{minipage}[b]{.24\linewidth}
 238   \centering
 239     \centerline{\includegraphics[width=5cm]{IH/graphs/atq-jpg}}
 240     \centerline{(a) JPEG effect.}
 241 \end{minipage}
 242 \hfill
 243 \begin{minipage}[b]{0.24\linewidth}
 244   \centering
 245     \centerline{\includegraphics[width=5cm]{IH/graphs/atq-jp2}}
 246     \centerline{(b) JPEG 2000 effect.}
 247 \end{minipage}
 248 \hfill
 249 \begin{minipage}[b]{.24\linewidth}
 250   \centering
 251     \centerline{\includegraphics[width=5cm]{IH/graphs/atq-dec}}
 252     \centerline{(c) Cropping attack.}
 253 \end{minipage}
 254 \hfill
 255 \begin{minipage}[b]{0.24\linewidth}
 256   \centering
 257     \centerline{\includegraphics[width=5cm]{IH/graphs/atq-rot}}
 258     \centerline{(d) Rotation attack.}
 259 \end{minipage}
 260 \caption{Robustness of $\mathcal{DI}_3$ scheme facing several attacks (50 images
 261 from the BOSS repository)}
 262 \label{fig:robustness-results}
 263 \end{figure*}
 264
 265
 266
 267 Attacked based on geometric transformations are addressed through
 268 rotation attacks: two opposite rotations
 269 of angle $\theta$ are successively applied around the center of the image.
 270 In these geometric transformations, angles range from 2 to 20
 271 degrees.
 272 Results  effects of such an attack are also presented in
 273 Fig.~\ref{fig:robustness-results}~(d).
 274
 275
 276 From all these experiments, one firstly can conclude that
 277 the steganographic scheme does not present obvious drawback and
 278 resists to all the attacks:
 279 all the percentage differences are so far less than 50\%.
 280
 281 The comparison with robustness of other steganographic schemes exposed in the
 282 work will be realize in a complementary study, and the best utilization of each
 283 one in several context will be discuss.