1 %\section{Application of Steganography for Anonymity through the Internet~\cite{bcfg12a:ip}}
3 \section{The $\mathcal{DI}_3$ steganographic process~\cite{bcfg12a:ip,bcfg12b:ip}}
5 In~\cite{bcfg12a:ip,bcfg12b:ip}, a new steganographic algorithm named $\mathcal{DI}_3$ is presented. It is
6 inspired from $\mathcal{CIW}_1$ and $\mathcal{CIS}_2$ respectively published
\r
7 in~\cite{fgb11:ip} and~\cite{gfb10:ip}, and recalled previously in this chapter.
8 Compared to the first one, $\mathcal{DI}_3$ is a steganographic scheme,
9 not just a watermarking technique. That is, in our understanding, it can embed more than one bit. Unlike
\r
10 $\mathcal{CIS}_2$, which requires embedding keys with three strategies, only one sequence
\r
11 is required for $\mathcal{DI}_3$, so it is easier to implement. Indeed
12 $\mathcal{DI}_3$ is a faster instance of $\mathcal{CIS}_2$, as
13 there is no message mixing in it.
14 $\mathcal{DI}_3$ is well-defined mathematically and its security is evaluated in~\cite{bcfg12a:ip},
15 whereas~\cite{bcfg12b:ip} provides algorithms and investigates its robustness, comparing it to
16 some well-known watermarking schemes, namely the
17 YASS~\cite{DBLP:conf/ih/SolankiSM07},
\r
18 nsF5~\cite{DBLP:conf/mmsec/FridrichPK07}, MMx~\cite{DBLP:conf/ih/KimDR06}, and HUGO~\cite{DBLP:conf/ih/PevnyFB10} algorithms
19 detailed in the Appendix~\ref{AppendixIH}.
22 \subsection{Mathematical definitions and notations}
25 New notations and terminologies must be introduced another time in order to be able to define
26 mathematically the $\mathcal{DI}_3$ steganographic process. They are provided thereafter.
30 The \emph{support of a finite sequence} $S$ of $n$ terms is the finite set
31 $\mathsf{S}(S)=\left\{ S^k, k < n \right\}$ containing all the distinct
32 values of $S$. Its cardinality is s.t. $\#\mathsf{S}(S) \leqslant n$.
35 \begin{Def}\label{def:injective-sequence}
36 A finite sequence $S \in \mathds{S}_\mathsf{N}$ of $n$ terms is \emph{injective}
37 if $n = \#\mathsf{S}(S)$.
39 if $N = \#\mathsf{S}(S)$. Finally, it is bijective if and only if it is both
40 injective and onto, so $n=N = \#\mathsf{S}(S)$.
43 ``$S$ is injective'' reflects the fact that all the $n$ terms of
44 the sequence $S$ are distinct, while ``$S$ is onto'' means
45 that all the values of the set $\llbracket 1;\mathsf{N}\rrbracket$
46 are reached at least once.
51 \subsection{The new $\mathcal{DI}_3$ process}
52 \label{sec:new-process-di-1}
54 In this section, the new algorithm introduced in~\cite{bcfg12a:ip} and studied
55 in~\cite{bcfg12b:ip} is recalled.
56 Let $\mathsf{P} \in \mathds{N}^\ast$ be the width, in term of bits, of the
57 message to embed into the cover media.
58 $\lambda \in \mathds{N}^{\ast}$ is the number of iterations to realize,
59 which is s.t. $\lambda > \mathsf{P}$.
60 $x^0 \in \mathbb{B}^\mathsf{N}$ is for the $\mathsf{N}$ LSCs of a given
61 cover media $C$ supposed to be uniformly distributed.
62 $m \in \mathbb{B}^\mathsf{P}$ is the message to hide into $x^0$.
63 Finally, $S \in \mathbb{S}_\mathsf{P}$ is a strategy such that the
64 finite sequence $\left\{S^k, k \in \llbracket \lambda - \mathsf{P}
65 +1;\lambda \rrbracket\right\}$ is injective.
69 The width $\mathsf{P}$ of the message to hide into the LSCs of the cover media
70 $x^0$ has to be far smaller than the number of LSCs.
75 The proposed information hiding scheme is defined by:
76 \begin{Def}[$\mathcal{DI}_3$ Data hiding scheme]\
77 \label{def:process-DI-1}
79 \mathds{N}^{\ast} \times \llbracket 0;\mathsf{N-1}\rrbracket \times \llbracket
80 0;\mathsf{P-1}\rrbracket$:
86 x_i^{n-1} & \text{ if }S^n\neq i \\
87 m_{S^n} & \text{ if }S^n=i.
95 The stego-content is the Boolean vector $y = x^\lambda \in
96 \mathbb{B}^\mathsf{N}$, which will replace the former LSCs,
97 that is, LSCs of the cover media are replaced by the
101 \subsection{Security study}
103 A security study of the $\mathcal{DI}_3$ steganographic process has been realized
104 in~\cite{bcfg12a:ip}. Conclusion of this study is summarized thereafter.
107 $\mathcal{DI}_3$ is stego-secure.
110 Thie proof of this proposition, provided in~\cite{bcfg12a:ip}, holds for
111 the following restrictive hypotheses:
114 \item[Distribution of LSCs.]
115 We have supposed that $x^0 \sim
116 \mathcal{U}\left(\mathbb{B}^\mathsf{N}\right)$
117 to prove the stego-security of the data hiding process
119 This hypothesis of the uniform distribution
120 of the least significant coefficients is obviously
121 the most restrictive one, but it can
122 be obtained at least partially in two possible manners.
123 Either a channel that appears to be random (for instance, when
124 applying a chi squared test, or for test batteries recalled in a
125 previous chapter) can be found in the media. Or a systematic process
126 can be applied on the images to obtain this uniformity, as follows.
127 Before embedding the hidden message, all the original LSCs must be replaced by randomly
128 generated ones, hoping so that such cover media will be considered to be noisy by any given attacker.
129 Let us remark that, in
130 the field of data anonymity for privacy on the
131 Internet, we are in the
132 ``watermark-only attack'' framework.
133 As it has been recalled in
134 Table~\ref{table:attack-classification},
136 the attacker has only access to stego-contents, having so no knowledge
137 of the original media (\emph{i.e.}, before introducing the message in the LSCs random channel).
138 These considerations, which have been
139 deepened in later publications, will be discussed more largely at
140 the end of this chapter.
141 \item[Distribution of the messages $m$.]
142 In order to prove the stego-security of the data hiding
144 $\mathcal{DI}_3$, we have supposed that $m \sim
145 \mathcal{U}\left(\mathbb{B}^\mathsf{P}\right)$. This hypothesis
146 of the uniform repartition of the message to hide is not
147 really restrictive. Indeed, to encrypt the message before its
148 embedding into the LSCs of cover media, which is usually required
149 for obvious security reasons, is sufficient to achieve this goal. To say
150 it different, in order to be in the conditions of applications of the process
151 $\mathcal{DI}_3$, the hidden message must be encrypted.
152 \item[Distribution of the strategies $S$.]
153 To prove the stego-security of the data hiding
155 $\mathcal{DI}_3$, we have finally supposed that $S \sim
156 \mathcal{U}\left(\mathbb{S}_\mathsf{P}\right)$. This hypothesis is not
157 restrictive too, as any cryptographically secure
158 pseudorandom generator (PRNG)
159 satisfies this property. With such PRNGs, it is impossible in
160 polynomial time, to make the distinction between random numbers and numbers provided by
161 these generators. For instance, \emph{Blum Blum Shub
162 (BBS)}~\cite{junod1999cryptographic}, \emph{Blum Goldwasser (BG)}~\cite{prng-Blum-goldwasser},
163 or \emph{ISAAC}~\cite{Jenkins1996}, recalled in the chapter focusing
164 on PRNGs, are convenient here.
167 \subsection{Evaluation against steganalyzers}\label{section:steganalysis}
170 The steganographic scheme detailed in~\cite{bcfg12a:ip}
172 state of the art steganographic approaches, namely
173 YASS~\cite{DBLP:conf/ih/SolankiSM07},
174 HUGO~\cite{DBLP:conf/ih/PevnyFB10}, and
175 nsF5~\cite{DBLP:conf/mmsec/FridrichPK07} detailed in the Appendix~\ref{AppendixIH}.
176 This study, realized in~\cite{bcfg12a:ip}, is summarized thereafter.
178 The steganalysis is based on the BOSS image
179 database~\cite{DBLP:conf/ih/BasFP11},
180 which consists in a set of 10 000 512x512 greyscale images.
181 We have randomly selected 50 of them to compute the cover set.
182 Since YASS and nsF5 are dedicated to JPEG
183 support, all these images have been firstly translated into JPEG format
184 thanks to the \verb+mogrify+ command line.
185 To allow the comparison between steganographic schemes, the relative payload
186 is always set with 0.1 bit per pixel. Under that constrain,
187 the embedded message $m$ is a sequence of 26214 randomly generated bits.
188 This step has led to distinguish four sets of stego contents, one for
189 each steganographic approach.
191 We have next used in~\cite{bcfg12a:ip} the steganalysis tool developed by the HugoBreakers
192 team~\cite{holmes11,ensemble11} based on AI classifier and which won
193 the BOSS competition~\cite{DBLP:conf/ih/BasFP11}.
194 Table~\ref{table:holme} summarizes these steganalysis results expressed
195 as the error probabilities of the steganalyser, as they are given in~\cite{bcfg12a:ip}. The errors are the mean
196 of the false alarms and of the missed detections. An error that is closed
197 to 0.5 signifies that deciding whether an image contains
198 a stego content is a random choice for the steganalyser.
199 Conversely, a tiny error denotes that the steganalyser can easily classify
200 stego content and non stego content.
204 \begin{tabular}{|l|l|l|l|l|}
206 Steganographic Tool & $\mathcal{DI}_1$ & YASS & HUGO & NsF5 \\
208 Error Probability & 0.4133& 0.0067& 0.495 & 0.47 \\
212 \caption{Steganalysis results of HugoBreakers steganalyser applied on
213 steganographic scheme}\label{table:holme}
217 The best result is obtained by HUGO, which is closed to the perfect
218 steganographic approach to the considered steganalyser, since the error is about 0.5. However, even if
219 the approach detailed in~\cite{bcfg12a:ip} has not any optimization, these
220 first experiments show promising results.
221 We finally notice that the HugoBreakers's steganalyser should outperform
222 these results on larger image databases, \textit{e.g.}, when applied on the
223 whole BOSS image database.