quelques modifs

[canny.git] / ourapproach.tex
diff --git a/ourapproach.tex b/ourapproach.tex

index f832695dc830ca094bc87b1cbfd1aa9c50ddfece..36fefc31a4cba7ed397041b6eee2884ba28482bb 100644 (file)
--- a/ourapproach.tex
+++ b/ourapproach.tex
@@ -1,158 +1,399 @@
-The flowcharts given in Fig.~\ref{fig:sch} summarize our steganography scheme denoted as to
-STABYLO for STeganography with cAnny, Bbs, binarY embedding at LOw cost.
-What follows successively details all the inner steps and flow inside 
-the embedding stage (Fig.\ref{fig:sch:emb}) 
-and inside the extraction one (Fig.~\ref{fig:sch:ext}).
+This section first presents the embedding scheme through its 
+four main steps: the data encryption (Sect.~\ref{sub:bbs}),
+the cover pixel selection (Sect.~\ref{sub:edge}),
+the adaptive payload considerations (Sect.~\ref{sub:adaptive}),
+and how the distortion has been minimized (Sect.~\ref{sub:stc}).
+The message extraction is then presented  (Sect.~\ref{sub:extract}) while a running example ends this section.
+
  
  
+The flowcharts given in Fig.~\ref{fig:sch}
+summarize our steganography scheme denoted by
+STABYLO, which stands for STe\-ga\-no\-gra\-phy with 
+Adaptive, Bbs, binarY embedding at LOw cost.
+What follows are successively some details of the inner steps and the flows both inside 
+ the embedding stage (Fig.~\ref{fig:sch:emb}) 
+and inside the extraction one (Fig.~\ref{fig:sch:ext}).
+Let us first focus on the data embedding. 
  
  
-\begin{figure*}[t]
+\begin{figure*}%[t]
    \begin{center}
    \begin{center}
-    \subfloat[Data Embedding.]{
-      \begin{minipage}{0.49\textwidth}
+    \subfloat[Data Embedding]{
+      \begin{minipage}{0.4\textwidth}
          \begin{center}
          \begin{center}
-          \includegraphics[width=5cm]{emb.pdf}
-          %\includegraphics[width=5cm]{emb.ps}
+            %\includegraphics[scale=0.45]{emb}
+            \includegraphics[scale=0.4]{emb}
          \end{center}
        \end{minipage}
        \label{fig:sch:emb}
          \end{center}
        \end{minipage}
        \label{fig:sch:emb}
-    }%\hfill
-    \subfloat[Data Extraction.]{
+    } 
+\hfill
+    \subfloat[Data Extraction]{
        \begin{minipage}{0.49\textwidth}
          \begin{center}
        \begin{minipage}{0.49\textwidth}
          \begin{center}
-          \includegraphics[width=5cm]{rec.pdf}
-          %\includegraphics[width=5cm]{rec.ps}
+            \includegraphics[scale=0.4]{dec}
          \end{center}
        \end{minipage}
        \label{fig:sch:ext}
      }%\hfill
    \end{center}
          \end{center}
        \end{minipage}
        \label{fig:sch:ext}
      }%\hfill
    \end{center}
-  \caption{The STABYLO Scheme.}
+  \caption{The STABYLO scheme}
    \label{fig:sch}
  \end{figure*}
  
  
  
  
    \label{fig:sch}
  \end{figure*}
  
  
  
  
-\subsection{Data Embedding} 
-This section describes the main three steps of the STABYLO data embedding
-scheme. 
  
  
  
  
  
  
-\subsubsection{Edge Based Image Steganography}
  
  
+\subsection{Security considerations}\label{sub:bbs}
+To provide a self-contained article without any bias, we shor\-tly 
+present the selected encryption process.
+Among the methods of  message encryption/decryption 
+(see~\cite{DBLP:journals/ejisec/FontaineG07} for a survey)
+we implement the asymmetric 
+Blum-Goldwasser cryptosystem~\cite{Blum:1985:EPP:19478.19501}
+that is based on the Blum Blum Shub~\cite{DBLP:conf/crypto/ShubBB82} 
+pseudorandom number generator (PRNG) and the 
+XOR binary function.
+The main justification of this choice 
+is that it has been proven~\cite{DBLP:conf/crypto/ShubBB82} that this PRNG 
+has the property of cryptographical security, \textit{i.e.}, 
+for any sequence of $L$ output bits $x_i$, $x_{i+1}$, \ldots, $x_{i+L-1}$,
+there is no algorithm, whose time complexity is polynomial  in $L$, and 
+which allows to find $x_{i-1}$ or $x_{i+L}$ with a probability greater
+than $1/2$.
+Equivalent formulations of such a property can
+be found. They all lead to the fact that,
+even if the encrypted message is extracted, 
+it is impossible to retrieve the original one in 
+polynomial time.   
  
  
-Edge Based Image Steganography schemes 
-already studied~\cite{Luo:2010:EAI:1824719.1824720,DBLP:journals/eswa/ChenCL10,DBLP:conf/ih/PevnyFB10} differ 
-how they select edge pixels, and  
-how they modify these ones.
+Starting thus with a key $k$ and the message \textit{mess} to hide, 
+this step computes a message $m$, which is the encrypted version  of \textit{mess}.
  
  
-Image Quality: Edge Image Steganography
-\JFC{Raphael, les fuzzy edge detection sont souvent utilisés. 
-  il faudrait comparer les approches en terme de nombre de bits retournés,
-  en terme de complexité. } \cite{KF11}
-\RC{Ben, à voir car on peut choisir le nombre de pixel avec canny. Supposons que les fuzzy edge soient retourne un peu plus de points, on sera probablement plus détectable...  Finalement on devrait surement vendre notre truc en : on a choisi cet algo car il est performant en vitesse/qualité. Mais on peut aussi en utilisé d'autres :-)}
  
  
-There are  many techniques to  detect edges in  images. Main methods  are filter
-edge detection methods such as Sobel  or Canny filter, low order methods such as
-first order  and second order  methods, these methods  are based on  gradient or
-Laplace  operators and  fuzzy edge  methods which  are based  on fuzzy  logic to
-highlight edges.
+\subsection{Edge-based image steganography}\label{sub:edge}
  
  
-Of course, all the algorithms have  advantages and drawbacks which depend on the
-motivation  to  highlight  edges.   Unfortunately  unless testing  most  of  the
-algorithms, which  would require many  times, it is  quite difficult to  have an
-accurate idea on what would produce  such algorithm compared to another. That is
-why we have  chosen canny algorithm which is well  known, fast and implementable
-on many  kinds of architecture, such  as FPGA, smartphone,  desktop machines and
-GPU. And of course, we do not pretend that this is the best solution.
  
  
+The edge-based image
+steganography schemes 
+already presented \cite{Luo:2010:EAI:1824719.1824720,DBLP:journals/eswa/ChenCL10} differ 
+in how carefully they select edge pixels, and  
+how they modify them.
  
  
-First of all, let us discuss about compexity of edge detetction methods.
-Let then $M$ and $N$ be the dimension of the original image. 
-According to~\cite{Hu:2007:HPE:1282866.1282944},
-even if the fuzzy logic based edge detection methods~\cite{Tyan1993} 
-have promising results, its complexity is in $C_3 \times O(M \times N)$
-whereas the complexity on the Canny method~\cite{Canny:1986:CAE:11274.11275} 
-is in $C_1 \times O(M \times N)$ where  $C_1 < C_3$.
-\JFC{Verifier ceci...}
-In experiments detailled in this article, the canny method has been retained 
-but the whole approach can be updated to consider 
-the fuzzy logic edge detector.   
+%Image Quality: Edge Image Steganography
+%\JFC{Raphael, les fuzzy edge detection sont souvent utilisés. 
+%  il faudrait comparer les approches en terme de nombre de bits retournés,
+%  en terme de complexité. } \cite{KF11}
+%\RC{Ben, à voir car on peut choisir le nombre de pixel avec Canny. Supposons que les fuzzy edge soient retourne un peu plus de points, on sera probablement plus détectable...  Finalement on devrait surement vendre notre truc en : on a choisi cet algo car il est performant en vitesse/qualité. Mais on peut aussi en utilisé d'autres :-)}
  
  
-Next, following~\cite{Luo:2010:EAI:1824719.1824720}, our scheme automatically
-modifies canny parameters to get a sufficiently large set of edge bits: this 
-one is practically enlarged untill its size is at least twice as many larger 
-than the size of embedded message.
+Many techniques have been proposed in the literature to  detect 
+edges in  images (whose noise has been initially reduced). 
+They can be separated in two categories: first and second order detection
+methods on the one hand, and fuzzy detectors on the other  hand~\cite{KF11}.
+In first order methods like Sobel, Canny~\cite{Canny:1986:CAE:11274.11275}, and so on, 
+a first-order derivative (gradient magnitude, etc.) is computed 
+to search for local maxima, whereas in second order ones, zero crossings in a second-order derivative, like the Laplacian computed from the image,
+are searched in order to find edges.
+As far as fuzzy edge methods are concerned, they are obviously based on fuzzy logic to highlight edges.
  
  
-Edge Based Image Steganography schemes 
-already studied~\cite{Luo:2010:EAI:1824719.1824720,DBLP:journals/eswa/ChenCL10,DBLP:conf/ih/PevnyFB10} differ 
-how they select edge pixels, and  
-how they modify these ones.
+Canny filters, on their parts, are an old family of algorithms still remaining a state of the art edge detector. They can be well-approximated by first-order derivatives of Gaussians.
+As the Canny algorithm is fast, well known, has been studied in depth, and is implementable
+on many  kinds of architectures like FPGAs, smart phones,  desktop machines, and
+GPUs, we have chosen this edge detector for illustrative purpose.
  
  
-First of all, let us discuss about compexity of edge detetction methods.
-Let then $M$ and $N$ be the dimension of the original image. 
-According to~\cite{Hu:2007:HPE:1282866.1282944},
-even if the fuzzy logic based edge detection methods~\cite{Tyan1993} 
-have promising results, its complexity is in $C_3 \times O(M \times N)$
-whereas the complexity on the Canny method~\cite{Canny:1986:CAE:11274.11275} 
-is in $C_1 \times O(M \times N)$ where  $C_1 < C_3$.
-\JFC{Verifier ceci...}
-In experiments detailled in this article, the canny method has been retained 
-but the whole approach can be updated to consider 
-the fuzzy logic edge detector.   
  
  
-Next, following~\cite{Luo:2010:EAI:1824719.1824720}, our scheme automatically
-modifies canny parameters to get a sufficiently large set of edge bits: this 
-one is practically enlarged untill its size is at least twice as many larger 
-than the size of embedded message.
  
  
  
  
-\subsubsection{Security Considerations}
-Among methods of message encryption/decryption 
-(see~\cite{DBLP:journals/ejisec/FontaineG07} for a survey)
-we implement the Blum-Goldwasser cryptosystem~\cite{Blum:1985:EPP:19478.19501}
-which is based on the Blum Blum Shub~\cite{DBLP:conf/crypto/ShubBB82} Pseudo Random Number Generator (PRNG) 
-for security reasons.
-It has been indeed proven~\cite{DBLP:conf/crypto/ShubBB82} that this PRNG 
-has the cryptographically security property, \textit{i.e.}, 
-for any sequence $L$ of output bits $x_i$, $x_{i+1}$, \ldots, $x_{i+L-1}$,
-there is no algorithm, whose time complexity is polynomial  in $L$, and 
-which allows to find $x_{i-1}$ and $x_{i+L}$ with a probability greater
-than $1/2$.
-Thus, even if the encrypted message would be extracted, 
-it would thus be not possible to retrieve the original one in a 
-polynomial time.   
+This edge detection is applied on a filtered version of the image given 
+as input.
+More precisely, only $b$ most significant bits are concerned by this step, 
+where the parameter $b$ is practically set with $6$ or $7$. 
+Notice that only the 2 LSBs of pixels in the set of edges
+are returned if $b$ is 6, and the LSB of pixels if $b$ is 7.
+If set with the same value $b$, the edge detection returns thus the same 
+set of pixels for both the cover and the stego image.   
+Moreover, to provide edge gradient value, 
+the Canny algorithm computes derivatives  
+in the two directions with respect to a mask of size $T$. 
+The higher $T$ is, the coarse the approach is. Practically, 
+$T$ is set with $3$, $5$, or $7$.
+In our flowcharts, this step is represented by ``Edge Detection(b, T, X)''.
+
+
+Let $x$ be the sequence of these bits. 
+The next  section presents how to adapt our scheme 
+with respect to the size
+of the message $m$ to embed and the size of the cover $x$.
+
+
+
+ 
  
  
  
  
-\subsubsection{Security Considerations}
-Among methods of message encryption/decryption 
-(see~\cite{DBLP:journals/ejisec/FontaineG07} for a survey)
-we implement the Blum-Goldwasser cryptosystem~\cite{Blum:1985:EPP:19478.19501}
-which is based on the Blum Blum Shub~\cite{DBLP:conf/crypto/ShubBB82} Pseudo Random Number Generator (PRNG) 
-for security reasons.
-It has been indeed proven~\cite{DBLP:conf/crypto/ShubBB82} that this PRNG 
-has the cryptographically security property, \textit{i.e.}, 
-for any sequence $L$ of output bits $x_i$, $x_{i+1}$, \ldots, $x_{i+L-1}$,
-there is no algorithm, whose time complexity is polynomial  in $L$, and 
-which allows to find $x_{i-1}$ and $x_{i+L}$ with a probability greater
-than $1/2$.
-Thus, even if the encrypted message would be extracted, 
-it would thus be not possible to retrieve the original one in a 
-polynomial time.   
  
  
  
  
+\subsection{Adaptive embedding rate}\label{sub:adaptive}
+Two strategies have been developed in our approach, 
+depending on the embedding rate that is either \emph{Adaptive} or \emph{Fixed}.
+In the former the embedding rate depends on the number of edge pixels.
+The higher it is, the larger the message length that can be inserted is.
+Practically, a set of edge pixels is computed according to the 
+Canny algorithm with parameters $b=7$ and $T=3$.
+The message length is thus defined to be lesser than 
+half of this set cardinality.
+If $x$ is too short for $m$, the message is split into sufficient parts
+and a new cover image should be used for the remaining part of the message. 
  
  
+In the latter, the embedding rate is defined as a percentage between the 
+number of modified pixels and the length of the bit message.
+This is the classical approach adopted in steganography.
+Practically, the Canny algorithm generates  
+a set of edge pixels related to increasing values of $T$ and 
+until its cardinality
+is sufficient. Even in this situation, our scheme adapts
+its algorithm to meet all the user's requirements. 
  
  
-\subsubsection{Minimizing Distortion with Syndrome-Treillis Codes} 
+
+Once the map of possibly modified pixels is computed, 
+two methods may further be applied to extract bits that 
+are really changed. 
+The first one randomly chooses the subset of pixels to modify by 
+applying the BBS PRNG again. This method is further denoted  as a \emph{sample}.
+Once this set is selected, a classical LSB replacement is applied to embed the 
+stego content.
+The second method considers the last significant bits of all the pixels 
+inside the previous map. It next directly applies the STC 
+algorithm~\cite{DBLP:journals/tifs/FillerJF11}. 
+It  is further referred to as \emph{STC} and is detailed in the next section.
+
+
+
+
+
+
+
+
+\subsection{Minimizing distortion with Syndrome-Trellis Codes}\label{sub:stc}
  \input{stc}
  
  
  \input{stc}
  
  
-\subsection{Data Extraction}
-Message extraction summarized in Fig.~\ref{fig:sch:ext} follows data embedding
+
+% Edge Based Image Steganography schemes 
+% already studied~\cite{Luo:2010:EAI:1824719.1824720,DBLP:journals/eswa/ChenCL10,DBLP:conf/ih/PevnyFB10} differ 
+% how they select edge pixels, and  
+% how they modify these ones.
+
+% First of all, let us discuss about compexity of edge detetction methods.
+% Let then $M$ and $N$ be the dimension of the original image. 
+% According to~\cite{Hu:2007:HPE:1282866.1282944},
+% even if the fuzzy logic based edge detection methods~\cite{Tyan1993} 
+% have promising results, its complexity is in $C_3 \times O(M \times N)$
+% whereas the complexity on the Canny method~\cite{Canny:1986:CAE:11274.11275} 
+% is in $C_1 \times O(M \times N)$ where  $C_1 < C_3$.
+% \JFC{Verifier ceci...}
+% In experiments detailled in this article, the Canny method has been retained 
+% but the whole approach can be updated to consider 
+% the fuzzy logic edge detector.   
+
+For a given set of parameters, 
+the Canny algorithm returns a numerical value and
+states whether a given pixel is an edge or not.
+In this article, in the Adaptive strategy 
+we consider that all the edge pixels that 
+have been selected by this algorithm have the same
+distortion cost, \textit{i.e.}, $\rho_X$ is always 1 for these bits. 
+In the Fixed strategy, since pixels that are detected to be edge
+with small values of $T$ (e.g., when $T=3$) 
+are more accurate than these with higher values of $T$,
+we give to STC the following distortion map of the corresponding bits
+$$  
+\rho_X= \left\{ 
+\begin{array}{l}
+1 \textrm{ if an edge for $T=3$,} \\
+10 \textrm{ if an edge for $T=5$,} \\
+100 \textrm{ if an edge for $T=7$.}
+\end{array}
+\right.
+$$
+
+
+
+
+\subsection{Data extraction}\label{sub:extract}
+The message extraction summarized in Fig.~\ref{fig:sch:ext} 
+follows the data embedding approach 
  since there exists a reverse function for all its steps.
  since there exists a reverse function for all its steps.
-First of all, the same edge detection is applied to get set,
-which is  sufficiently large with respect to the message size given as a key.  
-Then the STC reverse algorithm is applied to retrieve the encrypted message.
+
+More precisely,  let $b$ be the most significant bits and 
+$T$ be the size of the Canny mask, both be given as a key.
+Thus, the same edge detection is applied on a stego content $Y$ to 
+produce the sequence $y$ of LSBs. 
+If the STC approach has been selected in embedding, the STC reverse
+algorithm is directly executed to retrieve the encrypted message. 
+This inverse function takes the $\hat{H}$ matrix as a parameter.
+Otherwise, \textit{i.e.}, if the \emph{sample} strategy is retained,
+the same random bit selection than in the embedding step 
+is executed with the same seed, given as a key.
  Finally, the Blum-Goldwasser decryption function is executed and the original
  message is extracted.
  Finally, the Blum-Goldwasser decryption function is executed and the original
  message is extracted.
+
+
+\subsection{Running example}\label{sub:xpl}
+In this example, the cover image is  Lena, 
+which is a $512\times512$  image with 256 grayscale levels.
+The message is the poem Ulalume (E. A. Poe), which is constituted by 104 lines, 667
+words, and 3,754 characters, \textit{i.e.},  30,032 bits.
+Lena and the first verses are given in Fig.~\ref{fig:lena}.
+
+\begin{figure}
+\begin{center}
+\begin{minipage}{0.49\linewidth}
+\begin{center}
+\includegraphics[scale=0.20]{lena512}
+\end{center}
+\end{minipage}
+\begin{minipage}{0.49\linewidth}
+\begin{flushleft}
+\begin{scriptsize}
+The skies they were ashen and sober;\linebreak
+$\qquad$ The leaves they were crisped and sere—\linebreak
+$\qquad$ The leaves they were withering and sere;\linebreak
+It was night in the lonesome October\linebreak
+$\qquad$ Of my most immemorial year;\linebreak
+It was hard by the dim lake of Auber,\linebreak
+$\qquad$ In the misty mid region of Weir—\linebreak
+It was down by the dank tarn of Auber,\linebreak
+$\qquad$ In the ghoul-haunted woodland of Weir.
+\end{scriptsize}
+\end{flushleft}
+\end{minipage}
+\end{center}
+\caption{Cover and message examples} \label{fig:lena}
+\end{figure}
+
+The edge detection returns 18,641 and 18,455 pixels when $b$ is
+respectively 7 and 6 and $T=3$.
+These edges are represented in Figure~\ref{fig:edge}.
+When $b$ is 7, it remains one bit per pixel to build the cover vector.
+This configuration leads to a cover vector of size  18,641 if b is 7 
+and 36,910 if $b$ is 6.  
+
+\begin{figure}[t]
+  \begin{center}
+    \subfloat[$b$ is 7.]{
+      \begin{minipage}{0.49\linewidth}
+        \begin{center}
+          %\includegraphics[width=5cm]{emb.pdf}
+          \includegraphics[scale=0.20]{edge7}
+        \end{center}
+      \end{minipage}
+      %\label{fig:sch:emb}
+    }%\hfill
+    \subfloat[$b$ is 6.]{
+      \begin{minipage}{0.49\linewidth}
+        \begin{center}
+          %\includegraphics[width=5cm]{rec.pdf}
+          \includegraphics[scale=0.20]{edge6}
+        \end{center}
+      \end{minipage}
+      %\label{fig:sch:ext}
+    }%\hfill
+  \end{center}
+  \caption{Edge detection wrt $b$ with $T=3$}
+  \label{fig:edge}
+\end{figure}
+
+
+
+The STC algorithm is optimized when the rate between message length and 
+cover vector length is lower than 1/2. 
+So, only 9,320 bits  are available for embedding 
+in the  configuration where $b$ is 7.
+
+When $b$ is 6, we could have considered 18,455 bits for the message.
+However, first experiments have shown that modifying this number of bits is too 
+easily detectable. 
+So, we choose to modify the same amount of bits (9,320) and keep STC optimizing
+which bits to change among  the 36,910 ones.
+
+In the two cases, about the third part of the poem is hidden into the cover. 
+Results with {Adaptive} and {STC} strategies are presented in 
+Fig.~\ref{fig:lenastego}.
+
+\begin{figure}[t]
+  \begin{center}
+    \subfloat[$b$ is 7.]{
+      \begin{minipage}{0.49\linewidth}
+        \begin{center}
+          %\includegraphics[width=5cm]{emb.pdf}
+          \includegraphics[scale=0.20]{lena7}
+        \end{center}
+      \end{minipage}
+      %\label{fig:sch:emb}
+    }%\hfill
+    \subfloat[$b$ is 6.]{
+      \begin{minipage}{0.49\linewidth}
+        \begin{center}
+          %\includegraphics[width=5cm]{rec.pdf}
+          \includegraphics[scale=0.20]{lena6}
+        \end{center}
+      \end{minipage}
+      %\label{fig:sch:ext}
+    }%\hfill
+  \end{center}
+  \caption{Stego images wrt $b$}
+  \label{fig:lenastego}
+\end{figure}
+
+
+Finally, differences between the original cover and the stego images  
+are presented in Fig.~\ref{fig:lenadiff}. For each pair of pixel $X_{ij}$ and  $Y_{ij}$ ($X$ and $Y$ being the cover and the stego content respectively), 
+the pixel value $V_{ij}$ of the difference is defined with the following map
+$$
+V_{ij}= \left\{
+\begin{array}{rcl}
+0 & \textrm{if} &  X_{ij} = Y_{ij} \\
+75 & \textrm{if} &  \vert X_{ij} - Y_{ij} \vert = 1 \\
+150 & \textrm{if} &  \vert X_{ij} - Y_{ij} \vert = 2 \\
+225 & \textrm{if} &  \vert X_{ij} - Y_{ij} \vert = 3 
+\end{array}
+\right.
+$$
+This function allows to emphasize differences between contents.
+Notice that since $b$ is 7 in Fig.~\ref{fig:diff7}, the embedding is binary 
+and this image only contains 0 and 75 values.
+Similarly, if $b$ is 6 as in Fig.~\ref{fig:diff6}, the embedding is ternary 
+and the image contains all the values in $\{0,75,150,225\}$.
+
+
+
+\begin{figure}[t]
+  \begin{center}
+    \subfloat[$b$ is 7.]{
+      \begin{minipage}{0.49\linewidth}
+        \begin{center}
+          %\includegraphics[width=5cm]{emb.pdf}
+          \includegraphics[scale=0.20]{diff7}
+        \end{center}
+      \end{minipage}
+      \label{fig:diff7}
+    }%\hfill
+    \subfloat[$b$ is 6.]{
+      \begin{minipage}{0.49\linewidth}
+        \begin{center}
+          %\includegraphics[width=5cm]{rec.pdf}
+          \includegraphics[scale=0.20]{diff6}
+        \end{center}
+      \end{minipage}
+      \label{fig:diff6}
+    }%\hfill
+  \end{center}
+  \caption{Differences  with Lena's cover  wrt $b$}
+  \label{fig:lenadiff}
+\end{figure}
+
+
+