From 03f2db2f776786ee6e9435d3c0f603b248cdd2ae Mon Sep 17 00:00:00 2001 From: couchot Date: Mon, 15 Sep 2014 09:55:28 +0200 Subject: [PATCH] reprise STC --- complexity.tex | 29 +++++++++++++-------------- emb.ps | 2 +- experiments.tex | 11 +++++----- intro.tex | 28 ++++++++++++-------------- main.tex | 12 +++++------ ourapproach.tex | 53 ++++++++++++++++++------------------------------- rec.ps | 2 +- stc.tex | 36 +++++++++++++++++---------------- 8 files changed, 77 insertions(+), 96 deletions(-) diff --git a/complexity.tex b/complexity.tex index 4da4291..304ab32 100644 --- a/complexity.tex +++ b/complexity.tex @@ -1,43 +1,43 @@ This section aims at justifying the lightweight attribute of our approach. To be more precise, we compare the complexity of our schemes to the -state of the art steganography, namely HUGO~\cite{DBLP:conf/ih/PevnyFB10}. + best available steganographic scheme, namely HUGO~\cite{DBLP:conf/ih/PevnyFB10}. In what follows, we consider a $n \times n$ square image. First of all, HUGO starts with computing the second order SPAM Features. -This steps is in $O(n^2 + 2.343^2)$ due to the calculation +This steps is in $O(n^2 + 2\times 343^2)$ due to the calculation of the difference arrays and next of the 686 features (of size 343). Next for each pixel, the distortion measure is calculated by +1/-1 modifying its value and computing again the SPAM features. Pixels are thus selected according to their ability to provide -an image whose SPAM features are close to the original one. -The algorithm is thus computing a distance between each computed feature, -and the original ones -which is at least in $O(343)$ and an overall distance between these -metrics which is in $O(686)$. Computing the distance is thus in +an image whose SPAM features are close to the original ones. +The algorithm thus computes a distance between each feature +and the original ones, +which is at least in $O(343)$, and an overall distance between these +metrics, which is in $O(686)$. Computing the distance is thus in $O(2\times 343^2)$ and this modification is thus in $O(2\times 343^2 \times n^2)$. -Ranking these results may be achieved with an insertion sort which is in +Ranking these results may be achieved with an insertion sort, which is in $2.n^2 \ln(n)$. -The overall complexity of the pixel selection is thus -$O(n^2 +2.343^2 + 2\times 343^2 \times n^2 + 2.n^2 \ln(n))$, \textit{i.e} +The overall complexity of the pixel selection is finally +$O(n^2 +2.343^2 + 2\times 343^2 \times n^2 + 2.n^2 \ln(n))$, \textit{i.e}, $O(2.n^2(343^2 + \ln(n)))$. Our edge selection is based on a Canny Filter. When applied on a -$n \times n$ square image the Noise reduction steps is in $O(5^3 n^2)n$. +$n \times n$ square image, the noise reduction step is in $O(5^3 n^2)$. Next, let $T$ be the size of the canny mask. Computing gradients is in $O(4Tn)$ since derivatives of each direction (vertical or horizontal) are in $O(2Tn)$. Finally, thresholding with hysteresis is in $O(n^2)$. The overall complexity is thus in $O((5^3+4T+1)n^2)$. To summarize, for the embedding map construction, the complexity of Hugo is -dramatically higher than our scheme. +dramatically larger than our scheme. We are then left to express the complexity of the STC algorithm. According to~\cite{DBLP:journals/tifs/FillerJF11}, it is in $O(2^h.n)$ where $h$ is the size of the duplicated -matrix. Its complexity is thus negligeable compared with the embedding map +matrix. Its complexity is thus negligible compared with the embedding map construction. @@ -45,8 +45,7 @@ construction. -Thanks to these complexity result, we claim that STABYLO is lightweight. - +Thanks to these complexity results, we claim that STABYLO is lightweight. diff --git a/emb.ps b/emb.ps index ff26651..d0bdec5 100644 --- a/emb.ps +++ b/emb.ps @@ -521,7 +521,7 @@ newpath 388 588 moveto stroke 0 0 0 nodecolor 14 /Times-Roman set_font -342 578.4 moveto 38 (Key k) alignedtext +342 578.4 moveto 38 (key k) alignedtext grestore % key->encrypt gsave diff --git a/experiments.tex b/experiments.tex index 3babfb3..8194a53 100644 --- a/experiments.tex +++ b/experiments.tex @@ -1,20 +1,20 @@ -For whole experiments, the whole set of 10,000 images +For all the experiments, the whole set of 10,000 images of the BOSS contest~\cite{Boss10} database is taken. In this set, each cover is a $512\times 512$ grayscale digital image in a RAW format. We restrict experiments to this set of cover images since this paper is more focused on -the methodology than on benchmarking. +the methodology than on benchmarks. We use the matrices $\hat{H}$ generated by the integers given -in table~\ref{table:matrices:H} +in Table~\ref{table:matrices:H} as introduced in~\cite{FillerJF11}, since these ones have experimentally be proven to have the best modification efficiency. For instance if the rate between the size of the message and the size of the cover vector is 1/4, each number in $\{81, 95, 107, 121\}$ is translated into a binary number -and each one consitutes thus a column of $\hat{H}$. +and each one constitutes thus a column of $\hat{H}$. \begin{table} $$ @@ -141,7 +141,7 @@ in STC(6), data are hidden in the last two significant bits. The quality variance between HUGO and STABYLO for these parameters is given in bold font. It is always close to 1\% which confirms the objective presented in the motivations: -providing an efficient steganography approach with a lightweight manner. +providing an efficient steganography approach in a lightweight manner. Let us now compare the STABYLO approach with other edge based steganography @@ -215,4 +215,3 @@ Compared to EAILSBMR, we obtain better results when the strategy is However due to its huge number of integration features, it is not lightweight, which justifies in the authors' opinion the consideration of the proposed method. - diff --git a/intro.tex b/intro.tex index 751fae8..167b7dd 100644 --- a/intro.tex +++ b/intro.tex @@ -1,4 +1,4 @@ -This research work takes place in the field of information hiding, considerably developed for the last two decades. The proposed method for +This research work takes place in the field of information hiding, considerably developed these last two decades. The proposed method for steganography considers digital images as covers. It belongs to the well-known large category of spatial least significant bits (LSBs) replacement schemes. @@ -11,15 +11,15 @@ are never decreased (resp. increased), thus such schemes may break the structural symmetry of the host images. And these structural alterations can be detected by -well-designed statistical investigations, leading to well -known steganalysis methods~\cite{DBLP:journals/tsp/DumitrescuWW03,DBLP:conf/mmsec/FridrichGD01,Dumitrescu:2005:LSB:1073170.1073176}. +well-designed statistical investigations, leading to well-known +steganalysis methods~\cite{DBLP:journals/tsp/DumitrescuWW03,DBLP:conf/mmsec/FridrichGD01,Dumitrescu:2005:LSB:1073170.1073176}. Let us recall too that this drawback -can be corrected considering the LSB matching (LSBM) subcategory, in which -the $+1$ or $-1$ is randomly added to the cover pixel LSB value +can be fixed by considering the LSB matching (LSBM) subcategory, in which +the $+1$ or $-1$ is randomly added to the cover pixel's LSB value only if this one does not correspond to the secret bit. %TODO : modifier ceci -By considering well-encrypted hidden messages, the probabilities of increasing or decreasing the alue of pixels are equal. Then usual statistical approaches +By considering well-encrypted hidden messages, the probabilities of increasing or decreasing the value of pixels are equal. Then usual statistical approaches cannot be applied here to discover stego-contents in LSBM. The most accurate detectors for this matching are universal steganalysers such as~\cite{LHS08,DBLP:conf/ih/Ker05,FK12}, which classify images according to extracted features from neighboring elements of residual noise. @@ -43,17 +43,17 @@ LSBM approach. % based on our experiments -Instead of (efficiently) modifying LSBs, there is also a need to select pixels whose value +Additionally to (efficiently) modifying LSBs, there is also a need to select pixels whose value modification minimizes a distortion function. This distortion may be computed thanks to feature vectors that are embedded for instance in the steganalysers referenced above. The Highly Undetectable steGO (HUGO) method~\cite{DBLP:conf/ih/PevnyFB10} is one of the most efficient instance of such a scheme. It takes into account so-called SPAM features %(whose size is larger than $10^7$) -to avoid overfitting a particular +to avoid over-fitting a particular steganalyser. Thus a distortion measure for each pixel is individually determined as the sum of the differences between the features of the SPAM computed from the cover and from the stego images. -Thanks to this features set, HUGO allows to embed messages that are $7\times$ longer than the former ones with the same level of +Due to this features set, HUGO allows to embed messages that are $7$ times longer than the former ones with the same level of indetectability as LSB matching. However, this improvement is time consuming, mainly due to the distortion function computation. @@ -78,12 +78,12 @@ EAISLSBMR. This approach selects sharper edge regions with respect to a given embedding rate: the larger the number of bits to be embedded is, the coarser the edge regions are. -Then the data hiding algorithm is achieved by applying LSBMR on some of the pixels of these regions. +Then the data hiding algorithm is achie\-ved by applying LSBMR on some of the pixels of these regions. The authors show that their proposed method is more efficient than all the LSB, LSBM, and LSBMR approaches -thanks to extensive experiments. +through extensive experiments. However, it has been shown that the distinguishing error with LSB embedding is lower than the one with some binary embedding~\cite{DBLP:journals/tifs/FillerJF11}. -We thus propose to take advantage of these optimized embeddings, provided they are not too time consuming. +We thus propose to take advantage of this optimized embedding, provided they are not too time consuming. In the latter, an hybrid edge detector is presented followed by an ad hoc embedding. The Edge detection is computed by combining fuzzy logic~\cite{Tyan1993} @@ -93,7 +93,7 @@ is to enlarge the set of modified bits to increase the payload of the data hidin One can notice that all the previously referenced -schemes~\cite{Luo:2010:EAI:1824719.1824720,DBLP:journals/eswa/ChenCL10,DBLP:conf/ih/PevnyFB10} +sche\-mes~\cite{Luo:2010:EAI:1824719.1824720,DBLP:journals/eswa/ChenCL10,DBLP:conf/ih/PevnyFB10} produce stego contents by only considering the payload, not the type of image signal: the higher the payload is, the better the approach is said to be. @@ -132,5 +132,3 @@ Finally, concluding notes and future work are given in Section~\ref{sec:concl}. - - diff --git a/main.tex b/main.tex index 249436f..7d0132e 100755 --- a/main.tex +++ b/main.tex @@ -33,7 +33,7 @@ \title{STABYLO: STeganography with -Adaptive, Bbs, and binarY embedding at LOw cost.} +Adaptive, Bbs, and binarY embedding at LOw cost} \author{Jean-Fran\c cois Couchot, Raphael Couturier, and Christophe Guyeux\thanks{Authors in alphabetic order}} @@ -86,7 +86,7 @@ Its main advantage is to be much lighter than the so-called Highly Undetectable steGO (HUGO) scheme, a well-known state of the art steganographic process in the spatial domain. Additionally to this effectiveness, -quite comparable results through noise measures like PSNR-HVS-M, +quite comparable results through noise measures like PSNR-HVS-M and weighted PSNR (wPSNR) are obtained. To achieve the proposed goal, famous experimented components of signal processing, @@ -126,8 +126,8 @@ detection filter, the Blum-Blum-Shub cryptographically secure pseudorandom number generator, together with Syndrome-Trellis Codes for minimizing distortion. After having introduced with details the proposed method, -we have evaluated it through noise measures (namely, the PSNR, PSNR-HVS-M, -BIQI, and weighted PSNR), we have used well-established steganalysers. +we have evaluated it through noise measures (namely, the PSNR, PSNR-HVS-M, +and weighted PSNR), we have used well-established steganalysers. % Of course, other detectors like the fuzzy edge methods % deserve much further attention, which is why we intend @@ -145,11 +145,9 @@ replacement, in terms of security, will be discussed. Furthermore, we plan to investigate information hiding on other models, such as high frequency for JPEG encoding. -\bibliographystyle{compj} +\bibliographystyle{spbasic} \bibliography{abbrev,biblioand} \end{document} - - diff --git a/ourapproach.tex b/ourapproach.tex index 2378139..636b8ba 100644 --- a/ourapproach.tex +++ b/ourapproach.tex @@ -3,12 +3,12 @@ four main steps: the data encryption (Sect.~\ref{sub:bbs}), the cover pixel selection (Sect.~\ref{sub:edge}), the adaptive payload considerations (Sect.~\ref{sub:adaptive}), and how the distortion has been minimized (Sect.~\ref{sub:stc}). -The message extraction is then presented (Sect.~\ref{sub:extract}) and a running example ends this section (Sect.~\ref{sub:xpl}). +The message extraction is then presented (Sect.~\ref{sub:extract}) while a running example ends this section (Sect.~\ref{sub:xpl}). The flowcharts given in Fig.~\ref{fig:sch} summarize our steganography scheme denoted by -STABYLO, which stands for STeganography with +STABYLO, which stands for STe\-ga\-no\-gra\-phy with Adaptive, Bbs, binarY embedding at LOw cost. What follows are successively some details of the inner steps and the flows both inside the embedding stage (Fig.~\ref{fig:sch:emb}) @@ -17,7 +17,7 @@ Let us first focus on the data embedding. \begin{figure*}%[t] \begin{center} - \subfloat[Data Embedding.]{ + \subfloat[Data Embedding]{ \begin{minipage}{0.49\textwidth} \begin{center} %\includegraphics[width=5cm]{emb.pdf} @@ -27,7 +27,7 @@ Let us first focus on the data embedding. \label{fig:sch:emb} } - \subfloat[Data Extraction.]{ + \subfloat[Data Extraction]{ \begin{minipage}{0.49\textwidth} \begin{center} %\includegraphics[width=5cm]{rec.pdf} @@ -55,7 +55,7 @@ we implement the Blum-Goldwasser cryptosystem~\cite{Blum:1985:EPP:19478.19501} that is based on the Blum Blum Shub~\cite{DBLP:conf/crypto/ShubBB82} pseudorandom number generator (PRNG) and the XOR binary function. -It has been indeed proven~\cite{DBLP:conf/crypto/ShubBB82} that this PRNG +It has been proven~\cite{DBLP:conf/crypto/ShubBB82} that this PRNG has the property of cryptographical security, \textit{i.e.}, for any sequence of $L$ output bits $x_i$, $x_{i+1}$, \ldots, $x_{i+L-1}$, there is no algorithm, whose time complexity is polynomial in $L$, and @@ -88,9 +88,9 @@ how they modify them. Many techniques have been proposed in the literature to detect edges in images (whose noise has been initially reduced). -They can be separated into two categories: first and second order detection +They can be separated in two categories: first and second order detection methods on the one hand, and fuzzy detectors on the other hand~\cite{KF11}. -In first order methods like Sobel, Canny~\cite{Canny:1986:CAE:11274.11275}, \ldots, +In first order methods like Sobel, Canny~\cite{Canny:1986:CAE:11274.11275}, and so on, a first-order derivative (gradient magnitude, etc.) is computed to search for local maxima, whereas in second order ones, zero crossings in a second-order derivative, like the Laplacian computed from the image, are searched in order to find edges. @@ -98,7 +98,7 @@ As far as fuzzy edge methods are concerned, they are obviously based on fuzzy lo Canny filters, on their parts, are an old family of algorithms still remaining a state of the art edge detector. They can be well-approximated by first-order derivatives of Gaussians. As the Canny algorithm is fast, well known, has been studied in depth, and is implementable -on many kinds of architectures like FPGAs, smartphones, desktop machines, and +on many kinds of architectures like FPGAs, smart phones, desktop machines, and GPUs, we have chosen this edge detector for illustrative purpose. %\JFC{il faudrait comparer les complexites des algo fuzy and canny} @@ -113,15 +113,15 @@ If set with the same value $b$, the edge detection returns thus the same set of pixels for both the cover and the stego image. In our flowcharts, this is represented by ``edgeDetection(b bits)''. Then only the 2 LSBs of pixels in the set of edges are returned if $b$ is 6, -and the LSB of pixels if $b$ is 7. +and the LSBs of pixels if $b$ is 7. Let $x$ be the sequence of these bits. -The next section presents how our scheme -adapts when the size of $x$ is not sufficient for the message $m$ to embed. +The next section presents how to adapt our scheme + when the size of $x$ is not sufficient for the message $m$ to embed. @@ -130,7 +130,7 @@ adapts when the size of $x$ is not sufficient for the message $m$ to embed. \subsection{Adaptive embedding rate}\label{sub:adaptive} -Two strategies have been developed in our scheme, +Two strategies have been developed in our approach, depending on the embedding rate that is either \emph{adaptive} or \emph{fixed}. In the former the embedding rate depends on the number of edge pixels. The higher it is, the larger the message length that can be inserted is. @@ -138,7 +138,7 @@ Practically, a set of edge pixels is computed according to the Canny algorithm with a high threshold. The message length is thus defined to be less than half of this set cardinality. -If $x$ is then too short for $m$, the message is split into sufficient parts +If $x$ is too short for $m$, the message is split into sufficient parts and a new cover image should be used for the remaining part of the message. @@ -159,29 +159,15 @@ The first one randomly chooses the subset of pixels to modify by applying the BBS PRNG again. This method is further denoted as a \emph{sample}. Once this set is selected, a classical LSB replacement is applied to embed the stego content. -The second method is a direct application of the -STC algorithm~\cite{DBLP:journals/tifs/FillerJF11}. +The second method considers the last significant bits of all the pixels +inside the previous map. It next directly applies the STC +algorithm~\cite{DBLP:journals/tifs/FillerJF11}. It is further referred to as \emph{STC} and is detailed in the next section. -% First of all, let us discuss about compexity of edge detetction methods. -% Let then $M$ and $N$ be the dimension of the original image. -% According to~\cite{Hu:2007:HPE:1282866.1282944}, -% even if the fuzzy logic based edge detection methods~\cite{Tyan1993} -% have promising results, its complexity is in $C_3 \times O(M \times N)$ -% whereas the complexity on the Canny method~\cite{Canny:1986:CAE:11274.11275} -% is in $C_1 \times O(M \times N)$ where $C_1 < C_3$. -% \JFC{Verifier ceci...} -% In experiments detailled in this article, the Canny method has been retained -% but the whole approach can be updated to consider -% the fuzzy logic edge detector. - - - - @@ -272,7 +258,7 @@ $\qquad$ In the ghoul-haunted woodland of Weir. The edge detection returns 18,641 and 18,455 pixels when $b$ is respectively 7 and 6. These edges are represented in Figure~\ref{fig:edge}. When $b$ is 7, it remains one bit per pixel to build the cover vector. -in this configuration, this leads to a cover vector of size 18,641 if b is 7 +This configuration leads to a cover vector of size 18,641 if b is 7 and 36,910 if $b$ is 6. \begin{figure}[t] @@ -303,7 +289,7 @@ and 36,910 if $b$ is 6. The STC algorithm is optimized when the rate between message length and -cover vector length is less than 1/2. +cover vector length is lower than 1/2. So, only 9,320 bits are available for embedding in the configuration where $b$ is 7. @@ -311,7 +297,7 @@ When $b$ is 6, we could have considered 18,455 bits for the message. However, first experiments have shown that modifying this number of bits is too easily detectable. So, we choose to modify the same amount of bits (9,320) and keep STC optimizing -which bits to change among the 36,910 bits. +which bits to change among the 36,910 ones. In the two cases, about the third part of the poem is hidden into the cover. Results with \emph{adaptive+STC} strategy are presented in @@ -384,4 +370,3 @@ This function allows to emphasize differences between contents. \end{figure} - diff --git a/rec.ps b/rec.ps index 6a35845..92b2b06 100644 --- a/rec.ps +++ b/rec.ps @@ -340,7 +340,7 @@ newpath 54 270 moveto stroke 0 0 0 nodecolor 14 /Times-Roman set_font -8 260.4 moveto 38 (Key k) alignedtext +8 260.4 moveto 38 (key k) alignedtext grestore % decrypt gsave diff --git a/stc.tex b/stc.tex index 26ac732..db98585 100644 --- a/stc.tex +++ b/stc.tex @@ -1,15 +1,14 @@ To make this article self-contained, this section recalls the basis of the Syndrome Treillis Codes (STC). + Let -$x=(x_1,\ldots,x_n)$ be the $n$-bits cover vector of the image $X$, +$x=(x_1,\ldots,x_n)$ be the $n$-bits cover vector issued from an image $X$, $m$ be the message to embed, and $y=(y_1,\ldots,y_n)$ be the $n$-bits stego vector. The usual additive embedding impact of replacing $x$ by $y$ in $X$ is given by a distortion function $D_X(x,y)= \Sigma_{i=1}^n \rho_X(i,x,y)$, where the function $\rho_X$ expresses the cost of replacing $x_i$ by $y_i$ in $X$. -Let us consider that $x$ is fixed: -this is for instance the LSBs of the image edge bits. The objective is thus to find $y$ that minimizes $D_X(x,y)$. Hamming embedding proposes a solution to this problem. @@ -19,10 +18,12 @@ Furthermore this code provides a vector $y$ s.t. $Hy$ is equal to $m$ for a given binary matrix $H$. Let us explain this embedding on a small illustrative example where -$\rho_X(i,x,y)$ is equal to 1, -whereas $m$ and $x$ are respectively a 3 bits column -vector and a 7 bits column vector. -Let then $H$ be the binary Hamming matrix +$m$ and $x$ are respectively a 3 bits column +vector and a 7 bits column vector and where +$\rho_X(i,x,y)$ is equal to 1 for any $i$, $x$, $y$ +(\textit{i.e.}, $\rho_X(i,x,y) = 0$ if $x = y$ and $0$ otherwise). + +Let $H$ be the binary Hamming matrix $$ H = \left( \begin{array}{lllllll} @@ -53,17 +54,18 @@ switching the $j-$th component of $x$, that is, $\overline{x}^j = (x_1 , \ldots, \overline{x_j},\ldots, x_n )$. It is not hard to see that if $y$ is $\overline{x}^j$, then $m = Hy$. -It is then possible to embed 3 bits in only 7 LSBs of pixels by modifying +It is then possible to embed 3 bits in 7 LSBs of pixels by modifying at most 1 bit. -In the general case, communicating $n$ message bits in -$2^n-1$ pixels needs $1-1/2^n$ average changes. - +In the general case, communicating a message of $p$ bits in a cover of +$n=2^p-1$ pixels needs $1-1/2^p$ average changes. - -Unfortunately, for any given $H$, finding $y$ that solves $Hy=m$ and +This Hamming embeding is really efficient to very small payload and is +not well suited when the size of the message is larger, as in real situation. +The matrix $H$ should be changed to deal with higher payload. +Moreover, for any given $H$, finding $y$ that solves $Hy=m$ and that minimizes $D_X(x,y)$, has an exponential complexity with respect to $n$. The Syndrome-Trellis Codes -presented by Filler \emph{et al.} in~\cite{DBLP:conf/mediaforensics/FillerJF10} +presented by Filler \emph{et al.} in~\cite{FillerJF11} is a practical solution to this complexity. Thanks to this contribution, the solving algorithm has a linear complexity with respect to $n$. @@ -75,10 +77,10 @@ any solution of $m=Hy$ as a path through a trellis. Next, the process of finding $y$ consists in two stages: a forward and a backward part. \begin{enumerate} -\item Forward construction of the trellis that depends on $\hat{H}$, on $x$, on $m$, and on $\rho$. +\item Forward construction of the trellis that depends on $\hat{H}$, on $x$, on $m$, and on $\rho$. This step is linear in $n$. \item Backward determination of $y$ that minimizes $D$, starting with -the complete path having the minimal weight. +the complete path having the minimal weight. This corresponds to traversing +a graph and has a complexity which is linear in $n$. \end{enumerate} - -- 2.39.5