\documentclass{article}

\title{Answers to the questions asked by reviewers.}
\begin{document}
\maketitle
We would first like to thank reviewers for analyzing our article and making suggestions to improve it.
We did our best to comply with their requirements. We hope it will now meet the required standard.
We detail the changes made below.
\section*{Reviewer 1}
\begin{verbatim}
First, please define all your acronyms before using them.
\end{verbatim}
Unforgivable and fixed.

\begin{verbatim}
Second Figure 8 was referred to in page 4 and displyed in page 8.
Please display at least the upper part of this figure in page 4 
even though it is slightly redundant.
\end{verbatim}
We managed to display this figure early in the paper, on page 3, shortly after we first referred to it on page 2. 

\begin{verbatim}
Also, I had to wait till section 6.2 and 6.3 until I understood 
the main contribution of the paper. If you refer to this earlier
(maybe at the end of the introduction) you will be able to hock
 your reader in a better way.
\end{verbatim}
Abstract and introduction have been re-written accordingly. 


\section*{Reviewer 2}
\begin{verbatim}
1. There are a few English errors and typos sprinkled throughout
the paper.  These need to be corrected.  
Most of them are spelling errors.
\end{verbatim}
We carefully parsed the whole article and hope there are no errors left. 

\begin{verbatim}
2. The term "Lengthenable" is not a word.
\end{verbatim}
The same remark was made by reviewer 3. We replaced it, as suggested, by the word extendable.
Please note that the term \textit{lengthenable} is used, for example, in the text of some US Patents: \textit{Balance type minute lengthenable adjuster} (US patent 4995847 A) or \textit{lengthenable garment} (US patent 2602163).

\begin{verbatim}
3. In formulae, the use of a dot to represent multiplication is 
distracting.  It can be interpreted as a dot product or a period.
I would suggest these be removed.
\end{verbatim}
Done.
\begin{verbatim}
4. The second paragraph in the introduction (p. 1) starts with a 
reference to human vision, but the paragraph describes various 
noise filters that have been accelerated.  
The sentence is irrelevant.
\end{verbatim}
The introduction has been re-written. The reference to human vision is now part of the paragraph saying why no universal filter exists.  

\begin{verbatim}
5. Sections 2 and  3 should be collapsed into the introduction.
\end{verbatim}
Done.
\begin{verbatim}
6. The second paragraph on p. 5 is unclear.  Where does the 
formula and numbers given for the number of segments come from?
\end{verbatim}
Formulas have been detailed and the first sentences of the paragraph have been made more concise. 

\section*{Reviewer 3} 

\begin{verbatim}
It is also difficult to follow because the introduction doesn't 
clearly explain how the isolines and de-noising are connected.  
This leads to the confusion that will be evident from my comments
 below.  An *iso*line would normally be expected to delineate areas
 of the *same* intensity, so it's not obvious why a std devn should 
be associated with one.  This doesn't become clear until well into 
section 6!! The authors need to explain (even if only briefly) how 
associating a pixel with an isoline removes noise *early* in their
paper.  I would suggest moving figure 4(a,b,c) to the introduction 
and adding the set of isolines that their process adds to the that 
noisy window. Since it's only an 11x11 window, there should be only 
a small number of them.
If the diagram becomes too cluttered with neighbouring lines in 
the edge region, then something like the isoline through every 2nd 
pixel in the centre column should enable someone to discover 
immediately why *iso*lines have different intensities in them!
\end{verbatim}

As a preliminary remark, we would like to stress the fact, that what we call isolines is not exactly what Matheron calls level lines.  
We tried to explain the principle of this filter more clearly and earlier (beginning of section 3).
We do not ``add'' any isoline, we only make a local search, around each pixel, of the pattern that would best match the shape of the level line in the noiseless image.
Isolines are built by combining those patterns and once one isoline is terminated, the output gray level value is the average value of all gray level values along that isoline.\\  
The level lines exist in the noiseless image model but we only have access to the noisy images that have been corrupted by Gaussian noise, 
so a probability density function (pdf) has been 'applied' to all pixels, including those belonging to the level line we are trying to estimate.


\begin{verbatim}
There is a key inconsistency in presentation .. there is a statement
that 'most common' images are continuous and continuous few edges.
Such images would contain little information and be of little interest.
However, tests have been conducted on 'real' images with no shortage 
of edges - see figure 7b. If the continuity and few edges condition 
is important, then the effect of deviations (textured images with 
large edge counts) should be discussed.  If the condition is not 
actually important (as results suggest), then these references need 
to be adjusted.
\end{verbatim}
We removed the expression 'few edges'. It was actually an ill-advised attempt to speak about the limitations of the method.
Instead, we now refer to the definition given by Caselles of a 'natural image'. 

\begin{verbatim}
Circular poly-isolines are apparently excluded which would seem to be
 a key limitation and needs discussion.
\end{verbatim}
We did not mean ``circular'' but ``that could roll onto themselves''. 
Once again, our language was not precise enough. 

\begin{verbatim}
In general, the paper could be a lot better written.  The authors might 
remove the 'noise' (in the sense of unnecessary words and phrases) 
from the text also and make it more precise and concise!  
Some examples of this verbal noise are given below.  
The whole paper should be checked for similar excesses.
\end{verbatim}
We hope that the modifications we made are satisfactory.

\begin{verbatim}
Similarly, figure 9 contains excess noise in the form of meaningless
digits in numbers presented.  By convention,   if you write 19.49, 
you imply 19.49 ± 0.005 (half the least significant digit).  
It's certain that the last 9 means nothing here and you should 
therefore write 19.5.
\end{verbatim}
That is not exact: 
Let us consider a noisy $512\times 512$ pixel image of PSNR=19.50~dB.
The corresponding Sum of Square Errors is then $SE_{19.5}=750032$.
If the PSNR value is PSNR=19,49~dB, then $SE_{19.49}=751761$.

The difference is $1729$ which could be the consequence of 1729 pixels each having its own error increased by one.
It is absolutely measurable and has a meaning even if it is not visible by a human eye.
 
\begin{verbatim}
This table would be further improved by reporting the *improvement* 
in SNR (ie the difference between noisy and improved images).  
This would result in smaller (easier and faster to read) numbers 
and highlight differences. It's not clear to me why, if noise of 
the same sigma was added to each image,
how the SNR's for the noisy images are not the same too.
\end{verbatim}
We did so.
As for the PSNR, it depends on the content of each image: especially the average gray level value and the drawing of noise.
For example, if one tries to add a noise value of 25 to a reference gray level value of 240, the resulting corrupted value will be truncated  to 255, it but won't be the case if the reference value is under 230.   


\begin{verbatim}
It starts off badly in the abstract, where the authors state that 
they 'propose to address the issue ..'.
If they just 'propose to address' something, then this paper should 
be in a work-in-progess workshop.
I presume what they really mean to say is:
'We describe a GPU-based filter for image denoising.'
Abstracts are supposed to be concise, so the rest of the sentence 
is redundant.
It is reasonable to assume that readers understand that modern 
GPUs are parallel processing devices!
The next sentence could similarly be shortened and made more 
appropriate for an abstract to ..
We use Matherton's level set theory which has rarely been 
implemented because of high computation cost.
Reference numbers should *not* appear in the abstract, because it 
will often be read by itself without the accompanying text and 
reference list - just use the author(s)' names as in the suggested 
improvement.
I'm not sure of the significance of 'try to guess' .. but again 
more precise wording is needed: I don't think anyone is interested 
in a guess .. except as a first step in a refinement.  
'What we actually do' should be deleted altogether
as colloquial and excess verbiage. Try 'We initially guess ...' 
(or something similarly concise).
Next sentence contains some historical notes which are probably 
best in the text alone: the abstract would normally concentrate 
on results, but it should not be vague - 
'the optimization heuristics' tells me nothing.  This is where a 
*concise* description of the heuristics should be included.
The final sentence is almost inviting a reader to skip reading the 
paper altogether (and your reviewer to reject it!). It would 
be better if it gave some actual results achieved by the authors 
(both for denoising level *and* time) and compared
with state-of-the-art denoising levels and time from other work.

The abstract should also give an example of denoising performance 
- perhaps some numbers from Fig 9 
(but SNR improvements, not actual SNRs!).  An average from
fig 9 would also be reasonable.

Some more examples of writing style unacceptable in a journal ..


Denoising has been a much studied research issue since
electronic transmission was rst used. The wide range of
applications that involve denoising makes it uneasy to
propose a universal ltering method. Among them, dig-
ital image processing is a major eld of interest as the
number of digital devices able to take pictures or make
movies is growing fast and shooting is rarely done in
optimal conditions.

to

Denoising has been extensively studied since images have
been transmitted electronically. The wide range of applications
requiring noise removal makes it difficult to find a universal
 filter. The fast growth in digital devices
has made digital processing become more important.
\end{verbatim}
We modified our text accordingly and hope it will be more understandable.
As for the optimization heuristics, the most interesting ones are described
in the sections isoline-segments, PI-LD, PI-PD.

\begin{verbatim}
What's the significance of 'shooting is rarely done in
optimal conditions'? It's certainly not linked to digital images 
alone as it affects images however they are acquired and stored.
\end{verbatim}
Pictures made with mobile phones or digital cameras are most often taken in poor light conditions and without stabilizing stand.
Those pictures are quite noisy and need noise removal.    

\begin{verbatim}
higher noise effects -> higher noise levels

imposes high output flow rates .. -> requires high data rates 
in the processing algorithms

is subject to high variation ... -> varies significantly from 
person to person
\end{verbatim}
Done with a minor revision: \textit{data transfer rates}.

\begin{verbatim}
Many researchers have successfully speeded up image processing 
algorithms with GPUs.
\end{verbatim}
Our spell checker (aspell / Texlive / American dict.) says: sped up.
We tried to re-phrase our sentence.

\begin{verbatim}
Don't use reference numbers as substitutes for author's names ..
For example in [11],[7]
and [15], authors managed to design quite fast median
lters. Bilateral ltering has also been successfully pro-
posed in [17].

reads much better as

Fast median filters have been reported[11],[7];
bilateral filtering was also speeded up[17].

or 
McGuire[11] and Chen et al[7] reported ... ;
Yang et al sped up a bilateral filter[17].

Giving the actual rates (in terms, say, of frames per second) would
 be even better in putting the author's current work into context!
\end{verbatim}
Right.

\begin{verbatim}
What do you mean by 
'even apparently sub-optimal solutions'?
'apparently sub-optimal' implies that they might really be optimal .. 
However, from the rest of the sentence, 
it would seem that you're referring to the usual trade-off between 
performance and speed - 
lower quality algortithms that run faster .. so say that .. in 5 words 
instead of 5 lines ..
\end{verbatim}
Right.

\begin{verbatim}
Section 2
I cannot find the 'conditions mentioned in section 5' clearly set 
out there!!
So I find the claim that 'real life images fulfill the above 
conditions' untenable.
List the conditions (or assumptions that your method relies upon)
 *here*.


such level lines based algorithms as in [6] and [12] 
-> level lines algorithms[6,12].

A few years ago, in [3], authors proposed an original method ->
Bertaux et al described a method which ... [3].

Don't put in vague things like 'a few'!! It doesn't really matter
 how long ago anyway ..
if someone is curious, they can look up the reference date!!

preserve -> preserving
\end{verbatim}
Done.

\begin{verbatim}
reference images taken from [1] .. but [1] is hardly a complete 
reference
Are you referring to the Matlab package or Stanford's Lab?
In which case, say .. images from <source> Denoise Lab[1]
and give a complete reference .. 

.. have been published.  Where?? At least some sample references required.
\end{verbatim}
The page of Steven Lansel at Stanford Lab where the benchmark set of images could be downloaded is no more available. We withdrew the reference and added
a footnote in the text.

\begin{verbatim}
Section 4

This section would seem to be partly propaganda for Nvidia.  
They are not the only
manufacturers of GPUs and this section should at least recognize 
this.

The important distinction between what's on a 'card' and on the 
silicon die on the card is lost .. 
even though the chip is probably the only major component of the 
C2070 card
there will be a large difference between memory access times for 
on-chip and off-chip memory.
Nvidia themselves are rather vague about this, but that's no excuse
for propagating this
vagueness into (hopefully) carefully written and objective papers.

At least the clock rate of the card should be noted here.  
Some other parameters like
memory bandwidth are important too, but I can at least place the 
device used in historical
perspective with the clock rate.  I shouldn't have to look this up 
from Nvidia literature.
\end{verbatim}
It is not propaganda: we don't actually have any GPU from other manufacturer at our disposal.
We added clock rate values and precisions about different type of memory as requested.

\begin{verbatim}
What does 're-serialization' mean?
If you mean the inclusion of necessary barriers, then say so, 
rather than invent 
a new term (or follow someone else's unnecessary invention!) 
for a really simple concept.
\end{verbatim}
(or follow someone else's unnecessary invention!) ??? 

Please note that we are not talking about synchronization barriers.
When parallelizing an existing sequential process (on GPU), if the parallel code causes divergent branches or shared memory bank conflicts, some thread instructions just cannot be executed in parallel. Instead, the warps will run branches sequentially or replay the instructions.
That has made us think of a backward move and chose to speak about \textit{re-serialization}.
Anyway, we changed for the single word \textit{serialization}.  

\begin{verbatim}
requires -> requires that ..

The 'A' of CUDA is 'architecture' .. adding 'model' is not necessary .. 
it's already a virtual architecture.

There is no way to know how .. -> The order in which threads are 
scheduled is not defined.
('how' is wrong .. threads will certainly be scheduled  .. 
the key point is you don't know the order!)

The point about coalesced memory accesses is badly phrased and 
probably not correct.
Again Nvidia is not very explicit, but almost certainly coalesced 
accesses must 
lie in the same n mod 2^7 address block .. not an arbitrary 2^7 byte
 range.
\end{verbatim}
Modified.

\begin{verbatim}
The point about shared memory is just wrong.  
Threads within a 'warp' must access the
same shared memory block. The use of the term 'parallel thread' 
here is confusing and 
probably should be replaced by something more explicit. 
Of course, data must be distributed
carefully among shared memory blocks - 
probably this is what you are trying to say.
\end{verbatim}
Our first statement was actually incomplete.
We reproduced the sentence from the CUDA programmer's guide instead.

\begin{verbatim}
Last para is a general statement that applies to *any* parallel 
processing architecture: the authors seem to imply that they 
discovered this for GPUs!
\end{verbatim}
Removed.

\begin{verbatim}
constraining -> difficult
non-suited -> badly designed
probably -> may
\end{verbatim}
Adjusted.

\begin{verbatim}
Section 5

IID .. Each image is corrupted by ONE noise distribution .. 
so Identically distributed is meaningless, there's only one.
However, if you meant *images* corrupted by noise .. then IID 
has meaning.
Small point, but this is a journal paper .. so you should get it right!
\end{verbatim}
$I$/$\widehat{I}$ represent the family of reference/corrupted images.\\
Adjusted.


\begin{verbatim}
'As introduced above' .. where ?? In the previous para, you've just 
set down some notation.

'most common images are continuous and contain few edges' ??
This is an extremely contentious claim!!
If you want to persist with this, then you should give some examples 
of images which satisfy this criterion!
There are many images used for testing image processing algorithms 
which (deliberately) would fail this criterion.
\end{verbatim}
As said before, we focus on 'natural images' as defined by Caselles.\\
Adjusted.

\begin{verbatim}
Section 5.1
Here you introduce 'fixed length' isolines.  You don't justify 
this or explain its significance - except the fixed length 
obviously helps comparisons (as in previous para) .. 
Does an image with large bands of the same colour still have 
fixed length (ie short!) isolines?
Note that a 'continous' image with few edges actually contains 
little information so it's hard to see how interesting images 
satisfy this criterion.
\end{verbatim}
The length of the isolines depends on the power of noise.
Lower power of noise leads to shorter isolines.

\begin{verbatim}
You use the term 'isoline part' without definition ..
is this a fixed length segment of a longer isoline?
\end{verbatim}
An isoline part was meant to be a non terminated isoline.\\
Modified.

\begin{verbatim}
Z is a *set* of grey levels.
P is the likelihood of what ?
\end{verbatim}
First point: modified.
As for likelihood, it is the one defined by the expression that follows.\\
We added $Z$.
 
\begin{verbatim}
developped as -> re-written

Last sentence ..
The best isoline is the one which maximizes (5).

Section 5.2 
Lengthenable is not an English word .. use 'extendable'
larger -> longer
lengthening -> extension
512x512 -> $512 \times 512$

could be seen as 
possible valid ..
You're defining a notation here .. don't use words implying vagueness!
'candidate' is a good word if you're going to make a choice among 
possibilities at some stage.
hypothesis -> hypotheses 
'share the same mean value'?  
You mean that the two segments S^n and S^p must have the same mean?
In place of 'First' 'Second' write
If S^n+p is an isoline then ...
Alternatively, if S^n and S^p are ...
There is no 'third', so this is clearer ..

to validate lengthening -> to extend ...
depends -> depends on
\end{verbatim}
We agree and have adjusted our text.
As for \textit{lengthenable}, as said before, we found this word in some US Patent titles.\\ 
Anyway, we followed the suggestion.

When one segment A is supposed to extend another segment B, if we consider both of them as one single isoline part, they ``share'' the same mean value inside the (A+B) polyline, by hypothesis. 

\begin{verbatim}
You should define what you mean by an isoline carefully.
'iso' implies same, yet you describe the grey levels along S^n 
(an isoline part) as having a distribution.

This whole section needs careful re-write.  
A diagram explaining how a pixel and an isoline are related 
might help.
Figure 1 does not seem to help much .. it just shows an isoline 
which appears to have the *same* intensity along its whole length.
\end{verbatim}
Inside a noisy image, the level lines of the reference scene are corrupted by the noise. 
Modifications have been made. We hope to be more clear.

\begin{verbatim}
Section 6

combinating -> combining

Figures should appear (roughly) in the order to which they are 
referred! You refer to fig 8 *before* fig 2 has appeared!
\end{verbatim}
Adjusted.

\begin{verbatim}
To fit the GPU-specific ... 
Significance? 
Did you mean to say that you chose the number of directions, D, 
to be 2^k ?
This is hardly GPU-specific, it would be done for *any* hardware 
implementation!
\end{verbatim}
What is actually GPU specific is 32.\\
We re-phrased.


\begin{verbatim}
because of the necessary reduction stage for which GPU efficiency 
is not maximum ->
because reduction is not efficient in a GPU
\end{verbatim}
Reduction can actually be efficient on GPU. For example summing the 10 million values of one single vector is efficient on GPU.
But here, we talk about thousands of small and irregular sums. That is not efficient on GPU.
That led us to use the above expression.
We have tried to improve it. 

\begin{verbatim}
If you were inspired by Bertaux et al's work, it would be nice 
to use their name(s) in the text, instead of reducing them 
to a number!  This also saves someone who is familiar with their 
work having to go to the reference list to confirm
whose work you're talking about!
\end{verbatim}
Bertaux is actually mentionned as one of the authors. 
The remark is justified but we intimately never tried to reduce him.\\
Modified. 
  
\begin{verbatim}
Section 6.1
When considering S^n *under construction*,
how do you have C_x(Z(S^n)) .. having been obtained in the *previous* 
extension step?
\end{verbatim}
Because those sums are computed and accumulated at each stage.

\begin{verbatim}
wether -> whether
beeing -> being 
Please use a spelling checker!

deviation -> orientation change
(deviation can imply many things .. orientation change is more 
specific and appropriate here)
\end{verbatim}
Done.

\begin{verbatim}
It's not at all clear why circular lines are not allowed?  
Surely this is a common situation?
\end{verbatim}
Our bad English. As said before, we meant to say 'lines rolling onto themselves'.
It does not concern isolines that would have a circular shape with large radius.  

\begin{verbatim}
You *use* a test (GLRT) not perform it.

For each allowed pattern, GLRT is performed in order to decide if 
the corresponding segment could likely be added to the end of the 
poly-isoline Sn. If none is validated by GLRT, the poly-isoline 
Sn is stopped.

to

For each pattern, we use the GLRT to decide if the segment could
extend the poly-isoline Sn. If none satisfies the test, Sn is 
terminated.
\end{verbatim}
From our point of view, a test is rather performed than used.
We modified the other point.

\begin{verbatim}
In order to avoid critical situations where the first selected 
segment would not share the primary 

direction ....
Presumably you are trying to say that for the first segment of 
a line, you have to consider
all directions so you have D lines which could be extended?

To ensure isotropy .... 'isotropy' is not the right word here .. 
nor is 'shares' .. I suspect you mean to say that each of the 
D lines has the direction of the pattern p_{l,d} ?? 
\end{verbatim}
Done with adjustments. 

\begin{verbatim}
Fig 2 caption
indexes -> indices
It implies -> This implies
\end{verbatim}
Our spell checker says: indexes. 

\begin{verbatim}
standard deviation -> standard deviation of grey levels for 
(you use deviation in another sense also .. so you need to be specific here)
\end{verbatim}
Changed to avoid any ambiguity.

\begin{verbatim}
Section 6.2

'a bit weak' is colloquial .. use 'inferior to'

Remove 'In order to be performed'

Branches stall parts of the pipeline or leave GPU ALUs idle .. 
be specific instead of the vague 'do not fit'.

candidate -> candidates
The notation [0;D] for the set [0..D] is also unusual and probably 
better changed (everywhere) to a more common one.
\end{verbatim}
Adjusted.

\begin{verbatim}
Section 6.3
Discussion associated with edge detection and fig 5 is confusing 
and needs to be re-written.  Fig 5 does not seem to help at all.
\end{verbatim}
Sorry, but we do not see were the confusion can come from.

\begin{verbatim}
Section 7

There is little mention of the use of shared memory - 
only for P_l, not for the images themselves, which are much larger.
This is usually a critical
factor in achieving good speed-up with a GPU, so one assumes that 
higher speed-ups
than reported here are achievable with a little work to allocate 
tasks in such
a way as to allow efficient use of shared memory.  
The thread / pixel model used
here is easy to program but probably less than optimal.  
Blocking images to use
the shared memory effectively is relatively easy but needs a 
little more programming
effort.  In this case, it would appear that there are a sufficient 
number of 
accessses to each pixel to make this worthwhile.
\end{verbatim}
As we have shown in our previous work, shared memory is not \textit{the} solution to obtain speed on GPU.
An optimized sequence of texture fetches is most often faster, as soon as the halos of neighbor pixels overlap.
We applied this technique to design for example the fastest median filter known to date (accepted paper to be available soon) able to output more than 1.85 billion pixel per second \textit{without} using shared memory, unlike most of the other existing implementations. 
In the process of designing this filter, we tested the shared memory solution but gave it up.

\begin{verbatim}
Results should ideally separate out time to transfer images between 
the CPU and GPU
and GPU computation times, since for applications (most presumably) 
that want to process
the de-noised image in the GPU, only the GPU time is of interest.
\end{verbatim}
Adjusted.

\begin{verbatim}
There is a disappointing lack of data for values of l other than 5.
For real-time applications,
the performance-time trade-off is always important.
Choice of l affects both time and performance so some data would 
be valuable.
\end{verbatim}
All images of the set have the same size ($512\times 512$)
and have the same level of relevant details.
These conditions lead to an optimal value of $l=5$.
Moreover, the power of noise $\sigma^2$ lead to a maximum length of $n=25$.
It is one interesting point, because that way, $l$ and $n$ are no more parameters to be adjusted.
It is the same for both GLRT threshold values.   

\begin{verbatim}
'around' 3.5ms is vague and unacceptable in a scientific paper.
Either specify the error explicitly or simply use the half 
least significant digit
convention (assumed by default). Adding 'around' suggests 
vagueness or unexplained error!
\end{verbatim}
As runtime varies with the content of the image, we used the word ``around''.
We agree that it can be confusing. 
We just cleared the occurrences of the word 'around'. 

\begin{verbatim}
Conclusion
It's generally understood now that, to obtain speed-up, you need 
to consider the architecture.  The disparaging comment in the 
first sentence and the implication that the authors discovered 
the need to link solution and architecture is unrealistic
and should be removed.
\end{verbatim}
Modified.  

\end{document}