+From the LSQ algorithm, we have written a C program that uses only
+integer values. We use a very simple quantization by multiplying
+double precision values by a power of two, keeping the integer
+part. For example, all values stored in lut$_s$, lut$_c$, $\ldots$ are
+scaled by 1,024. Since LSQ also computes average, variance, ... to
+remove the slope, the result of implied Euclidean divisions may be
+relatively wrong. To avoid that, we also scale the pixel intensities
+by a power of two. Furthermore, assuming $nb_s$ is fixed, these
+divisions have a known denominator. Thus, they can be replaced by
+their multiplication/shift counterpart. Finally, all other
+multiplications or divisions by a power of two have been replaced by
+left or right bit shifts. Thus, the code only contains
+additions, subtractions and multiplications of signed integers, which
+are perfectly adapted to FGPAs.
+As mentioned above, hardware constraints have a great influence on the VHDL
+implementation. Consequently, we searched the maximum value of each variable as
+a function of the different scale factors and the size of profiles, which gives
+their maximum size in bits. That size determines the maximum scale factors that
+allow to use the least possible RAMs and DSPs. Actually, we implemented our
+algorithm with this maximum size but current works study the impact of
+quantization on the results precision and design complexity. We have compared
+the result of the LSQ version using integers and doubles and observed that the
+precision of both were similar.
+Then we built two versions of VHDL codes: one directly by hand coding
+and the other with Matlab using the Simulink HDL coder
+feature~\cite{HDLCoder}. Although the approach is completely different
+we obtained VHDL codes that are quite comparable. Each approach has
+advantages and drawbacks. Roughly speaking, hand coding provides
+beautiful and much better structured code while Simulink enables us to
+produce a code faster. In terms of throughput and latency,
+simulations show that the two approaches are close with a slight
+advantage for hand coding. We hope that real experiments will confirm