11ème :

[dmems12.git] / dmems12.tex
diff --git a/dmems12.tex b/dmems12.tex

index c5ee6cf6f77386663b9f5a6e2b1737f43f834c72..01b56c5f61cebd2ce4f0e735076992728213d69e 100644 (file)
--- a/dmems12.tex
+++ b/dmems12.tex
@@ -63,15 +63,16 @@
  
  \begin{abstract}
  
  
  \begin{abstract}
  
-Atomic force  microscope (AFM) provides  high resolution images of  surfaces. We
-focus  our attention  on an  interferometry method  to estimate  the cantilevers
-deflection.   This method  was based  on the  spline method  to  interpolate the
-deflection and  the computations were performed  on a PC with  LabView.  In this
-paper, we propose a  new method based on the least square  method and we present
-the implementation that we developped on a FPGA.  Our method can be pipelined on
-a FPGA in order to manipulate image profiles very quickly.  Simulations and real
-tests we have performed showed us that this implementation is very efficient and
-should allow us to control of a cantilevers array in real time.
+  Atomic force microscope (AFM) provides high resolution images of
+  surfaces. We focus our attention on an interferometry method to
+  estimate the cantilevers deflection.  The initial method was based
+  on splines to determine the phase of interference fringes, and thus
+  the deflection. Computations were performed on a PC with LabView.
+  In this paper, we propose a new approach based on the least square
+  methods and its implementation that we developped on a FPGA, using
+  the pipelining technique. Simulations and real tests showed us that
+  this implementation is very efficient and should allow us to control
+  a cantilevers array in real time.
  
  
  \end{abstract}
  
  
  \end{abstract}
@@ -697,23 +698,15 @@ will include real experiments in the final version of this paper.
  
  \subsection{VHDL implementation}
  
  
  \subsection{VHDL implementation}
  
-
-
-% - ecriture d'un code en C avec integer
-% - calcul de la taille max en bit de chaque variable en fonction de la quantization.
-% - tests de quantization : équilibre entre précision et contraintes FPGA
-% - en parallèle : simulink et VHDL à la main
-
-
  From the LSQ algorithm, we have written a C program that uses only
  integer values. We use a very simple quantization by multiplying
  double precision values by a power of two, keeping the integer
  part. For example, all values stored in lut$_s$, lut$_c$, $\ldots$ are
  From the LSQ algorithm, we have written a C program that uses only
  integer values. We use a very simple quantization by multiplying
  double precision values by a power of two, keeping the integer
  part. For example, all values stored in lut$_s$, lut$_c$, $\ldots$ are
-scaled by 1024.  Since LSQ also computes average, variance, ... to
+scaled by 1024. Since LSQ also computes average, variance, ... to
  remove the slope, the result of implied euclidian divisions may be
  relatively wrong. To avoid that, we also scale the pixel intensities
  by a power of two. Futhermore, assuming $nb_s$ is fixed, these
  remove the slope, the result of implied euclidian divisions may be
  relatively wrong. To avoid that, we also scale the pixel intensities
  by a power of two. Futhermore, assuming $nb_s$ is fixed, these
-divisions have a knonw denominator. Thus, they can be replaced by
+divisions have a known denominator. Thus, they can be replaced by
  their multiplication/shift counterpart. Finally, all other
  multiplications or divisions by a power of two have been replaced by
  left or right bit shifts. By the way, the code only contains
  their multiplication/shift counterpart. Finally, all other
  multiplications or divisions by a power of two have been replaced by
  left or right bit shifts. By the way, the code only contains
@@ -744,24 +737,40 @@ that.
  
  \subsection{Simulation}
  
  
  \subsection{Simulation}
  
-Currently, we have only simulated our VHDL codes with GHDL and GTKWave (two free
-tools with linux).  Both approaches led to correct results.  At the beginning of
-our simulations, our  pipiline could compute a new phase each  33 cycles and the
-length of the  pipeline was equal to  95 cycles.  When we tried  to generate the
-corresponding bitsream  with ISE environment  we had many problems  because many
-stages required  more than the  10$n$s required by  the clock frequency.   So we
-needed to decompose  some part of the  pipeline in order to add  some cycles and
-simplify some parts between a clock top.
-% ghdl + gtkwave
-% au mieux : une phase tous les 33 cycles, latence de 95 cycles.
-% mais routage/placement impossible.
+Before experimental tests on the board, we simulated our two VHDL
+codes with GHDL and GTKWave (two free tools with linux). For that, we
+build a testbench based on profiles taken from experimentations and
+compare the results to values given by the SPL algorithm. Both
+versions lead to correct results.
+
+Our first code were highly optimized : the pipeline could compute a
+new phase each 33 cycles and its latency was equal to 95 cycles. Since
+the Spartan6 is clocked at 100MHz, it implies that estimating the
+deflection of 100 cantilevers would take about $(95 + 200\times 33).10
+= 66.95\mu$s, i.e. nearly 15000 estimations by second.
+
  \subsection{Bitstream creation}
  
  \subsection{Bitstream creation}
  
-Currently both  approaches provide synthesable  bitstreams with ISE.   We expect
-that the  pipeline will  have a latency  of 112  cycles, i.e. 1.12$\mu$s  and it
-could accept new profiles of pixel each 48 cycles, i.e. 480$n$s.
+In order to test our code on the SP Vision board, the design was
+extended with a component that keeps profiles in RAM, flushes them in
+the phase computation component and stores its output in another
+RAM. We also added a wishbone : a component that can "drive" signals
+to communicate between i.MX and others components. It is mainly used
+to start to flush profiles and to retrieve the computed phases in RAM.
+
+Unfortunatly, the first designs could not be placed and route with ISE
+on the Spartan6 with a 100MHz clock. The main problems came from
+routing values from RAMs to DSPs and obtaining a result under 10ns. By
+the way, we needed to decompose some parts of the pipeline, which adds
+some cycles. For example, some delays have been introduced between
+RAMs output and DSPs. Finally, we obtained a bitstream that has a
+latency of 112 cycles and computes a new phase every 40 cycles. For
+100 cantilevers, it takes $(112 + 200\times 40).10 = 81.12\mu$s to
+compute their deflection.
+
+This bitstream has been successfully tested on the board TODAY ! YEAAHHHHH
+
  
  
-% pas fait mais prévision d'une sortie tous les 480ns avec une latence de 1120
  
  \label{sec:results}
  
  
  \label{sec:results}
  
@@ -769,13 +778,15 @@ could accept new profiles of pixel each 48 cycles, i.e. 480$n$s.
  
  
  \section{Conclusion and perspectives}
  
  
  \section{Conclusion and perspectives}
-In  this paper  we  have presented  a  new method  to  estimate the  cantilevers
-deflection in  an AFM.  This  method is based  on least square methods.  We have
-studied  the  quantization  of this  algorithm  and  have  implemented it  on  a
-FPGA. Our method gives comparable  results compared to the initial version based
-on splines.   Our solution has been be  implemented with a  pipeline technique.
-Consequently, it enables  to handle a new profile  image very quickly. Currently
-we have performed simulations and real tests on a Spartan6 FPGA.
+In this paper we have presented a new method to estimate the
+cantilevers deflection in an AFM.  This method is based on least
+square methods.  We have used quantization to produce an algorithm
+based exclusively on integer values, which is adapted to a FPGA
+implementation. We obtained a precision on results similar to the
+initial version based on splines.  Our solution has been implemented
+with a pipeline technique.  Consequently, it enables to handle a new
+profile image very quickly. Currently we have performed simulations
+and real tests on a Spartan6 FPGA.
  
  In future  work, we want to couple  our algorithm with a  high speed camera
  and we plan to control the whole AFM system.
  
  In future  work, we want to couple  our algorithm with a  high speed camera
  and we plan to control the whole AFM system.