7ème commit

author Stéphane Domas <sdomas@prodigy.iut-bm.univ-fcomte.fr>

Wed, 19 Oct 2011 15:20:06 +0000 (17:20 +0200)

committer Stéphane Domas <sdomas@prodigy.iut-bm.univ-fcomte.fr>

Wed, 19 Oct 2011 15:20:06 +0000 (17:20 +0200)
author Stéphane Domas <sdomas@prodigy.iut-bm.univ-fcomte.fr>
Wed, 19 Oct 2011 15:20:06 +0000 (17:20 +0200)
committer Stéphane Domas <sdomas@prodigy.iut-bm.univ-fcomte.fr>
Wed, 19 Oct 2011 15:20:06 +0000 (17:20 +0200)
diff --git a/dmems12.tex b/dmems12.tex

index 3fd1ca1636cf186c473dce04b75c496ee6f81b0d..647111e2ddd028eb5724ffce565500cfa85b5bd1 100644 (file)
--- a/dmems12.tex
+++ b/dmems12.tex
@@ -226,10 +226,12 @@ In fact, this timing is a very hard constraint. Let consider a very
  small programm that initializes twenty million of doubles in memory
  and then does 1000000 cumulated sums on 20 contiguous values
  (experimental profiles have about this size). On an intel Core 2 Duo
-E6650 at 2.33GHz, this program reaches an average of 155Mflops. It
-implies that the phase computation algorithm should not take more than
-$155\times 12.5 = 1937$ floating operations. For integers, it gives
-$3000$ operations. Obviously, some cache effects and optimizations on
+E6650 at 2.33GHz, this program reaches an average of 155Mflops. 
+
+%%Itimplies that the phase computation algorithm should not take more than
+%%$155\times 12.5 = 1937$ floating operations. For integers, it gives $3000$ operations. 
+
+Obviously, some cache effects and optimizations on
  huge amount of computations can drastically increase these
  performances : peak efficiency is about 2.5Gflops for the considered
  CPU. But this is not the case for phase computation that used only few
@@ -304,6 +306,7 @@ simultaneously. When  it is  possible, using  a pipeline is  a good  solution to
  manipulate  new  data  at  each  clock  top  and  using  parallelism  to  handle
  simultaneously several data streams.
  
+%% parler du VHDL, synthèse et bitstream
  \subsection{The board}
  
  The board we use is designed by the Armadeus compagny, under the name
@@ -653,15 +656,30 @@ mapping and routing the design on the Spartan6. By the way,
  extra-latency is generated and there must be idle times between two
  profiles entering into the pipeline.
  
-Before obtaining the least bitstream, the crucial question is : how to
-translate the C code the LSQ into VHDL ?
+%%Before obtaining the least bitstream, the crucial question is : how to
+%%translate the C code the LSQ into VHDL ?
+
  
+%\subsection{VHDL design paradigms}
  
-\subsection{VHDL design paradigms}
+\section{Experimental tests}
  
  \subsection{VHDL implementation}
  
-\section{Experimental results}
+% - ecriture d'un code en C avec integer
+% - calcul de la taille max en bit de chaque variable en fonction de la quantization.
+% - tests de quantization : équilibre entre précision et contraintes FPGA
+% - en parallèle : simulink et VHDL à la main
+%
+\subsection{Simulation}
+
+% ghdl + gtkwave
+% au mieux : une phase tous les 33 cycles, latence de 95 cycles.
+% mais routage/placement impossible.
+\subsection{Bitstream creation}
+
+% pas fait mais prévision d'une sortie tous les 480ns avec une latence de 1120
+
  \label{sec:results}
author	Stéphane Domas <sdomas@prodigy.iut-bm.univ-fcomte.fr>
	Wed, 19 Oct 2011 15:20:06 +0000 (17:20 +0200)
committer	Stéphane Domas <sdomas@prodigy.iut-bm.univ-fcomte.fr>
	Wed, 19 Oct 2011 15:20:06 +0000 (17:20 +0200)