Commit c27d271058517c5e6ec91558cb9d78154a236b3e

Authored by jfriedt
1 parent 56f7c40c96
Exists in master

relecture

Showing 1 changed file with 37 additions and 31 deletions Side-by-side Diff

ifcs2018_journal.tex
... ... @@ -174,7 +174,7 @@
174 174 that all solutions found by the solver are synthesized and executed on hardware at the end
175 175 of the analysis.
176 176  
177   -In this demonstration , we focus on only two operations: filtering and shifting the number of
  177 +In this demonstration, we focus on only two operations: filtering and shifting the number of
178 178 bits needed to represent the data along the processing chain.
179 179 We have chosen these basic operations because shifting and the filtering have already been studied
180 180 in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for
... ... @@ -298,7 +298,8 @@
298 298 Our criterion to compute the filter rejection considers
299 299 % r2.8 et r2.2 r2.3
300 300 the maximum magnitude within the stopband, to which the {\color{red}sum of the absolute values
301   -within the passband is subtracted to avoid filters with excessive ripples}. With this
  301 +within the passband is subtracted to avoid filters with excessive ripples, normalized to the
  302 +bin width to remain consistent with the passband criterion (dBc/Hz units in all cases)}. With this
302 303 criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.
303 304  
304 305 % \begin{figure}
... ... @@ -355,9 +356,10 @@
355 356 \end{figure}
356 357  
357 358 % r2.6
358   -Finally in our case, we consider that the input signal are fully known. So the
359   -resolution of the data stream are fixed and still the same for all experiments
360   -in this paper.
  359 +{\color{red}
  360 +Finally in our case, we consider that the input signal are fully known. The
  361 +resolution of the input data stream are fixed and still the same for all experiments
  362 +in this paper.}
361 363  
362 364 Based on this analysis, we address the estimate of resource consumption (called
363 365 % r2.11
364 366  
... ... @@ -407,13 +409,15 @@
407 409  
408 410 {\color{red}
409 411 This model is non-linear since we multiply some variable with another variable
410   -and it is even non-quadratic, as $F$ does not have a known
  412 +and it is even non-quadratic, as the cost function $F$ does not have a known
411 413 linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations.
412   -This variable must be defined by the user, it represent the number of different
413   -set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}
414   -functions from GNU Octave). So $C_{ij}$ and $\pi_{ij}^C$ become constant and
415   -we defined $1 \leq j \leq p$ and the function $F$ can be estimate for each configurations
416   -thanks our rejection criterion. We also defined binary
  414 +This variable $p$ is defined by the user, and represents the number of different
  415 +set of coefficients generated (remember, we use \texttt{firls} and \texttt{fir1}
  416 +functions from GNU Octave) based on the targeted filter characteristics and implementation
  417 +assumptions (estimated number of bits defining the coefficients). Hence, $C_{ij}$ and
  418 +$\pi_{ij}^C$ become constants and
  419 +we define $1 \leq j \leq p$ so that the function $F$ can be estimated (Look Up Table)
  420 +for each configurations thanks to the rejection criterion. We also define the binary
417 421 variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$
418 422 and 0 otherwise. The new equations are as follows:
419 423 }
... ... @@ -430,15 +434,15 @@
430 434 Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.
431 435  
432 436 {\color{red}
433   -However the problem still quadratic since in the constraint~\ref{eq:areadef2} we multiply
434   -$\delta_{ij}$ and $\pi_i^-$. But like $\delta_{ij}$ is a binary variable we can
435   -linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size
436   -we define $0 < \pi_i^- \leq 128$ which is the maximal data size that we can process.
437   -}
438   -Moreover the Gurobi
439   -(\url{www.gurobi.com}) optimization software is used to solve this quadratic
440   -model, and since Gurobi is able to linearize, the model is left as is. This model
441   -has $O(np)$ variables and $O(n)$ constraints.
  437 +However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}
  438 +we multiply
  439 +$\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can
  440 +linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size,
  441 +we define $0 < \pi_i^- \leq 128$ which is the maximum data size whose estimation is
  442 +assumed on hardware characteristics.
  443 +The Gurobi (\url{www.gurobi.com}) optimization software used to solve this quadratic
  444 +model is able to linearize the model provided as is. This model
  445 +has $O(np)$ variables and $O(n)$ constraints.}
442 446  
443 447 % This model is non-linear and even non-quadratic, as $F$ does not have a known
444 448 % linear or quadratic expression. We introduce $p$ FIR configurations
... ... @@ -1047,17 +1051,19 @@
1047 1051 previous one.
1048 1052  
1049 1053 {\color{red} % r1.4
1050   -To conclude we have compared our monolithic filters with the FIR Compiler form
1051   -Xilinx. For each experimentation we use the same coefficient set and we compare the
1052   -resources consumption. The table~\ref{tbl:xilinx_resources} exhibits the results.
1053   -The FIR Compiler never use BRAM while our filter use one block. This difference
1054   -can be explain be our wish to have a reconfigurable FIR filter. In our case, we can
1055   -configure the coefficients set without to have to change the FPGA design. With
1056   -the FIR compiler, the coefficients set are given during the FPGA design conception
1057   -so we have to change the coefficients, we need to regenerate the design. The
1058   -difference with the LUT consumption is also related to the reconfigurability
1059   -logic. However the DSP consumption, the most restricted resource, are the same between the FIR compiler end
1060   -our FIR block. Our solutions are as good as the Xilinx implementation.
  1054 +To conclude, we compare our monolithic filters with the FIR Compiler provided by
  1055 +Xilinx in the Vivado software suite (v.2018.2). For each experiment we use the
  1056 +same coefficient set and we compare the resource consumption, having checked that
  1057 +the transfer functions are indeed the same with both implementations.
  1058 +Table~\ref{tbl:xilinx_resources} exhibits the results.
  1059 +The FIR Compiler never use BRAM while our filter implementation uses one block. This difference
  1060 +is explained be our wish to have a dynamically reconfigurable FIR filter whose
  1061 +coefficients can be updated from the processing system without having to update the FPGA design.
  1062 +With the FIR compiler, the coefficients are defined during the FPGA design so that
  1063 +changing coefficients required generating a new design. The difference with the LUT consumption
  1064 +is also attributed to the reconfigurability logic. However the DSP consumption, the scarcest
  1065 +resource, is the same between the Xilinx FIR Compiler end
  1066 +our FIR block: we hence conclude that our solutions are as good as the Xilinx implementation.
1061 1067  
1062 1068 \renewcommand{\arraystretch}{1.2}
1063 1069 \begin{table}