Commit c27d271058517c5e6ec91558cb9d78154a236b3e
1 parent
56f7c40c96
Exists in
master
relecture
Showing 1 changed file with 37 additions and 31 deletions Side-by-side Diff
ifcs2018_journal.tex
... | ... | @@ -174,7 +174,7 @@ |
174 | 174 | that all solutions found by the solver are synthesized and executed on hardware at the end |
175 | 175 | of the analysis. |
176 | 176 | |
177 | -In this demonstration , we focus on only two operations: filtering and shifting the number of | |
177 | +In this demonstration, we focus on only two operations: filtering and shifting the number of | |
178 | 178 | bits needed to represent the data along the processing chain. |
179 | 179 | We have chosen these basic operations because shifting and the filtering have already been studied |
180 | 180 | in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for |
... | ... | @@ -298,7 +298,8 @@ |
298 | 298 | Our criterion to compute the filter rejection considers |
299 | 299 | % r2.8 et r2.2 r2.3 |
300 | 300 | the maximum magnitude within the stopband, to which the {\color{red}sum of the absolute values |
301 | -within the passband is subtracted to avoid filters with excessive ripples}. With this | |
301 | +within the passband is subtracted to avoid filters with excessive ripples, normalized to the | |
302 | +bin width to remain consistent with the passband criterion (dBc/Hz units in all cases)}. With this | |
302 | 303 | criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}. |
303 | 304 | |
304 | 305 | % \begin{figure} |
... | ... | @@ -355,9 +356,10 @@ |
355 | 356 | \end{figure} |
356 | 357 | |
357 | 358 | % r2.6 |
358 | -Finally in our case, we consider that the input signal are fully known. So the | |
359 | -resolution of the data stream are fixed and still the same for all experiments | |
360 | -in this paper. | |
359 | +{\color{red} | |
360 | +Finally in our case, we consider that the input signal are fully known. The | |
361 | +resolution of the input data stream are fixed and still the same for all experiments | |
362 | +in this paper.} | |
361 | 363 | |
362 | 364 | Based on this analysis, we address the estimate of resource consumption (called |
363 | 365 | % r2.11 |
364 | 366 | |
... | ... | @@ -407,13 +409,15 @@ |
407 | 409 | |
408 | 410 | {\color{red} |
409 | 411 | This model is non-linear since we multiply some variable with another variable |
410 | -and it is even non-quadratic, as $F$ does not have a known | |
412 | +and it is even non-quadratic, as the cost function $F$ does not have a known | |
411 | 413 | linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations. |
412 | -This variable must be defined by the user, it represent the number of different | |
413 | -set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1} | |
414 | -functions from GNU Octave). So $C_{ij}$ and $\pi_{ij}^C$ become constant and | |
415 | -we defined $1 \leq j \leq p$ and the function $F$ can be estimate for each configurations | |
416 | -thanks our rejection criterion. We also defined binary | |
414 | +This variable $p$ is defined by the user, and represents the number of different | |
415 | +set of coefficients generated (remember, we use \texttt{firls} and \texttt{fir1} | |
416 | +functions from GNU Octave) based on the targeted filter characteristics and implementation | |
417 | +assumptions (estimated number of bits defining the coefficients). Hence, $C_{ij}$ and | |
418 | +$\pi_{ij}^C$ become constants and | |
419 | +we define $1 \leq j \leq p$ so that the function $F$ can be estimated (Look Up Table) | |
420 | +for each configurations thanks to the rejection criterion. We also define the binary | |
417 | 421 | variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$ |
418 | 422 | and 0 otherwise. The new equations are as follows: |
419 | 423 | } |
... | ... | @@ -430,15 +434,15 @@ |
430 | 434 | Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most. |
431 | 435 | |
432 | 436 | {\color{red} |
433 | -However the problem still quadratic since in the constraint~\ref{eq:areadef2} we multiply | |
434 | -$\delta_{ij}$ and $\pi_i^-$. But like $\delta_{ij}$ is a binary variable we can | |
435 | -linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size | |
436 | -we define $0 < \pi_i^- \leq 128$ which is the maximal data size that we can process. | |
437 | -} | |
438 | -Moreover the Gurobi | |
439 | -(\url{www.gurobi.com}) optimization software is used to solve this quadratic | |
440 | -model, and since Gurobi is able to linearize, the model is left as is. This model | |
441 | -has $O(np)$ variables and $O(n)$ constraints. | |
437 | +However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2} | |
438 | +we multiply | |
439 | +$\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can | |
440 | +linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size, | |
441 | +we define $0 < \pi_i^- \leq 128$ which is the maximum data size whose estimation is | |
442 | +assumed on hardware characteristics. | |
443 | +The Gurobi (\url{www.gurobi.com}) optimization software used to solve this quadratic | |
444 | +model is able to linearize the model provided as is. This model | |
445 | +has $O(np)$ variables and $O(n)$ constraints.} | |
442 | 446 | |
443 | 447 | % This model is non-linear and even non-quadratic, as $F$ does not have a known |
444 | 448 | % linear or quadratic expression. We introduce $p$ FIR configurations |
... | ... | @@ -1047,17 +1051,19 @@ |
1047 | 1051 | previous one. |
1048 | 1052 | |
1049 | 1053 | {\color{red} % r1.4 |
1050 | -To conclude we have compared our monolithic filters with the FIR Compiler form | |
1051 | -Xilinx. For each experimentation we use the same coefficient set and we compare the | |
1052 | -resources consumption. The table~\ref{tbl:xilinx_resources} exhibits the results. | |
1053 | -The FIR Compiler never use BRAM while our filter use one block. This difference | |
1054 | -can be explain be our wish to have a reconfigurable FIR filter. In our case, we can | |
1055 | -configure the coefficients set without to have to change the FPGA design. With | |
1056 | -the FIR compiler, the coefficients set are given during the FPGA design conception | |
1057 | -so we have to change the coefficients, we need to regenerate the design. The | |
1058 | -difference with the LUT consumption is also related to the reconfigurability | |
1059 | -logic. However the DSP consumption, the most restricted resource, are the same between the FIR compiler end | |
1060 | -our FIR block. Our solutions are as good as the Xilinx implementation. | |
1054 | +To conclude, we compare our monolithic filters with the FIR Compiler provided by | |
1055 | +Xilinx in the Vivado software suite (v.2018.2). For each experiment we use the | |
1056 | +same coefficient set and we compare the resource consumption, having checked that | |
1057 | +the transfer functions are indeed the same with both implementations. | |
1058 | +Table~\ref{tbl:xilinx_resources} exhibits the results. | |
1059 | +The FIR Compiler never use BRAM while our filter implementation uses one block. This difference | |
1060 | +is explained be our wish to have a dynamically reconfigurable FIR filter whose | |
1061 | +coefficients can be updated from the processing system without having to update the FPGA design. | |
1062 | +With the FIR compiler, the coefficients are defined during the FPGA design so that | |
1063 | +changing coefficients required generating a new design. The difference with the LUT consumption | |
1064 | +is also attributed to the reconfigurability logic. However the DSP consumption, the scarcest | |
1065 | +resource, is the same between the Xilinx FIR Compiler end | |
1066 | +our FIR block: we hence conclude that our solutions are as good as the Xilinx implementation. | |
1061 | 1067 | |
1062 | 1068 | \renewcommand{\arraystretch}{1.2} |
1063 | 1069 | \begin{table} |