Commit efde7e84966811c5f4a9444ae5323f867b7ccd5b
Exists in
master
Merge branch 'master' of https://lxsd.femto-st.fr/gitlab/jfriedt/ifcs2018-article
Showing 1 changed file Side-by-side Diff
ifcs2018_journal.tex
... | ... | @@ -174,7 +174,7 @@ |
174 | 174 | that all solutions found by the solver are synthesized and executed on hardware at the end |
175 | 175 | of the analysis. |
176 | 176 | |
177 | -In this demonstration , we focus on only two operations: filtering and shifting the number of | |
177 | +In this demonstration, we focus on only two operations: filtering and shifting the number of | |
178 | 178 | bits needed to represent the data along the processing chain. |
179 | 179 | We have chosen these basic operations because shifting and the filtering have already been studied |
180 | 180 | in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for |
... | ... | @@ -298,7 +298,8 @@ |
298 | 298 | Our criterion to compute the filter rejection considers |
299 | 299 | % r2.8 et r2.2 r2.3 |
300 | 300 | the {\color{red}minimal} rejection within the stopband, to which the {\color{red}sum of the absolute values |
301 | -within the passband is subtracted to avoid filters with excessive ripples}. With this | |
301 | +within the passband is subtracted to avoid filters with excessive ripples, normalized to the | |
302 | +bin width to remain consistent with the passband criterion (dBc/Hz units in all cases)}. With this | |
302 | 303 | criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}. |
303 | 304 | |
304 | 305 | % \begin{figure} |
... | ... | @@ -311,7 +312,8 @@ |
311 | 312 | \begin{figure} |
312 | 313 | \centering |
313 | 314 | \includegraphics[width=\linewidth]{images/colored_custom_criterion} |
314 | -\caption{Custom criterion (maximum rejection in the stopband minus the mean of the absolute value of the passband rejection) | |
315 | +\caption{Custom criterion (maximum rejection in the stopband minus the {\color{red} sum of the | |
316 | +absolute values of the passband rejection normalized to the bandwidth}) | |
315 | 317 | comparison between monolithic filter and cascaded filters} |
316 | 318 | \label{fig:custom_criterion} |
317 | 319 | \end{figure} |
... | ... | @@ -327,7 +329,9 @@ |
327 | 329 | \begin{figure} |
328 | 330 | \centering |
329 | 331 | \includegraphics[width=\linewidth]{images/rejection_pyramid} |
330 | -\caption{Rejection as a function of number of coefficients and number of bits} | |
332 | +\caption{{\color{red}{Filter}} rejection as a function of number of coefficients and number of bits | |
333 | +{\color{red}: this lookup table will be used to identify which filter parameters -- number of bits | |
334 | +representing coefficients and number of coefficients -- best match the targeted transfer function.}} | |
331 | 335 | \label{fig:rejection_pyramid} |
332 | 336 | \end{figure} |
333 | 337 | |
... | ... | @@ -342,7 +346,7 @@ |
342 | 346 | with respect to a basic sum of the rejection criteria shown as a the dotted yellow line. |
343 | 347 | % r2.9 |
344 | 348 | Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection |
345 | -criteria of each filter. However since the this sum underestimates the rejection capability of the cascade, | |
349 | +criteria of each filter. However since the {\color{red}individual filter rejection} sum underestimates the rejection capability of the cascade, | |
346 | 350 | % r2.10 |
347 | 351 | this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability |
348 | 352 | of the filter cascade to meet design criteria. |
349 | 353 | |
... | ... | @@ -350,14 +354,19 @@ |
350 | 354 | \begin{figure} |
351 | 355 | \centering |
352 | 356 | \includegraphics[width=\linewidth]{images/cascaded_criterion} |
353 | -\caption{Rejection of two cascaded filters} | |
357 | +\caption{{\color{red}Transfer function of individual filters and after cascading} the two filters, | |
358 | +{\color{red}demonstrating that the selected criterion of maximum rejection in the bandstop (horizontal | |
359 | +lines) is met. Notice that the cascaded filter has better rejection than summing the bandstop | |
360 | +maximum of each individual filter.} | |
361 | +} | |
354 | 362 | \label{fig:sum_rejection} |
355 | 363 | \end{figure} |
356 | 364 | |
357 | 365 | % r2.6 |
358 | -Finally in our case, we consider that the input signal are fully known. So the | |
359 | -resolution of the data stream are fixed and still the same for all experiments | |
360 | -in this paper. | |
366 | +{\color{red} | |
367 | +Finally in our case, we consider that the input signal are fully known. The | |
368 | +resolution of the input data stream are fixed and still the same for all experiments | |
369 | +in this paper.} | |
361 | 370 | |
362 | 371 | Based on this analysis, we address the estimate of resource consumption (called |
363 | 372 | % r2.11 |
364 | 373 | |
... | ... | @@ -407,16 +416,24 @@ |
407 | 416 | |
408 | 417 | {\color{red} |
409 | 418 | This model is non-linear since we multiply some variable with another variable |
410 | -and it is even non-quadratic, as $F$ does not have a known | |
419 | +and it is even non-quadratic, as the cost function $F$ does not have a known | |
411 | 420 | linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations. |
412 | -This variable must be defined by the user, it represent the number of different | |
413 | -set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1} | |
414 | -functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid} | |
415 | -to restrict the number of configurations. Indeed, it is useless to have too many coefficients or | |
416 | -too many bits, hence we take the configurations close to edge of pyramid. Thank to theses | |
417 | -configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant | |
418 | -and the function $F$ can be estimate for each configurations | |
419 | -thanks our rejection criterion. We also defined binary | |
421 | +% AH: conflit merge | |
422 | +% This variable must be defined by the user, it represent the number of different | |
423 | +% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1} | |
424 | +% functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid} | |
425 | +% to restrict the number of configurations. Indeed, it is useless to have too many coefficients or | |
426 | +% too many bits, hence we take the configurations close to edge of pyramid. Thank to theses | |
427 | +% configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant | |
428 | +% and the function $F$ can be estimate for each configurations | |
429 | +% thanks our rejection criterion. We also defined binary | |
430 | +This variable $p$ is defined by the user, and represents the number of different | |
431 | +set of coefficients generated (remember, we use \texttt{firls} and \texttt{fir1} | |
432 | +functions from GNU Octave) based on the targeted filter characteristics and implementation | |
433 | +assumptions (estimated number of bits defining the coefficients). Hence, $C_{ij}$ and | |
434 | +$\pi_{ij}^C$ become constants and | |
435 | +we define $1 \leq j \leq p$ so that the function $F$ can be estimated (Look Up Table) | |
436 | +for each configurations thanks to the rejection criterion. We also define the binary | |
420 | 437 | variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$ |
421 | 438 | and 0 otherwise. The new equations are as follows: |
422 | 439 | } |
... | ... | @@ -433,9 +450,20 @@ |
433 | 450 | Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most. |
434 | 451 | |
435 | 452 | {\color{red} |
436 | -However the problem still quadratic since in the constraint~\ref{eq:areadef2} we multiply | |
437 | -$\delta_{ij}$ and $\pi_i^-$. But like $\delta_{ij}$ is a binary variable we can | |
438 | -linearize this multiplication. The following formula shows how to linearize | |
453 | +% JM: conflict merge | |
454 | +% However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2} | |
455 | +% we multiply | |
456 | +% $\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can | |
457 | +% linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size, | |
458 | +% we define $0 < \pi_i^- \leq 128$ which is the maximum data size whose estimation is | |
459 | +% assumed on hardware characteristics. | |
460 | +% The Gurobi (\url{www.gurobi.com}) optimization software used to solve this quadratic | |
461 | +% model is able to linearize the model provided as is. This model | |
462 | +% has $O(np)$ variables and $O(n)$ constraints.} | |
463 | +However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2} | |
464 | +we multiply | |
465 | +$\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can | |
466 | +linearise linearize this multiplication. The following formula shows how to linearize | |
439 | 467 | this situation in general case with $y$ a binary variable and $x$ a real variable ($0 \leq x \leq X^{max}$): |
440 | 468 | \begin{equation*} |
441 | 469 | m = x \times y \implies |
442 | 470 | |
... | ... | @@ -448,12 +476,11 @@ |
448 | 476 | \end{split} |
449 | 477 | \right . |
450 | 478 | \end{equation*} |
451 | - | |
452 | -So if we bound up $\pi_i^-$ by 128~bits to represent the maximum data size tolerated, | |
479 | +So if we bound up $\pi_i^-$ by 128~bits which is the maximum data size whose estimation is | |
480 | +assumed on hardware characteristics, | |
453 | 481 | the Gurobi (\url{www.gurobi.com}) optimization software will be able to linearize |
454 | -for us the quadratic problem so the model is left as is. | |
455 | -} | |
456 | -This model has $O(np)$ variables and $O(n)$ constraints. | |
482 | +for us the quadratic problem so the model is left as is. This model | |
483 | +has $O(np)$ variables and $O(n)$ constraints.} | |
457 | 484 | |
458 | 485 | % This model is non-linear and even non-quadratic, as $F$ does not have a known |
459 | 486 | % linear or quadratic expression. We introduce $p$ FIR configurations |
... | ... | @@ -532,7 +559,8 @@ |
532 | 559 | \draw[->] (Deploy) edge node [left] { (5) } (Postproc) ; |
533 | 560 | \draw[->] (Postproc) -- (Results) ; |
534 | 561 | \end{tikzpicture} |
535 | - \caption{Design workflow from the input parameters to the results} | |
562 | + \caption{Design workflow from the input parameters to the results {\color{red} allowing for | |
563 | +a fully automated optimal solution search.}} | |
536 | 564 | \label{fig:workflow} |
537 | 565 | \end{figure} |
538 | 566 | |
539 | 567 | |
540 | 568 | |
541 | 569 | |
... | ... | @@ -710,22 +738,28 @@ |
710 | 738 | \centering |
711 | 739 | \begin{subfigure}{\linewidth} |
712 | 740 | \includegraphics[width=\linewidth]{images/max_500} |
713 | - \caption{Signal spectrum for MAX/500} | |
741 | + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving | |
742 | +the MAX/500 problem of maximizing rejection for a given resource allocation (500~arbitrary units).} | |
714 | 743 | \label{fig:max_500_result} |
715 | 744 | \end{subfigure} |
716 | 745 | |
717 | 746 | \begin{subfigure}{\linewidth} |
718 | 747 | \includegraphics[width=\linewidth]{images/max_1000} |
719 | - \caption{Signal spectrum for MAX/1000} | |
748 | + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving | |
749 | +the MAX/1000 problem of maximizing rejection for a given resource allocation (1000~arbitrary units).} | |
720 | 750 | \label{fig:max_1000_result} |
721 | 751 | \end{subfigure} |
722 | 752 | |
723 | 753 | \begin{subfigure}{\linewidth} |
724 | 754 | \includegraphics[width=\linewidth]{images/max_1500} |
725 | - \caption{Signal spectrum for MAX/1500} | |
755 | + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving | |
756 | +the MAX/1500 problem of maximizing rejection for a given resource allocation (1500~arbitrary units).} | |
726 | 757 | \label{fig:max_1500_result} |
727 | 758 | \end{subfigure} |
728 | - \caption{Signal spectrum of each experimental configurations MAX/500, MAX/1000 and MAX/1500} | |
759 | + \caption{\color{red}Solutions for the MAX/500, MAX/1000 and MAX/1500 problems of maximizing | |
760 | +rejection for a given resource allocation. | |
761 | +The filter shape constraint (bandpass and bandstop) is shown as thick | |
762 | +horizontal lines on each chart.} | |
729 | 763 | \end{figure} |
730 | 764 | |
731 | 765 | In all cases, we observe that the actual rejection is close to the rejection computed by the solver. |
... | ... | @@ -747,7 +781,8 @@ |
747 | 781 | Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication. |
748 | 782 | |
749 | 783 | \begin{table}[h!tb] |
750 | - \caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.} | |
784 | + \caption{Resource occupation {\color{red}following synthesis of the solutions found for | |
785 | +the problem of maximizing rejection for a given resource allocation}. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.} | |
751 | 786 | \label{tbl:resources_usage} |
752 | 787 | \centering |
753 | 788 | \begin{tabular}{|c|c|ccc|c|} |
754 | 789 | |
755 | 790 | |
756 | 791 | |
757 | 792 | |
... | ... | @@ -964,29 +999,36 @@ |
964 | 999 | \begin{figure} |
965 | 1000 | \centering |
966 | 1001 | \begin{subfigure}{\linewidth} |
967 | - \includegraphics[width=\linewidth]{images/min_40} | |
968 | - \caption{Signal spectrum for MIN/40} | |
1002 | + \includegraphics[width=.91\linewidth]{images/min_40} | |
1003 | + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving | |
1004 | +the MIN/40 problem of minimizing resource allocation for reaching a 40~dB rejection.} | |
969 | 1005 | \label{fig:min_40} |
970 | 1006 | \end{subfigure} |
971 | 1007 | |
972 | 1008 | \begin{subfigure}{\linewidth} |
973 | - \includegraphics[width=\linewidth]{images/min_60} | |
974 | - \caption{Signal spectrum for MIN/60} | |
1009 | + \includegraphics[width=.91\linewidth]{images/min_60} | |
1010 | + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving | |
1011 | +the MIN/60 problem of minimizing resource allocation for reaching a 60~dB rejection.} | |
975 | 1012 | \label{fig:min_60} |
976 | 1013 | \end{subfigure} |
977 | 1014 | |
978 | 1015 | \begin{subfigure}{\linewidth} |
979 | - \includegraphics[width=\linewidth]{images/min_80} | |
980 | - \caption{Signal spectrum for MIN/80} | |
1016 | + \includegraphics[width=.91\linewidth]{images/min_80} | |
1017 | + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving | |
1018 | +the MIN/80 problem of minimizing resource allocation for reaching a 80~dB rejection.} | |
981 | 1019 | \label{fig:min_80} |
982 | 1020 | \end{subfigure} |
983 | 1021 | |
984 | 1022 | \begin{subfigure}{\linewidth} |
985 | - \includegraphics[width=\linewidth]{images/min_100} | |
986 | - \caption{Signal spectrum for MIN/100} | |
1023 | + \includegraphics[width=.91\linewidth]{images/min_100} | |
1024 | + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving | |
1025 | +the MIN/100 problem of minimizing resource allocation for reaching a 100~dB rejection.} | |
987 | 1026 | \label{fig:min_100} |
988 | 1027 | \end{subfigure} |
989 | - \caption{Signal spectrum of each experimental configurations MIN/40, MIN/60, MIN/80 and MIN/100} | |
1028 | + \caption{\color{red}Solutions for the MIN/40, MIN/60, MIN/80 and MIN/100 problems of reaching a | |
1029 | +given rejection while minimizing resource allocation. The filter shape constraint (bandpass and | |
1030 | +bandstop) is shown as thick | |
1031 | +horizontal lines on each chart.} | |
990 | 1032 | \end{figure} |
991 | 1033 | |
992 | 1034 | We observe that all rejections given by the quadratic solver are close to the experimentally |
... | ... | @@ -1062,17 +1104,19 @@ |
1062 | 1104 | previous one. |
1063 | 1105 | |
1064 | 1106 | {\color{red} % r1.4 |
1065 | -To conclude we have compared our monolithic filters with the FIR Compiler form | |
1066 | -Xilinx. For each experimentation we use the same coefficient set and we compare the | |
1067 | -resources consumption. The table~\ref{tbl:xilinx_resources} exhibits the results. | |
1068 | -The FIR Compiler never use BRAM while our filter use one block. This difference | |
1069 | -can be explain be our wish to have a reconfigurable FIR filter. In our case, we can | |
1070 | -configure the coefficients set without to have to change the FPGA design. With | |
1071 | -the FIR compiler, the coefficients set are given during the FPGA design conception | |
1072 | -so we have to change the coefficients, we need to regenerate the design. The | |
1073 | -difference with the LUT consumption is also related to the reconfigurability | |
1074 | -logic. However the DSP consumption, the most restricted resource, are the same between the FIR compiler end | |
1075 | -our FIR block. Our solutions are as good as the Xilinx implementation. | |
1107 | +To conclude, we compare our monolithic filters with the FIR Compiler provided by | |
1108 | +Xilinx in the Vivado software suite (v.2018.2). For each experiment we use the | |
1109 | +same coefficient set and we compare the resource consumption, having checked that | |
1110 | +the transfer functions are indeed the same with both implementations. | |
1111 | +Table~\ref{tbl:xilinx_resources} exhibits the results. | |
1112 | +The FIR Compiler never use BRAM while our filter implementation uses one block. This difference | |
1113 | +is explained be our wish to have a dynamically reconfigurable FIR filter whose | |
1114 | +coefficients can be updated from the processing system without having to update the FPGA design. | |
1115 | +With the FIR compiler, the coefficients are defined during the FPGA design so that | |
1116 | +changing coefficients required generating a new design. The difference with the LUT consumption | |
1117 | +is also attributed to the reconfigurability logic. However the DSP consumption, the scarcest | |
1118 | +resource, is the same between the Xilinx FIR Compiler end | |
1119 | +our FIR block: we hence conclude that our solutions are as good as the Xilinx implementation. | |
1076 | 1120 | |
1077 | 1121 | \renewcommand{\arraystretch}{1.2} |
1078 | 1122 | \begin{table} |