Commit efde7e84966811c5f4a9444ae5323f867b7ccd5b

Authored by Arthur HUGEAT
Exists in master

Merge branch 'master' of https://lxsd.femto-st.fr/gitlab/jfriedt/ifcs2018-article

Showing 1 changed file Side-by-side Diff

ifcs2018_journal.tex
... ... @@ -174,7 +174,7 @@
174 174 that all solutions found by the solver are synthesized and executed on hardware at the end
175 175 of the analysis.
176 176  
177   -In this demonstration , we focus on only two operations: filtering and shifting the number of
  177 +In this demonstration, we focus on only two operations: filtering and shifting the number of
178 178 bits needed to represent the data along the processing chain.
179 179 We have chosen these basic operations because shifting and the filtering have already been studied
180 180 in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for
... ... @@ -298,7 +298,8 @@
298 298 Our criterion to compute the filter rejection considers
299 299 % r2.8 et r2.2 r2.3
300 300 the {\color{red}minimal} rejection within the stopband, to which the {\color{red}sum of the absolute values
301   -within the passband is subtracted to avoid filters with excessive ripples}. With this
  301 +within the passband is subtracted to avoid filters with excessive ripples, normalized to the
  302 +bin width to remain consistent with the passband criterion (dBc/Hz units in all cases)}. With this
302 303 criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.
303 304  
304 305 % \begin{figure}
... ... @@ -311,7 +312,8 @@
311 312 \begin{figure}
312 313 \centering
313 314 \includegraphics[width=\linewidth]{images/colored_custom_criterion}
314   -\caption{Custom criterion (maximum rejection in the stopband minus the mean of the absolute value of the passband rejection)
  315 +\caption{Custom criterion (maximum rejection in the stopband minus the {\color{red} sum of the
  316 +absolute values of the passband rejection normalized to the bandwidth})
315 317 comparison between monolithic filter and cascaded filters}
316 318 \label{fig:custom_criterion}
317 319 \end{figure}
... ... @@ -327,7 +329,9 @@
327 329 \begin{figure}
328 330 \centering
329 331 \includegraphics[width=\linewidth]{images/rejection_pyramid}
330   -\caption{Rejection as a function of number of coefficients and number of bits}
  332 +\caption{{\color{red}{Filter}} rejection as a function of number of coefficients and number of bits
  333 +{\color{red}: this lookup table will be used to identify which filter parameters -- number of bits
  334 +representing coefficients and number of coefficients -- best match the targeted transfer function.}}
331 335 \label{fig:rejection_pyramid}
332 336 \end{figure}
333 337  
... ... @@ -342,7 +346,7 @@
342 346 with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.
343 347 % r2.9
344 348 Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection
345   -criteria of each filter. However since the this sum underestimates the rejection capability of the cascade,
  349 +criteria of each filter. However since the {\color{red}individual filter rejection} sum underestimates the rejection capability of the cascade,
346 350 % r2.10
347 351 this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability
348 352 of the filter cascade to meet design criteria.
349 353  
... ... @@ -350,14 +354,19 @@
350 354 \begin{figure}
351 355 \centering
352 356 \includegraphics[width=\linewidth]{images/cascaded_criterion}
353   -\caption{Rejection of two cascaded filters}
  357 +\caption{{\color{red}Transfer function of individual filters and after cascading} the two filters,
  358 +{\color{red}demonstrating that the selected criterion of maximum rejection in the bandstop (horizontal
  359 +lines) is met. Notice that the cascaded filter has better rejection than summing the bandstop
  360 +maximum of each individual filter.}
  361 +}
354 362 \label{fig:sum_rejection}
355 363 \end{figure}
356 364  
357 365 % r2.6
358   -Finally in our case, we consider that the input signal are fully known. So the
359   -resolution of the data stream are fixed and still the same for all experiments
360   -in this paper.
  366 +{\color{red}
  367 +Finally in our case, we consider that the input signal are fully known. The
  368 +resolution of the input data stream are fixed and still the same for all experiments
  369 +in this paper.}
361 370  
362 371 Based on this analysis, we address the estimate of resource consumption (called
363 372 % r2.11
364 373  
... ... @@ -407,16 +416,24 @@
407 416  
408 417 {\color{red}
409 418 This model is non-linear since we multiply some variable with another variable
410   -and it is even non-quadratic, as $F$ does not have a known
  419 +and it is even non-quadratic, as the cost function $F$ does not have a known
411 420 linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations.
412   -This variable must be defined by the user, it represent the number of different
413   -set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}
414   -functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid}
415   -to restrict the number of configurations. Indeed, it is useless to have too many coefficients or
416   -too many bits, hence we take the configurations close to edge of pyramid. Thank to theses
417   -configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant
418   -and the function $F$ can be estimate for each configurations
419   -thanks our rejection criterion. We also defined binary
  421 +% AH: conflit merge
  422 +% This variable must be defined by the user, it represent the number of different
  423 +% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}
  424 +% functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid}
  425 +% to restrict the number of configurations. Indeed, it is useless to have too many coefficients or
  426 +% too many bits, hence we take the configurations close to edge of pyramid. Thank to theses
  427 +% configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant
  428 +% and the function $F$ can be estimate for each configurations
  429 +% thanks our rejection criterion. We also defined binary
  430 +This variable $p$ is defined by the user, and represents the number of different
  431 +set of coefficients generated (remember, we use \texttt{firls} and \texttt{fir1}
  432 +functions from GNU Octave) based on the targeted filter characteristics and implementation
  433 +assumptions (estimated number of bits defining the coefficients). Hence, $C_{ij}$ and
  434 +$\pi_{ij}^C$ become constants and
  435 +we define $1 \leq j \leq p$ so that the function $F$ can be estimated (Look Up Table)
  436 +for each configurations thanks to the rejection criterion. We also define the binary
420 437 variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$
421 438 and 0 otherwise. The new equations are as follows:
422 439 }
... ... @@ -433,9 +450,20 @@
433 450 Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.
434 451  
435 452 {\color{red}
436   -However the problem still quadratic since in the constraint~\ref{eq:areadef2} we multiply
437   -$\delta_{ij}$ and $\pi_i^-$. But like $\delta_{ij}$ is a binary variable we can
438   -linearize this multiplication. The following formula shows how to linearize
  453 +% JM: conflict merge
  454 +% However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}
  455 +% we multiply
  456 +% $\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can
  457 +% linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size,
  458 +% we define $0 < \pi_i^- \leq 128$ which is the maximum data size whose estimation is
  459 +% assumed on hardware characteristics.
  460 +% The Gurobi (\url{www.gurobi.com}) optimization software used to solve this quadratic
  461 +% model is able to linearize the model provided as is. This model
  462 +% has $O(np)$ variables and $O(n)$ constraints.}
  463 +However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}
  464 +we multiply
  465 +$\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can
  466 +linearise linearize this multiplication. The following formula shows how to linearize
439 467 this situation in general case with $y$ a binary variable and $x$ a real variable ($0 \leq x \leq X^{max}$):
440 468 \begin{equation*}
441 469 m = x \times y \implies
442 470  
... ... @@ -448,12 +476,11 @@
448 476 \end{split}
449 477 \right .
450 478 \end{equation*}
451   -
452   -So if we bound up $\pi_i^-$ by 128~bits to represent the maximum data size tolerated,
  479 +So if we bound up $\pi_i^-$ by 128~bits which is the maximum data size whose estimation is
  480 +assumed on hardware characteristics,
453 481 the Gurobi (\url{www.gurobi.com}) optimization software will be able to linearize
454   -for us the quadratic problem so the model is left as is.
455   -}
456   -This model has $O(np)$ variables and $O(n)$ constraints.
  482 +for us the quadratic problem so the model is left as is. This model
  483 +has $O(np)$ variables and $O(n)$ constraints.}
457 484  
458 485 % This model is non-linear and even non-quadratic, as $F$ does not have a known
459 486 % linear or quadratic expression. We introduce $p$ FIR configurations
... ... @@ -532,7 +559,8 @@
532 559 \draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;
533 560 \draw[->] (Postproc) -- (Results) ;
534 561 \end{tikzpicture}
535   - \caption{Design workflow from the input parameters to the results}
  562 + \caption{Design workflow from the input parameters to the results {\color{red} allowing for
  563 +a fully automated optimal solution search.}}
536 564 \label{fig:workflow}
537 565 \end{figure}
538 566  
539 567  
540 568  
541 569  
... ... @@ -710,22 +738,28 @@
710 738 \centering
711 739 \begin{subfigure}{\linewidth}
712 740 \includegraphics[width=\linewidth]{images/max_500}
713   - \caption{Signal spectrum for MAX/500}
  741 + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
  742 +the MAX/500 problem of maximizing rejection for a given resource allocation (500~arbitrary units).}
714 743 \label{fig:max_500_result}
715 744 \end{subfigure}
716 745  
717 746 \begin{subfigure}{\linewidth}
718 747 \includegraphics[width=\linewidth]{images/max_1000}
719   - \caption{Signal spectrum for MAX/1000}
  748 + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
  749 +the MAX/1000 problem of maximizing rejection for a given resource allocation (1000~arbitrary units).}
720 750 \label{fig:max_1000_result}
721 751 \end{subfigure}
722 752  
723 753 \begin{subfigure}{\linewidth}
724 754 \includegraphics[width=\linewidth]{images/max_1500}
725   - \caption{Signal spectrum for MAX/1500}
  755 + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
  756 +the MAX/1500 problem of maximizing rejection for a given resource allocation (1500~arbitrary units).}
726 757 \label{fig:max_1500_result}
727 758 \end{subfigure}
728   - \caption{Signal spectrum of each experimental configurations MAX/500, MAX/1000 and MAX/1500}
  759 + \caption{\color{red}Solutions for the MAX/500, MAX/1000 and MAX/1500 problems of maximizing
  760 +rejection for a given resource allocation.
  761 +The filter shape constraint (bandpass and bandstop) is shown as thick
  762 +horizontal lines on each chart.}
729 763 \end{figure}
730 764  
731 765 In all cases, we observe that the actual rejection is close to the rejection computed by the solver.
... ... @@ -747,7 +781,8 @@
747 781 Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.
748 782  
749 783 \begin{table}[h!tb]
750   - \caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}
  784 + \caption{Resource occupation {\color{red}following synthesis of the solutions found for
  785 +the problem of maximizing rejection for a given resource allocation}. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}
751 786 \label{tbl:resources_usage}
752 787 \centering
753 788 \begin{tabular}{|c|c|ccc|c|}
754 789  
755 790  
756 791  
757 792  
... ... @@ -964,29 +999,36 @@
964 999 \begin{figure}
965 1000 \centering
966 1001 \begin{subfigure}{\linewidth}
967   - \includegraphics[width=\linewidth]{images/min_40}
968   - \caption{Signal spectrum for MIN/40}
  1002 + \includegraphics[width=.91\linewidth]{images/min_40}
  1003 + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
  1004 +the MIN/40 problem of minimizing resource allocation for reaching a 40~dB rejection.}
969 1005 \label{fig:min_40}
970 1006 \end{subfigure}
971 1007  
972 1008 \begin{subfigure}{\linewidth}
973   - \includegraphics[width=\linewidth]{images/min_60}
974   - \caption{Signal spectrum for MIN/60}
  1009 + \includegraphics[width=.91\linewidth]{images/min_60}
  1010 + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
  1011 +the MIN/60 problem of minimizing resource allocation for reaching a 60~dB rejection.}
975 1012 \label{fig:min_60}
976 1013 \end{subfigure}
977 1014  
978 1015 \begin{subfigure}{\linewidth}
979   - \includegraphics[width=\linewidth]{images/min_80}
980   - \caption{Signal spectrum for MIN/80}
  1016 + \includegraphics[width=.91\linewidth]{images/min_80}
  1017 + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
  1018 +the MIN/80 problem of minimizing resource allocation for reaching a 80~dB rejection.}
981 1019 \label{fig:min_80}
982 1020 \end{subfigure}
983 1021  
984 1022 \begin{subfigure}{\linewidth}
985   - \includegraphics[width=\linewidth]{images/min_100}
986   - \caption{Signal spectrum for MIN/100}
  1023 + \includegraphics[width=.91\linewidth]{images/min_100}
  1024 + \caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
  1025 +the MIN/100 problem of minimizing resource allocation for reaching a 100~dB rejection.}
987 1026 \label{fig:min_100}
988 1027 \end{subfigure}
989   - \caption{Signal spectrum of each experimental configurations MIN/40, MIN/60, MIN/80 and MIN/100}
  1028 + \caption{\color{red}Solutions for the MIN/40, MIN/60, MIN/80 and MIN/100 problems of reaching a
  1029 +given rejection while minimizing resource allocation. The filter shape constraint (bandpass and
  1030 +bandstop) is shown as thick
  1031 +horizontal lines on each chart.}
990 1032 \end{figure}
991 1033  
992 1034 We observe that all rejections given by the quadratic solver are close to the experimentally
... ... @@ -1062,17 +1104,19 @@
1062 1104 previous one.
1063 1105  
1064 1106 {\color{red} % r1.4
1065   -To conclude we have compared our monolithic filters with the FIR Compiler form
1066   -Xilinx. For each experimentation we use the same coefficient set and we compare the
1067   -resources consumption. The table~\ref{tbl:xilinx_resources} exhibits the results.
1068   -The FIR Compiler never use BRAM while our filter use one block. This difference
1069   -can be explain be our wish to have a reconfigurable FIR filter. In our case, we can
1070   -configure the coefficients set without to have to change the FPGA design. With
1071   -the FIR compiler, the coefficients set are given during the FPGA design conception
1072   -so we have to change the coefficients, we need to regenerate the design. The
1073   -difference with the LUT consumption is also related to the reconfigurability
1074   -logic. However the DSP consumption, the most restricted resource, are the same between the FIR compiler end
1075   -our FIR block. Our solutions are as good as the Xilinx implementation.
  1107 +To conclude, we compare our monolithic filters with the FIR Compiler provided by
  1108 +Xilinx in the Vivado software suite (v.2018.2). For each experiment we use the
  1109 +same coefficient set and we compare the resource consumption, having checked that
  1110 +the transfer functions are indeed the same with both implementations.
  1111 +Table~\ref{tbl:xilinx_resources} exhibits the results.
  1112 +The FIR Compiler never use BRAM while our filter implementation uses one block. This difference
  1113 +is explained be our wish to have a dynamically reconfigurable FIR filter whose
  1114 +coefficients can be updated from the processing system without having to update the FPGA design.
  1115 +With the FIR compiler, the coefficients are defined during the FPGA design so that
  1116 +changing coefficients required generating a new design. The difference with the LUT consumption
  1117 +is also attributed to the reconfigurability logic. However the DSP consumption, the scarcest
  1118 +resource, is the same between the Xilinx FIR Compiler end
  1119 +our FIR block: we hence conclude that our solutions are as good as the Xilinx implementation.
1076 1120  
1077 1121 \renewcommand{\arraystretch}{1.2}
1078 1122 \begin{table}