jfriedt / IFCS2018 article

Compare View

Commits (2)

9c253d6d2 Correction sur le critère de selection. Browse Code »

Arthur HUGEAT
2019-08-02 12:30:22 +0200
efde7e849 Merge branch 'master' of https://lxsd.femto-st.fr/gitlab/jfriedt/ifcs2018-article Browse Code »

Arthur HUGEAT
2019-08-02 12:41:22 +0200

Diff

Showing 4 changed files Inline Diff

ifcs2018_journal.tex
ifcs2018_journal_reponse.tex
images/letter_max_criterion.pdf
images/letter_sum_criterion.pdf

ifcs2018_journal.tex

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee	1	1	% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee
% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de	2	2	% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de
% rejection par bit et perte si moins de bits que rejection/6	3	3	% rejection par bit et perte si moins de bits que rejection/6
% developper programme lineaire en incluant le decalage de bits	4	4	% developper programme lineaire en incluant le decalage de bits
% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on	5	5	% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on
% implemente et on demontre que ca tourne	6	6	% implemente et on demontre que ca tourne
% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?	7	7	% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?
% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer	8	8	% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer
% (zedboard ou redpit)	9	9	% (zedboard ou redpit)
	10	10
% label schema : verifier que "argumenter de la cascade de FIR" est fait	11	11	% label schema : verifier que "argumenter de la cascade de FIR" est fait
	12	12
\documentclass[a4paper,journal]{IEEEtran/IEEEtran}	13	13	\documentclass[a4paper,journal]{IEEEtran/IEEEtran}
\usepackage{graphicx,color,hyperref}	14	14	\usepackage{graphicx,color,hyperref}
\usepackage{amsfonts}	15	15	\usepackage{amsfonts}
\usepackage{amsthm}	16	16	\usepackage{amsthm}
\usepackage{amssymb}	17	17	\usepackage{amssymb}
\usepackage{amsmath}	18	18	\usepackage{amsmath}
\usepackage{algorithm2e}	19	19	\usepackage{algorithm2e}
\usepackage{url,balance}	20	20	\usepackage{url,balance}
\usepackage[normalem]{ulem}	21	21	\usepackage[normalem]{ulem}
\usepackage{tikz}	22	22	\usepackage{tikz}
\usetikzlibrary{positioning,fit}	23	23	\usetikzlibrary{positioning,fit}
\usepackage{multirow}	24	24	\usepackage{multirow}
\usepackage{scalefnt}	25	25	\usepackage{scalefnt}
\usepackage{caption}	26	26	\usepackage{caption}
\usepackage{subcaption}	27	27	\usepackage{subcaption}
	28	28
% correct bad hyphenation here	29	29	% correct bad hyphenation here
\hyphenation{op-tical net-works semi-conduc-tor}	30	30	\hyphenation{op-tical net-works semi-conduc-tor}
\textheight=26cm	31	31	\textheight=26cm
\setlength{\footskip}{30pt}	32	32	\setlength{\footskip}{30pt}
\pagenumbering{gobble}	33	33	\pagenumbering{gobble}
\begin{document}	34	34	\begin{document}
\title{Filter optimization for real time digital processing of radiofrequency signals: application	35	35	\title{Filter optimization for real time digital processing of radiofrequency signals: application
to oscillator metrology}	36	36	to oscillator metrology}
	37	37
\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},	38	38	\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},
G. Goavec-M\'erou\IEEEauthorrefmark{1},	39	39	G. Goavec-M\'erou\IEEEauthorrefmark{1},
P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}\\	40	40	P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}\\
\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }\\	41	41	\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }\\
\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\	42	42	\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\
Email: \{pyb2,jmfriedt\}@femto-st.fr}	43	43	Email: \{pyb2,jmfriedt\}@femto-st.fr}
}	44	44	}
\maketitle	45	45	\maketitle
\thispagestyle{plain}	46	46	\thispagestyle{plain}
\pagestyle{plain}	47	47	\pagestyle{plain}
\newtheorem{definition}{Definition}	48	48	\newtheorem{definition}{Definition}
	49	49
\begin{abstract}	50	50	\begin{abstract}
Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to	51	51	Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to
radiofrequency signal processing. Applied to oscillator characterization in the context	52	52	radiofrequency signal processing. Applied to oscillator characterization in the context
of ultrastable clocks, stringent filtering requirements are defined by spurious signal or	53	53	of ultrastable clocks, stringent filtering requirements are defined by spurious signal or
noise rejection needs. Since real time radiofrequency processing must be performed in a	54	54	noise rejection needs. Since real time radiofrequency processing must be performed in a
Field Programmable Array to meet timing constraints, we investigate optimization strategies	55	55	Field Programmable Array to meet timing constraints, we investigate optimization strategies
to design filters meeting rejection characteristics while limiting the hardware resources	56	56	to design filters meeting rejection characteristics while limiting the hardware resources
required and keeping timing constraints within the targeted measurement bandwidths. The	57	57	required and keeping timing constraints within the targeted measurement bandwidths. The
presented technique is applicable to scheduling any sequence of processing blocks characterized	58	58	presented technique is applicable to scheduling any sequence of processing blocks characterized
by a throughput, resource occupation and performance tabulated as a function of configuration	59	59	by a throughput, resource occupation and performance tabulated as a function of configuration
characateristics, as is the case for filters with their coefficients and resolution yielding	60	60	characateristics, as is the case for filters with their coefficients and resolution yielding
rejection and number of multipliers.	61	61	rejection and number of multipliers.
\end{abstract}	62	62	\end{abstract}
	63	63
\begin{IEEEkeywords}	64	64	\begin{IEEEkeywords}
Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter	65	65	Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter
\end{IEEEkeywords}	66	66	\end{IEEEkeywords}
	67	67
\section{Digital signal processing of ultrastable clock signals}	68	68	\section{Digital signal processing of ultrastable clock signals}
	69	69
Analog oscillator phase noise characteristics are classically performed by downconverting	70	70	Analog oscillator phase noise characteristics are classically performed by downconverting
the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,	71	71	the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,
followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In	72	72	followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In
a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by	73	73	a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by
multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.	74	74	multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.
	75	75
\begin{figure}[h!tb]	76	76	\begin{figure}[h!tb]
\begin{center}	77	77	\begin{center}
\includegraphics[width=.8\linewidth]{images/schema}	78	78	\includegraphics[width=.8\linewidth]{images/schema}
\end{center}	79	79	\end{center}
\caption{Fully digital oscillator phase noise characterization: the Device Under Test	80	80	\caption{Fully digital oscillator phase noise characterization: the Device Under Test
(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and	81	81	(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and
downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals	82	82	downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals
and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite	83	83	and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite
Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays	84	84	Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays
the spectral characteristics of the phase fluctuations.}	85	85	the spectral characteristics of the phase fluctuations.}
\label{schema}	86	86	\label{schema}
\end{figure}	87	87	\end{figure}
	88	88
As with the analog mixer,	89	89	As with the analog mixer,
the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as	90	90	the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as
well as the generation of the frequency sum signal in addition to the frequency difference.	91	91	well as the generation of the frequency sum signal in addition to the frequency difference.
These unwanted spectral characteristics must be rejected before decimating the data stream	92	92	These unwanted spectral characteristics must be rejected before decimating the data stream
for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the	93	93	for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the
downconverter	94	94	downconverter
and the decimation processing blocks are core characteristics of an oscillator characterization	95	95	and the decimation processing blocks are core characteristics of an oscillator characterization
system, and must reject out-of-band signals below the targeted phase noise -- typically in the	96	96	system, and must reject out-of-band signals below the targeted phase noise -- typically in the
sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will	97	97	sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will
use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency	98	98	use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency
datastream: optimizing the performance of the filter while reducing the needed resources is	99	99	datastream: optimizing the performance of the filter while reducing the needed resources is
hence tackled in a systematic approach using optimization techniques. Most significantly, we	100	100	hence tackled in a systematic approach using optimization techniques. Most significantly, we
tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with	101	101	tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with
tunable number of coefficients and tunable number of bits representing the coefficients and the	102	102	tunable number of coefficients and tunable number of bits representing the coefficients and the
data being processed.	103	103	data being processed.
	104	104
\section{Finite impulse response filter}	105	105	\section{Finite impulse response filter}
	106	106
We select FIR filters for their unconditional stability and ease of design. A FIR filter is defined	107	107	We select FIR filters for their unconditional stability and ease of design. A FIR filter is defined
by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the	108	108	by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the
outputs $y_k$	109	109	outputs $y_k$
\begin{align}	110	110	\begin{align}
y_n=\sum_{k=0}^N b_k x_{n-k}	111	111	y_n=\sum_{k=0}^N b_k x_{n-k}
\label{eq:fir_equation}	112	112	\label{eq:fir_equation}
\end{align}	113	113	\end{align}
	114	114
As opposed to an implementation on a general purpose processor in which word size is defined by the	115	115	As opposed to an implementation on a general purpose processor in which word size is defined by the
processor architecture, implementing such a filter on an FPGA offers more degrees of freedom since	116	116	processor architecture, implementing such a filter on an FPGA offers more degrees of freedom since
not only the coefficient values and number of taps must be defined, but also the number of bits	117	117	not only the coefficient values and number of taps must be defined, but also the number of bits
defining the coefficients and the sample size. For this reason, and because we consider pipeline	118	118	defining the coefficients and the sample size. For this reason, and because we consider pipeline
processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency	119	119	processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency
signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but	120	120	signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but
the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language	121	121	the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language
(VHDL) level.	122	122	(VHDL) level.
{\color{red}Since latency is not an issue in a openloop phase noise characterization instrument,	123	123	{\color{red}Since latency is not an issue in a openloop phase noise characterization instrument,
the large	124	124	the large
numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,	125	125	numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,
is not considered as an issue as would be in a closed loop system.} % r2.4	126	126	is not considered as an issue as would be in a closed loop system.} % r2.4
	127	127
The coefficients are classically expressed as floating point values. However, this binary	128	128	The coefficients are classically expressed as floating point values. However, this binary
number representation is not efficient for fast arithmetic computation by an FPGA. Instead,	129	129	number representation is not efficient for fast arithmetic computation by an FPGA. Instead,
we select to quantify these floating point values into integer values. This quantization	130	130	we select to quantify these floating point values into integer values. This quantization
will result in some precision loss.	131	131	will result in some precision loss.
	132	132
\begin{figure}[h!tb]	133	133	\begin{figure}[h!tb]
\includegraphics[width=\linewidth]{images/zero_values}	134	134	\includegraphics[width=\linewidth]{images/zero_values}
\caption{Impact of the quantization resolution of the coefficients: the quantization is	135	135	\caption{Impact of the quantization resolution of the coefficients: the quantization is
set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting	136	136	set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting
the 30~first and 30~last coefficients out of the initial 128~band-pass	137	137	the 30~first and 30~last coefficients out of the initial 128~band-pass
filter coefficients to 0 (red dots).}	138	138	filter coefficients to 0 (red dots).}
\label{float_vs_int}	139	139	\label{float_vs_int}
\end{figure}	140	140	\end{figure}
	141	141
The tradeoff between quantization resolution and number of coefficients when considering	142	142	The tradeoff between quantization resolution and number of coefficients when considering
integer operations is not trivial. As an illustration of the issue related to the	143	143	integer operations is not trivial. As an illustration of the issue related to the
relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits	144	144	relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits
a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon	145	145	a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon
quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the	146	146	quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the
taps become null, {\color{red}making the large number of coefficients irrelevant: processing	147	147	taps become null, {\color{red}making the large number of coefficients irrelevant: processing
resources % r1.1	148	148	resources % r1.1
are hence saved by shrinking the filter length.} This tradeoff aimed at minimizing resources	149	149	are hence saved by shrinking the filter length.} This tradeoff aimed at minimizing resources
to reach a given rejection level, or maximizing out of band rejection for a given computational	150	150	to reach a given rejection level, or maximizing out of band rejection for a given computational
resource, will drive the investigation on cascading filters designed with varying tap resolution	151	151	resource, will drive the investigation on cascading filters designed with varying tap resolution
and tap length, as will be shown in the next section. Indeed, our development strategy closely	152	152	and tap length, as will be shown in the next section. Indeed, our development strategy closely
follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}	153	153	follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}
in which basic blocks are defined and characterized before being assembled \cite{hide}	154	154	in which basic blocks are defined and characterized before being assembled \cite{hide}
in a complete processing chain. In our case, assembling the filter blocks is a simpler block	155	155	in a complete processing chain. In our case, assembling the filter blocks is a simpler block
combination process since we assume a single value to be processed and a single value to be	156	156	combination process since we assume a single value to be processed and a single value to be
generated at each clock cycle. The FIR filters will not be considered to decimate in the	157	157	generated at each clock cycle. The FIR filters will not be considered to decimate in the
current implementation: the decimation is assumed to be located after the FIR cascade at the	158	158	current implementation: the decimation is assumed to be located after the FIR cascade at the
moment.	159	159	moment.
	160	160
\section{Methodology description}	161	161	\section{Methodology description}
	162	162
Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP)	163	163	Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP)
chain obtained by assembling basic processing blocks, with hardware and manufacturer independence.	164	164	chain obtained by assembling basic processing blocks, with hardware and manufacturer independence.
Achieving such a target requires defining an abstract model to represent some basic properties	165	165	Achieving such a target requires defining an abstract model to represent some basic properties
of DSP blocks such as perfomance (i.e. rejection or ripples in the bandpass for filters) and	166	166	of DSP blocks such as perfomance (i.e. rejection or ripples in the bandpass for filters) and
resource occupation. These abstract properties, not necessarily related to the detailed hardware	167	167	resource occupation. These abstract properties, not necessarily related to the detailed hardware
implementation of a given platform, will feed a scheduler solver aimed at assembling the optimum	168	168	implementation of a given platform, will feed a scheduler solver aimed at assembling the optimum
target, whether in terms of maximizing performance for a given arbitrary resource occupation, or	169	169	target, whether in terms of maximizing performance for a given arbitrary resource occupation, or
minimizing resource occupation for a given perfomance. In our approach, the solution of the	170	170	minimizing resource occupation for a given perfomance. In our approach, the solution of the
solver is then synthesized using the dedicated tool provided by each platform manufacturer	171	171	solver is then synthesized using the dedicated tool provided by each platform manufacturer
to assess the validity of our abstract resource occupation indicator, and the result of running	172	172	to assess the validity of our abstract resource occupation indicator, and the result of running
the DSP chain on the FPGA allows for assessing the performance of the scheduler. We emphasize	173	173	the DSP chain on the FPGA allows for assessing the performance of the scheduler. We emphasize
that all solutions found by the solver are synthesized and executed on hardware at the end	174	174	that all solutions found by the solver are synthesized and executed on hardware at the end
of the analysis.	175	175	of the analysis.
	176	176
In this demonstration , we focus on only two operations: filtering and shifting the number of	177	177	In this demonstration, we focus on only two operations: filtering and shifting the number of
bits needed to represent the data along the processing chain.	178	178	bits needed to represent the data along the processing chain.
We have chosen these basic operations because shifting and the filtering have already been studied	179	179	We have chosen these basic operations because shifting and the filtering have already been studied
in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for	180	180	in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for
assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend	181	181	assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend
requiring pipelined processing at full bandwidth for the earliest steps, including for	182	182	requiring pipelined processing at full bandwidth for the earliest steps, including for
time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}.	183	183	time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}.
	184	184
Addressing only two operations allows for demonstrating the methodology but should not be	185	185	Addressing only two operations allows for demonstrating the methodology but should not be
considered as a limitation of the framework which can be extended to assembling any number	186	186	considered as a limitation of the framework which can be extended to assembling any number
of skeleton blocks as long as perfomance and resource occupation can be determined. {\color{red}	187	187	of skeleton blocks as long as perfomance and resource occupation can be determined. {\color{red}
Hence,	188	188	Hence,
in this paper we will apply our methodology on simple DSP chains: a white noise input signal % r1.2	189	189	in this paper we will apply our methodology on simple DSP chains: a white noise input signal % r1.2
is generated using a Pseudo-Random Number (PRN) generator or by sampling a wideband (125~MS/s)	190	190	is generated using a Pseudo-Random Number (PRN) generator or by sampling a wideband (125~MS/s)
14-bit Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.} Once samples have been	191	191	14-bit Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.} Once samples have been
digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance --	192	192	digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance --
practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction	193	193	practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction
by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing,	194	194	by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing,
allowing to assess either filter rejection for a given resource usage, or validating the rejection	195	195	allowing to assess either filter rejection for a given resource usage, or validating the rejection
when implementing a solution minimizing resource occupation.	196	196	when implementing a solution minimizing resource occupation.
	197	197
{\color{red}	198	198	{\color{red}
The first step of our approach is to model the DSP chain. Since we aim at only optimizing % r1.3	199	199	The first step of our approach is to model the DSP chain. Since we aim at only optimizing % r1.3
the filtering part of the signal processing chain, we have not included the PRN generator or the	200	200	the filtering part of the signal processing chain, we have not included the PRN generator or the
ADC in the model: the input data size and rate are considered fixed and defined by the hardware.	201	201	ADC in the model: the input data size and rate are considered fixed and defined by the hardware.
The filtering can be done in two ways, either by considering a single monolithic FIR filter	202	202	The filtering can be done in two ways, either by considering a single monolithic FIR filter
requiring many coefficients to reach the targeted noise rejection ratio, or by	203	203	requiring many coefficients to reach the targeted noise rejection ratio, or by
cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.}	204	204	cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.}
	205	205
After each filter we leave the possibility of shifting the filtered data to consume	206	206	After each filter we leave the possibility of shifting the filtered data to consume
less resources. Hence in the case of cascaded filter, we define a stage as a filter	207	207	less resources. Hence in the case of cascaded filter, we define a stage as a filter
and a shifter (the shift could be omitted if we do not need to divide the filtered data).	208	208	and a shifter (the shift could be omitted if we do not need to divide the filtered data).
	209	209
\subsection{Model of a FIR filter}	210	210	\subsection{Model of a FIR filter}
	211	211
A cascade of filters is composed of $n$ FIR stages. In stage $i$ ($1 \leq i \leq n$)	212	212	A cascade of filters is composed of $n$ FIR stages. In stage $i$ ($1 \leq i \leq n$)
the FIR has $C_i$ coefficients and each coefficient is an integer value with $\pi^C_i$	213	213	the FIR has $C_i$ coefficients and each coefficient is an integer value with $\pi^C_i$
bits while the filtered data are shifted by $\pi^S_i$ bits. We define also $\pi^-_i$ as	214	214	bits while the filtered data are shifted by $\pi^S_i$ bits. We define also $\pi^-_i$ as
the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}	215	215	the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}
shows a filtering stage.	216	216	shows a filtering stage.
	217	217
\begin{figure}	218	218	\begin{figure}
\centering	219	219	\centering
\begin{tikzpicture}[node distance=2cm]	220	220	\begin{tikzpicture}[node distance=2cm]
\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;	221	221	\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;
\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;	222	222	\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;
\node (Start) [left of=FIR] { } ;	223	223	\node (Start) [left of=FIR] { } ;
\node (End) [right of=Shift] { } ;	224	224	\node (End) [right of=Shift] { } ;
	225	225
\node[draw,fit=(FIR) (Shift)] (Filter) { } ;	226	226	\node[draw,fit=(FIR) (Shift)] (Filter) { } ;
	227	227
\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;	228	228	\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;
\draw[->] (FIR) -- (Shift) ;	229	229	\draw[->] (FIR) -- (Shift) ;
\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;	230	230	\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;
\end{tikzpicture}	231	231	\end{tikzpicture}
\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}	232	232	\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}
\label{fig:fir_stage}	233	233	\label{fig:fir_stage}
\end{figure}	234	234	\end{figure}
	235	235
FIR $i$ has been characterized through numerical simulation as able to reject $F(C_i, \pi_i^C)$ dB.	236	236	FIR $i$ has been characterized through numerical simulation as able to reject $F(C_i, \pi_i^C)$ dB.
This rejection has been computed using GNU Octave software FIR coefficient design functions	237	237	This rejection has been computed using GNU Octave software FIR coefficient design functions
(\texttt{firls} and \texttt{fir1}).	238	238	(\texttt{firls} and \texttt{fir1}).
For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.	239	239	For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.
Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,	240	240	Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,
the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.	241	241	the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.
At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the others are coded on much fewer bits.	242	242	At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the others are coded on much fewer bits.
	243	243
With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter	244	244	With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter
transfer function.	245	245	transfer function.
Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag},	246	246	Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag},
the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the	247	247	the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the
bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. {\color{red}Throughout this demonstration,	248	248	bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. {\color{red}Throughout this demonstration,
we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%	249	249	we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%
of the Nyquist frequency to the end of the band, as would be typically selected to prevent	250	250	of the Nyquist frequency to the end of the band, as would be typically selected to prevent
aliasing before decimating the dataflow by 2. The method is however generalized to any filter	251	251	aliasing before decimating the dataflow by 2. The method is however generalized to any filter
shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}	252	252	shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}
as described below is indeed unique for each filter shape.}	253	253	as described below is indeed unique for each filter shape.}
	254	254
\begin{figure}	255	255	\begin{figure}
\begin{center}	256	256	\begin{center}
\scalebox{0.8}{	257	257	\scalebox{0.8}{
\centering	258	258	\centering
\begin{tikzpicture}[scale=0.3]	259	259	\begin{tikzpicture}[scale=0.3]
\draw[<->] (0,15) -- (0,0) -- (21,0) ;	260	260	\draw[<->] (0,15) -- (0,0) -- (21,0) ;
\draw[thick] (0,12) -- (8,12) -- (20,0) ;	261	261	\draw[thick] (0,12) -- (8,12) -- (20,0) ;
	262	262
\draw (0,14) node [left] { $P$ } ;	263	263	\draw (0,14) node [left] { $P$ } ;
\draw (20,0) node [below] { $f$ } ;	264	264	\draw (20,0) node [below] { $f$ } ;
	265	265
\draw[>=latex,<->] (0,14) -- (8,14) ;	266	266	\draw[>=latex,<->] (0,14) -- (8,14) ;
\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;	267	267	\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;
	268	268
\draw[>=latex,<->] (8,14) -- (12,14) ;	269	269	\draw[>=latex,<->] (8,14) -- (12,14) ;
\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;	270	270	\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;
	271	271
\draw[>=latex,<->] (12,14) -- (20,14) ;	272	272	\draw[>=latex,<->] (12,14) -- (20,14) ;
\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;	273	273	\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;
	274	274
\draw[>=latex,<->] (16,12) -- (16,8) ;	275	275	\draw[>=latex,<->] (16,12) -- (16,8) ;
\draw (16,10) node [right] { rejection } ;	276	276	\draw (16,10) node [right] { rejection } ;
	277	277
\draw[dashed] (8,-1) -- (8,14) ;	278	278	\draw[dashed] (8,-1) -- (8,14) ;
\draw[dashed] (12,-1) -- (12,14) ;	279	279	\draw[dashed] (12,-1) -- (12,14) ;
	280	280
\draw[dashed] (8,12) -- (16,12) ;	281	281	\draw[dashed] (8,12) -- (16,12) ;
\draw[dashed] (12,8) -- (16,8) ;	282	282	\draw[dashed] (12,8) -- (16,8) ;
	283	283
\end{tikzpicture}	284	284	\end{tikzpicture}
}	285	285	}
\end{center}	286	286	\end{center}
\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:	287	287	\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:
the passband is considered to occupy the initial 40\% of the Nyquist frequency range,	288	288	the passband is considered to occupy the initial 40\% of the Nyquist frequency range,
the stopband the last 40\%, allowing 20\% transition width.}	289	289	the stopband the last 40\%, allowing 20\% transition width.}
\label{fig:fir_mag}	290	290	\label{fig:fir_mag}
\end{figure}	291	291	\end{figure}
	292	292
In the transition band, the behavior of the filter is left free, we only {\color{red}define} the passband and the stopband characteristics.	293	293	In the transition band, the behavior of the filter is left free, we only {\color{red}define} the passband and the stopband characteristics.
% r2.7	294	294	% r2.7
% Our initial criterion considered the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion	295	295	% Our initial criterion considered the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion
% yields unacceptable results since notches overestimate the rejection capability of the filter. Furthermore, the losses within	296	296	% yields unacceptable results since notches overestimate the rejection capability of the filter. Furthermore, the losses within
% the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients.	297	297	% the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients.
Our criterion to compute the filter rejection considers	298	298	Our criterion to compute the filter rejection considers
% r2.8 et r2.2 r2.3	299	299	% r2.8 et r2.2 r2.3
the {\color{red}minimal} rejection within the stopband, to which the {\color{red}sum of the absolute values	300	300	the {\color{red}minimal} rejection within the stopband, to which the {\color{red}sum of the absolute values
within the passband is subtracted to avoid filters with excessive ripples}. With this	301	301	within the passband is subtracted to avoid filters with excessive ripples, normalized to the
criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.	302	302	bin width to remain consistent with the passband criterion (dBc/Hz units in all cases)}. With this
	303	303	criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.
% \begin{figure}	304	304
% \centering	305	305	% \begin{figure}
% \includegraphics[width=\linewidth]{images/colored_mean_criterion}	306	306	% \centering
% \caption{Mean stopband rejection criterion comparison between monolithic filter and cascaded filters}	307	307	% \includegraphics[width=\linewidth]{images/colored_mean_criterion}
% \label{fig:mean_criterion}	308	308	% \caption{Mean stopband rejection criterion comparison between monolithic filter and cascaded filters}
% \end{figure}	309	309	% \label{fig:mean_criterion}
	310	310	% \end{figure}
\begin{figure}	311	311
\centering	312	312	\begin{figure}
\includegraphics[width=\linewidth]{images/colored_custom_criterion}	313	313	\centering
\caption{Custom criterion (maximum rejection in the stopband minus the mean of the absolute value of the passband rejection)	314	314	\includegraphics[width=\linewidth]{images/colored_custom_criterion}
comparison between monolithic filter and cascaded filters}	315	315	\caption{Custom criterion (maximum rejection in the stopband minus the {\color{red} sum of the
\label{fig:custom_criterion}	316	316	absolute values of the passband rejection normalized to the bandwidth})
\end{figure}	317	317	comparison between monolithic filter and cascaded filters}
	318	318	\label{fig:custom_criterion}
Thanks to the latter criterion which will be used in the remainder of this paper, we are able to automatically generate multiple FIR taps	319	319	\end{figure}
and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the	320	320
rejection as a function of the number of coefficients and the number of bits representing these coefficients.	321	321	Thanks to the latter criterion which will be used in the remainder of this paper, we are able to automatically generate multiple FIR taps
The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet.	322	322	and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the
Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection.	323	323	rejection as a function of the number of coefficients and the number of bits representing these coefficients.
Conversely when setting the a given number of bits, increasing the number of coefficients will not improve	324	324	The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet.
the rejection. Hence the best coefficient set are on the vertex of the pyramid.	325	325	Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection.
	326	326	Conversely when setting the a given number of bits, increasing the number of coefficients will not improve
\begin{figure}	327	327	the rejection. Hence the best coefficient set are on the vertex of the pyramid.
\centering	328	328
\includegraphics[width=\linewidth]{images/rejection_pyramid}	329	329	\begin{figure}
\caption{Rejection as a function of number of coefficients and number of bits}	330	330	\centering
\label{fig:rejection_pyramid}	331	331	\includegraphics[width=\linewidth]{images/rejection_pyramid}
\end{figure}	332	332	\caption{{\color{red}{Filter}} rejection as a function of number of coefficients and number of bits
	333	333	{\color{red}: this lookup table will be used to identify which filter parameters -- number of bits
Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps),	334	334	representing coefficients and number of coefficients -- best match the targeted transfer function.}}
we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria.	335	335	\label{fig:rejection_pyramid}
If the FIR filter coefficients are the same between the stages, we have:	336	336	\end{figure}
$$F_{total} = F_1 + F_2$$	337	337
But selecting two different sets of coefficient will yield a more complex situation in which	338	338	Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps),
the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves	339	339	we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria.
are two different filters with maximums and notches not located at the same frequency offsets.	340	340	If the FIR filter coefficients are the same between the stages, we have:
Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved	341	341	$$F_{total} = F_1 + F_2$$
with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.	342	342	But selecting two different sets of coefficient will yield a more complex situation in which
% r2.9	343	343	the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves
Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection	344	344	are two different filters with maximums and notches not located at the same frequency offsets.
criteria of each filter. However since the this sum underestimates the rejection capability of the cascade,	345	345	Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved
% r2.10	346	346	with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.
this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability	347	347	% r2.9
of the filter cascade to meet design criteria.	348	348	Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection
	349	349	criteria of each filter. However since the {\color{red}individual filter rejection} sum underestimates the rejection capability of the cascade,
\begin{figure}	350	350	% r2.10
\centering	351	351	this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability
\includegraphics[width=\linewidth]{images/cascaded_criterion}	352	352	of the filter cascade to meet design criteria.
\caption{Rejection of two cascaded filters}	353	353
\label{fig:sum_rejection}	354	354	\begin{figure}
\end{figure}	355	355	\centering
	356	356	\includegraphics[width=\linewidth]{images/cascaded_criterion}
% r2.6	357	357	\caption{{\color{red}Transfer function of individual filters and after cascading} the two filters,
Finally in our case, we consider that the input signal are fully known. So the	358	358	{\color{red}demonstrating that the selected criterion of maximum rejection in the bandstop (horizontal
resolution of the data stream are fixed and still the same for all experiments	359	359	lines) is met. Notice that the cascaded filter has better rejection than summing the bandstop
in this paper.	360	360	maximum of each individual filter.}
	361	361	}
Based on this analysis, we address the estimate of resource consumption (called	362	362	\label{fig:sum_rejection}
% r2.11	363	363	\end{figure}
silicon area -- in the case of FPGAs this means processing cells) as a function of	364	364
filter characteristics. As a reminder, we do not aim at matching actual hardware	365	365	% r2.6
configuration but consider an arbitrary silicon area occupied by each processing function,	366	366	{\color{red}
and will assess after synthesis the adequation of this arbitrary unit with actual	367	367	Finally in our case, we consider that the input signal are fully known. The
hardware resources provided by FPGA manufacturers. The sum of individual processing	368	368	resolution of the input data stream are fixed and still the same for all experiments
unit areas is constrained by a total silicon area representative of FPGA global resources.	369	369	in this paper.}
Formally, variable $a_i$ is the area taken by filter~$i$	370	370
(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).	371	371	Based on this analysis, we address the estimate of resource consumption (called
Constant $\mathcal{A}$ is the total available area. We model our problem as follows:	372	372	% r2.11
	373	373	silicon area -- in the case of FPGAs this means processing cells) as a function of
\begin{align}	374	374	filter characteristics. As a reminder, we do not aim at matching actual hardware
\text{Maximize } & \sum_{i=1}^n r_i \notag \\	375	375	configuration but consider an arbitrary silicon area occupied by each processing function,
\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\	376	376	and will assess after synthesis the adequation of this arbitrary unit with actual
a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\	377	377	hardware resources provided by FPGA manufacturers. The sum of individual processing
r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\	378	378	unit areas is constrained by a total silicon area representative of FPGA global resources.
\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\	379	379	Formally, variable $a_i$ is the area taken by filter~$i$
\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\	380	380	(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).
\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\	381	381	Constant $\mathcal{A}$ is the total available area. We model our problem as follows:
\pi_1^- &= \Pi^I \label{eq:init}	382	382
\end{align}	383	383	\begin{align}
	384	384	\text{Maximize } & \sum_{i=1}^n r_i \notag \\
Equation~\ref{eq:area} states that the total area taken by the filters must be	385	385	\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\
less than the available area. Equation~\ref{eq:areadef} gives the definition of	386	386	a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\
the area used by a filter, considered as the area of the FIR since the Shifter is	387	387	r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\
assumed not to require significant resources. We consider that the FIR needs $C_i$ registers of size	388	388	\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\
$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the	389	389	\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\
input data with the coefficients. Equation~\ref{eq:rejectiondef} gives the	390	390	\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\
definition of the rejection of the filter thanks to the tabulated function~$F$ that we defined	391	391	\pi_1^- &= \Pi^I \label{eq:init}
previously. The Shifter does not introduce negative rejection as we will explain later,	392	392	\end{align}
so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the	393	393
relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add	394	394	Equation~\ref{eq:area} states that the total area taken by the filters must be
$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes	395	395	less than the available area. Equation~\ref{eq:areadef} gives the definition of
$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of	396	396	the area used by a filter, considered as the area of the FIR since the Shifter is
a filter is the same as the input number of bits of the next filter.	397	397	assumed not to require significant resources. We consider that the FIR needs $C_i$ registers of size
Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative	398	398	$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the
rejection. Indeed, the results of the FIR can be right shifted without compromising	399	399	input data with the coefficients. Equation~\ref{eq:rejectiondef} gives the
the quality of the rejection until a threshold. Each bit of the output data	400	400	definition of the rejection of the filter thanks to the tabulated function~$F$ that we defined
increases the maximum rejection level by 6~dB. We add one to take the sign bit	401	401	previously. The Shifter does not introduce negative rejection as we will explain later,
into account. If equation~\ref{eq:maxshift} was not present, the Shifter could	402	402	so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the
shift too much and introduce some noise in the output data. Each supplementary	403	403	relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add
shift bit would cause an additional 6~dB rejection rise. A totally equivalent equation is:	404	404	$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes
$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right)$.	405	405	$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of
Finally, equation~\ref{eq:init} gives the number of bits of the global input.	406	406	a filter is the same as the input number of bits of the next filter.
	407	407	Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative
{\color{red}	408	408	rejection. Indeed, the results of the FIR can be right shifted without compromising
This model is non-linear since we multiply some variable with another variable	409	409	the quality of the rejection until a threshold. Each bit of the output data
and it is even non-quadratic, as $F$ does not have a known	410	410	increases the maximum rejection level by 6~dB. We add one to take the sign bit
linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations.	411	411	into account. If equation~\ref{eq:maxshift} was not present, the Shifter could
This variable must be defined by the user, it represent the number of different	412	412	shift too much and introduce some noise in the output data. Each supplementary
set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}	413	413	shift bit would cause an additional 6~dB rejection rise. A totally equivalent equation is:
functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid}	414	414	$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right)$.
to restrict the number of configurations. Indeed, it is useless to have too many coefficients or	415	415	Finally, equation~\ref{eq:init} gives the number of bits of the global input.
too many bits, hence we take the configurations close to edge of pyramid. Thank to theses	416	416
configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant	417	417	{\color{red}
and the function $F$ can be estimate for each configurations	418	418	This model is non-linear since we multiply some variable with another variable
thanks our rejection criterion. We also defined binary	419	419	and it is even non-quadratic, as the cost function $F$ does not have a known
variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$	420	420	linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations.
		421	% AH: conflit merge
		422	% This variable must be defined by the user, it represent the number of different
		423	% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}
		424	% functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid}
		425	% to restrict the number of configurations. Indeed, it is useless to have too many coefficients or
		426	% too many bits, hence we take the configurations close to edge of pyramid. Thank to theses
		427	% configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant
		428	% and the function $F$ can be estimate for each configurations
		429	% thanks our rejection criterion. We also defined binary
and 0 otherwise. The new equations are as follows:	421	430	This variable $p$ is defined by the user, and represents the number of different
}	422	431	set of coefficients generated (remember, we use \texttt{firls} and \texttt{fir1}
	423	432	functions from GNU Octave) based on the targeted filter characteristics and implementation
\begin{align}	424	433	assumptions (estimated number of bits defining the coefficients). Hence, $C_{ij}$ and
a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\	425	434	$\pi_{ij}^C$ become constants and
r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\	426	435	we define $1 \leq j \leq p$ so that the function $F$ can be estimated (Look Up Table)
\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\	427	436	for each configurations thanks to the rejection criterion. We also define the binary
\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}	428	437	variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$
\end{align}	429	438	and 0 otherwise. The new equations are as follows:
	430	439	}
Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace	431	440
respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.	432	441	\begin{align}
Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.	433	442	a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\
	434	443	r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\
{\color{red}	435	444	\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\
However the problem still quadratic since in the constraint~\ref{eq:areadef2} we multiply	436	445	\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}
$\delta_{ij}$ and $\pi_i^-$. But like $\delta_{ij}$ is a binary variable we can	437	446	\end{align}
linearize this multiplication. The following formula shows how to linearize	438	447
this situation in general case with $y$ a binary variable and $x$ a real variable ($0 \leq x \leq X^{max}$):	439	448	Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace
\begin{equation*}	440	449	respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.
m = x \times y \implies	441	450	Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.
\left \{	442	451
\begin{split}	443	452	{\color{red}
m & \geq 0 \\	444	453	% JM: conflict merge
		454	% However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}
		455	% we multiply
		456	% $\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can
		457	% linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size,
		458	% we define $0 < \pi_i^- \leq 128$ which is the maximum data size whose estimation is
		459	% assumed on hardware characteristics.
		460	% The Gurobi (\url{www.gurobi.com}) optimization software used to solve this quadratic
		461	% model is able to linearize the model provided as is. This model
		462	% has $O(np)$ variables and $O(n)$ constraints.}
		463	However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}
m & \leq y \times X^{max} \\	445	464	we multiply
m & \leq x \\	446	465	$\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can
m & \geq x - (1 - y) \times X^{max} \\	447	466	linearise linearize this multiplication. The following formula shows how to linearize
\end{split}	448	467	this situation in general case with $y$ a binary variable and $x$ a real variable ($0 \leq x \leq X^{max}$):
\right .	449	468	\begin{equation*}
\end{equation*}	450	469	m = x \times y \implies
	451	470	\left \{
		471	\begin{split}
		472	m & \geq 0 \\
		473	m & \leq y \times X^{max} \\
		474	m & \leq x \\
		475	m & \geq x - (1 - y) \times X^{max} \\
		476	\end{split}
		477	\right .
		478	\end{equation*}
		479	So if we bound up $\pi_i^-$ by 128~bits which is the maximum data size whose estimation is
		480	assumed on hardware characteristics,
		481	the Gurobi (\url{www.gurobi.com}) optimization software will be able to linearize
		482	for us the quadratic problem so the model is left as is. This model
So if we bound up $\pi_i^-$ by 128~bits to represent the maximum data size tolerated,	452	483	has $O(np)$ variables and $O(n)$ constraints.}
the Gurobi (\url{www.gurobi.com}) optimization software will be able to linearize	453	484
for us the quadratic problem so the model is left as is.	454	485	% This model is non-linear and even non-quadratic, as $F$ does not have a known
}	455	486	% linear or quadratic expression. We introduce $p$ FIR configurations
This model has $O(np)$ variables and $O(n)$ constraints.	456	487	% $(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants.
	457	488	% % r2.12
% This model is non-linear and even non-quadratic, as $F$ does not have a known	458	489	% This variable must be defined by the user, it represent the number of different
% linear or quadratic expression. We introduce $p$ FIR configurations	459	490	% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}
% $(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants.	460	491	% functions from GNU Octave).
% % r2.12	461	492	% We define binary
% This variable must be defined by the user, it represent the number of different	462	493	% variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$
% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}	463	494	% and 0 otherwise. The new equations are as follows:
% functions from GNU Octave).	464	495	%
% We define binary	465	496	% \begin{align}
% variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$	466	497	% a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\
% and 0 otherwise. The new equations are as follows:	467	498	% r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\
%	468	499	% \pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\
% \begin{align}	469	500	% \sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}
% a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\	470	501	% \end{align}
% r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\	471	502	%
% \pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\	472	503	% Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace
% \sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}	473	504	% respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.
% \end{align}	474	505	% Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.
%	475	506	%
% Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace	476	507	% % r2.13
% respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.	477	508	% This modified model is quadratic since we multiply two variables in the
% Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.	478	509	% equation~\ref{eq:areadef2} ($\delta_{ij}$ by $\pi_{ij}^-$) but it can be linearised if necessary.
%	479	510	% The Gurobi
% % r2.13	480	511	% (\url{www.gurobi.com}) optimization software is used to solve this quadratic
% This modified model is quadratic since we multiply two variables in the	481	512	% model, and since Gurobi is able to linearize, the model is left as is. This model
% equation~\ref{eq:areadef2} ($\delta_{ij}$ by $\pi_{ij}^-$) but it can be linearised if necessary.	482	513	% has $O(np)$ variables and $O(n)$ constraints.
% The Gurobi	483	514
% (\url{www.gurobi.com}) optimization software is used to solve this quadratic	484	515	Two problems will be addressed using the workflow described in the next section: on the one
% model, and since Gurobi is able to linearize, the model is left as is. This model	485	516	hand maximizing the rejection capability of a set of cascaded filters occupying a fixed arbitrary
% has $O(np)$ variables and $O(n)$ constraints.	486	517	silcon area (section~\ref{sec:fixed_area}) and on the second hand the dual problem of minimizing the silicon area
	487	518	for a fixed rejection criterion (section~\ref{sec:fixed_rej}). In the latter case, the
Two problems will be addressed using the workflow described in the next section: on the one	488	519	objective function is replaced with:
hand maximizing the rejection capability of a set of cascaded filters occupying a fixed arbitrary	489	520	\begin{align}
silcon area (section~\ref{sec:fixed_area}) and on the second hand the dual problem of minimizing the silicon area	490	521	\text{Minimize } & \sum_{i=1}^n a_i \notag
for a fixed rejection criterion (section~\ref{sec:fixed_rej}). In the latter case, the	491	522	\end{align}
objective function is replaced with:	492	523	We adapt our constraints of quadratic program to replace equation \ref{eq:area}
\begin{align}	493	524	with equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal
\text{Minimize } & \sum_{i=1}^n a_i \notag	494	525	rejection required.
\end{align}	495	526
We adapt our constraints of quadratic program to replace equation \ref{eq:area}	496	527	\begin{align}
with equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal	497	528	\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}
rejection required.	498	529	\end{align}
	499	530
\begin{align}	500	531	\section{Design workflow}
\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}	501	532	\label{sec:workflow}
\end{align}	502	533
	503	534	In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area}
\section{Design workflow}	504	535	and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved
\label{sec:workflow}	505	536	in the computation of the results.
	506	537
In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area}	507	538	\begin{figure}
and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved	508	539	\centering
in the computation of the results.	509	540	\begin{tikzpicture}[node distance=0.75cm and 2cm]
	510	541	\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;
\begin{figure}	511	542	\node (Start) [left= 3cm of Solver] { } ;
\centering	512	543	\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;
\begin{tikzpicture}[node distance=0.75cm and 2cm]	513	544	\node (Input) [above= of TCL] { } ;
\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;	514	545	\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;
\node (Start) [left= 3cm of Solver] { } ;	515	546	\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;
\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;	516	547	\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;
\node (Input) [above= of TCL] { } ;	517	548	\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;
\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;	518	549	\node (Results) [left= of Postproc] { } ;
\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;	519	550
\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;	520	551	\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;
\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;	521	552	\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;
\node (Results) [left= of Postproc] { } ;	522	553	\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;
	523	554	\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;
\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;	524	555	\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;
\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;	525	556	\draw[->,dashed] (Bitstream) -- (Deploy) ;
\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;	526	557	\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;
\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;	527	558	\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;
\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;	528	559	\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;
\draw[->,dashed] (Bitstream) -- (Deploy) ;	529	560	\draw[->] (Postproc) -- (Results) ;
\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;	530	561	\end{tikzpicture}
\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;	531	562	\caption{Design workflow from the input parameters to the results {\color{red} allowing for
\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;	532	563	a fully automated optimal solution search.}}
\draw[->] (Postproc) -- (Results) ;	533	564	\label{fig:workflow}
\end{tikzpicture}	534	565	\end{figure}
\caption{Design workflow from the input parameters to the results}	535	566
\label{fig:workflow}	536	567	The filter solver is a C++ program that takes as input the maximum area
\end{figure}	537	568	$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,
	538	569	the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates
The filter solver is a C++ program that takes as input the maximum area	539	570	the quadratic programs and uses the Gurobi solver to estimate the optimal results.
$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,	540	571	Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})
the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates	541	572	and a deploy script ((1b) on figure~\ref{fig:workflow}).
the quadratic programs and uses the Gurobi solver to estimate the optimal results.	542	573
Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})	543	574	The TCL script describes the whole digital processing chain from the beginning
and a deploy script ((1b) on figure~\ref{fig:workflow}).	544	575	(the raw signal data) to the end (the filtered data) in a language compatible
	545	576	with proprietary synthesis software, namely Vivado for Xilinx and Quartus for
The TCL script describes the whole digital processing chain from the beginning	546	577	Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN)
(the raw signal data) to the end (the filtered data) in a language compatible	547	578	generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.
with proprietary synthesis software, namely Vivado for Xilinx and Quartus for	548	579	Then the script builds each stage of the chain with a generic FIR task that
Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN)	549	580	comes from a skeleton library. The generic FIR is highly configurable
generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.	550	581	with the number of coefficients and the size of the coefficients. The coefficients
Then the script builds each stage of the chain with a generic FIR task that	551	582	themselves are not stored in the script.
comes from a skeleton library. The generic FIR is highly configurable	552	583	As the signal is processed in real-time, the output signal is stored as
with the number of coefficients and the size of the coefficients. The coefficients	553	584	consecutive bursts of data for post-processing, mainly assessing the consistency of the
themselves are not stored in the script.	554	585	implemented FIR cascade transfer function with the design criteria and the expected
As the signal is processed in real-time, the output signal is stored as	555	586	transfer function.
consecutive bursts of data for post-processing, mainly assessing the consistency of the	556	587
implemented FIR cascade transfer function with the design criteria and the expected	557	588	The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).
transfer function.	558	589	We use the 2018.2 version of Xilinx Vivado and we execute the synthesized
	559	590	bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series
The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).	560	591	FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Omega$ resistors to
We use the 2018.2 version of Xilinx Vivado and we execute the synthesized	561	592	provide a broadband noise source.
bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series	562	593	The board runs the Linux kernel and surrounding environment produced from the
FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Omega$ resistors to	563	594	Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring
provide a broadband noise source.	564	595	the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and
The board runs the Linux kernel and surrounding environment produced from the	565	596	fetching the results is automated.
Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring	566	597
the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and	567	598	The deploy script uploads the bitstream to the board ((3) on
fetching the results is automated.	568	599	figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,
	569	600	configures the coefficients of the FIR filters. It then waits for the results
The deploy script uploads the bitstream to the board ((3) on	570	601	and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).
figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,	571	602
configures the coefficients of the FIR filters. It then waits for the results	572	603	Finally, an Octave post-processing script computes the final results thanks to
and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).	573	604	the output data ((5) on figure~\ref{fig:workflow}).
	574	605	The results are normalized so that the Power Spectrum Density (PSD) starts at zero
Finally, an Octave post-processing script computes the final results thanks to	575	606	and the different configurations can be compared.
the output data ((5) on figure~\ref{fig:workflow}).	576	607
The results are normalized so that the Power Spectrum Density (PSD) starts at zero	577	608	\section{Maximizing the rejection at fixed silicon area}
and the different configurations can be compared.	578	609	\label{sec:fixed_area}
	579	610	This section presents the output of the filter solver {\em i.e.} the computed
\section{Maximizing the rejection at fixed silicon area}	580	611	configurations for each stage, the computed rejection and the computed silicon area.
\label{sec:fixed_area}	581	612	Such results allow for understanding the choices made by the solver to compute its solutions.
This section presents the output of the filter solver {\em i.e.} the computed	582	613
configurations for each stage, the computed rejection and the computed silicon area.	583	614	The experimental setup is composed of three cases. The raw input is generated
Such results allow for understanding the choices made by the solver to compute its solutions.	584	615	by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.
	585	616	Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500
The experimental setup is composed of three cases. The raw input is generated	586	617	arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.
by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.	587	618	The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$
Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500	588	619	ranging from 2 to 22. In each case, the quadratic program has been able to give a
arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.	589	620	result up to five stages ($n = 5$) in the cascaded filter.
The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$	590	621
ranging from 2 to 22. In each case, the quadratic program has been able to give a	591	622	Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.
result up to five stages ($n = 5$) in the cascaded filter.	592	623	Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.
	593	624	Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.
Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.	594	625
Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.	595	626	\renewcommand{\arraystretch}{1.4}
Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.	596	627
	597	628	\begin{table}
\renewcommand{\arraystretch}{1.4}	598	629	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}
	599	630	\label{tbl:gurobi_max_500}
\begin{table}	600	631	\centering
\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}	601	632	{\scalefont{0.77}
\label{tbl:gurobi_max_500}	602	633	\begin{tabular}{\|c\|ccccc\|c\|c\|}
\centering	603	634	\hline
{\scalefont{0.77}	604	635	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
\begin{tabular}{\|c\|ccccc\|c\|c\|}	605	636	\hline
\hline	606	637	1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\
$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\	607	638	2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\
\hline	608	639	3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\
1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\	609	640	4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\
2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\	610	641	5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\
3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\	611	642	\hline
4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\	612	643	\end{tabular}
5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\	613	644	}
\hline	614	645	\end{table}
\end{tabular}	615	646
}	616	647	\begin{table}
\end{table}	617	648	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}
	618	649	\label{tbl:gurobi_max_1000}
\begin{table}	619	650	\centering
\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}	620	651	{\scalefont{0.77}
\label{tbl:gurobi_max_1000}	621	652	\begin{tabular}{\|c\|ccccc\|c\|c\|}
\centering	622	653	\hline
{\scalefont{0.77}	623	654	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
\begin{tabular}{\|c\|ccccc\|c\|c\|}	624	655	\hline
\hline	625	656	1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\
$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\	626	657	2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\
\hline	627	658	3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\
1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\	628	659	4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\
2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\	629	660	5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\
3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\	630	661	\hline
4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\	631	662	\end{tabular}
5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\	632	663	}
\hline	633	664	\end{table}
\end{tabular}	634	665
}	635	666	\begin{table}
\end{table}	636	667	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}
	637	668	\label{tbl:gurobi_max_1500}
\begin{table}	638	669	\centering
\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}	639	670	{\scalefont{0.77}
\label{tbl:gurobi_max_1500}	640	671	\begin{tabular}{\|c\|ccccc\|c\|c\|}
\centering	641	672	\hline
{\scalefont{0.77}	642	673	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
\begin{tabular}{\|c\|ccccc\|c\|c\|}	643	674	\hline
\hline	644	675	1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\
$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\	645	676	2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\
\hline	646	677	3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\
1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\	647	678	4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\
2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\	648	679	5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\
3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\	649	680	\hline
4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\	650	681	\end{tabular}
5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\	651	682	}
\hline	652	683	\end{table}
\end{tabular}	653	684
}	654	685	\renewcommand{\arraystretch}{1}
\end{table}	655	686
	656	687	From these tables, we can first state that the more stages are used to define
\renewcommand{\arraystretch}{1}	657	688	the cascaded FIR filters, the better the rejection. It was an expected result as it has
	658	689	been previously observed that many small filters are better than
From these tables, we can first state that the more stages are used to define	659	690	a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusions
the cascaded FIR filters, the better the rejection. It was an expected result as it has	660	691	being hardly used in practice due to the lack of tools for identifying individual filter
been previously observed that many small filters are better than	661	692	coefficients in the cascaded approach.
a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusions	662	693
being hardly used in practice due to the lack of tools for identifying individual filter	663	694	Second, the larger the silicon area, the better the rejection. This was also an
coefficients in the cascaded approach.	664	695	expected result as more area means a filter of better quality with more coefficients
	665	696	or more bits per coefficient.
Second, the larger the silicon area, the better the rejection. This was also an	666	697
expected result as more area means a filter of better quality with more coefficients	667	698	Then, we also observe that the first stage can have a larger shift than the other
or more bits per coefficient.	668	699	stages. This is explained by the fact that the solver tries to use just enough
	669	700	bits for the computed rejection after each stage. In the first stage, a
Then, we also observe that the first stage can have a larger shift than the other	670	701	balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}
stages. This is explained by the fact that the solver tries to use just enough	671	702	gives the relation between both values.
bits for the computed rejection after each stage. In the first stage, a	672	703
balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}	673	704	Finally, we note that the solver consumes all the given silicon area.
gives the relation between both values.	674	705
	675	706	The following graphs present the rejection for real data on the FPGA. In all the following
Finally, we note that the solver consumes all the given silicon area.	676	707	figures, the solid line represents the actual rejection of the filtered
	677	708	data on the FPGA as measured experimentally and the dashed line are the noise levels
The following graphs present the rejection for real data on the FPGA. In all the following	678	709	given by the quadratic solver. The configurations are those computed in the previous section.
figures, the solid line represents the actual rejection of the filtered	679	710
data on the FPGA as measured experimentally and the dashed line are the noise levels	680	711	Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.
given by the quadratic solver. The configurations are those computed in the previous section.	681	712	Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.
	682	713	Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.
Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.	683	714
Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.	684	715	% \begin{figure}
Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.	685	716	% \centering
	686	717	% \includegraphics[width=\linewidth]{images/max_500}
% \begin{figure}	687	718	% \caption{Signal spectrum for MAX/500}
% \centering	688	719	% \label{fig:max_500_result}
% \includegraphics[width=\linewidth]{images/max_500}	689	720	% \end{figure}
% \caption{Signal spectrum for MAX/500}	690	721	%
% \label{fig:max_500_result}	691	722	% \begin{figure}
% \end{figure}	692	723	% \centering
%	693	724	% \includegraphics[width=\linewidth]{images/max_1000}
% \begin{figure}	694	725	% \caption{Signal spectrum for MAX/1000}
% \centering	695	726	% \label{fig:max_1000_result}
% \includegraphics[width=\linewidth]{images/max_1000}	696	727	% \end{figure}
% \caption{Signal spectrum for MAX/1000}	697	728	%
% \label{fig:max_1000_result}	698	729	% \begin{figure}
% \end{figure}	699	730	% \centering
%	700	731	% \includegraphics[width=\linewidth]{images/max_1500}
% \begin{figure}	701	732	% \caption{Signal spectrum for MAX/1500}
% \centering	702	733	% \label{fig:max_1500_result}
% \includegraphics[width=\linewidth]{images/max_1500}	703	734	% \end{figure}
% \caption{Signal spectrum for MAX/1500}	704	735
% \label{fig:max_1500_result}	705	736	% r2.14 et r2.15 et r2.16
% \end{figure}	706	737	\begin{figure}
	707	738	\centering
% r2.14 et r2.15 et r2.16	708	739	\begin{subfigure}{\linewidth}
\begin{figure}	709	740	\includegraphics[width=\linewidth]{images/max_500}
\centering	710	741	\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
\begin{subfigure}{\linewidth}	711	742	the MAX/500 problem of maximizing rejection for a given resource allocation (500~arbitrary units).}
\includegraphics[width=\linewidth]{images/max_500}	712	743	\label{fig:max_500_result}
\caption{Signal spectrum for MAX/500}	713	744	\end{subfigure}
\label{fig:max_500_result}	714	745
\end{subfigure}	715	746	\begin{subfigure}{\linewidth}
	716	747	\includegraphics[width=\linewidth]{images/max_1000}
\begin{subfigure}{\linewidth}	717	748	\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
\includegraphics[width=\linewidth]{images/max_1000}	718	749	the MAX/1000 problem of maximizing rejection for a given resource allocation (1000~arbitrary units).}
\caption{Signal spectrum for MAX/1000}	719	750	\label{fig:max_1000_result}
\label{fig:max_1000_result}	720	751	\end{subfigure}
\end{subfigure}	721	752
	722	753	\begin{subfigure}{\linewidth}
\begin{subfigure}{\linewidth}	723	754	\includegraphics[width=\linewidth]{images/max_1500}
\includegraphics[width=\linewidth]{images/max_1500}	724	755	\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
\caption{Signal spectrum for MAX/1500}	725	756	the MAX/1500 problem of maximizing rejection for a given resource allocation (1500~arbitrary units).}
\label{fig:max_1500_result}	726	757	\label{fig:max_1500_result}
\end{subfigure}	727	758	\end{subfigure}
\caption{Signal spectrum of each experimental configurations MAX/500, MAX/1000 and MAX/1500}	728	759	\caption{\color{red}Solutions for the MAX/500, MAX/1000 and MAX/1500 problems of maximizing
\end{figure}	729	760	rejection for a given resource allocation.
	730	761	The filter shape constraint (bandpass and bandstop) is shown as thick
In all cases, we observe that the actual rejection is close to the rejection computed by the solver.	731	762	horizontal lines on each chart.}
	732	763	\end{figure}
We compare the actual silicon resources given by Vivado to the	733	764
resources in arbitrary units.	734	765	In all cases, we observe that the actual rejection is close to the rejection computed by the solver.
The goal is to check that our arbitrary units of silicon area models well enough	735	766
the real resources on the FPGA. Especially we want to verify that, for a given	736	767	We compare the actual silicon resources given by Vivado to the
number of arbitrary units, the actual silicon resources do not depend on the	737	768	resources in arbitrary units.
number of stages $n$. Most significantly, our approach aims	738	769	The goal is to check that our arbitrary units of silicon area models well enough
at remaining far enough from the practical logic gate implementation used by	739	770	the real resources on the FPGA. Especially we want to verify that, for a given
various vendors to remain platform independent and be portable from one	740	771	number of arbitrary units, the actual silicon resources do not depend on the
architecture to another.	741	772	number of stages $n$. Most significantly, our approach aims
	742	773	at remaining far enough from the practical logic gate implementation used by
Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and	743	774	various vendors to remain platform independent and be portable from one
MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000	744	775	architecture to another.
and 1500 arbitrary units. We have taken care to extract solely the resources used by	745	776
the FIR filters and remove additional processing blocks including FIFO and Programmable	746	777	Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and
Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.	747	778	MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000
	748	779	and 1500 arbitrary units. We have taken care to extract solely the resources used by
\begin{table}[h!tb]	749	780	the FIR filters and remove additional processing blocks including FIFO and Programmable
\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}	750	781	Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.
\label{tbl:resources_usage}	751	782
\centering	752	783	\begin{table}[h!tb]
\begin{tabular}{\|c\|c\|ccc\|c\|}	753	784	\caption{Resource occupation {\color{red}following synthesis of the solutions found for
\hline	754	785	the problem of maximizing rejection for a given resource allocation}. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}
$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline	755	786	\label{tbl:resources_usage}
& LUT & 249 & 453 & 627 & \emph{17600} \\	756	787	\centering
1 & BRAM & 1 & 1 & 1 & \emph{120} \\	757	788	\begin{tabular}{\|c\|c\|ccc\|c\|}
& DSP & 21 & 37 & 47 & \emph{80} \\ \hline	758	789	\hline
& LUT & 2374 & 5494 & 691 & \emph{17600} \\	759	790	$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline
2 & BRAM & 2 & 2 & 2 & \emph{120} \\	760	791	& LUT & 249 & 453 & 627 & \emph{17600} \\
& DSP & 0 & 0 & 70 & \emph{80} \\ \hline	761	792	1 & BRAM & 1 & 1 & 1 & \emph{120} \\
& LUT & 2443 & 3304 & 3521 & \emph{17600} \\	762	793	& DSP & 21 & 37 & 47 & \emph{80} \\ \hline
3 & BRAM & 3 & 3 & 3 & \emph{120} \\	763	794	& LUT & 2374 & 5494 & 691 & \emph{17600} \\
& DSP & 0 & 19 & 35 & \emph{80} \\ \hline	764	795	2 & BRAM & 2 & 2 & 2 & \emph{120} \\
& LUT & 2634 & 3753 & 2557 & \emph{17600} \\	765	796	& DSP & 0 & 0 & 70 & \emph{80} \\ \hline
4 & BRAM & 4 & 4 & 4 & \emph{120} \\	766	797	& LUT & 2443 & 3304 & 3521 & \emph{17600} \\
& DPS & 0 & 19 & 46 & \emph{80} \\ \hline	767	798	3 & BRAM & 3 & 3 & 3 & \emph{120} \\
& LUT & 2423 & 3047 & 2847 & \emph{17600} \\	768	799	& DSP & 0 & 19 & 35 & \emph{80} \\ \hline
5 & BRAM & 5 & 5 & 5 & \emph{120} \\	769	800	& LUT & 2634 & 3753 & 2557 & \emph{17600} \\
& DPS & 0 & 22 & 46 & \emph{80} \\ \hline	770	801	4 & BRAM & 4 & 4 & 4 & \emph{120} \\
\end{tabular}	771	802	& DPS & 0 & 19 & 46 & \emph{80} \\ \hline
\end{table}	772	803	& LUT & 2423 & 3047 & 2847 & \emph{17600} \\
	773	804	5 & BRAM & 5 & 5 & 5 & \emph{120} \\
In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,	774	805	& DPS & 0 & 22 & 46 & \emph{80} \\ \hline
when the filter coefficients are small enough, or when the input size is small	775	806	\end{tabular}
enough, Vivado optimizes resource consumption by selecting multiplexers to	776	807	\end{table}
implement the multiplications instead of a DSP. In this case, it is quite difficult	777	808
to compare the whole silicon budget.	778	809	In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,
	779	810	when the filter coefficients are small enough, or when the input size is small
However, a rough estimation can be made with a simple equivalence: looking at	780	811	enough, Vivado optimizes resource consumption by selecting multiplexers to
the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,	781	812	implement the multiplications instead of a DSP. In this case, it is quite difficult
we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon	782	813	to compare the whole silicon budget.
area use. With this equivalence, our 500 arbitraty units correspond to 2500 LUTs,	783	814
1000 arbitrary units correspond to 5000 LUTs and 1500 arbitrary units correspond	784	815	However, a rough estimation can be made with a simple equivalence: looking at
to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary	785	816	the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,
unit map well to actual hardware resources. The relatively small differences can probably be explained	786	817	we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon
by the optimizations done by Vivado based on the detailed map of available processing resources.	787	818	area use. With this equivalence, our 500 arbitraty units correspond to 2500 LUTs,
	788	819	1000 arbitrary units correspond to 5000 LUTs and 1500 arbitrary units correspond
We now present the computation time needed to solve the quadratic problem.	789	820	to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary
For each case, the filter solver software is executed on a Intel(R) Xeon(R) CPU E5606	790	821	unit map well to actual hardware resources. The relatively small differences can probably be explained
clocked at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve	791	822	by the optimizations done by Vivado based on the detailed map of available processing resources.
the quadratic problem. Table~\ref{tbl:area_time} shows the time needed to solve the quadratic	792	823
problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.	793	824	We now present the computation time needed to solve the quadratic problem.
	794	825	For each case, the filter solver software is executed on a Intel(R) Xeon(R) CPU E5606
\begin{table}[h!tb]	795	826	clocked at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve
\caption{Time needed to solve the quadratic program with Gurobi}	796	827	the quadratic problem. Table~\ref{tbl:area_time} shows the time needed to solve the quadratic
\label{tbl:area_time}	797	828	problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.
\centering	798	829
\begin{tabular}{\|c\|c\|c\|c\|}\hline	799	830	\begin{table}[h!tb]
$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline	800	831	\caption{Time needed to solve the quadratic program with Gurobi}
1 & 0.1~s & 0.1~s & 0.3~s \\	801	832	\label{tbl:area_time}
2 & 1.1~s & 2.2~s & 12~s \\	802	833	\centering
3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\	803	834	\begin{tabular}{\|c\|c\|c\|c\|}\hline
4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\	804	835	$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline
5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline	805	836	1 & 0.1~s & 0.1~s & 0.3~s \\
\end{tabular}	806	837	2 & 1.1~s & 2.2~s & 12~s \\
\end{table}	807	838	3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\
	808	839	4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\
As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?	809	840	5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline
When the area is limited, the design exploration space is more limited and the solver is able to	810	841	\end{tabular}
find an optimal solution faster.	811	842	\end{table}
	812	843
\subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}	813	844	As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?
	814	845	When the area is limited, the design exploration space is more limited and the solver is able to
This section presents the results of the complementary quadratic program aimed at	815	846	find an optimal solution faster.
minimizing the area occupation for a targeted rejection level.	816	847
	817	848	\subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}
The experimental setup is composed of four cases. The raw input is the same	818	849
as in the previous section, from a PRN generator, which fixes the input data size $\Pi^I$.	819	850	This section presents the results of the complementary quadratic program aimed at
Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60, 80 or 100~dB.	820	851	minimizing the area occupation for a targeted rejection level.
Hence, the three cases have been named: MIN/40, MIN/60, MIN/80 and MIN/100.	821	852
The number of configurations $p$ is the same as previous section.	822	853	The experimental setup is composed of four cases. The raw input is the same
	823	854	as in the previous section, from a PRN generator, which fixes the input data size $\Pi^I$.
Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.	824	855	Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60, 80 or 100~dB.
Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.	825	856	Hence, the three cases have been named: MIN/40, MIN/60, MIN/80 and MIN/100.
Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.	826	857	The number of configurations $p$ is the same as previous section.
Table~\ref{tbl:gurobi_min_100} shows the results obtained by the filter solver for MIN/100.	827	858
	828	859	Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.
\renewcommand{\arraystretch}{1.4}	829	860	Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.
	830	861	Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.
\begin{table}[h!tb]	831	862	Table~\ref{tbl:gurobi_min_100} shows the results obtained by the filter solver for MIN/100.
\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}	832	863
\label{tbl:gurobi_min_40}	833	864	\renewcommand{\arraystretch}{1.4}
\centering	834	865
{\scalefont{0.77}	835	866	\begin{table}[h!tb]
\begin{tabular}{\|c\|ccccc\|c\|c\|}	836	867	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}
\hline	837	868	\label{tbl:gurobi_min_40}
$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\	838	869	\centering
\hline	839	870	{\scalefont{0.77}
1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\	840	871	\begin{tabular}{\|c\|ccccc\|c\|c\|}
2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\	841	872	\hline
3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\	842	873	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\	843	874	\hline
\hline	844	875	1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\
\end{tabular}	845	876	2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\
}	846	877	3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\
\end{table}	847	878	4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\
	848	879	\hline
\begin{table}[h!tb]	849	880	\end{tabular}
\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}	850	881	}
\label{tbl:gurobi_min_60}	851	882	\end{table}
\centering	852	883
{\scalefont{0.77}	853	884	\begin{table}[h!tb]
\begin{tabular}{\|c\|ccccc\|c\|c\|}	854	885	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}
\hline	855	886	\label{tbl:gurobi_min_60}
$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\	856	887	\centering
\hline	857	888	{\scalefont{0.77}
1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\	858	889	\begin{tabular}{\|c\|ccccc\|c\|c\|}
2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\	859	890	\hline
3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\	860	891	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\	861	892	\hline
5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\	862	893	1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\
\hline	863	894	2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\
\end{tabular}	864	895	3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\
}	865	896	4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\
\end{table}	866	897	5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\
	867	898	\hline
\begin{table}[h!tb]	868	899	\end{tabular}
\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}	869	900	}
\label{tbl:gurobi_min_80}	870	901	\end{table}
\centering	871	902
{\scalefont{0.77}	872	903	\begin{table}[h!tb]
\begin{tabular}{\|c\|ccccc\|c\|c\|}	873	904	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}
\hline	874	905	\label{tbl:gurobi_min_80}
$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\	875	906	\centering
\hline	876	907	{\scalefont{0.77}
1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\	877	908	\begin{tabular}{\|c\|ccccc\|c\|c\|}
2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\	878	909	\hline
3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\	879	910	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\	880	911	\hline
5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\	881	912	1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\
\hline	882	913	2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\
\end{tabular}	883	914	3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\
}	884	915	4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\
\end{table}	885	916	5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\
	886	917	\hline
\begin{table}[h!tb]	887	918	\end{tabular}
\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/100}	888	919	}
\label{tbl:gurobi_min_100}	889	920	\end{table}
\centering	890	921
{\scalefont{0.77}	891	922	\begin{table}[h!tb]
\begin{tabular}{\|c\|ccccc\|c\|c\|}	892	923	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/100}
\hline	893	924	\label{tbl:gurobi_min_100}
$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\	894	925	\centering
\hline	895	926	{\scalefont{0.77}
1 & - & - & - & - & - & - & - \\	896	927	\begin{tabular}{\|c\|ccccc\|c\|c\|}
2 & (15, 7, 17) & (51, 14, 0) & - & - & - & 100~dB & 1365 \\	897	928	\hline
3 & (3, 3, 15) & (27, 9, 0) & (27, 9, 0) & - & - & 100~dB & 1002 \\	898	929	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
4 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 0) & - & 101~dB & 909 \\	899	930	\hline
5 & (3, 3, 15) & (23, 8, 1) & (19, 7, 0) & (3, 3, 0) & (3, 3, 0) & 101~dB & 810 \\	900	931	1 & - & - & - & - & - & - & - \\
\hline	901	932	2 & (15, 7, 17) & (51, 14, 0) & - & - & - & 100~dB & 1365 \\
\end{tabular}	902	933	3 & (3, 3, 15) & (27, 9, 0) & (27, 9, 0) & - & - & 100~dB & 1002 \\
}	903	934	4 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 0) & - & 101~dB & 909 \\
\end{table}	904	935	5 & (3, 3, 15) & (23, 8, 1) & (19, 7, 0) & (3, 3, 0) & (3, 3, 0) & 101~dB & 810 \\
\renewcommand{\arraystretch}{1}	905	936	\hline
	906	937	\end{tabular}
From these tables, we can first state that almost all configurations reach the targeted rejection	907	938	}
level or even better thanks to our underestimate of the cascade rejection as the sum of the	908	939	\end{table}
individual filter rejection. The only exception is for the monolithic case ($n = 1$) in	909	940	\renewcommand{\arraystretch}{1}
MIN/100: no solution is found for a single monolithic filter reach a 100~dB rejection.	910	941
Futhermore, the area of the monolithic filter is twice as big as the two cascaded filters	911	942	From these tables, we can first state that almost all configurations reach the targeted rejection
(1131 and 1760 arbitrary units v.s 547 and 903 arbitrary units for 60 and 80~dB rejection	912	943	level or even better thanks to our underestimate of the cascade rejection as the sum of the
respectively). More generally, the more filters are cascaded, the lower the occupied area.	913	944	individual filter rejection. The only exception is for the monolithic case ($n = 1$) in
	914	945	MIN/100: no solution is found for a single monolithic filter reach a 100~dB rejection.
Like in previous section, the solver chooses always a little filter as first	915	946	Futhermore, the area of the monolithic filter is twice as big as the two cascaded filters
filter stage and the second one is often the biggest filter. This choice can be explained	916	947	(1131 and 1760 arbitrary units v.s 547 and 903 arbitrary units for 60 and 80~dB rejection
as in the previous section, with the solver using just enough bits not to degrade the input	917	948	respectively). More generally, the more filters are cascaded, the lower the occupied area.
signal and in the second filter selecting a better filter to improve rejection without	918	949
having too many bits in the output data.	919	950	Like in previous section, the solver chooses always a little filter as first
	920	951	filter stage and the second one is often the biggest filter. This choice can be explained
For the specific case of MIN/40 for $n = 5$ the solver has determined that the optimal	921	952	as in the previous section, with the solver using just enough bits not to degrade the input
number of filters is 4 so it did not chose any configuration for the last filter. Hence this	922	953	signal and in the second filter selecting a better filter to improve rejection without
solution is equivalent to the result for $n = 4$.	923	954	having too many bits in the output data.
	924	955
The following graphs present the rejection for real data on the FPGA. In all the following	925	956	For the specific case of MIN/40 for $n = 5$ the solver has determined that the optimal
figures, the solid line represents the actual rejection of the filtered	926	957	number of filters is 4 so it did not chose any configuration for the last filter. Hence this
data on the FPGA as measured experimentally and the dashed line is the noise level	927	958	solution is equivalent to the result for $n = 4$.
given by the quadratic solver.	928	959
	929	960	The following graphs present the rejection for real data on the FPGA. In all the following
Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.	930	961	figures, the solid line represents the actual rejection of the filtered
Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.	931	962	data on the FPGA as measured experimentally and the dashed line is the noise level
Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.	932	963	given by the quadratic solver.
Figure~\ref{fig:min_100} shows the rejection of the different configurations in the case of MIN/100.	933	964
	934	965	Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.
% \begin{figure}	935	966	Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.
% \centering	936	967	Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.
% \includegraphics[width=\linewidth]{images/min_40}	937	968	Figure~\ref{fig:min_100} shows the rejection of the different configurations in the case of MIN/100.
% \caption{Signal spectrum for MIN/40}	938	969
% \label{fig:min_40}	939	970	% \begin{figure}
% \end{figure}	940	971	% \centering
%	941	972	% \includegraphics[width=\linewidth]{images/min_40}
% \begin{figure}	942	973	% \caption{Signal spectrum for MIN/40}
% \centering	943	974	% \label{fig:min_40}
% \includegraphics[width=\linewidth]{images/min_60}	944	975	% \end{figure}
% \caption{Signal spectrum for MIN/60}	945	976	%
% \label{fig:min_60}	946	977	% \begin{figure}
% \end{figure}	947	978	% \centering
%	948	979	% \includegraphics[width=\linewidth]{images/min_60}
% \begin{figure}	949	980	% \caption{Signal spectrum for MIN/60}
% \centering	950	981	% \label{fig:min_60}
% \includegraphics[width=\linewidth]{images/min_80}	951	982	% \end{figure}
% \caption{Signal spectrum for MIN/80}	952	983	%
% \label{fig:min_80}	953	984	% \begin{figure}
% \end{figure}	954	985	% \centering
%	955	986	% \includegraphics[width=\linewidth]{images/min_80}
% \begin{figure}	956	987	% \caption{Signal spectrum for MIN/80}
% \centering	957	988	% \label{fig:min_80}
% \includegraphics[width=\linewidth]{images/min_100}	958	989	% \end{figure}
% \caption{Signal spectrum for MIN/100}	959	990	%
% \label{fig:min_100}	960	991	% \begin{figure}
% \end{figure}	961	992	% \centering
	962	993	% \includegraphics[width=\linewidth]{images/min_100}
% r2.14 et r2.15 et r2.16	963	994	% \caption{Signal spectrum for MIN/100}
\begin{figure}	964	995	% \label{fig:min_100}
\centering	965	996	% \end{figure}
\begin{subfigure}{\linewidth}	966	997
\includegraphics[width=\linewidth]{images/min_40}	967	998	% r2.14 et r2.15 et r2.16
\caption{Signal spectrum for MIN/40}	968	999	\begin{figure}
\label{fig:min_40}	969	1000	\centering
\end{subfigure}	970	1001	\begin{subfigure}{\linewidth}
	971	1002	\includegraphics[width=.91\linewidth]{images/min_40}
\begin{subfigure}{\linewidth}	972	1003	\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
\includegraphics[width=\linewidth]{images/min_60}	973	1004	the MIN/40 problem of minimizing resource allocation for reaching a 40~dB rejection.}
\caption{Signal spectrum for MIN/60}	974	1005	\label{fig:min_40}
\label{fig:min_60}	975	1006	\end{subfigure}
\end{subfigure}	976	1007
	977	1008	\begin{subfigure}{\linewidth}
\begin{subfigure}{\linewidth}	978	1009	\includegraphics[width=.91\linewidth]{images/min_60}
\includegraphics[width=\linewidth]{images/min_80}	979	1010	\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
\caption{Signal spectrum for MIN/80}	980	1011	the MIN/60 problem of minimizing resource allocation for reaching a 60~dB rejection.}
\label{fig:min_80}	981	1012	\label{fig:min_60}
\end{subfigure}	982	1013	\end{subfigure}
	983	1014
\begin{subfigure}{\linewidth}	984	1015	\begin{subfigure}{\linewidth}
\includegraphics[width=\linewidth]{images/min_100}	985	1016	\includegraphics[width=.91\linewidth]{images/min_80}
\caption{Signal spectrum for MIN/100}	986	1017	\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
\label{fig:min_100}	987	1018	the MIN/80 problem of minimizing resource allocation for reaching a 80~dB rejection.}
\end{subfigure}	988	1019	\label{fig:min_80}
\caption{Signal spectrum of each experimental configurations MIN/40, MIN/60, MIN/80 and MIN/100}	989	1020	\end{subfigure}
\end{figure}	990	1021
	991	1022	\begin{subfigure}{\linewidth}
We observe that all rejections given by the quadratic solver are close to the experimentally	992	1023	\includegraphics[width=.91\linewidth]{images/min_100}
measured rejection. All curves prove that the constraint to reach the target rejection is	993	1024	\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving
respected with both monolithic (except in MIN/100 which has no monolithic solution) or cascaded filters.	994	1025	the MIN/100 problem of minimizing resource allocation for reaching a 100~dB rejection.}
	995	1026	\label{fig:min_100}
Table~\ref{tbl:resources_usage} shows the resource usage in the case of MIN/40, MIN/60;	996	1027	\end{subfigure}
MIN/80 and MIN/100 \emph{i.e.} when the target rejection is fixed to 40, 60, 80 and 100~dB. We	997	1028	\caption{\color{red}Solutions for the MIN/40, MIN/60, MIN/80 and MIN/100 problems of reaching a
have taken care to extract solely the resources used by	998	1029	given rejection while minimizing resource allocation. The filter shape constraint (bandpass and
the FIR filters and remove additional processing blocks including FIFO and PL to	999	1030	bandstop) is shown as thick
PS communication.	1000	1031	horizontal lines on each chart.}
	1001	1032	\end{figure}
\renewcommand{\arraystretch}{1.2}	1002	1033
\begin{table}	1003	1034	We observe that all rejections given by the quadratic solver are close to the experimentally
\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}	1004	1035	measured rejection. All curves prove that the constraint to reach the target rejection is
\label{tbl:resources_usage_comp}	1005	1036	respected with both monolithic (except in MIN/100 which has no monolithic solution) or cascaded filters.
\centering	1006	1037
{\scalefont{0.90}	1007	1038	Table~\ref{tbl:resources_usage} shows the resource usage in the case of MIN/40, MIN/60;
\begin{tabular}{\|c\|c\|cccc\|c\|}	1008	1039	MIN/80 and MIN/100 \emph{i.e.} when the target rejection is fixed to 40, 60, 80 and 100~dB. We
\hline	1009	1040	have taken care to extract solely the resources used by
$n$ & & MIN/40 & MIN/60 & MIN/80 & MIN/100 & \emph{Zynq 7010} \\ \hline\hline	1010	1041	the FIR filters and remove additional processing blocks including FIFO and PL to
& LUT & 343 & 334 & 772 & - & \emph{17600} \\	1011	1042	PS communication.
1 & BRAM & 1 & 1 & 1 & - & \emph{120} \\	1012	1043
& DSP & 27 & 39 & 55 & - & \emph{80} \\ \hline	1013	1044	\renewcommand{\arraystretch}{1.2}
& LUT & 1252 & 2862 & 5099 & 640 & \emph{17600} \\	1014	1045	\begin{table}
2 & BRAM & 2 & 2 & 2 & 2 & \emph{120} \\	1015	1046	\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}
& DSP & 0 & 0 & 0 & 66 & \emph{80} \\ \hline	1016	1047	\label{tbl:resources_usage_comp}
& LUT & 891 & 2148 & 2023 & 2448 & \emph{17600} \\	1017	1048	\centering
3 & BRAM & 3 & 3 & 3 & 3 & \emph{120} \\	1018	1049	{\scalefont{0.90}
& DSP & 0 & 0 & 19 & 27 & \emph{80} \\ \hline	1019	1050	\begin{tabular}{\|c\|c\|cccc\|c\|}
& LUT & 662 & 1729 & 2451 & 2893 & \emph{17600} \\	1020	1051	\hline
4 & BRAM & 4 & 4 & 4 & 4 & \emph{120} \\	1021	1052	$n$ & & MIN/40 & MIN/60 & MIN/80 & MIN/100 & \emph{Zynq 7010} \\ \hline\hline
& DPS & 0 & 0 & 7 & 19 & \emph{80} \\ \hline	1022	1053	& LUT & 343 & 334 & 772 & - & \emph{17600} \\
& LUT & - & 1259 & 2602 & 2505 & \emph{17600} \\	1023	1054	1 & BRAM & 1 & 1 & 1 & - & \emph{120} \\

ifcs2018_journal_reponse.tex

Diff comments View file @ efde7e8

%Minor Revision - TUFFC-09469-2019	1	1	%Minor Revision - TUFFC-09469-2019
%Transactions on Ultrasonics, Ferroelectrics, and Frequency	2	2	%Transactions on Ultrasonics, Ferroelectrics, and Frequency
%Control (July 23, 2019 9:29 PM)	3	3	%Control (July 23, 2019 9:29 PM)
%To: arthur.hugeat@femto-st.fr, julien.bernard@femto-st.fr,	4	4	%To: arthur.hugeat@femto-st.fr, julien.bernard@femto-st.fr,
%gwenhael.goavec@femto-st.fr, pyb2@femto-st.fr, pierre-yves.bourgeois@femto-st.fr,	5	5	%gwenhael.goavec@femto-st.fr, pyb2@femto-st.fr, pierre-yves.bourgeois@femto-st.fr,
%jmfriedt@femto-st.fr	6	6	%jmfriedt@femto-st.fr
%CC: giorgio.santarelli@institutoptique.fr, lewin@ece.drexel.edu	7	7	%CC: giorgio.santarelli@institutoptique.fr, lewin@ece.drexel.edu
%	8	8	%
%Dear Mr. Arthur HUGEAT	9	9	%Dear Mr. Arthur HUGEAT
%	10	10	%
%Congratulations! Your manuscript	11	11	%Congratulations! Your manuscript
%	12	12	%
%MANUSCRIPT NO. TUFFC-09469-2019	13	13	%MANUSCRIPT NO. TUFFC-09469-2019
%MANUSCRIPT TYPE: Papers	14	14	%MANUSCRIPT TYPE: Papers
%TITLE: Filter optimization for real time digital processing of radiofrequency	15	15	%TITLE: Filter optimization for real time digital processing of radiofrequency
%signals: application to oscillator metrology	16	16	%signals: application to oscillator metrology
%AUTHOR(S): HUGEAT, Arthur; BERNARD, Julien; Goavec-Mérou, Gwenhaël; Bourgeois,	17	17	%AUTHOR(S): HUGEAT, Arthur; BERNARD, Julien; Goavec-Mérou, Gwenhaël; Bourgeois,
%Pierre-Yves; Friedt, Jean-Michel	18	18	%Pierre-Yves; Friedt, Jean-Michel
%	19	19	%
%has been reviewed and it has been suggested that it be accepted for publication	20	20	%has been reviewed and it has been suggested that it be accepted for publication
%after minor revisions. In your revision, you must respond to the reviewer’s	21	21	%after minor revisions. In your revision, you must respond to the reviewer’s
%comments at the end of this e-mail or attached.	22	22	%comments at the end of this e-mail or attached.
%	23	23	%
%Your revised manuscript must be submitted within the next THREE WEEKS. If you	24	24	%Your revised manuscript must be submitted within the next THREE WEEKS. If you
%are not able to submit your manuscript in this time frame, you must contact the	25	25	%are not able to submit your manuscript in this time frame, you must contact the
%Editor in Chief (Peter Lewin, lewinpa@drexel.edu).	26	26	%Editor in Chief (Peter Lewin, lewinpa@drexel.edu).
%	27	27	%
%Please resubmit your revised manuscript to the Transactions on Ultrasonics,	28	28	%Please resubmit your revised manuscript to the Transactions on Ultrasonics,
%Ferroelectrics, and Frequency Control Manuscript Central website at	29	29	%Ferroelectrics, and Frequency Control Manuscript Central website at
%http://mc.manuscriptcentral.com/tuffc-ieee. From the “Author Center” select	30	30	%http://mc.manuscriptcentral.com/tuffc-ieee. From the “Author Center” select
%“Manuscripts with Decisions” and under the appropriate manuscript ID select	31	31	%“Manuscripts with Decisions” and under the appropriate manuscript ID select
%“create a revision”.	32	32	%“create a revision”.
%	33	33	%
%To expedite the review of your resubmission:	34	34	%To expedite the review of your resubmission:
%	35	35	%
%(1) Include or attach a point by point response to reviewer’s comments and	36	36	%(1) Include or attach a point by point response to reviewer’s comments and
%detail all changes made in your manuscript under “Response to Decision Letter”.	37	37	%detail all changes made in your manuscript under “Response to Decision Letter”.
%Failure to address reviewers comments can still lead to a rejection of your	38	38	%Failure to address reviewers comments can still lead to a rejection of your
%manuscript.	39	39	%manuscript.
%(2) Submit a PDF of the revised manuscript using the “Formatted (Double Column)	40	40	%(2) Submit a PDF of the revised manuscript using the “Formatted (Double Column)
%Main File - PDF Document Only” file type with all changes highlighted in yellow	41	41	%Main File - PDF Document Only” file type with all changes highlighted in yellow
%under “File Upload”.	42	42	%under “File Upload”.
%(3) Original TeX, LaTeX, or Microsoft Word file of the final manuscript as	43	43	%(3) Original TeX, LaTeX, or Microsoft Word file of the final manuscript as
%Supporting Document.	44	44	%Supporting Document.
%(4) High quality source files of your figures in Word, Tiff, Postscript,	45	45	%(4) High quality source files of your figures in Word, Tiff, Postscript,
%EPS, Excel or Power Point (if figures are not already embedded in your source	46	46	%EPS, Excel or Power Point (if figures are not already embedded in your source
%file above) as Supporting Document.	47	47	%file above) as Supporting Document.
%(5) Author photos and biographies (papers only) as Supporting Document.	48	48	%(5) Author photos and biographies (papers only) as Supporting Document.
%(6) Graphical Abstract to accompany your text abstract on IEEE Xplore (image,	49	49	%(6) Graphical Abstract to accompany your text abstract on IEEE Xplore (image,
%animation, movie, or audio clip) uploaded as Multimedia.	50	50	%animation, movie, or audio clip) uploaded as Multimedia.
%	51	51	%
%*Please make sure that all final files have unique file names in order for	52	52	%*Please make sure that all final files have unique file names in order for
%them to be processed correctly by IEEE*	53	53	%them to be processed correctly by IEEE*
%Please note that a PDF is NOT sufficient for publication, the PDF is used	54	54	%Please note that a PDF is NOT sufficient for publication, the PDF is used
%for review.	55	55	%for review.
%	56	56	%
%During the resubmission process if you do not see a confirmation screen and	57	57	%During the resubmission process if you do not see a confirmation screen and
%receive a confirmation e-mail, your revised manuscript was not transmitted	58	58	%receive a confirmation e-mail, your revised manuscript was not transmitted
%to us and we will not be able to continue to process your manuscript.	59	59	%to us and we will not be able to continue to process your manuscript.
%	60	60	%
%Please refer to the policies regarding the voluntary page charges and	61	61	%Please refer to the policies regarding the voluntary page charges and
%mandatory page charges in the "Guideline for Authors" at	62	62	%mandatory page charges in the "Guideline for Authors" at
%http://ieee-uffc.org/publications/transactions-on-uffc/information-for-authors	63	63	%http://ieee-uffc.org/publications/transactions-on-uffc/information-for-authors
%Note over-length charge of US$175 per page is applied for published pages in	64	64	%Note over-length charge of US$175 per page is applied for published pages in
%excess of 8 pages.	65	65	%excess of 8 pages.
%	66	66	%
%Sincerely,	67	67	%Sincerely,
%	68	68	%
%Giorgio Santarelli	69	69	%Giorgio Santarelli
%Associate Editor in Chief	70	70	%Associate Editor in Chief
%Transactions on Ultrasonics, Ferroelectrics, and Frequency Control	71	71	%Transactions on Ultrasonics, Ferroelectrics, and Frequency Control
%	72	72	%
%****************************************************	73	73	%****************************************************
%REVIEWERS' COMMENTS:	74	74	%REVIEWERS' COMMENTS:
	75	75
\documentclass[a4paper]{article}	76	76	\documentclass[a4paper]{article}
\usepackage{fullpage,graphicx,amsmath, subcaption}	77	77	\usepackage{fullpage,graphicx,amsmath, subcaption}
\begin{document}	78	78	\begin{document}
{\bf Reviewer: 1}	79	79	{\bf Reviewer: 1}
	80	80
%Comments to the Author	81	81	%Comments to the Author
%In general, the language/grammar is adequate.	82	82	%In general, the language/grammar is adequate.
	83	83
{\bf	84	84	{\bf
On page 2, "...allowing to save processing resource..." could be improved. % r1.1 - fait	85	85	On page 2, "...allowing to save processing resource..." could be improved. % r1.1 - fait
}	86	86	}
	87	87
The sentence was split and now reads ``number of coefficients irrelevant: processing	88	88	The sentence was split and now reads ``number of coefficients irrelevant: processing
resources are hence saved by shrinking the filter length.''	89	89	resources are hence saved by shrinking the filter length.''
	90	90
{\bf	91	91	{\bf
On page 2, "... or thanks at a radiofrequency-grade..." isn't at all clear what % r1.2 - fait	92	92	On page 2, "... or thanks at a radiofrequency-grade..." isn't at all clear what % r1.2 - fait
the author meant.}	93	93	the author meant.}
	94	94
Grammatical error: this sentence now reads ``or by sampling a wideband (125~MS/s)	95	95	Grammatical error: this sentence now reads ``or by sampling a wideband (125~MS/s)
Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.''	96	96	Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.''
	97	97
{\bf	98	98	{\bf
On page 2, the whole paragraph "The first step of our approach is to model..." % r1.3 - fait	99	99	On page 2, the whole paragraph "The first step of our approach is to model..." % r1.3 - fait
could be improved.	100	100	could be improved.
}	101	101	}
	102	102
Indeed this paragraph has be written again and now reads as\\	103	103	Indeed this paragraph has be written again and now reads as\\
``The first step of our approach is to model the DSP chain. Since we aim at only optimizing	104	104	``The first step of our approach is to model the DSP chain. Since we aim at only optimizing
the filtering part of the signal processing chain, we have not included the PRN generator or the	105	105	the filtering part of the signal processing chain, we have not included the PRN generator or the
ADC in the model: the input data size and rate are considered fixed and defined by the hardware.	106	106	ADC in the model: the input data size and rate are considered fixed and defined by the hardware.
The filtering can be done in two ways, either by considering a single monolithic FIR filter	107	107	The filtering can be done in two ways, either by considering a single monolithic FIR filter
requiring many coefficients to reach the targeted noise rejection ratio, or by	108	108	requiring many coefficients to reach the targeted noise rejection ratio, or by
cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.	109	109	cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.
''	110	110	''
	111	111
{\bf	112	112	{\bf
I appreciate that the authors attempted and document two optimizations: that % r1.4 - fait	113	113	I appreciate that the authors attempted and document two optimizations: that % r1.4 - fait
of maximum rejection ratio at fixed silicon area, as well as minimum silicon	114	114	of maximum rejection ratio at fixed silicon area, as well as minimum silicon
area for a fixed minimum rejection ratio. For non-experts, it might be very	115	115	area for a fixed minimum rejection ratio. For non-experts, it might be very
useful to compare the results of both optimization paths to the performance and	116	116	useful to compare the results of both optimization paths to the performance and
resource-utilization of generic low-pass filter gateware offered by device	117	117	resource-utilization of generic low-pass filter gateware offered by device
manufacturers. I appreciate also that the authors have presented source code	118	118	manufacturers. I appreciate also that the authors have presented source code
for examination online.	119	119	for examination online.
}	120	120	}
	121	121
To compare the performance of our FIR filters and the performance of device	122	122	To compare the performance of our FIR filters and the performance of device
manufacturers generic filter, we have added a paragraph and a table at the	123	123	manufacturers generic filter, we have added a paragraph and a table at the
end of experiments section. We compare the resources consumption with the same	124	124	end of experiments section. We compare the resources consumption with the same
FIR coefficients set.	125	125	FIR coefficients set.
	126	126
{\bf	127	127	{\bf
Reviewer: 2	128	128	Reviewer: 2
}	129	129	}
	130	130
%Comments to the Author	131	131	%Comments to the Author
%In the Manuscript, the Authors describe an optimization methodology for filter	132	132	%In the Manuscript, the Authors describe an optimization methodology for filter
%design to be used in phase noise metrology. The methodology is general and can	133	133	%design to be used in phase noise metrology. The methodology is general and can
%be used for many aspects of the processing chain. In the Manuscript, the Authors	134	134	%be used for many aspects of the processing chain. In the Manuscript, the Authors
%focus on filtering and shifting while the other aspects, in particular decimation,	135	135	%focus on filtering and shifting while the other aspects, in particular decimation,
%will be considered in a future work. The optimization problem is modelled	136	136	%will be considered in a future work. The optimization problem is modelled
%theoretically and then solved by means of a commercial software. The solutions	137	137	%theoretically and then solved by means of a commercial software. The solutions
%are tested experimentally on the Redpitaya platform with synthetic and real	138	138	%are tested experimentally on the Redpitaya platform with synthetic and real
%white noises. Two cases are considered as a function of the number of filters:	139	139	%white noises. Two cases are considered as a function of the number of filters:
%maximum rejection given a fixed amount of resources and minimum resource	140	140	%maximum rejection given a fixed amount of resources and minimum resource
%utilization given a fixed amount of rejection.	141	141	%utilization given a fixed amount of rejection.
%The Authors find that filtering improves significantly when the number of	142	142	%The Authors find that filtering improves significantly when the number of
%filters increases.	143	143	%filters increases.
%A lot of work has been done in generalizing and automating the procedure so	144	144	%A lot of work has been done in generalizing and automating the procedure so
%that different approaches can be investigated quickly and efficiently. The	145	145	%that different approaches can be investigated quickly and efficiently. The
%results presented in the Manuscript seem to be just a case study based on	146	146	%results presented in the Manuscript seem to be just a case study based on
%the particular criterion chosen by the Authors. Different criteria, in	147	147	%the particular criterion chosen by the Authors. Different criteria, in
%general, could lead to different results and it is important to consider	148	148	%general, could lead to different results and it is important to consider
%carefully the criterion adopted by the Authors, in order to check if it	149	149	%carefully the criterion adopted by the Authors, in order to check if it
%is adequate to compare the performance of filters and if multi-stage	150	150	%is adequate to compare the performance of filters and if multi-stage
%filters are really superior than monolithic filters.	151	151	%filters are really superior than monolithic filters.
	152	152
{\bf	153	153	{\bf
By observing the results presented in fig. 10-16, it is clear that the % r2.1	154	154	By observing the results presented in fig. 10-16, it is clear that the % r2.1
performances of multi-stage filters are obtained at the expense of their	155	155	performances of multi-stage filters are obtained at the expense of their
selectivity and, in this sense, the filters presented in these figures	156	156	selectivity and, in this sense, the filters presented in these figures
are not equivalent. For example, in Fig. 14, at the limit of the pass band,	157	157	are not equivalent. For example, in Fig. 14, at the limit of the pass band,
the attenuation is almost 15 dB for n = 5, while it is not noticeable for	158	158	the attenuation is almost 15 dB for n = 5, while it is not noticeable for
n = 1.	159	159	n = 1.
}	160	160	}
	161	161
We have added on Figs 10--16 (now Fig 9(a)--(c)) the templates used to defined	162	162	We have added on Figs 10--16 (now Fig 9(a)--(c)) the templates used to defined
the bandpass and the bandstop of the filter.	163	163	the bandpass and the bandstop of the filter.
	164	164
% We are aware of this non equivalence but we think that difference is not due to	165	165	% We are aware of this non equivalence but we think that difference is not due to
% the cascaded filters but due to the definition of rejection criterion on the passband.	166	166	% the cascaded filters but due to the definition of rejection criterion on the passband.
% Indeed, in this article we have choose to take the summation of absolute values divide	167	167	% Indeed, in this article we have choose to take the summation of absolute values divide
% by the bandwidth but this criterion is maybe too permissive and when we cascade	168	168	% by the bandwidth but this criterion is maybe too permissive and when we cascade
% some filters this impact is more important.	169	169	% some filters this impact is more important.
%	170	170	%
% However if we change the passband	171	171	% However if we change the passband
% criterion by the summation of absolute value in passband, weighting given to the	172	172	% criterion by the summation of absolute value in passband, weighting given to the
% passband ripples are too strong and the solver are too restricted to provide	173	173	% passband ripples are too strong and the solver are too restricted to provide
% any interesting solution but the ripples in passband will be minimal. And if we take the maximum absolute value in	174	174	% any interesting solution but the ripples in passband will be minimal. And if we take the maximum absolute value in
% passband, the rejection evaluation are too close form the original criterion and	175	175	% passband, the rejection evaluation are too close form the original criterion and
% the result will not be improved.	176	176	% the result will not be improved.
%	177	177	%
% In this article, we will highlight the methodology instead of the filter conception.	178	178	% In this article, we will highlight the methodology instead of the filter conception.
% Even if our rejection criterion is not the best, our methodology was not impacted	179	179	% Even if our rejection criterion is not the best, our methodology was not impacted
% by this. So to improve the results, we can choose another criterion to be more	180	180	% by this. So to improve the results, we can choose another criterion to be more
% selective in passband but it is not the main objective of our article.	181	181	% selective in passband but it is not the main objective of our article.
		182
		183	We are aware of this equivalence but to limit this ripples in passband we need to
		184	enforce the criterion in passband. If we takes a strong constraint like the sum of
		185	absolute values in passband. This criterion si too selective because it considers
		186	all bin on passband while on stopband we consider only the bin with the minimal
		187	rejection. The figure~\ref{fig:letter_sum_criterion} exhibits the results with this
		188	criterion for the case MAX/1000. With this criterion, the solver find an optimal
		189	solution with only two filters in expend of the resource consumption.
		190
		191
		192
		193	If we relax a little the criterion on passband with taking only the maximum absolute
		194	value, we will penalize the ripple peak on passband. The figure~\ref{fig:letter_max_criterion}
		195	shows the results for the case MAX/1000. There as almost no difference with the
		196	article results. Indeed the only little change are on the case $i = 4$ and $i = 5$
		197	which they have some minor differences on coefficients choices.
		198
		199	\begin{figure}[h!tb]
		200	\centering
		201	\begin{subfigure}{0.48\linewidth}
		202	\includegraphics[width=\linewidth]{images/letter_sum_criterion}
		203	\caption{Results for the case MAX/1000 with as criterion on passband the sum absolute values}
		204	\label{fig:letter_sum_criterion}
		205	\end{subfigure}
		206	\begin{subfigure}{0.48\linewidth}
		207	\includegraphics[width=\linewidth]{images/letter_max_criterion}
		208	\caption{Results for the case MAX/1000 with as criterion on passband the maximum absolute value}
		209	\label{fig:letter_max_criterion}
		210	\end{subfigure}
		211	\end{figure}
		212
		213	Finally, if we ponder the maximum absolute on passband, we should improve the result.
		214	We have arbitrary pondered by 5 the maximum. Even with this weighting, the solver
		215	choose the same coefficient set.
		216
		217	To conclude, find a better criterion to avoid the ripples on the passband is difficult.
		218	In this article we are focused on the methodology so even if our criterion could
		219	be improved, our methodology still the same and it works independently of rejection criterion.
	182	220
We are aware of this equivalence but to limit this ripples in passband we need to	183	221	% %Peut etre refaire une serie de simulation dans lesquelles on impose une coupure
enforce the criterion in passband. If we takes a strong constraint like the sum of	184	222	% %non pas entre 40 et 60\% mais entre 50 et 60\% pour demontrer que l'outil s'adapte
absolute values in passband. This criterion si too selective because it considers	185	223	% %au critere qu'on lui impose, et que la coupure moins raide n'est pas intrinseque
all bin on passband while on stopband we consider only the bin with the minimal	186	224	% %a la cascade de filtres.
rejection. The figure~\ref{fig:letter_sum_criterion} exhibits the results with this	187	225	% %AH: Je finis les corrections, je poste l'article revu et pendant ce temps j'essaie de
criterion for the case MAX/1000. With this criterion, the solver find an optimal	188	226	% %relancer des expérimentations. Si j'arrive à les finir à temps, je les intégrerai
solution with only two filters in expend of the resource consumption.	189	227	%
	190	228	% densité spectrale de la bande passante
	191	229	% sum des valeurs absolues / largeur de la bande passante (1/N) vs max dans la bande de coupure
	192	230	%
If we relax a little the criterion on passband with taking only the maximum absolute	193	231	% JMF : il n'a pas tord, la coupure est bcp moins franche a 5 filtres qu'a 1. Ca se voyait
value, we will penalize the ripple peak on passband. The figure~\ref{fig:letter_max_criterion}	194	232	% moins avant de moyenner les fonctions de transfert, mais il y a bien une 15aine de dB
shows the results for the case MAX/1000. There as almost no difference with the	195	233	% quand on cascade 5 filtres !
article results. Indeed the only little change are on the case $i = 4$ and $i = 5$	196	234	%
which they have some minor differences on coefficients choices.	197	235	% Dire que la chute n'est pas du à la casacade mais à notre critère de rejection
	198	236
\begin{figure}[h!tb]	199	237	{\bf
\centering	200	238	The reason is in the criterion that considers the average attenuation in % r2.2
\begin{subfigure}{0.48\linewidth}	201	239	the pass band. This criterion does not take into account the maximum attenuation
\includegraphics[width=\linewidth]{images/letter_sum_criterion}	202	240	in this region, which is a very important parameter for specifying a filter
\caption{Results for the case MAX/1000 with as criterion on passband the sum absolute values}	203	241	and for evaluating its performance. For example, with this criterion, a
\label{fig:letter_sum_criterion}	204	242	filter with 0.1 dB of ripple is considered equivalent to a filter with
\end{subfigure}	205	243	10 dB of ripple. This point has a strong impact in the optimization process
\begin{subfigure}{0.48\linewidth}	206	244	and in the results that are obtained and has to be reconsidered.
\includegraphics[width=\linewidth]{images/letter_max_criterion}	207	245	}
\caption{Results for the case MAX/1000 with as criterion on passband the maximum absolute value}	208	246
\label{fig:letter_max_criterion}	209	247	See above: Choose a criterion is difficult and depending on the context. The main
\end{subfigure}	210	248	contribution on this paper is the methodology not the criterion to quantify the
		249	rejection.
\end{figure}	211	250
	212	251	% The manuscript erroneously stated that we considered the mean of the absolute
Finally, if we ponder the maximum absolute on passband, we should improve the result.	213	252	% value within the bandpass: the manuscript has now been corrected to properly state
We have arbitrary pondered by 5 the maximum. Even with this weighting, the solver	214	253	% the selected criterion, namely the {\em sum} of the absolute value, so that any
choose the same coefficient set.	215	254	% ripple in the bandpass will reduce the chances of a given filter set from being
	216	255	% selected. The manuscript now states ``Our criterion to compute the filter rejection considers
To conclude, find a better criterion to avoid the ripples on the passband is difficult.	217	256	% % r2.8 et r2.2 r2.3
In this article we are focused on the methodology so even if our criterion could	218	257	% the maximum magnitude within the stopband, to which the {sum of the absolute values
be improved, our methodology still the same and it works independently of rejection criterion.	219	258	% within the passband is subtracted to avoid filters with excessive ripples}.''
	220	259
% %Peut etre refaire une serie de simulation dans lesquelles on impose une coupure	221	260	{\bf
% %non pas entre 40 et 60\% mais entre 50 et 60\% pour demontrer que l'outil s'adapte	222	261	I strongly suggest to re-run the analysis with a criterion that takes also % r2.3 -fait
% %au critere qu'on lui impose, et que la coupure moins raide n'est pas intrinseque	223	262	into account the maximum allowed attenuation in pass band, for example by
% %a la cascade de filtres.	224	263	fixing its value to a typical one, as it has been done for the transition
% %AH: Je finis les corrections, je poste l'article revu et pendant ce temps j'essaie de	225	264	bandwidth.
% %relancer des expérimentations. Si j'arrive à les finir à temps, je les intégrerai	226	265	}
%	227	266
% densité spectrale de la bande passante	228	267	See above: the absolute value within the passband will reject filters with
% sum des valeurs absolues / largeur de la bande passante (1/N) vs max dans la bande de coupure	229	268	excessive ripples, including excessive attenuation, within the passband.
%	230	269
% JMF : il n'a pas tord, la coupure est bcp moins franche a 5 filtres qu'a 1. Ca se voyait	231	270	% TODO: test max(stopband) - max(abs(passband))
% moins avant de moyenner les fonctions de transfert, mais il y a bien une 15aine de dB	232	271
% quand on cascade 5 filtres !	233	272	{\bf
%	234	273	In addition, I suggest to address the following points: % r2.4 - fait
% Dire que la chute n'est pas du à la casacade mais à notre critère de rejection	235	274	- Page 1, line 50: the Authors state that IIR have shorter impulse response
	236	275	than FIR. This is not true in general. The sentence should be reconsidered.
{\bf	237	276	}
The reason is in the criterion that considers the average attenuation in % r2.2	238	277
the pass band. This criterion does not take into account the maximum attenuation	239	278	We have not stated that the IIR has a shorter impulse response but a shorter lag.
in this region, which is a very important parameter for specifying a filter	240	279	Indeed while a typical FIR filter will have 32 to 128~coefficients, few IIR filters
and for evaluating its performance. For example, with this criterion, a	241	280	have more than 5~coefficients. Hence, while a FIR requires 128 inputs before providing
filter with 0.1 dB of ripple is considered equivalent to a filter with	242	281	the first output, an IIR will start providing outputs only 5 time steps after the initial
10 dB of ripple. This point has a strong impact in the optimization process	243	282	input starts feeding the IIR. Hence, the issue we address here is lag and not impulse
and in the results that are obtained and has to be reconsidered.	244	283	response. We aimed at making this sentence clearer by stating that ``Since latency is not an issue
}	245	284	in a openloop phase noise characterization instrument, the large
	246	285	numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,
See above: Choose a criterion is difficult and depending on the context. The main	247	286	is not considered as an issue as would be in a closed loop system in which lag aims at being
contribution on this paper is the methodology not the criterion to quantify the	248	287	minimized to avoid oscillation conditions.''
rejection.	249	288
	250	289	{\bf
% The manuscript erroneously stated that we considered the mean of the absolute	251	290	- Fig. 4: the Author should motivate in the text why it has been chosen % r2.5 - fait
% value within the bandpass: the manuscript has now been corrected to properly state	252	291	this transition bandwidth and if it is a typical requirement for phase-noise
% the selected criterion, namely the {\em sum} of the absolute value, so that any	253	292	metrology.
% ripple in the bandpass will reduce the chances of a given filter set from being	254	293	}
% selected. The manuscript now states ``Our criterion to compute the filter rejection considers	255	294
% % r2.8 et r2.2 r2.3	256	295	The purpose of the paper is to demonstrate how a given filter shape can be achieved by
% the maximum magnitude within the stopband, to which the {sum of the absolute values	257	296	minimizing varous resource criteria. Indeed the stopband and bandpass boundaries can
% within the passband is subtracted to avoid filters with excessive ripples}.''	258	297	be questioned: we have selected this filter shape as a typical anti-aliasing filter considering
	259	298	the the dataflow is to be halved. Hence, selecting a cutoff frequency of 40\% the initial
{\bf	260	299	Nyquist frequency prevents noise from reaching baseband after decimating the dataflow by a
I strongly suggest to re-run the analysis with a criterion that takes also % r2.3 -fait	261	300	factor of 2. Such ideas are now stated explicitly in the text as ``Throughout this demonstration,
into account the maximum allowed attenuation in pass band, for example by	262	301	we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%
fixing its value to a typical one, as it has been done for the transition	263	302	of the Nyquist frequency to the end of the band, as would be typically selected to prevent
bandwidth.	264	303	aliasing before decimating the dataflow by 2. The method is however generalized to any filter
}	265	304	shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}
	266	305	as described below is indeed unique for each filter shape.''
See above: the absolute value within the passband will reject filters with	267	306
excessive ripples, including excessive attenuation, within the passband.	268	307	{\bf
	269	308	- The impact of the coefficient resolution is discussed. What about the % r2.6 - fait
% TODO: test max(stopband) - max(abs(passband))	270	309	resolution of the data stream? Is it fixed? If so, which value has been
	271	310	used in the analysis? If not, how is it changed with respect to the
{\bf	272	311	coefficient resolution?
In addition, I suggest to address the following points: % r2.4 - fait	273	312	}
- Page 1, line 50: the Authors state that IIR have shorter impulse response	274	313
than FIR. This is not true in general. The sentence should be reconsidered.	275	314	We have now stated in the beginning of the document that ``we have not included the PRN generator
}	276	315	or the ADC in the model: the input data size and rate are considered fixed and defined by the
	277	316	hardware.'' so indeed the input datastream resolution is considered as a given.
We have not stated that the IIR has a shorter impulse response but a shorter lag.	278	317
Indeed while a typical FIR filter will have 32 to 128~coefficients, few IIR filters	279	318	{\bf
have more than 5~coefficients. Hence, while a FIR requires 128 inputs before providing	280	319	- Page 3, line 47: the initial criterion can be omitted and, consequently, % r2.7 - fait
the first output, an IIR will start providing outputs only 5 time steps after the initial	281	320	Fig. 5 can be removed.
		321	}
		322
		323	Juste mettre une phrase pour dire que la mean ne donnait pas de bons résultats
		324
		325	{\bf
input starts feeding the IIR. Hence, the issue we address here is lag and not impulse	282	326	- Page 3, line 55: ``maximum rejection'' is not compatible with fig. 4. % r2.8 - fait
response. We aimed at making this sentence clearer by stating that ``Since latency is not an issue	283	327	It should be ``minimum''
in a openloop phase noise characterization instrument, the large	284	328	}
numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,	285
is not considered as an issue as would be in a closed loop system in which lag aims at being	286
minimized to avoid oscillation conditions.''	287	329
	288	330	This typo has been corrected.
{\bf	289	331
- Fig. 4: the Author should motivate in the text why it has been chosen % r2.5 - fait	290	332	{\bf
this transition bandwidth and if it is a typical requirement for phase-noise	291	333	- Page e, line 55, second column: ``takin'' % r2.9 - fait
metrology.	292	334	- Page 3, line 58: ``pessimistic'' should be replaced with ``conservative'' % r2.10 - fait
}	293	335	- Page 4, line 17: ``meaning'' $\rightarrow$ ``this means'' % r2.11 - fait
	294	336	}
The purpose of the paper is to demonstrate how a given filter shape can be achieved by	295	337
minimizing varous resource criteria. Indeed the stopband and bandpass boundaries can	296	338	All typos and grammatical errors have been corrected.
be questioned: we have selected this filter shape as a typical anti-aliasing filter considering	297	339
the the dataflow is to be halved. Hence, selecting a cutoff frequency of 40\% the initial	298	340	{\bf
Nyquist frequency prevents noise from reaching baseband after decimating the dataflow by a	299	341	- Page 4, line 10: how $p$ is chosen? Which is the criterion used to choose % r2.12 - fait
factor of 2. Such ideas are now stated explicitly in the text as ``Throughout this demonstration,	300	342	these particular configurations? Are they chosen automatically?
we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%	301	343	}
of the Nyquist frequency to the end of the band, as would be typically selected to prevent	302	344	C'est le nombre de coefficients et un taille raisonnable
aliasing before decimating the dataflow by 2. The method is however generalized to any filter	303	345	Troncature de la pyramide
shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}	304	346
as described below is indeed unique for each filter shape.''	305	347	See below: we have added a better description of $p$ during the transformation explanation.
	306	348	``we introduce $p$ FIR configurations.
{\bf	307	349	This variable must be defined by the user, it represent the number of different
- The impact of the coefficient resolution is discussed. What about the % r2.6 - fait	308	350	set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}
resolution of the data stream? Is it fixed? If so, which value has been	309	351	functions from GNU Octave)''
used in the analysis? If not, how is it changed with respect to the	310	352
coefficient resolution?	311	353	{\bf
}	312	354	- Page 4, line 31: how does the delta function transform model from non-linear % r2.13
	313	355	and non-quadratic to a quadratic?}
We have now stated in the beginning of the document that ``we have not included the PRN generator	314	356
or the ADC in the model: the input data size and rate are considered fixed and defined by the	315	357	The first model is non-quadratic but when we introduce the $p$ configurations,
hardware.'' so indeed the input datastream resolution is considered as a given.	316	358	we can estimate the function $F$ by computing
	317	359	the rejection for each configuration, so the model become quadratic because we have
{\bf	318	360	some multiplication between variables. With the definition of $\delta_{ij}$ we can
- Page 3, line 47: the initial criterion can be omitted and, consequently, % r2.7 - fait	319	361	replace the multiplication between variables by multiplication with binary variable and
Fig. 5 can be removed.	320	362	this one can be linearise as follow:\\
}	321	363	$y$ is a binary variable \\
	322	364	$x$ is a real variable bounded by $X^{max}$ \\
Juste mettre une phrase pour dire que la mean ne donnait pas de bons résultats	323	365	\begin{equation*}
	324	366	m = x \times y \implies
{\bf	325	367	\left \{
- Page 3, line 55: ``maximum rejection'' is not compatible with fig. 4. % r2.8 - fait	326	368	\begin{split}
It should be ``minimum''	327	369	m & \geq 0 \\
}	328	370	m & \leq y \times X^{max} \\
	329	371	m & \leq x \\
This typo has been corrected.	330	372	m & \geq x - (1 - y) \times X^{max} \\
	331	373	\end{split}
{\bf	332	374	\right .
- Page e, line 55, second column: ``takin'' % r2.9 - fait	333	375	\end{equation*}
- Page 3, line 58: ``pessimistic'' should be replaced with ``conservative'' % r2.10 - fait	334	376	Gurobi does the linearization so we don't explain this step to keep the model more
- Page 4, line 17: ``meaning'' $\rightarrow$ ``this means'' % r2.11 - fait	335	377	simple. However, to improve the transformation explanation we have rewrote the
}	336	378	paragraph ``This model is non-linear and even non-quadratic...''.
	337	379
All typos and grammatical errors have been corrected.	338	380	% JMF : il faudra mettre une phrase qui explique, ca en lisant cette reponse dans l'article
	339	381	% je ne comprends pas comment ca repond a la question
{\bf	340	382	%
- Page 4, line 10: how $p$ is chosen? Which is the criterion used to choose % r2.12 - fait	341	383	% AH: Je mets l'idée en français, je vais essayer de traduire ça au mieux.
these particular configurations? Are they chosen automatically?	342	384	%
}	343	385	% Le problème n'est pas linéaire car nous multiplions des variables
C'est le nombre de coefficients et un taille raisonnable	344	386	% entre elles. Pour y remédier, on considère que $\pi_{ij}^C$ et que $C_{ij}$ deviennent
Troncature de la pyramide	345	387	% des constantes. On introduit donc la variable binaire $\delta_{ij}$ qui nous indique
	346	388	% quel filtre est sélectionné étage par étage. Malgré cela, notre programme est encore
See below: we have added a better description of $p$ during the transformation explanation.	347	389	% quadratique car pour la contrainte~\ref{eq:areadef2}, il reste une multiplication entre
``we introduce $p$ FIR configurations.	348	390	% $\delta_{ij}$ et $\pi_i^-$. Mais comme $\delta_{ij}$ est binaire, il est possible
This variable must be defined by the user, it represent the number of different	349	391	% de linéariser cette multiplication pour peu qu'on puisse borner $\pi_i^-$. Dans notre
set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}	350	392	% cas définir la borne est facile car $\pi_i^-$ représente une taille de donnée,
functions from GNU Octave)''	351	393	% nous définission donc $0 < \pi_i^- \leq 128$ car il s'agit de la plus grande valeur
	352	394	% qu'on puisse traiter. De plus nous utiliserons Gurobi qui se chargera de faire la
{\bf	353	395	% linéarisation pour nous.
- Page 4, line 31: how does the delta function transform model from non-linear % r2.13	354	396
and non-quadratic to a quadratic?}	355	397
	356	398	{\bf
The first model is non-quadratic but when we introduce the $p$ configurations,	357	399	- Captions of figure and tables are too minimal. % r2.14
we can estimate the function $F$ by computing	358	400	}
the rejection for each configuration, so the model become quadratic because we have	359	401	We have change the captions of fig 10-16.
some multiplication between variables. With the definition of $\delta_{ij}$ we can	360	402
replace the multiplication between variables by multiplication with binary variable and	361	403	{\bf
this one can be linearise as follow:\\	362	404	- Figures can be grouped: fig. 10-12 can be grouped as three subplots (a, b, c) % r2.15 - fait
$y$ is a binary variable \\	363	405	of a single figure. Same for fig. 13-16.
$x$ is a real variable bounded by $X^{max}$ \\	364	406	}
\begin{equation*}	365	407	We add two sub figure to group the fig.10-12 and fig. 13-16
m = x \times y \implies	366	408
\left \{	367	409	{\bf
\begin{split}	368	410	- Please increase the number of averages for the spectrum. Currently the noise % r2.16 - fait
m & \geq 0 \\	369	411	of the curves is about 20 dBpk-pk and it doesn’t allow to appreciate the
m & \leq y \times X^{max} \\	370	412	differences among the curves. I suggest to reduce the noise below 1 dBpk-pk.
m & \leq x \\	371	413	}
m & \geq x - (1 - y) \times X^{max} \\	372	414
\end{split}	373	415	Indeed averaging had been omitted during post-processing and figure generation: we
\right .	374	416	are grateful to the reviewer for emphasizing this point which has now been corrected. All spectra
\end{equation*}	375	417	now exhibit sub-dBpk-pl line thickness.
Gurobi does the linearization so we don't explain this step to keep the model more	376	418
simple. However, to improve the transformation explanation we have rewrote the	377	419	We believe these updates to the manuscript have improved the presentation and made clearer
paragraph ``This model is non-linear and even non-quadratic...''.	378	420	some of the shortcomings of the initial draft: we are greatful to the reviewers for pointing
	379	421	out these issues.
% JMF : il faudra mettre une phrase qui explique, ca en lisant cette reponse dans l'article	380	422
% je ne comprends pas comment ca repond a la question	381	423	Best wishes, A. Hugeat
%	382	424
% AH: Je mets l'idée en français, je vais essayer de traduire ça au mieux.	383	425	%In conclusion, my opinion is that the methodology presented in the Manuscript
%	384	426	%deserve to be published, provided that the criterion is changed according
% Le problème n'est pas linéaire car nous multiplions des variables	385	427	%the indications mentioned above.
% entre elles. Pour y remédier, on considère que $\pi_{ij}^C$ et que $C_{ij}$ deviennent	386	428	\end{document}
% des constantes. On introduit donc la variable binaire $\delta_{ij}$ qui nous indique	387	429	%****************************************************
% quel filtre est sélectionné étage par étage. Malgré cela, notre programme est encore	388	430	%
% quadratique car pour la contrainte~\ref{eq:areadef2}, il reste une multiplication entre	389	431	%For information about the IEEE Ultrasonics, Ferroelectrics, and Frequency
% $\delta_{ij}$ et $\pi_i^-$. Mais comme $\delta_{ij}$ est binaire, il est possible	390	432	%Control Society, please visit the website: http://www.ieee-uffc.org. The
% de linéariser cette multiplication pour peu qu'on puisse borner $\pi_i^-$. Dans notre	391	433	%website of the Transactions on Ultrasonics, Ferroelectrics, and Frequency
% cas définir la borne est facile car $\pi_i^-$ représente une taille de donnée,	392	434	%Control is at: http://ieee-uffc.org/publications/transactions-on-uffc
% nous définission donc $0 < \pi_i^- \leq 128$ car il s'agit de la plus grande valeur	393	435
% qu'on puisse traiter. De plus nous utiliserons Gurobi qui se chargera de faire la	394
% linéarisation pour nous.	395
	396
	397
{\bf	398
- Captions of figure and tables are too minimal. % r2.14	399
}	400
We have change the captions of fig 10-16.	401
	402
{\bf	403
- Figures can be grouped: fig. 10-12 can be grouped as three subplots (a, b, c) % r2.15 - fait	404
of a single figure. Same for fig. 13-16.	405
}	406
We add two sub figure to group the fig.10-12 and fig. 13-16	407
	408
{\bf	409
- Please increase the number of averages for the spectrum. Currently the noise % r2.16 - fait	410
of the curves is about 20 dBpk-pk and it doesn’t allow to appreciate the	411
differences among the curves. I suggest to reduce the noise below 1 dBpk-pk.	412
}	413
	414
Indeed averaging had been omitted during post-processing and figure generation: we	415
are grateful to the reviewer for emphasizing this point which has now been corrected. All spectra	416
now exhibit sub-dBpk-pl line thickness.	417
	418
We believe these updates to the manuscript have improved the presentation and made clearer	419
some of the shortcomings of the initial draft: we are greatful to the reviewers for pointing	420
out these issues.	421
	422
Best wishes, A. Hugeat	423
	424
%In conclusion, my opinion is that the methodology presented in the Manuscript	425

images/letter_max_criterion.pdf

Diff comments View file @ efde7e8

No preview for this file type

images/letter_sum_criterion.pdf

Diff comments View file @ efde7e8

No preview for this file type