jfriedt / IFCS2018 article

Browse Code »

Commit 0096d462550a4852259c89b83e36e613d0ed73eb

Authored by jfriedt 2019-04-01 17:53:07 +0200

2 parents 17d9a84344 8d9489b3bc

Exists in master

Merge branch 'master' of https://lxsd.femto-st.fr/gitlab/jfriedt/ifcs2018-article

Showing 11 changed files Inline Diff

ifcs2018_journal.tex
images/max_1000.pdf
images/max_1500.pdf
images/max_500.pdf
images/max_rejection/prn_1000.pdf
images/max_rejection/prn_2000.pdf
images/max_rejection/prn_500.pdf
images/min_40.pdf
images/min_60.pdf
images/min_80.pdf
references.bib

ifcs2018_journal.tex

Diff comments View file @ 0096d46

\documentclass[a4paper,conference]{IEEEtran/IEEEtran}	1	1	\documentclass[a4paper,conference]{IEEEtran/IEEEtran}
\usepackage{graphicx,color,hyperref}	2	2	\usepackage{graphicx,color,hyperref}
\usepackage{amsfonts}	3	3	\usepackage{amsfonts}
\usepackage{amsthm}	4	4	\usepackage{amsthm}
\usepackage{amssymb}	5	5	\usepackage{amssymb}
\usepackage{amsmath}	6	6	\usepackage{amsmath}
\usepackage{algorithm2e}	7	7	\usepackage{algorithm2e}
\usepackage{url,balance}	8	8	\usepackage{url,balance}
\usepackage[normalem]{ulem}	9	9	\usepackage[normalem]{ulem}
\usepackage{tikz}	10	10	\usepackage{tikz}
\usetikzlibrary{positioning,fit}	11	11	\usetikzlibrary{positioning,fit}
\usepackage{multirow}	12	12	\usepackage{multirow}
\usepackage{scalefnt}	13	13	\usepackage{scalefnt}
	14	14
% correct bad hyphenation here	15	15	% correct bad hyphenation here
\hyphenation{op-tical net-works semi-conduc-tor}	16	16	\hyphenation{op-tical net-works semi-conduc-tor}
\textheight=26cm	17	17	\textheight=26cm
\setlength{\footskip}{30pt}	18	18	\setlength{\footskip}{30pt}
\pagenumbering{gobble}	19	19	\pagenumbering{gobble}
\begin{document}	20	20	\begin{document}
\title{Filter optimization for real time digital processing of radiofrequency signals: application	21	21	\title{Filter optimization for real time digital processing of radiofrequency signals: application
to oscillator metrology}	22	22	to oscillator metrology}
	23	23
\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},	24	24	\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},
G. Goavec-M\'erou\IEEEauthorrefmark{1},	25	25	G. Goavec-M\'erou\IEEEauthorrefmark{1},
P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}	26	26	P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}
\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }	27	27	\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }
\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\	28	28	\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\
Email: \{pyb2,jmfriedt\}@femto-st.fr}	29	29	Email: \{pyb2,jmfriedt\}@femto-st.fr}
}	30	30	}
\maketitle	31	31	\maketitle
\thispagestyle{plain}	32	32	\thispagestyle{plain}
\pagestyle{plain}	33	33	\pagestyle{plain}
\newtheorem{definition}{Definition}	34	34	\newtheorem{definition}{Definition}
	35	35
\begin{abstract}	36	36	\begin{abstract}
Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to	37	37	Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to
radiofrequency signal processing. Applied to oscillator characterization in the context	38	38	radiofrequency signal processing. Applied to oscillator characterization in the context
of ultrastable clocks, stringent filtering requirements are defined by spurious signal or	39	39	of ultrastable clocks, stringent filtering requirements are defined by spurious signal or
noise rejection needs. Since real time radiofrequency processing must be performed in a	40	40	noise rejection needs. Since real time radiofrequency processing must be performed in a
Field Programmable Array to meet timing constraints, we investigate optimization strategies	41	41	Field Programmable Array to meet timing constraints, we investigate optimization strategies
to design filters meeting rejection characteristics while limiting the hardware resources	42	42	to design filters meeting rejection characteristics while limiting the hardware resources
required and keeping timing constraints within the targeted measurement bandwidths.	43	43	required and keeping timing constraints within the targeted measurement bandwidths.
\end{abstract}	44	44	\end{abstract}
	45	45
\begin{IEEEkeywords}	46	46	\begin{IEEEkeywords}
Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter	47	47	Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter
\end{IEEEkeywords}	48	48	\end{IEEEkeywords}
	49	49
\section{Digital signal processing of ultrastable clock signals}	50	50	\section{Digital signal processing of ultrastable clock signals}
	51	51
Analog oscillator phase noise characteristics are classically performed by downconverting	52	52	Analog oscillator phase noise characteristics are classically performed by downconverting
the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,	53	53	the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,
followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In	54	54	followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In
a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by	55	55	a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by
multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.	56	56	multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.
	57	57
\begin{figure}[h!tb]	58	58	\begin{figure}[h!tb]
\begin{center}	59	59	\begin{center}
\includegraphics[width=.8\linewidth]{images/schema}	60	60	\includegraphics[width=.8\linewidth]{images/schema}
\end{center}	61	61	\end{center}
\caption{Fully digital oscillator phase noise characterization: the Device Under Test	62	62	\caption{Fully digital oscillator phase noise characterization: the Device Under Test
(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and	63	63	(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and
downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals	64	64	downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals
and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite	65	65	and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite
Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays	66	66	Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays
the spectral characteristics of the phase fluctuations.}	67	67	the spectral characteristics of the phase fluctuations.}
\label{schema}	68	68	\label{schema}
\end{figure}	69	69	\end{figure}
	70	70
As with the analog mixer,	71	71	As with the analog mixer,
the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as	72	72	the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as
well as the generation of the frequency sum signal in addition to the frequency difference.	73	73	well as the generation of the frequency sum signal in addition to the frequency difference.
These unwanted spectral characteristics must be rejected before decimating the data stream	74	74	These unwanted spectral characteristics must be rejected before decimating the data stream
for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the	75	75	for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the
downconverter	76	76	downconverter
and the decimation processing blocks are core characteristics of an oscillator characterization	77	77	and the decimation processing blocks are core characteristics of an oscillator characterization
system, and must reject out-of-band signals below the targeted phase noise -- typically in the	78	78	system, and must reject out-of-band signals below the targeted phase noise -- typically in the
sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will	79	79	sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will
use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency	80	80	use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency
datastream: optimizing the performance of the filter while reducing the needed resources is	81	81	datastream: optimizing the performance of the filter while reducing the needed resources is
hence tackled in a systematic approach using optimization techniques. Most significantly, we	82	82	hence tackled in a systematic approach using optimization techniques. Most significantly, we
tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with	83	83	tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with
tunable number of coefficients and tunable number of bits representing the coefficients and the	84	84	tunable number of coefficients and tunable number of bits representing the coefficients and the
data being processed.	85	85	data being processed.
	86	86
\section{Finite impulse response filter}	87	87	\section{Finite impulse response filter}
	88	88
We select FIR filter for their unconditional stability and ease of design. A FIR filter is defined	89	89	We select FIR filter for their unconditional stability and ease of design. A FIR filter is defined
by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the	90	90	by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the
outputs $y_k$	91	91	outputs $y_k$
\begin{align}	92	92	\begin{align}
y_n=\sum_{k=0}^N b_k x_{n-k}	93	93	y_n=\sum_{k=0}^N b_k x_{n-k}
\label{eq:fir_equation}	94	94	\label{eq:fir_equation}
\end{align}	95	95	\end{align}
	96	96
As opposed to an implementation on a general purpose processor in which word size is defined by the	97	97	As opposed to an implementation on a general purpose processor in which word size is defined by the
processor architecture, implementing such a filter on an FPGA offer more degrees of freedom since	98	98	processor architecture, implementing such a filter on an FPGA offer more degrees of freedom since
not only the coefficient values and number of taps must be defined, but also the number of bits	99	99	not only the coefficient values and number of taps must be defined, but also the number of bits
defining the coefficients and the sample size. For this reason, and because we consider pipeline	100	100	defining the coefficients and the sample size. For this reason, and because we consider pipeline
processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency	101	101	processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency
signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but	102	102	signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but
the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language (VHDL) level.	103	103	the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language (VHDL) level.
Since latency is not an issue in a openloop phase noise characterization instrument, the large	104	104	Since latency is not an issue in a openloop phase noise characterization instrument, the large
numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,	105	105	numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,
is not considered as an issue as would be in a closed loop system.	106	106	is not considered as an issue as would be in a closed loop system.
	107	107
The coefficients are classically expressed as floating point values. However, this binary	108	108	The coefficients are classically expressed as floating point values. However, this binary
number representation is not efficient for fast arithmetic computation by an FPGA. Instead,	109	109	number representation is not efficient for fast arithmetic computation by an FPGA. Instead,
we select to quantify these floating point values into integer values. This quantization	110	110	we select to quantify these floating point values into integer values. This quantization
will result in some precision loss.	111	111	will result in some precision loss.
	112	112
\begin{figure}[h!tb]	113	113	\begin{figure}[h!tb]
\includegraphics[width=\linewidth]{images/demo_filtre}	114	114	\includegraphics[width=\linewidth]{images/demo_filtre}
\caption{Impact of the quantization resolution of the coefficients: the quantization is	115	115	\caption{Impact of the quantization resolution of the coefficients: the quantization is
set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting	116	116	set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting
the 30~first and 30~last coefficients out of the initial 128~band-pass	117	117	the 30~first and 30~last coefficients out of the initial 128~band-pass
filter coefficients to 0 (red dots).}	118	118	filter coefficients to 0 (red dots).}
\label{float_vs_int}	119	119	\label{float_vs_int}
\end{figure}	120	120	\end{figure}
	121	121
The tradeoff between quantization resolution and number of coefficients when considering	122	122	The tradeoff between quantization resolution and number of coefficients when considering
integer operations is not trivial. As an illustration of the issue related to the	123	123	integer operations is not trivial. As an illustration of the issue related to the
relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits	124	124	relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits
a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon	125	125	a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon
quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the	126	126	quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the
taps become null, making the large number of coefficients irrelevant and allowing to save	127	127	taps become null, making the large number of coefficients irrelevant and allowing to save
processing resource by shrinking the filter length. This tradeoff aimed at minimizing resources	128	128	processing resource by shrinking the filter length. This tradeoff aimed at minimizing resources
to reach a given rejection level, or maximizing out of band rejection for a given computational	129	129	to reach a given rejection level, or maximizing out of band rejection for a given computational
resource, will drive the investigation on cascading filters designed with varying tap resolution	130	130	resource, will drive the investigation on cascading filters designed with varying tap resolution
and tap length, as will be shown in the next section. Indeed, our development strategy closely	131	131	and tap length, as will be shown in the next section. Indeed, our development strategy closely
follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}	132	132	follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}
in which basic blocks are defined and characterized before being assembled \cite{hide}	133	133	in which basic blocks are defined and characterized before being assembled \cite{hide}
in a complete processing chain. In our case, assembling the filter blocks is a simpler block	134	134	in a complete processing chain. In our case, assembling the filter blocks is a simpler block
combination process since we assume a single value to be processed and a single value to be	135	135	combination process since we assume a single value to be processed and a single value to be
generated at each clock cycle. The FIR filters will not be considered to decimate in the	136	136	generated at each clock cycle. The FIR filters will not be considered to decimate in the
current implementation: the decimation is assumed to be located after the FIR cascade at the	137	137	current implementation: the decimation is assumed to be located after the FIR cascade at the
moment.	138	138	moment.
	139	139
\section{Methodology description}	140	140	\section{Methodology description}
We want create a new methodology to develop any Digital Signal Processing (DSP) chain	141	141	We want create a new methodology to develop any Digital Signal Processing (DSP) chain
and for any hardware platform (Altera, Xilinx...). To do this we have defined an	142	142	and for any hardware platform (Altera, Xilinx...). To do this we have defined an
abstract model to represent some basic operations of DSP.	143	143	abstract model to represent some basic operations of DSP.
	144	144
For the moment, we are focused on only two operations: the filtering and the shift of data.	145	145	For the moment, we are focused on only two operations: the filtering and the shifting of data.
We have chosen this basic operation because the shifting and the filtering have already be studied in	146	146	We have chosen this basic operation because the shifting and the filtering have already be studied in
lot of works {\color{red} mettre les nouvelles référence ici} hence it will be easier	147	147	lot of works \cite{lim_1996, lim_1988, young_1992, smith_1998} hence it will be easier
to check and validate our results.	148	148	to check and validate our results.
	149	149
However having only two operations is insufficient to work with complex DSP but	150	150	However having only two operations is insufficient to work with complex DSP but
in this paper we only want demonstrate the relevance and the efficiency of our approach.	151	151	in this paper we only want demonstrate the relevance and the efficiency of our approach.
In future work it will be possible to add more operations and we are able to	152	152	In future work it will be possible to add more operations and we are able to
model any DSP chain.	153	153	model any DSP chain.
	154	154
We will apply our methodology on very simple DSP chain. We generate a digital signal	155	155	We will apply our methodology on very simple DSP chain. We generate a digital signal
thanks at generator of Pseudo-Random Number (PRN) or thanks at an Analog to Digital	156	156	thanks at generator of Pseudo-Random Number (PRN) or thanks at an Analog to Digital
Converter (ADC). Once we have a digital signal, we filter it to decrease the noise level.	157	157	Converter (ADC). Once we have a digital signal, we filter it to decrease the noise level.
Finally we stored some burst of filtered samples before post-processing it.	158	158	Finally we stored some burst of filtered samples before post-processing it.
% TODO: faire un schéma	159	159	% TODO: faire un schéma
In this particular case, we want optimize the filtering step to have the best noise	160	160	In this particular case, we want optimize the filtering step to have the best noise
rejection for constrain number of resource or to have the minimal resources	161	161	rejection for constrain number of resource or to have the minimal resources
consumption for a given rejection objective.	162	162	consumption for a given rejection objective.
	163	163
The first step of our approach is to model the DSP chain and since we just optimize	164	164	The first step of our approach is to model the DSP chain and since we just optimize
the filtering, we have not modeling the PRN generator or the ADC. The filtering can be	165	165	the filtering, we have not modeling the PRN generator or the ADC. The filtering can be
done by two ways. The first one we use only one FIR filter with lot of coefficients	166	166	done by two ways. The first one we use only one FIR filter with lot of coefficients
to rejection the noise, we called this approach a monolithic approach. And the second one	167	167	to rejection the noise, we called this approach a monolithic approach. And the second one
we select different FIR filters with less coefficients the monolithic filter and we cascaded	168	168	we select different FIR filters with less coefficients the monolithic filter and we cascaded
it to filtering the signal.	169	169	it to filtering the signal.
	170	170
After each filter we leave the possibility of shifting the filtered data to consume	171	171	After each filter we leave the possibility of shifting the filtered data to consume
less resources. Hence in the case of cascaded filter, we define a stage as a filter	172	172	less resources. Hence in the case of cascaded filter, we define a stage as a filter
and a shifter (the shift could be omitted if we do not need to divide the filtered data).	173	173	and a shifter (the shift could be omitted if we do not need to divide the filtered data).
	174	174
\subsection{Model of a FIR filter}	175	175	\subsection{Model of a FIR filter}
A cascade of filter are composed of $n$ stage. In stage $i$ ($1 \leq i \leq n$)	176	176	A cascade of filter are composed of $n$ stage. In stage $i$ ($1 \leq i \leq n$)
the FIR has $C_i$ coefficients and each coefficients are integer values with $\pi^C_i$	177	177	the FIR has $C_i$ coefficients and each coefficients are integer values with $\pi^C_i$
bits and the filtered data are shifted of $\pi^S_i$ bits. We define also $\pi^-_i$ as	178	178	bits and the filtered data are shifted of $\pi^S_i$ bits. We define also $\pi^-_i$ as
the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}	179	179	the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}
shows a filtering stage.	180	180	shows a filtering stage.
	181	181
\begin{figure}	182	182	\begin{figure}
\centering	183	183	\centering
\begin{tikzpicture}[node distance=2cm]	184	184	\begin{tikzpicture}[node distance=2cm]
\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;	185	185	\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;
\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;	186	186	\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;
\node (Start) [left of=FIR] { } ;	187	187	\node (Start) [left of=FIR] { } ;
\node (End) [right of=Shift] { } ;	188	188	\node (End) [right of=Shift] { } ;
	189	189
\node[draw,fit=(FIR) (Shift)] (Filter) { } ;	190	190	\node[draw,fit=(FIR) (Shift)] (Filter) { } ;
	191	191
\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;	192	192	\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;
\draw[->] (FIR) -- (Shift) ;	193	193	\draw[->] (FIR) -- (Shift) ;
\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;	194	194	\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;
\end{tikzpicture}	195	195	\end{tikzpicture}
\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}	196	196	\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}
\label{fig:fir_stage}	197	197	\label{fig:fir_stage}
\end{figure}	198	198	\end{figure}
	199	199
FIR $i$ can reject $F(C_i, \pi_i^C)$ dB. $F$ is determined numerically.	200	200	FIR $i$ can reject $F(C_i, \pi_i^C)$ dB. $F$ is determined numerically.
To measure this rejection, we use GNU Octave software to design FIR filter coefficients thanks to two	201	201	To measure this rejection, we use GNU Octave software to design FIR filter coefficients thanks to two
algorithms (\texttt{firls} and \texttt{fir1}).	202	202	algorithms (\texttt{firls} and \texttt{fir1}).
For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.	203	203	For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.
Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,	204	204	Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,
the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.	205	205	the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.
At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the other are coded on very fewer bits.	206	206	At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the other are coded on very fewer bits.
	207	207
With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter.	208	208	With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter.
Comparing the performance between FIRs requires however a unique criterion. As shown in figure~\ref{fig:fir_mag},	209	209	Comparing the performance between FIRs requires however a unique criterion. As shown in figure~\ref{fig:fir_mag},
the FIR magnitude exhibits two parts.	210	210	the FIR magnitude exhibits two parts.
	211	211
\begin{figure}	212	212	\begin{figure}
\centering	213	213	\centering
\begin{tikzpicture}[scale=0.3]	214	214	\begin{tikzpicture}[scale=0.3]
\draw[<->] (0,15) -- (0,0) -- (21,0) ;	215	215	\draw[<->] (0,15) -- (0,0) -- (21,0) ;
\draw[thick] (0,12) -- (8,12) -- (20,0) ;	216	216	\draw[thick] (0,12) -- (8,12) -- (20,0) ;
	217	217
\draw (0,14) node [left] { $P$ } ;	218	218	\draw (0,14) node [left] { $P$ } ;
\draw (20,0) node [below] { $f$ } ;	219	219	\draw (20,0) node [below] { $f$ } ;
	220	220
\draw[>=latex,<->] (0,14) -- (8,14) ;	221	221	\draw[>=latex,<->] (0,14) -- (8,14) ;
\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;	222	222	\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;
	223	223
\draw[>=latex,<->] (8,14) -- (12,14) ;	224	224	\draw[>=latex,<->] (8,14) -- (12,14) ;
\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;	225	225	\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;
	226	226
\draw[>=latex,<->] (12,14) -- (20,14) ;	227	227	\draw[>=latex,<->] (12,14) -- (20,14) ;
\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;	228	228	\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;
	229	229
\draw[>=latex,<->] (16,12) -- (16,8) ;	230	230	\draw[>=latex,<->] (16,12) -- (16,8) ;
\draw (16,10) node [right] { rejection } ;	231	231	\draw (16,10) node [right] { rejection } ;
	232	232
\draw[dashed] (8,-1) -- (8,14) ;	233	233	\draw[dashed] (8,-1) -- (8,14) ;
\draw[dashed] (12,-1) -- (12,14) ;	234	234	\draw[dashed] (12,-1) -- (12,14) ;
	235	235
\draw[dashed] (8,12) -- (16,12) ;	236	236	\draw[dashed] (8,12) -- (16,12) ;
\draw[dashed] (12,8) -- (16,8) ;	237	237	\draw[dashed] (12,8) -- (16,8) ;
	238	238
\end{tikzpicture}	239	239	\end{tikzpicture}
	240	240
% \includegraphics[width=.5\linewidth]{images/fir_magnitude}	241	241	% \includegraphics[width=.5\linewidth]{images/fir_magnitude}
\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:	242	242	\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:
the passband is considered to occupy the initial 40\% of the Nyquist frequency range,	243	243	the passband is considered to occupy the initial 40\% of the Nyquist frequency range,
the stopband the last 40\%, allowing 20\% transition width.}	244	244	the stopband the last 40\%, allowing 20\% transition width.}
\label{fig:fir_mag}	245	245	\label{fig:fir_mag}
\end{figure}	246	246	\end{figure}
	247	247
In the transition band, the behavior of the filter is left free, we only care about the passband and the stopband.	248	248	In the transition band, the behavior of the filter is left free, we only care about the passband and the stopband.
Our first criterion considers the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion does not work because we do not consider the shape of the passband.	249	249	Our first criterion considers the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion does not work because we do not consider the shape of the passband.
A second criterion considers the maximum rejection within the stopband minus the mean of the absolute value of passband rejection. With this criterion, the results are significantly improved as shown in figure~\ref{fig:custom_criterion}.	250	250	A second criterion considers the maximum rejection within the stopband minus the mean of the absolute value of passband rejection. With this criterion, the results are significantly improved as shown in figure~\ref{fig:custom_criterion}.
	251	251
\begin{figure}	252	252	\begin{figure}
\centering	253	253	\centering
\includegraphics[width=\linewidth]{images/mean_criterion}	254	254	\includegraphics[width=\linewidth]{images/mean_criterion}
\caption{Mean criterion comparison between monolithic filter and cascade filters}	255	255	\caption{Mean criterion comparison between monolithic filter and cascade filters}
\label{fig:mean_criterion}	256	256	\label{fig:mean_criterion}
\end{figure}	257	257	\end{figure}
	258	258
\begin{figure}	259	259	\begin{figure}
\centering	260	260	\centering
\includegraphics[width=\linewidth]{images/custom_criterion}	261	261	\includegraphics[width=\linewidth]{images/custom_criterion}
\caption{Custom criterion comparison between monolithic filter and cascade filters}	262	262	\caption{Custom criterion comparison between monolithic filter and cascade filters}
\label{fig:custom_criterion}	263	263	\label{fig:custom_criterion}
\end{figure}	264	264	\end{figure}
	265	265
Although we have a efficient criterion to estimate the rejection of one set of coefficient	266	266	Although we have a efficient criterion to estimate the rejection of one set of coefficient
we have a problem when we sum two or more criterion. If the FIR filter coefficients are the same	267	267	we have a problem when we sum two or more criterion. If the FIR filter coefficients are the same
between the stage, we have:	268	268	between the stage, we have:
$$F_{total} = F_1 + F_2$$	269	269	$$F_{total} = F_1 + F_2$$
But when we choose two different set of coefficient, the previous equality are not	270	270	But when we choose two different set of coefficient, the previous equality are not
true. The figure~\ref{fig:sum_rejection} illustrates the problem. The red and blue curves	271	271	true. The figure~\ref{fig:sum_rejection} illustrates the problem. The red and blue curves
are two different filter coefficient and we can see that their maximum on the stopband	272	272	are two different filter coefficient and we can see that their maximum on the stopband
are not at the same frequency. So when we sum the rejection criteria (the dotted yellow line)	273	273	are not at the same frequency. So when we sum the rejection criteria (the dotted yellow line)
we do not meet the dashed yellow line. Define the rejection of cascaded filters	274	274	we do not meet the dashed yellow line. Define the rejection of cascaded filters
is more difficult than just take the summation between all the rejection criteria of each filter.	275	275	is more difficult than just take the summation between all the rejection criteria of each filter.
However this summation gives us an upper bound for rejection although in fact we obtain	276	276	However this summation gives us an upper bound for rejection although in fact we obtain
better rejection than expected.	277	277	better rejection than expected.
	278	278
\begin{figure}	279	279	\begin{figure}
\centering	280	280	\centering
\includegraphics[width=\linewidth]{images/sum_rejection}	281	281	\includegraphics[width=\linewidth]{images/sum_rejection}
\caption{Rejection of two cascaded filters}	282	282	\caption{Rejection of two cascaded filters}
\label{fig:sum_rejection}	283	283	\label{fig:sum_rejection}
\end{figure}	284	284	\end{figure}
	285	285
		286	Finally we can describe our abstract model with following expressions :
		287	\begin{align}
		288	\text{Maximize } & \sum_{i=1}^n r_i \notag \\
		289	\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\
		290	a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\
		291	r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\
		292	\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\
		293	\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\
		294	\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\
		295	\pi_1^- &= \Pi^I \label{eq:init}
		296	\end{align}
		297
		298	{\color{red} Je sais que l'idée est de ne pas parler du programme linéaire mais
		299	ça me semble quand même indispensable. Au pire, j'essaierai de revoir ça si on
		300	est vraiment en manque de place.}
		301
		302	Equation~\ref{eq:area} states that the total area taken by the filters must be
		303	less than the available area. Equation~\ref{eq:areadef} gives the definition of
		304	the area for a filter. More precisely, it is the area of the FIR as the Shifter
		305	does not need any circuitry. We consider that the FIR needs $C_i$ registers of size
		306	$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the
		307	input data and the coefficients. Equation~\ref{eq:rejectiondef} gives the
		308	definition of the rejection of the filter thanks to function~$F$ that we defined
		309	previously. The Shifter does not introduce negative rejection as we explain later,
		310	so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the
		311	relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add
		312	$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes
		313	$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of
		314	a filter is the same as the input number of bits of the next filter.
		315	Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative
		316	rejection. Indeed, the results of the FIR can be right shifted without compromising
		317	the quality of the rejection until a threshold. Each bit of the output data
		318	increases the maximum rejection level of 6~dB. We add one to take the sign bit
		319	into account. If equation~\ref{eq:maxshift} was not present, the Shifter could
		320	shift too much and introduce some noise in the output data. Each supplementary
		321	shift bit would cause 6~dB of noise. A totally equivalent equation is:
		322	$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right) $.
		323	Finally, equation~\ref{eq:init} gives the global input's number of bits.
		324
		325	This model is non-linear and even non-quadratic, as $F$ does not have a known
		326	linear or quadratic expression. We introduce $p$ FIR configurations
		327	$(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants. We define binary
		328	variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$
		329	and 0 otherwise. The new equations are as follows:
		330
		331	\begin{align}
		332	a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\
		333	r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\
		334	\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\
		335	\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}
		336	\end{align}
		337
		338	Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace
		339	respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.
		340	Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.
		341
		342	The next section shows the results for this quadratic program but the section~\ref{sec:fixed_rej}
		343	presents the results for the complementary problem. In this case we want
		344	minimize the occupied area for a targeted rejection level. Hence we have replace
		345	the objective function with:
		346	\begin{align}
		347	\text{Minimize } & \sum_{i=1}^n a_i \notag
		348	\end{align}
		349	We adapt our constraints of quadratic program to replace the equation \ref{eq:area}
		350	by the equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal
		351	rejection required.
		352
		353	\begin{align}
		354	\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}
		355	\end{align}
		356
		357	\section{Design workflow}
		358	\label{sec:workflow}
		359
		360	In this section, we describe the workflow to compute all the results presented in section~\ref{sec:fixed_area}.
		361	Figure~\ref{fig:workflow} shows the global workflow and the different steps involved in the computations of the results.
		362
		363	\begin{figure}
		364	\centering
		365	\begin{tikzpicture}[node distance=0.75cm and 2cm]
		366	\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;
		367	\node (Start) [left= 3cm of Solver] { } ;
		368	\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;
		369	\node (Input) [above= of TCL] { } ;
		370	\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;
		371	\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;
		372	\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;
		373	\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;
		374	\node (Results) [left= of Postproc] { } ;
		375
		376	\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;
		377	\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;
		378	\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;
		379	\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;
		380	\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;
		381	\draw[->,dashed] (Bitstream) -- (Deploy) ;
		382	\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;
		383	\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;
		384	\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;
		385	\draw[->] (Postproc) -- (Results) ;
		386	\end{tikzpicture}
		387	\caption{Design workflow from the input parameters to the results}
		388	\label{fig:workflow}
		389	\end{figure}
		390
		391	The filter solver is a C++ program that takes as input the maximum area
		392	$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,
		393	the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates
		394	the quadratic programs and uses the Gurobi solver to get the optimal results.
		395	Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})
		396	and a deploy script ((1b) on figure~\ref{fig:workflow}).
		397
		398	The TCL script describes the whole digital processing chain from the beginning
		399	(the raw signal data) to the end (the filtered data).
		400	The raw input data generated from a Pseudo Random Number (PRN)
		401	generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.
		402	Then the script builds each stage of the chain with a generic FIR task that
		403	comes from a skeleton library. The generic FIR is highly configurable
		404	with the number of coefficients and the size of the coefficients. The coefficients
		405	themselves are not stored in the script.
		406	Whereas the signal is processed in real-time, the output signal is stored as
		407	consecutive bursts of data.
		408
		409	The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).
		410	We use the 2018.2 version of Xilinx Vivado and we execute the synthesized
		411	bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series
		412	FPGA (xc7z010clg400-1) and two 125~MS/s ADC.
		413	The board works with a Buildroot Linux image. We have developed some tools and
		414	drivers to flash and communicate with the FPGA. They are used to automatize all
		415	the workflow inside the board: load the filter coefficients and retrieve the
		416	computed data.
		417
		418	The deploy script uploads the bitstream to the board ((3) on
		419	figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,
		420	configures the coefficients of the FIR filters. It then waits for the results
		421	and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).
		422
		423	Finally, an Octave post-processing script computes the final results thanks to
		424	the output data ((5) on figure~\ref{fig:workflow}).
		425	The results are normalized so that the Power Spectrum Density (PSD) starts at zero
		426	and the different configurations can be compared.
		427
		428	The workflow used to compute the results in section~\ref{sec:fixed_rej}, we
		429	have just adapted the quadratic program but the rest of the workflow is unchanged.
		430
\section{Experiments with fixed area space}	286	431	\section{Experiments with fixed area space}
		432	\label{sec:fixed_area}
		433	This section presents the output of the filter solver {\em i.e.} the computed
		434	configurations for each stage, the computed rejection and the computed silicon area.
		435	This is interesting to understand the choices made by the solver to compute its solutions.
	287	436
		437	The experimental setup is composed of three cases. The raw input is generated
		438	by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.
		439	Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500
		440	arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.
		441	The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$
		442	ranging from 2 to 22. In each case, the quadratic program has been able to give a
		443	result up to five stages ($n = 5$) in the cascaded filter.
		444
		445	Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.
		446	Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.
		447	Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.
		448
		449	\renewcommand{\arraystretch}{1.4}
		450
		451	\begin{table}
		452	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}
		453	\label{tbl:gurobi_max_500}
		454	\centering
		455	{\scalefont{0.77}
		456	\begin{tabular}{\|c\|ccccc\|c\|c\|}
		457	\hline
		458	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
		459	\hline
		460	1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\
		461	2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\
		462	3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\
		463	4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\
		464	5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\
		465	\hline
		466	\end{tabular}
		467	}
		468	\end{table}
		469
		470	\begin{table}
		471	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}
		472	\label{tbl:gurobi_max_1000}
		473	\centering
		474	{\scalefont{0.77}
		475	\begin{tabular}{\|c\|ccccc\|c\|c\|}
		476	\hline
		477	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
		478	\hline
		479	1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\
		480	2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\
		481	3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\
		482	4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\
		483	5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\
		484	\hline
		485	\end{tabular}
		486	}
		487	\end{table}
		488
		489	\begin{table}
		490	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}
		491	\label{tbl:gurobi_max_1500}
		492	\centering
		493	{\scalefont{0.77}
		494	\begin{tabular}{\|c\|ccccc\|c\|c\|}
		495	\hline
		496	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
		497	\hline
		498	1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\
		499	2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\
		500	3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\
		501	4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\
		502	5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\
		503	\hline
		504	\end{tabular}
		505	}
		506	\end{table}
		507
		508	\renewcommand{\arraystretch}{1}
		509
		510	From these tables, we can first state that the more stages are used to define
		511	the cascaded FIR filters, the better the rejection. It was an expected result as it has
		512	been previously observed that many small filters are better than
		513	a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusion
		514	being hardly used in practice due to the lack of tools for identifying individual filter
		515	coefficients in the cascaded approach.
		516
		517	Second, the larger the silicon area, the better the rejection. This was also an
		518	expected result as more area means a filter of better quality (more coefficients
		519	or more bits per coefficient).
		520
		521	Then, we also observe that the first stage can have a larger shift than the other
		522	stages. This is explained by the fact that the solver tries to use just enough
		523	bits for the computed rejection after each stage. In the first stage, a
		524	balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}
		525	gives the relation between both values.
		526
		527	Finally, we note that the solver consumes all the given silicon area.
		528
		529	The following graphs present the rejection for real data on the FPGA. In all following
		530	figures, the solid line represents the actual rejection of the filtered
		531	data on the FPGA as measured experimentally and the dashed line are the noise level
		532	given by the quadratic solver. The configurations are those computed in the previous section.
		533
		534	Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.
		535	Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.
		536	Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.
		537
\begin{figure}	288	538	\begin{figure}
\centering	289	539	\centering
\includegraphics[width=\linewidth]{images/max_rejection/prn_500}	290	540	\includegraphics[width=\linewidth]{images/max_500}
\caption{Experimental results for design with PRN as data input and 500 a.u. as max arbitrary space}	291	541	\caption{Signal spectrum for MAX/500}
\label{fig:prn_500}	292	542	\label{fig:max_500_result}
\end{figure}	293	543	\end{figure}
	294	544
\begin{figure}	295	545	\begin{figure}
\centering	296	546	\centering
\includegraphics[width=\linewidth]{images/max_rejection/prn_1000}	297	547	\includegraphics[width=\linewidth]{images/max_1000}
\caption{Experimental results for design with PRN as data input and 1000 a.u. as max arbitrary space}	298	548	\caption{Signal spectrum for MAX/1000}
\label{fig:prn_1000}	299	549	\label{fig:max_1000_result}
\end{figure}	300	550	\end{figure}
	301	551
\begin{figure}	302	552	\begin{figure}
\centering	303	553	\centering
\includegraphics[width=\linewidth]{images/max_rejection/prn_2000}	304	554	\includegraphics[width=\linewidth]{images/max_1500}
\caption{Experimental results for design with PRN as data input and 2000 a.u. as max arbitrary space}	305	555	\caption{Signal spectrum for MAX/1500}
\label{fig:prn_2000}	306	556	\label{fig:max_1500_result}
\end{figure}	307	557	\end{figure}
	308	558
\begin{table}	309	559	In all cases, we observe that the actual rejection is close to the rejection computed by the solver.
\centering	310
\begin{tabular}{\|c\|c\|ccc\|c\|c\|}	311
\hline	312
\multicolumn{2}{\|c\|}{\multirow{2}{}{Stage}} & \multicolumn{3}{c\|}{Stage} & \multirow{2}{}{Rejection} & \multirow{2}{*}{Area} \\ \cline{3-5}	313
\multicolumn{2}{\|c\|}{} & i = 1 & i = 2 & i = 3 & & \\ \hline	314
& C & 19 & - & - & & \\	315
n = 1 & $pi^C$ & 7 & - & - & 33 dB & 437 a.u. \\	316
& $pi^S$ & 0 & - & - & & \\ \hline	317
& C & 11 & 19 & - & & \\	318
n = 2 & $pi^C$ & 5 & 7 & - & 53 dB & 478 a.u. \\	319
& $pi^S$ & 16 & 0 & - & & \\ \hline	320
& C & 9 & 15 & 11 & & \\	321
n = 3 & $pi^C$ & 4 & 6 & 5 & 57 dB & 499 a.u. \\	322
& $pi^S$ & 16 & 3 & 0 & & \\ \hline	323
\end{tabular}	324
\caption{Solver results for design with PRN as data input and 500 a.u. as max arbitrary space}	325
\label{tbl:prn_500}	326
\end{table}	327
	328	560
		561	We compare the actual silicon resources given by Vivado to the
		562	resources in arbitrary units.
		563	The goal is to check that our arbitrary units of silicon area models well enough
		564	the real resources on the FPGA. Especially we want to verify that, for a given
		565	number of arbitrary units, the actual silicon resources do not depend on the
		566	number of stages $n$. Most significantly, our approach aims
		567	at remaining far enough from the practical logic gate implementation used by
		568	various vendors to remain platform independent and be portable from one
		569	architecture to another.
		570
		571	Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and
		572	MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000
		573	and 1500 arbitrary units. We have taken care to extract solely the resources used by
		574	the FIR filters and remove additional processing blocks including FIFO and PL to
		575	PS communication.
		576
\begin{table}	329	577	\begin{table}
\centering	330	578	\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}
{\scalefont{0.85}	331	579	\label{tbl:resources_usage}
\begin{tabular}{\|c\|c\|ccccc\|c\|c\|}	332	580	\centering
\hline	333	581	\begin{tabular}{\|c\|c\|ccc\|c\|}
\multicolumn{2}{\|c\|}{\multirow{2}{}{Stage}} & \multicolumn{5}{c\|}{Stage} & \multirow{2}{}{Rejection} & \multirow{2}{*}{Area} \\ \cline{3-7}	334	582	\hline
\multicolumn{2}{\|c\|}{} & i = 1 & i = 2 & i = 3 & i = 4 & i = 5 & & \\ \hline	335	583	$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline
& C & 37 & - & - & - & - & & \\	336	584	& LUT & 249 & 453 & 627 & \emph{17600} \\
n = 1 & $pi^C$ & 11 & - & - & - & - & 56 dB & 999 a.u. \\	337	585	1 & BRAM & 1 & 1 & 1 & \emph{120} \\
& $pi^S$ & 0 & - & - & - & - & & \\ \hline	338	586	& DSP & 21 & 37 & 47 & \emph{80} \\ \hline
& C & 11 & 39 & - & - & - & & \\	339	587	& LUT & 2374 & 5494 & 691 & \emph{17600} \\
n = 2 & $pi^C$ & 5 & 13 & - & - & - & 82 dB & 972 a.u. \\	340	588	2 & BRAM & 2 & 2 & 2 & \emph{120} \\
& $pi^S$ & 16 & 0 & - & - & - & & \\ \hline	341	589	& DSP & 0 & 0 & 70 & \emph{80} \\ \hline
& C & 9 & 31 & 19 & - & - & & \\	342	590	& LUT & 2443 & 3304 & 3521 & \emph{17600} \\
n = 3 & $pi^C$ & 7 & 8 & 7 & - & - & 93 dB & 990 a.u. \\	343	591	3 & BRAM & 3 & 3 & 3 & \emph{120} \\
& $pi^S$ & 19 & 2 & 0 & - & - & & \\ \hline	344	592	& DSP & 0 & 19 & 35 & \emph{80} \\ \hline
& C & 9 & 19 & 17 & 11 & - & & \\	345	593	& LUT & 2634 & 3753 & 2557 & \emph{17600} \\
n = 4 & $pi^C$ & 4 & 7 & 7 & 5 & - & 99 dB & 992 a.u. \\	346	594	4 & BRAM & 4 & 4 & 4 & \emph{120} \\
& $pi^S$ & 16 & 3 & 3 & 0 & - & & \\ \hline	347	595	& DPS & 0 & 19 & 46 & \emph{80} \\ \hline
& C & 9 & 15 & 11 & 11 & 11 & & \\	348	596	& LUT & 2423 & 3047 & 2847 & \emph{17600} \\
n = 5 & $pi^C$ & 4 & 7 & 5 & 5 & 5 & 99 dB & 998 a.u. \\	349	597	5 & BRAM & 5 & 5 & 5 & \emph{120} \\
& $pi^S$ & 16 & 3 & 2 & 1 & 1 & & \\ \hline	350	598	& DPS & 0 & 22 & 46 & \emph{80} \\ \hline
\end{tabular}	351	599	\end{tabular}
}	352
\caption{Solver results for design with PRN as data input and 1000 a.u. as max arbitrary space}	353
\label{tbl:prn_1000}	354
\end{table}	355	600	\end{table}
	356	601
		602	In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,
		603	when the filters coefficients are small enough, or when the input size is small
		604	enough, Vivado optimized resource consumption by selecting multiplexers to
		605	implement the multiplications instead of a DSP. In this case, it is quite difficult
		606	to compare the whole silicon budget.
		607
		608	However, a rough estimation can be made with a simple equivalence. Looking at
		609	the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,
		610	we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon
		611	area use. With this equivalence, our 500 arbitraty units corresponds to 2500 LUTs,
		612	1000 arbitrary units corresponds to 5000 LUTs and 1500 arbitrary units corresponds
		613	to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary
		614	unit are quite good. The relatively small differences can probably be explained
		615	by the optimizations done by Vivado based on the detailed map of available processing resources.
		616
		617	We present the computation time to solve the quadratic problem.
		618	For each case, the filter solver software are executed with a Intel(R) Xeon(R) CPU E5606
		619	cadenced at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve
		620	the quadratic problem.
		621
		622	Table~\ref{tbl:area_time} shows the time needed to solve the quadratic
		623	problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.
		624
\begin{table}	357	625	\begin{table}
		626	\caption{Time to solve the quadratic program with Gurobi}
		627	\label{tbl:area_time}
\centering	358	628	\centering
{\scalefont{0.85}	359	629	\begin{tabular}{\|c\|c\|c\|c\|}\hline
\begin{tabular}{\|c\|c\|ccccc\|c\|c\|}	360	630	$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline
\hline	361	631	1 & 0.1~s & 0.1~s & 0.3~s \\
\multicolumn{2}{\|c\|}{\multirow{2}{}{Stage}} & \multicolumn{5}{c\|}{Stage} & \multirow{2}{}{Rejection} & \multirow{2}{*}{Area} \\ \cline{3-7}	362	632	2 & 1.1~s & 2.2~s & 12~s \\
\multicolumn{2}{\|c\|}{} & i = 1 & i = 2 & i = 3 & i = 4 & i = 5 & & \\ \hline	363	633	3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\
& C & 39 & - & - & - & - & & \\	364	634	4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\
n = 1 & $pi^C$ & 13 & - & - & - & - & 61 dB & 1131 a.u. \\	365	635	5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline
& $pi^S$ & 0 & - & - & - & - & & \\ \hline	366
& C & 37 & 39 & - & - & - & & \\	367
n = 2 & $pi^C$ & 11 & 13 & - & - & - & 117 dB & 1974 a.u. \\	368
& $pi^S$ & 17 & 0 & - & - & - & & \\ \hline	369
& C & 15 & 35 & 35 & - & - & & \\	370
n = 3 & $pi^C$ & 9 & 11 & 11 & - & - & 138 dB & 1985 a.u. \\	371
& $pi^S$ & 19 & 3 & 0 & - & - & & \\ \hline	372
& C & 11 & 27 & 27 & 23 & - & & \\	373
n = 4 & $pi^C$ & 5 & 9 & 9 & 9 & - & 148 dB & 1993 a.u. \\	374
& $pi^S$ & 16 & 3 & 2 & 0 & - & & \\ \hline	375
& C & 11 & 27 & 31 & 11 & 11 & & \\	376
n = 5 & $pi^C$ & 5 & 9 & 8 & 5 & 5 & 153 dB & 2000 a.u. \\	377
& $pi^S$ & 16 & 3 & 1 & 0 & 1 & & \\ \hline	378
\end{tabular}	379	636	\end{tabular}
}	380
\caption{Solver results for design with PRN as data input and 2000 a.u. as max arbitrary space}	381
\label{tbl:prn_2000}	382
\end{table}	383	637	\end{table}
	384	638
		639	As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?
		640	When the area is limited, the design exploration space is more limited and the solver is able to
		641	find an optimal solution faster. On the contrary, in the case of MAX/1500 with
		642	5~stages, we were not able to obtain a result after 40~hours of computation so we decided to stop.
		643
		644	\section{Experiments with fixed rejection target}
		645	\label{sec:fixed_rej}
		646	This section presents the results of complementary quadratic program which we
		647	minimize the area occupation for a targeted noise level.
		648
		649	The experimental setup is also composed of three cases. The raw input is the same
		650	as previous section, a PRN generator, which fixes the input data size $\Pi^I$.
		651	Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60 or 80~dB.
		652	Hence, the three cases have been named: MIN/40, MIN/60, MIN/80.
		653	The number of configurations $p$ is the same as previous section.
		654
		655	Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.
		656	Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.
		657	Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.
		658
		659	\renewcommand{\arraystretch}{1.4}
		660
\begin{table}	385	661	\begin{table}
\centering	386	662	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}
\begin{tabular}{\|c\|c\|c\|c\|c\|}\hline	387	663	\label{tbl:gurobi_min_40}
Input & Stages & Computation time & Vivado time & Redpitaya time \\\hline\hline	388	664	\centering
& 1 & 0.02~s & $\approx$ 20 min & $\approx$ 1 min \\	389	665	{\scalefont{0.77}
PRN & 2 & 1.70~s & $\approx$ 20 min & $\approx$ 1 min \\	390	666	\begin{tabular}{\|c\|ccccc\|c\|c\|}
& 3 & 19~s & $\approx$ 20 min & $\approx$ 1 min \\\hline	391	667	\hline
\end{tabular}	392	668	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
\caption{Time to compute and deploy the designs for PRN 500}	393	669	\hline
\label{tbl:time_prn_500}	394	670	1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\
		671	2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\
		672	3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\
		673	4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\
		674	\hline
		675	\end{tabular}
		676	}
\end{table}	395	677	\end{table}
	396	678
\begin{table}	397	679	\begin{table}
\centering	398	680	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}
\begin{tabular}{\|c\|c\|c\|c\|c\|}\hline	399	681	\label{tbl:gurobi_min_60}
Input & Stages & Computation time & Vivado time & Redpitaya time \\\hline\hline	400	682	\centering
& 1 & 0.07~s & $\approx$ 20 min & $\approx$ 1 min \\	401	683	{\scalefont{0.77}
& 2 & 1.31~s & $\approx$ 20 min & $\approx$ 1 min \\	402	684	\begin{tabular}{\|c\|ccccc\|c\|c\|}
PRN & 3 & 119~s ($\approx$ 2~min) & $\approx$ 20 min & $\approx$ 1 min \\	403	685	\hline
& 4 & 270~s ($\approx$ 5~min) & $\approx$ 20 min & $\approx$ 1 min \\	404	686	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
& 5 & 5998~s ($\approx$ 2~h) & $\approx$ 20 min & $\approx$ 1 min \\\hline	405	687	\hline
\end{tabular}	406	688	1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\
\caption{Time to compute and deploy the designs for PRN 1000}	407	689	2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\
\label{tbl:time_prn_1000}	408	690	3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\
		691	4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\
		692	5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\
		693	\hline
		694	\end{tabular}
		695	}
\end{table}	409	696	\end{table}
	410	697
\begin{table}	411	698	\begin{table}
\centering	412	699	\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}
\begin{tabular}{\|c\|c\|c\|c\|c\|}\hline	413	700	\label{tbl:gurobi_min_80}
Input & Stages & Computation time & Vivado time & Redpitaya time \\\hline\hline	414	701	\centering
& 1 & 0.07~s & $\approx$ 20 min & $\approx$ 1 min \\	415	702	{\scalefont{0.77}
& 2 & 0.75~s & $\approx$ 20 min & $\approx$ 1 min \\	416	703	\begin{tabular}{\|c\|ccccc\|c\|c\|}
PRN & 3 & 36~s & - & - \\	417	704	\hline
& 4 & 14500~s ($\approx$ 4~h) & $\approx$ 20 min & $\approx$ 1 min \\	418	705	$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\
& 5 & 74237~s ($\approx$ 20~h) & $\approx$ 20 min & $\approx$ 1 min \\\hline	419	706	\hline
\end{tabular}	420	707	1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\
\caption{Time to compute and deploy the designs for PRN 2000}	421	708	2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\
\label{tbl:time_prn_2000}	422	709	3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\
		710	4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\
		711	5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\
		712	\hline
		713	\end{tabular}
		714	}
\end{table}	423	715	\end{table}
		716	\renewcommand{\arraystretch}{1}
	424	717
\section{Experiments with fixed rejection target}	425	718	From these tables, we can first state that all configuration reach the target rejection
		719	level and more we have stages lesser is the area occupied in arbitrary unit.
		720	Futhermore, the area of the monolithic filter is twice bigger than the two cascaded.
		721	More generally, more there is filters lower is the occupied area.
	426	722
		723	Like in previous section, the solver choose always a little filter as first
		724	filter stage and the second one is often the biggest filter. this choice can be explain
		725	as the previous section. The solver uses just enough bits to not degrade the input
		726	signal and in second filter it can choose a better filter to improve rejection without
		727	have too bits in the output data.
		728
		729	For the specific case in MIN/40 for $n = 5$ the solver has determined that the optimal
		730	number of filter is 4 so it not chose any configuration in last filter. Hence this
		731	solution is equivalent to the result for $n = 4$.
		732
		733	The following graphs present the rejection for real data on the FPGA. In all following
		734	figures, the solid line represents the actual rejection of the filtered
		735	data on the FPGA as measured experimentally and the dashed line are the noise level
		736	given by the quadratic solver.
		737
		738	Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.
		739	Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.
		740	Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.
		741
\begin{figure}	427	742	\begin{figure}
\centering	428	743	\centering
\includegraphics[width=\linewidth]{images/min_area/prn_50}	429	744	\includegraphics[width=\linewidth]{images/min_40}
\caption{Results for design with PRN as data input and 50 dB as aimed rejection level}	430	745	\caption{Signal spectrum for MIN/40}
\label{fig:prn_500}	431	746	\label{fig:min_40}
\end{figure}	432	747	\end{figure}
	433	748
\begin{figure}	434	749	\begin{figure}
\centering	435	750	\centering
\includegraphics[width=\linewidth]{images/min_area/prn_100}	436	751	\includegraphics[width=\linewidth]{images/min_60}
\caption{Results for design with PRN as data input and 50 dB as aimed rejection level}	437	752	\caption{Signal spectrum for MIN/60}
\label{fig:prn_100}	438	753	\label{fig:min_60}
\end{figure}	439	754	\end{figure}
	440	755
\begin{figure}	441	756	\begin{figure}
\centering	442	757	\centering
\includegraphics[width=\linewidth]{images/min_area/prn_150}	443	758	\includegraphics[width=\linewidth]{images/min_80}
\caption{Results for design with PRN as data input and 2000 a.u. as max arbitrary space}	444	759	\caption{Signal spectrum for MIN/80}
\label{fig:prn_150}	445	760	\label{fig:min_80}
\end{figure}	446	761	\end{figure}
	447	762
		763	We observe that all rejections given by the quadratic solver are close to the real
		764	rejection. All curves prove that the constraint to reach the target rejection is
		765	respected both monolithic filter or cascaded filters.
		766
		767	Table~\ref{tbl:resources_usage} shows the resources usage in the case of MIN/40, MIN/60 and
		768	MIN/80 \emph{i.e.} when the target rejection is fixed to 40, 60 and 80~dB. We
		769	have taken care to extract solely the resources used by
		770	the FIR filters and remove additional processing blocks including FIFO and PL to
		771	PS communication.
		772
		773	\begin{table}
		774	\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}
		775	\label{tbl:resources_usage_comp}
		776	\centering
		777	\begin{tabular}{\|c\|c\|ccc\|c\|}
		778	\hline
		779	$n$ & & MIN/40 & MIN/60 & MIN/80 & \emph{Zynq 7010} \\ \hline\hline
		780	& LUT & 343 & 334 & 772 & \emph{17600} \\
		781	1 & BRAM & 1 & 1 & 1 & \emph{120} \\
		782	& DSP & 27 & 39 & 55 & \emph{80} \\ \hline
		783	& LUT & 1252 & 2862 & 5099 & \emph{17600} \\
		784	2 & BRAM & 2 & 2 & 2 & \emph{120} \\
		785	& DSP & 0 & 0 & 0 & \emph{80} \\ \hline
		786	& LUT & 891 & 2148 & 2023 & \emph{17600} \\
		787	3 & BRAM & 3 & 3 & 3 & \emph{120} \\
		788	& DSP & 0 & 0 & 19 & \emph{80} \\ \hline
		789	& LUT & 662 & 1729 & 2451 & \emph{17600} \\
		790	4 & BRAM & 4 & 4 & 4 & \emph{120} \\
		791	& DPS & 0 & 0 & 7 & \emph{80} \\ \hline
		792	& LUT & - & 1259 & 2602 & \emph{17600} \\
		793	5 & BRAM & - & 5 & 5 & \emph{120} \\
		794	& DPS & - & 0 & 0 & \emph{80} \\ \hline
		795	\end{tabular}
		796	\end{table}
		797
		798	If we keep the previous estimation of cost of one DSP in term of LUT (1 DSP $\approx$ 100 LUT)
		799	the real resource consumption decrease in function of number of stage filter according
		800	to the solution given by the quadratic solver. Indeed, we have always a decreasing
		801	consumption even if the difference between the monolithic and the two cascaded
		802	filters is lesser than expected.
		803
		804	Finally, the table~\ref{tbl:area_time_comp} shows the computation time to solve
		805	the quadratic program.
		806
		807	\begin{table}
		808	\caption{Time to solve the quadratic program with Gurobi}
		809	\label{tbl:area_time_comp}
		810	\centering
		811	\begin{tabular}{\|c\|c\|c\|c\|}\hline
		812	$n$ & Time (MIN/40) & Time (MIN/60) & Time (MIN/80) \\\hline\hline
		813	1 & 0.07~s & 0.02~s & 0.01~s \\
		814	2 & 7.8~s & 16~s & 14~s \\
		815	3 & 4.7~s & 14~s & 28~s \\
		816	4 & 39~s & 20~s & 193~s \\
		817	5 & 126~s & 12~s & 170~s \\\hline
		818	\end{tabular}
		819	\end{table}
		820
		821	The time needed to solve this configuration are substantially faster than time
		822	needed in the previous section. Indeed the worst time in this case is only 3~minutes
		823	in balance of 3~days on previous section. We are able to solve more easily this
		824	problem than the previous one.
		825
\section{Conclusion}	448	826	\section{Conclusion}
		827
		828	In this paper, we have proposed a new approach to work with a cascade of FIR filter inside a FPGA.
		829	This method aims to be hardware independent and focus an high-level of abstraction.
		830	We have modeled the FIR filter operation and the data shift impact. With this model
		831	we have created a quadratic program to select the optimal FIR coefficient set to reject a
		832	maximum of noise. In our experiments we have chosen deliberately some common tools
		833	to design the filter coefficients but we can use any other method.
		834
		835	Our experimental results are very promising in providing a rational approach to selecting
		836	the coefficients of each FIR filter in the context of a performance target for a chain of
		837	such filters. The FPGA design that is produced automatically by our
		838	workflow is able to filter an input signal as expected which validates our model and our approach.
		839	We can easily change the quadratic program to adapt it to an other problem.
		840
		841	A perspective is to model and add the decimators to the processing chain to have a classical
		842	FIR filter and decimator. The impact of the decimator is not so trivial, especially in terms of silicon
		843	area for the subsequent stages since some hardware optimization can be applied in
		844	this case.
		845
		846	The software used to demonstrate the concepts developed in this paper is based on the
		847	CPU-FPGA co-design framework available at \url{https://github.com/oscimp/oscimpDigital}.
	449	848
\section*{Acknowledgement}	450	849	\section*{Acknowledgement}
	451	850
This work is supported by the ANR Programme d'Investissement d'Avenir in	452	851	This work is supported by the ANR Programme d'Investissement d'Avenir in
progress at the Time and Frequency Departments of the FEMTO-ST Institute	453	852	progress at the Time and Frequency Departments of the FEMTO-ST Institute
(Oscillator IMP, First-TF and Refimeve+), and by R\'egion de Franche-Comt\'e.	454	853	(Oscillator IMP, First-TF and Refimeve+), and by R\'egion de Franche-Comt\'e.
The authors would like to thank E. Rubiola, F. Vernotte, and G. Cabodevila	455	854	The authors would like to thank E. Rubiola, F. Vernotte, and G. Cabodevila
for support and fruitful discussions.	456	855	for support and fruitful discussions.
	457	856
\bibliographystyle{IEEEtran}	458	857	\bibliographystyle{IEEEtran}
\balance	459	858	\balance
\bibliography{references,biblio}	460	859	\bibliography{references,biblio}
\end{document}	461	860	\end{document}
	462	861

images/max_1000.pdf

Diff comments View file @ 0096d46

No preview for this file type

images/max_1500.pdf

Diff comments View file @ 0096d46

No preview for this file type

images/max_500.pdf

Diff comments View file @ 0096d46

No preview for this file type

images/max_rejection/prn_1000.pdf

Diff comments View file @ 0096d46

No preview for this file type

images/max_rejection/prn_2000.pdf

Diff comments View file @ 0096d46

No preview for this file type

images/max_rejection/prn_500.pdf

Diff comments View file @ 0096d46

No preview for this file type

images/min_40.pdf

Diff comments View file @ 0096d46

No preview for this file type

images/min_60.pdf

Diff comments View file @ 0096d46

No preview for this file type

images/min_80.pdf

Diff comments View file @ 0096d46

No preview for this file type

references.bib

Diff comments View file @ 0096d46

@article{yu2007design,	1	1	@article{yu2007design,
title={Design of linear phase {FIR} filters in subexpression space using mixed integer linear programming},	2	2	title={Design of linear phase {FIR} filters in subexpression space using mixed integer linear programming},
author={Yu, Ya Jun and Lim, Yong Ching},	3	3	author={Yu, Ya Jun and Lim, Yong Ching},
journal={IEEE Transactions on Circuits and Systems I: Regular Papers},	4	4	journal={IEEE Transactions on Circuits and Systems I: Regular Papers},
volume={54},	5	5	volume={54},
number={10},	6	6	number={10},
pages={2330--2338},	7	7	pages={2330--2338},
year={2007},	8	8	year={2007},
publisher={IEEE}	9	9	publisher={IEEE}
}	10	10	}
	11	11
@article{kodek1980design,	12	12	@article{kodek1980design,
title={Design of optimal finite wordlength {FIR} digital filters using integer	13	13	title={Design of optimal finite wordlength {FIR} digital filters using integer
programming techniques},	14	14	programming techniques},
author={Kodek, Dusan},	15	15	author={Kodek, Dusan},
journal={IEEE Transactions on Acoustics, Speech, and Signal Processing},	16	16	journal={IEEE Transactions on Acoustics, Speech, and Signal Processing},
volume={28},	17	17	volume={28},
number={3},	18	18	number={3},
pages={304--308},	19	19	pages={304--308},
year={1980},	20	20	year={1980},
publisher={IEEE}	21	21	publisher={IEEE}
}	22	22	}
	23	23
@book{leung2004handbook,	24	24	@book{leung2004handbook,
title={Handbook of scheduling: algorithms, models, and performance analysis},	25	25	title={Handbook of scheduling: algorithms, models, and performance analysis},
author={Leung, Joseph YT},	26	26	author={Leung, Joseph YT},
year={2004},	27	27	year={2004},
publisher={CRC Press}	28	28	publisher={CRC Press}
}	29	29	}
	30	30
@misc{glpk,	31	31	@misc{glpk,
title={\url{https://www.gnu.org/software/glpk/}},	32	32	title={\url{https://www.gnu.org/software/glpk/}},
note={availble online, accessed May 2018}	33	33	note={availble online, accessed May 2018}
}	34	34	}
	35	35
@article{rsi,	36	36	@article{rsi,
title={Oscillator metrology with software defined radio},	37	37	title={Oscillator metrology with software defined radio},
author={Sherman, Jeff A and J{\"o}rdens, Robert},	38	38	author={Sherman, Jeff A and J{\"o}rdens, Robert},
journal={Review of Scientific Instruments},	39	39	journal={Review of Scientific Instruments},
volume={87},	40	40	volume={87},
number={5},	41	41	number={5},
pages={054711},	42	42	pages={054711},
year={2016},	43	43	year={2016},
publisher={AIP Publishing}	44	44	publisher={AIP Publishing}
}	45	45	}
		46
		47	@inproceedings{lim_1996,
		48	author={Y.-C. Lim and R. Yang and B. Liu},
		49	booktitle={1996 IEEE International Symposium on Circuits and Systems. Circuits and Systems Connecting the World. ISCAS 96},
		50	title={The design of cascaded FIR filters},
		51	year={1996},
		52	volume={2},
		53	number={},
		54	pages={181-184 vol.2},
		55	keywords={cascade networks;digital filters;FIR filters;filtering theory;linear programming;frequency response;cascaded FIR filters;stopband response;minimum attenuation requirement;passband ripple magnitude;linear-programming technique;FIR filter design;filter optimisation;Finite impulse response filter;IIR filters;Passband;Frequency;Signal sampling;Band pass filters;Digital filters;Attenuation;Image sampling;Linear programming},
		56	doi={10.1109/ISCAS.1996.540382},
		57	ISSN={},
		58	month={May},}
		59
		60	@article{lim_1988,
		61	author={Y. C. {Lim} and B. {Liu}},
		62	journal={IEEE Transactions on Acoustics, Speech, and Signal Processing},
		63	title={Design of cascade form FIR filters with discrete valued coefficients},
		64	year={1988},
		65	volume={36},
		66	number={11},
		67	pages={1735-1739},
		68	keywords={cascade networks;digital filters;filtering and prediction theory;iterative equalisation strategy;cascade form FIR filters;discrete valued coefficients;peak ripple;prototype filter;roundoff noise property;Finite impulse response filter;Low pass filters;Band pass filters;Passband;Prototypes;Frequency;Digital filters;Digital arithmetic;Design optimization;Sampling methods},
		69	doi={10.1109/29.9010},
		70	ISSN={0096-3518},
		71	month={Nov},}
		72
		73	@inproceedings{young_1992,
		74	author={C. {Young} and D. L. {Jones}},
		75	booktitle={[Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing},
		76	title={Improvement in finite wordlength FIR digital filter design by cascading},
		77	year={1992},
		78	volume={5},
		79	number={},
		80	pages={109-112 vol.5},
		81	keywords={approximation theory;digital filters;integer programming;series (mathematics);finite wordlength filter;quantization;FIR digital filter design;finite impulse response;digital systems;finite wordlength coefficients;cascaded subfilters;stopband suppression;Taylor series approximation;linear integer program;passband deviation;Finite impulse response filter;Digital filters;Linear programming;Passband;Quantization;Frequency response;Digital systems;Taylor series;Minimax techniques;Design optimization},
		82	doi={10.1109/ICASSP.1992.226646},
		83	ISSN={1520-6149},
		84	month={March},}
		85
		86	@article{smith_1998,
		87	author={L. M. {Smith}},
		88	journal={IEEE Transactions on Signal Processing},
		89	title={Decomposition of FIR digital filters for realization via the cascade connection of subfilters},
		90	year={1998},
		91	volume={46},
		92	number={6},
		93	pages={1681-1684},
		94	keywords={FIR filters;digital filters;cascade networks;Z transforms;transfer functions;frequency response;Newton-Raphson method;convergence of numerical methods;search problems;poles and zeros;FIR digital filters;subfilters cascade connection;even-order linear-phase FIR filters;filter decomposition;fourth-order subfilters;second-order subfilters;roots;z-domain filter transfer function;complex z plane;impulse response symmetry;unit circle;perimeter;complex values;real values;impulse response coefficients;root-finding algorithm;Newton-Raphson method;2D search;Cauchy-Riemann relations;convergence speed;frequency response characteristics;Finite impulse response filter;Digital filters;Polynomials;Programmable logic arrays;Transfer functions;Testing;Frequency response;Application specific integrated circuits;Nonlinear filters;Passband},