jfriedt / IFCS2018 article

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee

1

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee

% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de

2

% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de

% rejection par bit et perte si moins de bits que rejection/6

3

% rejection par bit et perte si moins de bits que rejection/6

% developper programme lineaire en incluant le decalage de bits

4

% developper programme lineaire en incluant le decalage de bits

% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on

5

% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on

% implemente et on demontre que ca tourne

6

% implemente et on demontre que ca tourne

% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?

7

% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?

% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer

8

% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer

% (zedboard ou redpit)

9

% (zedboard ou redpit)

10

% label schema : verifier que "argumenter de la cascade de FIR" est fait

11

% label schema : verifier que "argumenter de la cascade de FIR" est fait

12

\documentclass[a4paper,journal]{IEEEtran/IEEEtran}

13

\documentclass[a4paper,journal]{IEEEtran/IEEEtran}

\usepackage{graphicx,color,hyperref}

14

\usepackage{graphicx,color,hyperref}

\usepackage{amsfonts}

15

\usepackage{amsfonts}

\usepackage{amsthm}

16

\usepackage{amsthm}

\usepackage{amssymb}

17

\usepackage{amssymb}

\usepackage{amsmath}

18

\usepackage{amsmath}

\usepackage{algorithm2e}

19

\usepackage{algorithm2e}

\usepackage{url,balance}

20

\usepackage{url,balance}

\usepackage[normalem]{ulem}

21

\usepackage[normalem]{ulem}

\usepackage{tikz}

22

\usepackage{tikz}

\usetikzlibrary{positioning,fit}

23

\usetikzlibrary{positioning,fit}

\usepackage{multirow}

24

\usepackage{multirow}

\usepackage{scalefnt}

25

\usepackage{scalefnt}

\usepackage{caption}

26

\usepackage{caption}

\usepackage{subcaption}

27

\usepackage{subcaption}

28

% correct bad hyphenation here

29

% correct bad hyphenation here

\hyphenation{op-tical net-works semi-conduc-tor}

30

\hyphenation{op-tical net-works semi-conduc-tor}

\textheight=26cm

31

\textheight=26cm

\setlength{\footskip}{30pt}

32

\setlength{\footskip}{30pt}

\pagenumbering{gobble}

33

\pagenumbering{gobble}

\begin{document}

34

\begin{document}

\title{Filter optimization for real time digital processing of radiofrequency signals: application

35

\title{Filter optimization for real time digital processing of radiofrequency signals: application

to oscillator metrology}

36

to oscillator metrology}

37

\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},

38

\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},

G. Goavec-M\'erou\IEEEauthorrefmark{1},

39

G. Goavec-M\'erou\IEEEauthorrefmark{1},

P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}\\

40

P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}\\

\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }\\

41

\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }\\

\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\

42

\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\

Email: \{pyb2,jmfriedt\}@femto-st.fr}

43

Email: \{pyb2,jmfriedt\}@femto-st.fr}

}

44

}

\maketitle

45

\maketitle

\thispagestyle{plain}

46

\thispagestyle{plain}

\pagestyle{plain}

47

\pagestyle{plain}

\newtheorem{definition}{Definition}

48

\newtheorem{definition}{Definition}

49

\begin{abstract}

50

\begin{abstract}

Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to

51

Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to

radiofrequency signal processing. Applied to oscillator characterization in the context

52

radiofrequency signal processing. Applied to oscillator characterization in the context

of ultrastable clocks, stringent filtering requirements are defined by spurious signal or

53

of ultrastable clocks, stringent filtering requirements are defined by spurious signal or

noise rejection needs. Since real time radiofrequency processing must be performed in a

54

noise rejection needs. Since real time radiofrequency processing must be performed in a

Field Programmable Array to meet timing constraints, we investigate optimization strategies

55

Field Programmable Array to meet timing constraints, we investigate optimization strategies

to design filters meeting rejection characteristics while limiting the hardware resources

56

to design filters meeting rejection characteristics while limiting the hardware resources

required and keeping timing constraints within the targeted measurement bandwidths. The

57

required and keeping timing constraints within the targeted measurement bandwidths. The

presented technique is applicable to scheduling any sequence of processing blocks characterized

58

presented technique is applicable to scheduling any sequence of processing blocks characterized

by a throughput, resource occupation and performance tabulated as a function of configuration

59

by a throughput, resource occupation and performance tabulated as a function of configuration

characateristics, as is the case for filters with their coefficients and resolution yielding

60

characateristics, as is the case for filters with their coefficients and resolution yielding

rejection and number of multipliers.

61

rejection and number of multipliers.

\end{abstract}

62

\end{abstract}

63

\begin{IEEEkeywords}

64

\begin{IEEEkeywords}

Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter

65

Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter

\end{IEEEkeywords}

66

\end{IEEEkeywords}

67

\section{Digital signal processing of ultrastable clock signals}

68

\section{Digital signal processing of ultrastable clock signals}

69

Analog oscillator phase noise characteristics are classically performed by downconverting

70

Analog oscillator phase noise characteristics are classically performed by downconverting

the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,

71

the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,

followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In

72

followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In

a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by

73

a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by

multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.

74

multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.

75

\begin{figure}[h!tb]

76

\begin{figure}[h!tb]

\begin{center}

77

\begin{center}

\includegraphics[width=.8\linewidth]{images/schema}

78

\includegraphics[width=.8\linewidth]{images/schema}

\end{center}

79

\end{center}

\caption{Fully digital oscillator phase noise characterization: the Device Under Test

80

\caption{Fully digital oscillator phase noise characterization: the Device Under Test

(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and

81

(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and

downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals

82

downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals

and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite

83

and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite

Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays

84

Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays

the spectral characteristics of the phase fluctuations.}

85

the spectral characteristics of the phase fluctuations.}

\label{schema}

86

\label{schema}

\end{figure}

87

\end{figure}

88

As with the analog mixer,

89

As with the analog mixer,

the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as

90

the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as

well as the generation of the frequency sum signal in addition to the frequency difference.

91

well as the generation of the frequency sum signal in addition to the frequency difference.

These unwanted spectral characteristics must be rejected before decimating the data stream

92

These unwanted spectral characteristics must be rejected before decimating the data stream

for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the

93

for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the

downconverter

94

downconverter

and the decimation processing blocks are core characteristics of an oscillator characterization

95

and the decimation processing blocks are core characteristics of an oscillator characterization

system, and must reject out-of-band signals below the targeted phase noise -- typically in the

96

system, and must reject out-of-band signals below the targeted phase noise -- typically in the

sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will

97

sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will

use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency

98

use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency

datastream: optimizing the performance of the filter while reducing the needed resources is

99

datastream: optimizing the performance of the filter while reducing the needed resources is

hence tackled in a systematic approach using optimization techniques. Most significantly, we

100

hence tackled in a systematic approach using optimization techniques. Most significantly, we

tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with

101

tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with

tunable number of coefficients and tunable number of bits representing the coefficients and the

102

tunable number of coefficients and tunable number of bits representing the coefficients and the

data being processed.

103

data being processed.

104

\section{Finite impulse response filter}

105

\section{Finite impulse response filter}

106

We select FIR filters for their unconditional stability and ease of design. A FIR filter is defined

107

We select FIR filters for their unconditional stability and ease of design. A FIR filter is defined

by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the

108

by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the

outputs $y_k$

109

outputs $y_k$

\begin{align}

110

\begin{align}

y_n=\sum_{k=0}^N b_k x_{n-k}

111

y_n=\sum_{k=0}^N b_k x_{n-k}

\label{eq:fir_equation}

112

\label{eq:fir_equation}

\end{align}

113

\end{align}

114

As opposed to an implementation on a general purpose processor in which word size is defined by the

115

As opposed to an implementation on a general purpose processor in which word size is defined by the

processor architecture, implementing such a filter on an FPGA offers more degrees of freedom since

116

processor architecture, implementing such a filter on an FPGA offers more degrees of freedom since

not only the coefficient values and number of taps must be defined, but also the number of bits

117

not only the coefficient values and number of taps must be defined, but also the number of bits

defining the coefficients and the sample size. For this reason, and because we consider pipeline

118

defining the coefficients and the sample size. For this reason, and because we consider pipeline

processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency

119

processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency

signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but

120

signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but

the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language

121

the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language

(VHDL) level.

122

(VHDL) level.

{\color{red}Since latency is not an issue in a openloop phase noise characterization instrument,

123

{\color{red}Since latency is not an issue in a openloop phase noise characterization instrument,

the large

124

the large

numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,

125

numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,

is not considered as an issue as would be in a closed loop system.} % r2.4

126

is not considered as an issue as would be in a closed loop system.} % r2.4

127

The coefficients are classically expressed as floating point values. However, this binary

128

The coefficients are classically expressed as floating point values. However, this binary

number representation is not efficient for fast arithmetic computation by an FPGA. Instead,

129

number representation is not efficient for fast arithmetic computation by an FPGA. Instead,

we select to quantify these floating point values into integer values. This quantization

130

we select to quantify these floating point values into integer values. This quantization

will result in some precision loss.

131

will result in some precision loss.

132

\begin{figure}[h!tb]

133

\begin{figure}[h!tb]

\includegraphics[width=\linewidth]{images/zero_values}

134

\includegraphics[width=\linewidth]{images/zero_values}

\caption{Impact of the quantization resolution of the coefficients: the quantization is

135

\caption{Impact of the quantization resolution of the coefficients: the quantization is

set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting

136

set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting

the 30~first and 30~last coefficients out of the initial 128~band-pass

137

the 30~first and 30~last coefficients out of the initial 128~band-pass

filter coefficients to 0 (red dots).}

138

filter coefficients to 0 (red dots).}

\label{float_vs_int}

139

\label{float_vs_int}

\end{figure}

140

\end{figure}

141

The tradeoff between quantization resolution and number of coefficients when considering

142

The tradeoff between quantization resolution and number of coefficients when considering

integer operations is not trivial. As an illustration of the issue related to the

143

integer operations is not trivial. As an illustration of the issue related to the

relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits

144

relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits

a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon

145

a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon

quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the

146

quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the

taps become null, {\color{red}making the large number of coefficients irrelevant: processing

147

taps become null, {\color{red}making the large number of coefficients irrelevant: processing

resources % r1.1

148

resources % r1.1

are hence saved by shrinking the filter length.} This tradeoff aimed at minimizing resources

149

are hence saved by shrinking the filter length.} This tradeoff aimed at minimizing resources

to reach a given rejection level, or maximizing out of band rejection for a given computational

150

to reach a given rejection level, or maximizing out of band rejection for a given computational

resource, will drive the investigation on cascading filters designed with varying tap resolution

151

resource, will drive the investigation on cascading filters designed with varying tap resolution

and tap length, as will be shown in the next section. Indeed, our development strategy closely

152

and tap length, as will be shown in the next section. Indeed, our development strategy closely

follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}

153

follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}

in which basic blocks are defined and characterized before being assembled \cite{hide}

154

in which basic blocks are defined and characterized before being assembled \cite{hide}

in a complete processing chain. In our case, assembling the filter blocks is a simpler block

155

in a complete processing chain. In our case, assembling the filter blocks is a simpler block

combination process since we assume a single value to be processed and a single value to be

156

combination process since we assume a single value to be processed and a single value to be

generated at each clock cycle. The FIR filters will not be considered to decimate in the

157

generated at each clock cycle. The FIR filters will not be considered to decimate in the

current implementation: the decimation is assumed to be located after the FIR cascade at the

158

current implementation: the decimation is assumed to be located after the FIR cascade at the

moment.

159

moment.

160

\section{Methodology description}

161

\section{Methodology description}

162

Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP)

163

Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP)

chain obtained by assembling basic processing blocks, with hardware and manufacturer independence.

164

chain obtained by assembling basic processing blocks, with hardware and manufacturer independence.

Achieving such a target requires defining an abstract model to represent some basic properties

165

Achieving such a target requires defining an abstract model to represent some basic properties

of DSP blocks such as perfomance (i.e. rejection or ripples in the bandpass for filters) and

166

of DSP blocks such as perfomance (i.e. rejection or ripples in the bandpass for filters) and

resource occupation. These abstract properties, not necessarily related to the detailed hardware

167

resource occupation. These abstract properties, not necessarily related to the detailed hardware

implementation of a given platform, will feed a scheduler solver aimed at assembling the optimum

168

implementation of a given platform, will feed a scheduler solver aimed at assembling the optimum

target, whether in terms of maximizing performance for a given arbitrary resource occupation, or

169

target, whether in terms of maximizing performance for a given arbitrary resource occupation, or

minimizing resource occupation for a given perfomance. In our approach, the solution of the

170

minimizing resource occupation for a given perfomance. In our approach, the solution of the

solver is then synthesized using the dedicated tool provided by each platform manufacturer

171

solver is then synthesized using the dedicated tool provided by each platform manufacturer

to assess the validity of our abstract resource occupation indicator, and the result of running

172

to assess the validity of our abstract resource occupation indicator, and the result of running

the DSP chain on the FPGA allows for assessing the performance of the scheduler. We emphasize

173

the DSP chain on the FPGA allows for assessing the performance of the scheduler. We emphasize

that all solutions found by the solver are synthesized and executed on hardware at the end

174

that all solutions found by the solver are synthesized and executed on hardware at the end

of the analysis.

175

of the analysis.

176

In this demonstration , we focus on only two operations: filtering and shifting the number of

177

In this demonstration , we focus on only two operations: filtering and shifting the number of

bits needed to represent the data along the processing chain.

178

bits needed to represent the data along the processing chain.

We have chosen these basic operations because shifting and the filtering have already been studied

179

We have chosen these basic operations because shifting and the filtering have already been studied

in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for

180

in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for

assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend

181

assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend

requiring pipelined processing at full bandwidth for the earliest steps, including for

182

requiring pipelined processing at full bandwidth for the earliest steps, including for

time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}.

183

time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}.

184

Addressing only two operations allows for demonstrating the methodology but should not be

185

Addressing only two operations allows for demonstrating the methodology but should not be

considered as a limitation of the framework which can be extended to assembling any number

186

considered as a limitation of the framework which can be extended to assembling any number

of skeleton blocks as long as perfomance and resource occupation can be determined. {\color{red}

187

of skeleton blocks as long as perfomance and resource occupation can be determined. {\color{red}

Hence,

188

Hence,

in this paper we will apply our methodology on simple DSP chains: a white noise input signal % r1.2

189

in this paper we will apply our methodology on simple DSP chains: a white noise input signal % r1.2

is generated using a Pseudo-Random Number (PRN) generator or by sampling a wideband (125~MS/s)

190

is generated using a Pseudo-Random Number (PRN) generator or by sampling a wideband (125~MS/s)

14-bit Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.} Once samples have been

191

14-bit Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.} Once samples have been

digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance --

192

digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance --

practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction

193

practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction

by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing,

194

by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing,

allowing to assess either filter rejection for a given resource usage, or validating the rejection

195

allowing to assess either filter rejection for a given resource usage, or validating the rejection

when implementing a solution minimizing resource occupation.

196

when implementing a solution minimizing resource occupation.

197

{\color{red}

198

{\color{red}

The first step of our approach is to model the DSP chain. Since we aim at only optimizing % r1.3

199

The first step of our approach is to model the DSP chain. Since we aim at only optimizing % r1.3

the filtering part of the signal processing chain, we have not included the PRN generator or the

200

the filtering part of the signal processing chain, we have not included the PRN generator or the

ADC in the model: the input data size and rate are considered fixed and defined by the hardware.

201

ADC in the model: the input data size and rate are considered fixed and defined by the hardware.

The filtering can be done in two ways, either by considering a single monolithic FIR filter

202

The filtering can be done in two ways, either by considering a single monolithic FIR filter

requiring many coefficients to reach the targeted noise rejection ratio, or by

203

requiring many coefficients to reach the targeted noise rejection ratio, or by

cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.}

204

cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.}

205

After each filter we leave the possibility of shifting the filtered data to consume

206

After each filter we leave the possibility of shifting the filtered data to consume

less resources. Hence in the case of cascaded filter, we define a stage as a filter

207

less resources. Hence in the case of cascaded filter, we define a stage as a filter

and a shifter (the shift could be omitted if we do not need to divide the filtered data).

208

and a shifter (the shift could be omitted if we do not need to divide the filtered data).

209

\subsection{Model of a FIR filter}

210

\subsection{Model of a FIR filter}

211

A cascade of filters is composed of $n$ FIR stages. In stage $i$ ($1 \leq i \leq n$)

212

A cascade of filters is composed of $n$ FIR stages. In stage $i$ ($1 \leq i \leq n$)

the FIR has $C_i$ coefficients and each coefficient is an integer value with $\pi^C_i$

213

the FIR has $C_i$ coefficients and each coefficient is an integer value with $\pi^C_i$

bits while the filtered data are shifted by $\pi^S_i$ bits. We define also $\pi^-_i$ as

214

bits while the filtered data are shifted by $\pi^S_i$ bits. We define also $\pi^-_i$ as

the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}

215

the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}

shows a filtering stage.

216

shows a filtering stage.

217

\begin{figure}

218

\begin{figure}

\centering

219

\centering

\begin{tikzpicture}[node distance=2cm]

220

\begin{tikzpicture}[node distance=2cm]

\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;

221

\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;

\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;

222

\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;

\node (Start) [left of=FIR] { } ;

223

\node (Start) [left of=FIR] { } ;

\node (End) [right of=Shift] { } ;

224

\node (End) [right of=Shift] { } ;

225

\node[draw,fit=(FIR) (Shift)] (Filter) { } ;

226

\node[draw,fit=(FIR) (Shift)] (Filter) { } ;

227

\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;

228

\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;

\draw[->] (FIR) -- (Shift) ;

229

\draw[->] (FIR) -- (Shift) ;

\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;

230

\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;

\end{tikzpicture}

231

\end{tikzpicture}

\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}

232

\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}

\label{fig:fir_stage}

233

\label{fig:fir_stage}

\end{figure}

234

\end{figure}

235

FIR $i$ has been characterized through numerical simulation as able to reject $F(C_i, \pi_i^C)$ dB.

236

FIR $i$ has been characterized through numerical simulation as able to reject $F(C_i, \pi_i^C)$ dB.

This rejection has been computed using GNU Octave software FIR coefficient design functions

237

This rejection has been computed using GNU Octave software FIR coefficient design functions

(\texttt{firls} and \texttt{fir1}).

238

(\texttt{firls} and \texttt{fir1}).

For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.

239

For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.

Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,

240

Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,

the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.

241

the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.

At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the others are coded on much fewer bits.

242

At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the others are coded on much fewer bits.

243

With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter

244

With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter

transfer function.

245

transfer function.

Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag},

246

Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag},

the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the

247

the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the

bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. {\color{red}Throughout this demonstration,

248

bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. {\color{red}Throughout this demonstration,

we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%

249

we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%

of the Nyquist frequency to the end of the band, as would be typically selected to prevent

250

of the Nyquist frequency to the end of the band, as would be typically selected to prevent

aliasing before decimating the dataflow by 2. The method is however generalized to any filter

251

aliasing before decimating the dataflow by 2. The method is however generalized to any filter

shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}

252

shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}

as described below is indeed unique for each filter shape.}

253

as described below is indeed unique for each filter shape.}

254

\begin{figure}

255

\begin{figure}

\begin{center}

256

\begin{center}

\scalebox{0.8}{

257

\scalebox{0.8}{

\centering

258

\centering

\begin{tikzpicture}[scale=0.3]

259

\begin{tikzpicture}[scale=0.3]

\draw[<->] (0,15) -- (0,0) -- (21,0) ;

260

\draw[<->] (0,15) -- (0,0) -- (21,0) ;

\draw[thick] (0,12) -- (8,12) -- (20,0) ;

261

\draw[thick] (0,12) -- (8,12) -- (20,0) ;

262

\draw (0,14) node [left] { $P$ } ;

263

\draw (0,14) node [left] { $P$ } ;

\draw (20,0) node [below] { $f$ } ;

264

\draw (20,0) node [below] { $f$ } ;

265

\draw[>=latex,<->] (0,14) -- (8,14) ;

266

\draw[>=latex,<->] (0,14) -- (8,14) ;

\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;

267

\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;

268

\draw[>=latex,<->] (8,14) -- (12,14) ;

269

\draw[>=latex,<->] (8,14) -- (12,14) ;

\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;

270

\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;

271

\draw[>=latex,<->] (12,14) -- (20,14) ;

272

\draw[>=latex,<->] (12,14) -- (20,14) ;

\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;

273

\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;

274

\draw[>=latex,<->] (16,12) -- (16,8) ;

275

\draw[>=latex,<->] (16,12) -- (16,8) ;

\draw (16,10) node [right] { rejection } ;

276

\draw (16,10) node [right] { rejection } ;

277

\draw[dashed] (8,-1) -- (8,14) ;

278

\draw[dashed] (8,-1) -- (8,14) ;

\draw[dashed] (12,-1) -- (12,14) ;

279

\draw[dashed] (12,-1) -- (12,14) ;

280

\draw[dashed] (8,12) -- (16,12) ;

281

\draw[dashed] (8,12) -- (16,12) ;

\draw[dashed] (12,8) -- (16,8) ;

282

\draw[dashed] (12,8) -- (16,8) ;

283

\end{tikzpicture}

284

\end{tikzpicture}

}

285

}

\end{center}

286

\end{center}

\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:

287

\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:

the passband is considered to occupy the initial 40\% of the Nyquist frequency range,

288

the passband is considered to occupy the initial 40\% of the Nyquist frequency range,

the stopband the last 40\%, allowing 20\% transition width.}

289

the stopband the last 40\%, allowing 20\% transition width.}

\label{fig:fir_mag}

290

\label{fig:fir_mag}

\end{figure}

291

\end{figure}

292

In the transition band, the behavior of the filter is left free, we only {\color{red}define} the passband and the stopband characteristics.

293

In the transition band, the behavior of the filter is left free, we only {\color{red}define} the passband and the stopband characteristics.

% r2.7

294

% r2.7

% Our initial criterion considered the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion

295

% Our initial criterion considered the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion

% yields unacceptable results since notches overestimate the rejection capability of the filter. Furthermore, the losses within

296

% yields unacceptable results since notches overestimate the rejection capability of the filter. Furthermore, the losses within

% the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients.

297

% the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients.

Our criterion to compute the filter rejection considers

298

Our criterion to compute the filter rejection considers

% r2.8 et r2.2 r2.3

299

% r2.8 et r2.2 r2.3

the maximum magnitude within the stopband, to which the {\color{red}sum of the absolute values

300

the maximum magnitude within the stopband, to which the {\color{red}sum of the absolute values

within the passband is subtracted to avoid filters with excessive ripples}. With this

301

within the passband is subtracted to avoid filters with excessive ripples}. With this

criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.

302

criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.

303

% \begin{figure}

304

% \begin{figure}

% \centering

305

% \centering

% \includegraphics[width=\linewidth]{images/colored_mean_criterion}

306

% \includegraphics[width=\linewidth]{images/colored_mean_criterion}

% \caption{Mean stopband rejection criterion comparison between monolithic filter and cascaded filters}

307

% \caption{Mean stopband rejection criterion comparison between monolithic filter and cascaded filters}

% \label{fig:mean_criterion}

308

% \label{fig:mean_criterion}

% \end{figure}

309

% \end{figure}

310

\begin{figure}

311

\begin{figure}

\centering

312

\centering

\includegraphics[width=\linewidth]{images/colored_custom_criterion}

313

\includegraphics[width=\linewidth]{images/colored_custom_criterion}

\caption{Custom criterion (maximum rejection in the stopband minus the mean of the absolute value of the passband rejection)

314

\caption{Custom criterion (maximum rejection in the stopband minus the mean of the absolute value of the passband rejection)

comparison between monolithic filter and cascaded filters}

315

comparison between monolithic filter and cascaded filters}

\label{fig:custom_criterion}

316

\label{fig:custom_criterion}

\end{figure}

317

\end{figure}

318

Thanks to the latter criterion which will be used in the remainder of this paper, we are able to automatically generate multiple FIR taps

319

Thanks to the latter criterion which will be used in the remainder of this paper, we are able to automatically generate multiple FIR taps

and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the

320

and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the

rejection as a function of the number of coefficients and the number of bits representing these coefficients.

321

rejection as a function of the number of coefficients and the number of bits representing these coefficients.

The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet.

322

The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet.

Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection.

323

Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection.

Conversely when setting the a given number of bits, increasing the number of coefficients will not improve

324

Conversely when setting the a given number of bits, increasing the number of coefficients will not improve

the rejection. Hence the best coefficient set are on the vertex of the pyramid.

325

the rejection. Hence the best coefficient set are on the vertex of the pyramid.

326

\begin{figure}

327

\begin{figure}

\centering

328

\centering

\includegraphics[width=\linewidth]{images/rejection_pyramid}

329

\includegraphics[width=\linewidth]{images/rejection_pyramid}

\caption{Rejection as a function of number of coefficients and number of bits}

330

\caption{Rejection as a function of number of coefficients and number of bits}

\label{fig:rejection_pyramid}

331

\label{fig:rejection_pyramid}

\end{figure}

332

\end{figure}

333

Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps),

334

Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps),

we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria.

335

we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria.

If the FIR filter coefficients are the same between the stages, we have:

336

If the FIR filter coefficients are the same between the stages, we have:

$$F_{total} = F_1 + F_2$$

337

$$F_{total} = F_1 + F_2$$

But selecting two different sets of coefficient will yield a more complex situation in which

338

But selecting two different sets of coefficient will yield a more complex situation in which

the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves

339

the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves

are two different filters with maximums and notches not located at the same frequency offsets.

340

are two different filters with maximums and notches not located at the same frequency offsets.

Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved

341

Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved

with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.

342

with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.

% r2.9

343

% r2.9

Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection

344

Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection

criteria of each filter. However since the this sum underestimates the rejection capability of the cascade,

345

criteria of each filter. However since the this sum underestimates the rejection capability of the cascade,

% r2.10

346

% r2.10

this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability

347

this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability

of the filter cascade to meet design criteria.

348

of the filter cascade to meet design criteria.

349

\begin{figure}

350

\begin{figure}

\centering

351

\centering

\includegraphics[width=\linewidth]{images/cascaded_criterion}

352

\includegraphics[width=\linewidth]{images/cascaded_criterion}

\caption{Rejection of two cascaded filters}

353

\caption{Rejection of two cascaded filters}

\label{fig:sum_rejection}

354

\label{fig:sum_rejection}

\end{figure}

355

\end{figure}

356

% r2.6

357

% r2.6

Finally in our case, we consider that the input signal are fully known. So the

358

Finally in our case, we consider that the input signal are fully known. So the

resolution of the data stream are fixed and still the same for all experiments

359

resolution of the data stream are fixed and still the same for all experiments

in this paper.

360

in this paper.

361

Based on this analysis, we address the estimate of resource consumption (called

362

Based on this analysis, we address the estimate of resource consumption (called

% r2.11

363

% r2.11

silicon area -- in the case of FPGAs this means processing cells) as a function of

364

silicon area -- in the case of FPGAs this means processing cells) as a function of

filter characteristics. As a reminder, we do not aim at matching actual hardware

365

filter characteristics. As a reminder, we do not aim at matching actual hardware

configuration but consider an arbitrary silicon area occupied by each processing function,

366

configuration but consider an arbitrary silicon area occupied by each processing function,

and will assess after synthesis the adequation of this arbitrary unit with actual

367

and will assess after synthesis the adequation of this arbitrary unit with actual

hardware resources provided by FPGA manufacturers. The sum of individual processing

368

hardware resources provided by FPGA manufacturers. The sum of individual processing

unit areas is constrained by a total silicon area representative of FPGA global resources.

369

unit areas is constrained by a total silicon area representative of FPGA global resources.

Formally, variable $a_i$ is the area taken by filter~$i$

370

Formally, variable $a_i$ is the area taken by filter~$i$

(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).

371

(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).

Constant $\mathcal{A}$ is the total available area. We model our problem as follows:

372

Constant $\mathcal{A}$ is the total available area. We model our problem as follows:

373

\begin{align}

374

\begin{align}

\text{Maximize } & \sum_{i=1}^n r_i \notag \\

375

\text{Maximize } & \sum_{i=1}^n r_i \notag \\

\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\

376

\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\

a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\

377

a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\

r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\

378

r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\

\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\

379

\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\

\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\

380

\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\

\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\

381

\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\

\pi_1^- &= \Pi^I \label{eq:init}

382

\pi_1^- &= \Pi^I \label{eq:init}

\end{align}

383

\end{align}

384

Equation~\ref{eq:area} states that the total area taken by the filters must be

385

Equation~\ref{eq:area} states that the total area taken by the filters must be

less than the available area. Equation~\ref{eq:areadef} gives the definition of

386

less than the available area. Equation~\ref{eq:areadef} gives the definition of

the area used by a filter, considered as the area of the FIR since the Shifter is

387

the area used by a filter, considered as the area of the FIR since the Shifter is

assumed not to require significant resources. We consider that the FIR needs $C_i$ registers of size

388

assumed not to require significant resources. We consider that the FIR needs $C_i$ registers of size

$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the

389

$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the

input data with the coefficients. Equation~\ref{eq:rejectiondef} gives the

390

input data with the coefficients. Equation~\ref{eq:rejectiondef} gives the

definition of the rejection of the filter thanks to the tabulated function~$F$ that we defined

391

definition of the rejection of the filter thanks to the tabulated function~$F$ that we defined

previously. The Shifter does not introduce negative rejection as we will explain later,

392

previously. The Shifter does not introduce negative rejection as we will explain later,

so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the

393

so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the

relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add

394

relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add

$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes

395

$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes

$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of

396

$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of

a filter is the same as the input number of bits of the next filter.

397

a filter is the same as the input number of bits of the next filter.

Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative

398

Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative

rejection. Indeed, the results of the FIR can be right shifted without compromising

399

rejection. Indeed, the results of the FIR can be right shifted without compromising

the quality of the rejection until a threshold. Each bit of the output data

400

the quality of the rejection until a threshold. Each bit of the output data

increases the maximum rejection level by 6~dB. We add one to take the sign bit

401

increases the maximum rejection level by 6~dB. We add one to take the sign bit

into account. If equation~\ref{eq:maxshift} was not present, the Shifter could

402

into account. If equation~\ref{eq:maxshift} was not present, the Shifter could

shift too much and introduce some noise in the output data. Each supplementary

403

shift too much and introduce some noise in the output data. Each supplementary

shift bit would cause an additional 6~dB rejection rise. A totally equivalent equation is:

404

shift bit would cause an additional 6~dB rejection rise. A totally equivalent equation is:

$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right)$.

405

$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right)$.

Finally, equation~\ref{eq:init} gives the number of bits of the global input.

406

Finally, equation~\ref{eq:init} gives the number of bits of the global input.

407

This model is non-linear and even non-quadratic, as $F$ does not have a known

408

This model is non-linear and even non-quadratic, as $F$ does not have a known

linear or quadratic expression. We introduce $p$ FIR configurations

409

linear or quadratic expression. We introduce $p$ FIR configurations

$(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants.

410

$(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants.

% r2.12

411

% r2.12

This variable must be defined by the user, it represent the number of different

412

This variable must be defined by the user, it represent the number of different

set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

413

set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

functions from GNU Octave).

414

functions from GNU Octave).

We define binary

415

We define binary

variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

416

variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

and 0 otherwise. The new equations are as follows:

417

and 0 otherwise. The new equations are as follows:

418

\begin{align}

419

\begin{align}

a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

420

a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

421

r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

422

\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

423

\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

\end{align}

424

\end{align}

425

Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

426

Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

427

respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

428

Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

429

% r2.13

430

% r2.13

This modified model is quadratic since we multiply two variables in the

431

This modified model is quadratic since we multiply two variables in the

equation~\ref{eq:areadef2} ($\delta_{ij}$ by $\pi_{ij}^-$) but it can be linearised if necessary.

432

equation~\ref{eq:areadef2} ($\delta_{ij}$ by $\pi_{ij}^-$) but it can be linearised if necessary.

The Gurobi

433

The Gurobi

(\url{www.gurobi.com}) optimization software is used to solve this quadratic

434

(\url{www.gurobi.com}) optimization software is used to solve this quadratic

model, and since Gurobi is able to linearize, the model is left as is. This model

435

model, and since Gurobi is able to linearize, the model is left as is. This model

has $O(np)$ variables and $O(n)$ constraints.

436

has $O(np)$ variables and $O(n)$ constraints.

437

Two problems will be addressed using the workflow described in the next section: on the one

438

Two problems will be addressed using the workflow described in the next section: on the one

hand maximizing the rejection capability of a set of cascaded filters occupying a fixed arbitrary

439

hand maximizing the rejection capability of a set of cascaded filters occupying a fixed arbitrary

silcon area (section~\ref{sec:fixed_area}) and on the second hand the dual problem of minimizing the silicon area

440

silcon area (section~\ref{sec:fixed_area}) and on the second hand the dual problem of minimizing the silicon area

for a fixed rejection criterion (section~\ref{sec:fixed_rej}). In the latter case, the

441

for a fixed rejection criterion (section~\ref{sec:fixed_rej}). In the latter case, the

objective function is replaced with:

442

objective function is replaced with:

\begin{align}

443

\begin{align}

\text{Minimize } & \sum_{i=1}^n a_i \notag

444

\text{Minimize } & \sum_{i=1}^n a_i \notag

\end{align}

445

\end{align}

We adapt our constraints of quadratic program to replace equation \ref{eq:area}

446

We adapt our constraints of quadratic program to replace equation \ref{eq:area}

with equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal

447

with equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal

rejection required.

448

rejection required.

449

\begin{align}

450

\begin{align}

\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}

451

\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}

\end{align}

452

\end{align}

453

\section{Design workflow}

454

\section{Design workflow}

\label{sec:workflow}

455

\label{sec:workflow}

456

In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area}

457

In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area}

and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved

458

and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved

in the computation of the results.

459

in the computation of the results.

460

\begin{figure}

461

\begin{figure}

\centering

462

\centering

\begin{tikzpicture}[node distance=0.75cm and 2cm]

463

\begin{tikzpicture}[node distance=0.75cm and 2cm]

\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;

464

\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;

\node (Start) [left= 3cm of Solver] { } ;

465

\node (Start) [left= 3cm of Solver] { } ;

\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;

466

\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;

\node (Input) [above= of TCL] { } ;

467

\node (Input) [above= of TCL] { } ;

\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;

468

\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;

\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;

469

\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;

\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;

470

\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;

\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;

471

\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;

\node (Results) [left= of Postproc] { } ;

472

\node (Results) [left= of Postproc] { } ;

473

\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;

474

\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;

\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;

475

\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;

\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;

476

\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;

\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;

477

\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;

\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;

478

\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;

\draw[->,dashed] (Bitstream) -- (Deploy) ;

479

\draw[->,dashed] (Bitstream) -- (Deploy) ;

\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;

480

\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;

\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;

481

\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;

\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;

482

\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;

\draw[->] (Postproc) -- (Results) ;

483

\draw[->] (Postproc) -- (Results) ;

\end{tikzpicture}

484

\end{tikzpicture}

\caption{Design workflow from the input parameters to the results}

485

\caption{Design workflow from the input parameters to the results}

\label{fig:workflow}

486

\label{fig:workflow}

\end{figure}

487

\end{figure}

488

The filter solver is a C++ program that takes as input the maximum area

489

The filter solver is a C++ program that takes as input the maximum area

$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,

490

$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,

the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates

491

the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates

the quadratic programs and uses the Gurobi solver to estimate the optimal results.

492

the quadratic programs and uses the Gurobi solver to estimate the optimal results.

Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})

493

Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})

and a deploy script ((1b) on figure~\ref{fig:workflow}).

494

and a deploy script ((1b) on figure~\ref{fig:workflow}).

495

The TCL script describes the whole digital processing chain from the beginning

496

The TCL script describes the whole digital processing chain from the beginning

(the raw signal data) to the end (the filtered data) in a language compatible

497

(the raw signal data) to the end (the filtered data) in a language compatible

with proprietary synthesis software, namely Vivado for Xilinx and Quartus for

498

with proprietary synthesis software, namely Vivado for Xilinx and Quartus for

Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN)

499

Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN)

generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.

500

generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.

Then the script builds each stage of the chain with a generic FIR task that

501

Then the script builds each stage of the chain with a generic FIR task that

comes from a skeleton library. The generic FIR is highly configurable

502

comes from a skeleton library. The generic FIR is highly configurable

with the number of coefficients and the size of the coefficients. The coefficients

503

with the number of coefficients and the size of the coefficients. The coefficients

themselves are not stored in the script.

504

themselves are not stored in the script.

As the signal is processed in real-time, the output signal is stored as

505

As the signal is processed in real-time, the output signal is stored as

consecutive bursts of data for post-processing, mainly assessing the consistency of the

506

consecutive bursts of data for post-processing, mainly assessing the consistency of the

implemented FIR cascade transfer function with the design criteria and the expected

507

implemented FIR cascade transfer function with the design criteria and the expected

transfer function.

508

transfer function.

509

The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).

510

The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).

We use the 2018.2 version of Xilinx Vivado and we execute the synthesized

511

We use the 2018.2 version of Xilinx Vivado and we execute the synthesized

bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series

512

bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series

FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Omega$ resistors to

513

FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Omega$ resistors to

provide a broadband noise source.

514

provide a broadband noise source.

The board runs the Linux kernel and surrounding environment produced from the

515

The board runs the Linux kernel and surrounding environment produced from the

Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring

516

Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring

the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and

517

the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and

fetching the results is automated.

518

fetching the results is automated.

519

The deploy script uploads the bitstream to the board ((3) on

520

The deploy script uploads the bitstream to the board ((3) on

figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,

521

figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,

configures the coefficients of the FIR filters. It then waits for the results

522

configures the coefficients of the FIR filters. It then waits for the results

and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).

523

and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).

524

Finally, an Octave post-processing script computes the final results thanks to

525

Finally, an Octave post-processing script computes the final results thanks to

the output data ((5) on figure~\ref{fig:workflow}).

526

the output data ((5) on figure~\ref{fig:workflow}).

The results are normalized so that the Power Spectrum Density (PSD) starts at zero

527

The results are normalized so that the Power Spectrum Density (PSD) starts at zero

and the different configurations can be compared.

528

and the different configurations can be compared.

529

\section{Maximizing the rejection at fixed silicon area}

530

\section{Maximizing the rejection at fixed silicon area}

\label{sec:fixed_area}

531

\label{sec:fixed_area}

This section presents the output of the filter solver {\em i.e.} the computed

532

This section presents the output of the filter solver {\em i.e.} the computed

configurations for each stage, the computed rejection and the computed silicon area.

533

configurations for each stage, the computed rejection and the computed silicon area.

Such results allow for understanding the choices made by the solver to compute its solutions.

534

Such results allow for understanding the choices made by the solver to compute its solutions.

535

The experimental setup is composed of three cases. The raw input is generated

536

The experimental setup is composed of three cases. The raw input is generated

by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.

537

by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.

Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500

538

Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500

arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.

539

arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.

The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$

540

The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$

ranging from 2 to 22. In each case, the quadratic program has been able to give a

541

ranging from 2 to 22. In each case, the quadratic program has been able to give a

result up to five stages ($n = 5$) in the cascaded filter.

542

result up to five stages ($n = 5$) in the cascaded filter.

543

Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.

544

Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.

Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.

545

Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.

Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.

546

Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.

547

\renewcommand{\arraystretch}{1.4}

548

\renewcommand{\arraystretch}{1.4}

549

\begin{table}

550

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}

551

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}

\label{tbl:gurobi_max_500}

552

\label{tbl:gurobi_max_500}

\centering

553

\centering

{\scalefont{0.77}

554

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

555

\begin{tabular}{|c|ccccc|c|c|}

\hline

556

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

557

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

558

\hline

1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\

559

1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\

2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\

560

2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\

3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\

561

3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\

4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\

562

4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\

5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\

563

5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\

\hline

564

\hline

\end{tabular}

565

\end{tabular}

}

566

}

\end{table}

567

\end{table}

568

\begin{table}

569

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}

570

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}

\label{tbl:gurobi_max_1000}

571

\label{tbl:gurobi_max_1000}

\centering

572

\centering

{\scalefont{0.77}

573

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

574

\begin{tabular}{|c|ccccc|c|c|}

\hline

575

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

576

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

577

\hline

1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\

578

1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\

2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\

579

2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\

3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\

580

3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\

4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\

581

4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\

5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\

582

5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\

\hline

583

\hline

\end{tabular}

584

\end{tabular}

}

585

}

\end{table}

586

\end{table}

587

\begin{table}

588

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}

589

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}

\label{tbl:gurobi_max_1500}

590

\label{tbl:gurobi_max_1500}

\centering

591

\centering

{\scalefont{0.77}

592

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

593

\begin{tabular}{|c|ccccc|c|c|}

\hline

594

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

595

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

596

\hline

1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\

597

1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\

2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\

598

2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\

3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\

599

3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\

4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\

600

4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\

5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\

601

5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\

\hline

602

\hline

\end{tabular}

603

\end{tabular}

}

604

}

\end{table}

605

\end{table}

606

\renewcommand{\arraystretch}{1}

607

\renewcommand{\arraystretch}{1}

608

From these tables, we can first state that the more stages are used to define

609

From these tables, we can first state that the more stages are used to define

the cascaded FIR filters, the better the rejection. It was an expected result as it has

610

the cascaded FIR filters, the better the rejection. It was an expected result as it has

been previously observed that many small filters are better than

611

been previously observed that many small filters are better than

a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusions

612

a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusions

being hardly used in practice due to the lack of tools for identifying individual filter

613

being hardly used in practice due to the lack of tools for identifying individual filter

coefficients in the cascaded approach.

614

coefficients in the cascaded approach.

615

Second, the larger the silicon area, the better the rejection. This was also an

616

Second, the larger the silicon area, the better the rejection. This was also an

expected result as more area means a filter of better quality with more coefficients

617

expected result as more area means a filter of better quality with more coefficients

or more bits per coefficient.

618

or more bits per coefficient.

619

Then, we also observe that the first stage can have a larger shift than the other

620

Then, we also observe that the first stage can have a larger shift than the other

stages. This is explained by the fact that the solver tries to use just enough

621

stages. This is explained by the fact that the solver tries to use just enough

bits for the computed rejection after each stage. In the first stage, a

622

bits for the computed rejection after each stage. In the first stage, a

balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}

623

balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}

gives the relation between both values.

624

gives the relation between both values.

625

Finally, we note that the solver consumes all the given silicon area.

626

Finally, we note that the solver consumes all the given silicon area.

627

The following graphs present the rejection for real data on the FPGA. In all the following

628

The following graphs present the rejection for real data on the FPGA. In all the following

figures, the solid line represents the actual rejection of the filtered

629

figures, the solid line represents the actual rejection of the filtered

data on the FPGA as measured experimentally and the dashed line are the noise levels

630

data on the FPGA as measured experimentally and the dashed line are the noise levels

given by the quadratic solver. The configurations are those computed in the previous section.

631

given by the quadratic solver. The configurations are those computed in the previous section.

632

Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.

633

Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.

Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.

634

Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.

Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.

635

Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.

636

% \begin{figure}

637

% \begin{figure}

% \centering

638

% \centering

% \includegraphics[width=\linewidth]{images/max_500}

639

% \includegraphics[width=\linewidth]{images/max_500}

% \caption{Signal spectrum for MAX/500}

640

% \caption{Signal spectrum for MAX/500}

% \label{fig:max_500_result}

641

% \label{fig:max_500_result}

% \end{figure}

642

% \end{figure}

%

643

%

% \begin{figure}

644

% \begin{figure}

% \centering

645

% \centering

% \includegraphics[width=\linewidth]{images/max_1000}

646

% \includegraphics[width=\linewidth]{images/max_1000}

% \caption{Signal spectrum for MAX/1000}

647

% \caption{Signal spectrum for MAX/1000}

% \label{fig:max_1000_result}

648

% \label{fig:max_1000_result}

% \end{figure}

649

% \end{figure}

%

650

%

% \begin{figure}

651

% \begin{figure}

% \centering

652

% \centering

% \includegraphics[width=\linewidth]{images/max_1500}

653

% \includegraphics[width=\linewidth]{images/max_1500}

% \caption{Signal spectrum for MAX/1500}

654

% \caption{Signal spectrum for MAX/1500}

% \label{fig:max_1500_result}

655

% \label{fig:max_1500_result}

% \end{figure}

656

% \end{figure}

657

% r2.14 et r2.15 et r2.16

658

% r2.14 et r2.15 et r2.16

\begin{figure}

659

\begin{figure}

\centering

660

\centering

\begin{subfigure}{\linewidth}

661

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_500}

662

\includegraphics[width=\linewidth]{images/max_500}

\caption{Signal spectrum for MAX/500}

663

\caption{Signal spectrum for MAX/500}

\label{fig:max_500_result}

664

\label{fig:max_500_result}

\end{subfigure}

665

\end{subfigure}

666

\begin{subfigure}{\linewidth}

667

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_1000}

668

\includegraphics[width=\linewidth]{images/max_1000}

\caption{Signal spectrum for MAX/1000}

669

\caption{Signal spectrum for MAX/1000}

\label{fig:max_1000_result}

670

\label{fig:max_1000_result}

\end{subfigure}

671

\end{subfigure}

672

\begin{subfigure}{\linewidth}

673

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_1500}

674

\includegraphics[width=\linewidth]{images/max_1500}

\caption{Signal spectrum for MAX/1500}

675

\caption{Signal spectrum for MAX/1500}

\label{fig:max_1500_result}

676

\label{fig:max_1500_result}

\end{subfigure}

677

\end{subfigure}

\caption{Signal spectrum of each experimental configurations MAX/500, MAX/1000 and MAX/1500}

678

\caption{Signal spectrum of each experimental configurations MAX/500, MAX/1000 and MAX/1500}

\end{figure}

679

\end{figure}

680

In all cases, we observe that the actual rejection is close to the rejection computed by the solver.

681

In all cases, we observe that the actual rejection is close to the rejection computed by the solver.

682

We compare the actual silicon resources given by Vivado to the

683

We compare the actual silicon resources given by Vivado to the

resources in arbitrary units.

684

resources in arbitrary units.

The goal is to check that our arbitrary units of silicon area models well enough

685

The goal is to check that our arbitrary units of silicon area models well enough

the real resources on the FPGA. Especially we want to verify that, for a given

686

the real resources on the FPGA. Especially we want to verify that, for a given

number of arbitrary units, the actual silicon resources do not depend on the

687

number of arbitrary units, the actual silicon resources do not depend on the

number of stages $n$. Most significantly, our approach aims

688

number of stages $n$. Most significantly, our approach aims

at remaining far enough from the practical logic gate implementation used by

689

at remaining far enough from the practical logic gate implementation used by

various vendors to remain platform independent and be portable from one

690

various vendors to remain platform independent and be portable from one

architecture to another.

691

architecture to another.

692

Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and

693

Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and

MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000

694

MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000

and 1500 arbitrary units. We have taken care to extract solely the resources used by

695

and 1500 arbitrary units. We have taken care to extract solely the resources used by

the FIR filters and remove additional processing blocks including FIFO and Programmable

696

the FIR filters and remove additional processing blocks including FIFO and Programmable

Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.

697

Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.

698

\begin{table}[h!tb]

699

\begin{table}[h!tb]

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

700

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

\label{tbl:resources_usage}

701

\label{tbl:resources_usage}

\centering

702

\centering

\begin{tabular}{|c|c|ccc|c|}

703

\begin{tabular}{|c|c|ccc|c|}

\hline

704

\hline

$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline

705

$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline

& LUT & 249 & 453 & 627 & \emph{17600} \\

706

& LUT & 249 & 453 & 627 & \emph{17600} \\

1 & BRAM & 1 & 1 & 1 & \emph{120} \\

707

1 & BRAM & 1 & 1 & 1 & \emph{120} \\

& DSP & 21 & 37 & 47 & \emph{80} \\ \hline

708

& DSP & 21 & 37 & 47 & \emph{80} \\ \hline

& LUT & 2374 & 5494 & 691 & \emph{17600} \\

709

& LUT & 2374 & 5494 & 691 & \emph{17600} \\

2 & BRAM & 2 & 2 & 2 & \emph{120} \\

710

2 & BRAM & 2 & 2 & 2 & \emph{120} \\

& DSP & 0 & 0 & 70 & \emph{80} \\ \hline

711

& DSP & 0 & 0 & 70 & \emph{80} \\ \hline

& LUT & 2443 & 3304 & 3521 & \emph{17600} \\

712

& LUT & 2443 & 3304 & 3521 & \emph{17600} \\

3 & BRAM & 3 & 3 & 3 & \emph{120} \\

713

3 & BRAM & 3 & 3 & 3 & \emph{120} \\

& DSP & 0 & 19 & 35 & \emph{80} \\ \hline

714

& DSP & 0 & 19 & 35 & \emph{80} \\ \hline

& LUT & 2634 & 3753 & 2557 & \emph{17600} \\

715

& LUT & 2634 & 3753 & 2557 & \emph{17600} \\

4 & BRAM & 4 & 4 & 4 & \emph{120} \\

716

4 & BRAM & 4 & 4 & 4 & \emph{120} \\

& DPS & 0 & 19 & 46 & \emph{80} \\ \hline

717

& DPS & 0 & 19 & 46 & \emph{80} \\ \hline

& LUT & 2423 & 3047 & 2847 & \emph{17600} \\

718

& LUT & 2423 & 3047 & 2847 & \emph{17600} \\

5 & BRAM & 5 & 5 & 5 & \emph{120} \\

719

5 & BRAM & 5 & 5 & 5 & \emph{120} \\

& DPS & 0 & 22 & 46 & \emph{80} \\ \hline

720

& DPS & 0 & 22 & 46 & \emph{80} \\ \hline

\end{tabular}

721

\end{tabular}

\end{table}

722

\end{table}

723

In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,

724

In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,

when the filter coefficients are small enough, or when the input size is small

725

when the filter coefficients are small enough, or when the input size is small

enough, Vivado optimizes resource consumption by selecting multiplexers to

726

enough, Vivado optimizes resource consumption by selecting multiplexers to

implement the multiplications instead of a DSP. In this case, it is quite difficult

727

implement the multiplications instead of a DSP. In this case, it is quite difficult

to compare the whole silicon budget.

728

to compare the whole silicon budget.

729

However, a rough estimation can be made with a simple equivalence: looking at

730

However, a rough estimation can be made with a simple equivalence: looking at

the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,

731

the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,

we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon

732

we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon

area use. With this equivalence, our 500 arbitraty units correspond to 2500 LUTs,

733

area use. With this equivalence, our 500 arbitraty units correspond to 2500 LUTs,

1000 arbitrary units correspond to 5000 LUTs and 1500 arbitrary units correspond

734

1000 arbitrary units correspond to 5000 LUTs and 1500 arbitrary units correspond

to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary

735

to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary

unit map well to actual hardware resources. The relatively small differences can probably be explained

736

unit map well to actual hardware resources. The relatively small differences can probably be explained

by the optimizations done by Vivado based on the detailed map of available processing resources.

737

by the optimizations done by Vivado based on the detailed map of available processing resources.

738

We now present the computation time needed to solve the quadratic problem.

739

We now present the computation time needed to solve the quadratic problem.

For each case, the filter solver software is executed on a Intel(R) Xeon(R) CPU E5606

740

For each case, the filter solver software is executed on a Intel(R) Xeon(R) CPU E5606

clocked at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve

741

clocked at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve

the quadratic problem. Table~\ref{tbl:area_time} shows the time needed to solve the quadratic

742

the quadratic problem. Table~\ref{tbl:area_time} shows the time needed to solve the quadratic

problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.

743

problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.

744

\begin{table}[h!tb]

745

\begin{table}[h!tb]

\caption{Time needed to solve the quadratic program with Gurobi}

746

\caption{Time needed to solve the quadratic program with Gurobi}

\label{tbl:area_time}

747

\label{tbl:area_time}

\centering

748

\centering

\begin{tabular}{|c|c|c|c|}\hline

749

\begin{tabular}{|c|c|c|c|}\hline

$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline

750

$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline

1 & 0.1~s & 0.1~s & 0.3~s \\

751

1 & 0.1~s & 0.1~s & 0.3~s \\

2 & 1.1~s & 2.2~s & 12~s \\

752

2 & 1.1~s & 2.2~s & 12~s \\

3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\

753

3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\

4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\

754

4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\

5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline

755

5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline

\end{tabular}

756

\end{tabular}

\end{table}

757

\end{table}

758

As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?

759

As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?

When the area is limited, the design exploration space is more limited and the solver is able to

760

When the area is limited, the design exploration space is more limited and the solver is able to

find an optimal solution faster.

761

find an optimal solution faster.

762

\subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}

763

\subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}

764

This section presents the results of the complementary quadratic program aimed at

765

This section presents the results of the complementary quadratic program aimed at

minimizing the area occupation for a targeted rejection level.

766

minimizing the area occupation for a targeted rejection level.

767

The experimental setup is composed of four cases. The raw input is the same

768

The experimental setup is composed of four cases. The raw input is the same

as in the previous section, from a PRN generator, which fixes the input data size $\Pi^I$.

769

as in the previous section, from a PRN generator, which fixes the input data size $\Pi^I$.

Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60, 80 or 100~dB.

770

Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60, 80 or 100~dB.

Hence, the three cases have been named: MIN/40, MIN/60, MIN/80 and MIN/100.

771

Hence, the three cases have been named: MIN/40, MIN/60, MIN/80 and MIN/100.

The number of configurations $p$ is the same as previous section.

772

The number of configurations $p$ is the same as previous section.

773

Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.

774

Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.

Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.

775

Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.

Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.

776

Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.

Table~\ref{tbl:gurobi_min_100} shows the results obtained by the filter solver for MIN/100.

777

Table~\ref{tbl:gurobi_min_100} shows the results obtained by the filter solver for MIN/100.

778

\renewcommand{\arraystretch}{1.4}

779

\renewcommand{\arraystretch}{1.4}

780

\begin{table}[h!tb]

781

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}

782

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}

\label{tbl:gurobi_min_40}

783

\label{tbl:gurobi_min_40}

\centering

784

\centering

{\scalefont{0.77}

785

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

786

\begin{tabular}{|c|ccccc|c|c|}

\hline

787

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

788

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

789

\hline

1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\

790

1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\

2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\

791

2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\

3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\

792

3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\

4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\

793

4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\

\hline

794

\hline

\end{tabular}

795

\end{tabular}

}

796

}

\end{table}

797

\end{table}

798

\begin{table}[h!tb]

799

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}

800

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}

\label{tbl:gurobi_min_60}

801

\label{tbl:gurobi_min_60}

\centering

802

\centering

{\scalefont{0.77}

803

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

804

\begin{tabular}{|c|ccccc|c|c|}

\hline

805

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

806

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

807

\hline

1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\

808

1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\

2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\

809

2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\

3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\

810

3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\

4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\

811

4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\

5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\

812

5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\

\hline

813

\hline

\end{tabular}

814

\end{tabular}

}

815

}

\end{table}

816

\end{table}

817

\begin{table}[h!tb]

818

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}

819

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}

\label{tbl:gurobi_min_80}

820

\label{tbl:gurobi_min_80}

\centering

821

\centering

{\scalefont{0.77}

822

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

823

\begin{tabular}{|c|ccccc|c|c|}

\hline

824

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

825

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

826

\hline

1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\

827

1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\

2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\

828

2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\

3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\

829

3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\

4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\

830

4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\

5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\

831

5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\

\hline

832

\hline

\end{tabular}

833

\end{tabular}

}

834

}

\end{table}

835

\end{table}

836

\begin{table}[h!tb]

837

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/100}

838

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/100}

\label{tbl:gurobi_min_100}

839

\label{tbl:gurobi_min_100}

\centering

840

\centering

{\scalefont{0.77}

841

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

842

\begin{tabular}{|c|ccccc|c|c|}

\hline

843

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

844

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

845

\hline

1 & - & - & - & - & - & - & - \\

846

1 & - & - & - & - & - & - & - \\

2 & (15, 7, 17) & (51, 14, 0) & - & - & - & 100~dB & 1365 \\

847

2 & (15, 7, 17) & (51, 14, 0) & - & - & - & 100~dB & 1365 \\

3 & (3, 3, 15) & (27, 9, 0) & (27, 9, 0) & - & - & 100~dB & 1002 \\

848

3 & (3, 3, 15) & (27, 9, 0) & (27, 9, 0) & - & - & 100~dB & 1002 \\

4 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 0) & - & 101~dB & 909 \\

849

4 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 0) & - & 101~dB & 909 \\

5 & (3, 3, 15) & (23, 8, 1) & (19, 7, 0) & (3, 3, 0) & (3, 3, 0) & 101~dB & 810 \\

850

5 & (3, 3, 15) & (23, 8, 1) & (19, 7, 0) & (3, 3, 0) & (3, 3, 0) & 101~dB & 810 \\

\hline

851

\hline

\end{tabular}

852

\end{tabular}

}

853

}

\end{table}

854

\end{table}

\renewcommand{\arraystretch}{1}

855

\renewcommand{\arraystretch}{1}

856

From these tables, we can first state that almost all configurations reach the targeted rejection

857

From these tables, we can first state that almost all configurations reach the targeted rejection

level or even better thanks to our underestimate of the cascade rejection as the sum of the

858

level or even better thanks to our underestimate of the cascade rejection as the sum of the

individual filter rejection. The only exception is for the monolithic case ($n = 1$) in

859

individual filter rejection. The only exception is for the monolithic case ($n = 1$) in

MIN/100: no solution is found for a single monolithic filter reach a 100~dB rejection.

860

MIN/100: no solution is found for a single monolithic filter reach a 100~dB rejection.

Futhermore, the area of the monolithic filter is twice as big as the two cascaded filters

861

Futhermore, the area of the monolithic filter is twice as big as the two cascaded filters

(1131 and 1760 arbitrary units v.s 547 and 903 arbitrary units for 60 and 80~dB rejection

862

(1131 and 1760 arbitrary units v.s 547 and 903 arbitrary units for 60 and 80~dB rejection

respectively). More generally, the more filters are cascaded, the lower the occupied area.

863

respectively). More generally, the more filters are cascaded, the lower the occupied area.

864

Like in previous section, the solver chooses always a little filter as first

865

Like in previous section, the solver chooses always a little filter as first

filter stage and the second one is often the biggest filter. This choice can be explained

866

filter stage and the second one is often the biggest filter. This choice can be explained

as in the previous section, with the solver using just enough bits not to degrade the input

867

as in the previous section, with the solver using just enough bits not to degrade the input

signal and in the second filter selecting a better filter to improve rejection without

868

signal and in the second filter selecting a better filter to improve rejection without

having too many bits in the output data.

869

having too many bits in the output data.

870

For the specific case of MIN/40 for $n = 5$ the solver has determined that the optimal

871

For the specific case of MIN/40 for $n = 5$ the solver has determined that the optimal

number of filters is 4 so it did not chose any configuration for the last filter. Hence this

872

number of filters is 4 so it did not chose any configuration for the last filter. Hence this

solution is equivalent to the result for $n = 4$.

873

solution is equivalent to the result for $n = 4$.

874

The following graphs present the rejection for real data on the FPGA. In all the following

875

The following graphs present the rejection for real data on the FPGA. In all the following

figures, the solid line represents the actual rejection of the filtered

876

figures, the solid line represents the actual rejection of the filtered

data on the FPGA as measured experimentally and the dashed line is the noise level

877

data on the FPGA as measured experimentally and the dashed line is the noise level

given by the quadratic solver.

878

given by the quadratic solver.

879

Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.

880

Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.

Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.

881

Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.

Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.

882

Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.

Figure~\ref{fig:min_100} shows the rejection of the different configurations in the case of MIN/100.

883

Figure~\ref{fig:min_100} shows the rejection of the different configurations in the case of MIN/100.

884

% \begin{figure}

885

% \begin{figure}

% \centering

886

% \centering

% \includegraphics[width=\linewidth]{images/min_40}

887

% \includegraphics[width=\linewidth]{images/min_40}

% \caption{Signal spectrum for MIN/40}

888

% \caption{Signal spectrum for MIN/40}

% \label{fig:min_40}

889

% \label{fig:min_40}

% \end{figure}

890

% \end{figure}

%

891

%

% \begin{figure}

892

% \begin{figure}

% \centering

893

% \centering

% \includegraphics[width=\linewidth]{images/min_60}

894

% \includegraphics[width=\linewidth]{images/min_60}

% \caption{Signal spectrum for MIN/60}

895

% \caption{Signal spectrum for MIN/60}

% \label{fig:min_60}

896

% \label{fig:min_60}

% \end{figure}

897

% \end{figure}

%

898

%

% \begin{figure}

899

% \begin{figure}

% \centering

900

% \centering

% \includegraphics[width=\linewidth]{images/min_80}

901

% \includegraphics[width=\linewidth]{images/min_80}

% \caption{Signal spectrum for MIN/80}

902

% \caption{Signal spectrum for MIN/80}

% \label{fig:min_80}

903

% \label{fig:min_80}

% \end{figure}

904

% \end{figure}

%

905

%

% \begin{figure}

906

% \begin{figure}

% \centering

907

% \centering

% \includegraphics[width=\linewidth]{images/min_100}

908

% \includegraphics[width=\linewidth]{images/min_100}

% \caption{Signal spectrum for MIN/100}

909

% \caption{Signal spectrum for MIN/100}

% \label{fig:min_100}

910

% \label{fig:min_100}

% \end{figure}

911

% \end{figure}

912

% r2.14 et r2.15 et r2.16

913

% r2.14 et r2.15 et r2.16

\begin{figure}

914

\begin{figure}

\centering

915

\centering

\begin{subfigure}{\linewidth}

916

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/min_40}

917

\includegraphics[width=\linewidth]{images/min_40}

\caption{Signal spectrum for MIN/40}

918

\caption{Signal spectrum for MIN/40}

\label{fig:min_40}

919

\label{fig:min_40}

\end{subfigure}

920

\end{subfigure}

921

\begin{subfigure}{\linewidth}

922

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/min_60}

923

\includegraphics[width=\linewidth]{images/min_60}

\caption{Signal spectrum for MIN/60}

924

\caption{Signal spectrum for MIN/60}

\label{fig:min_60}

925

\label{fig:min_60}

\end{subfigure}

926

\end{subfigure}

927

\begin{subfigure}{\linewidth}

928

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/min_80}

929

\includegraphics[width=\linewidth]{images/min_80}

\caption{Signal spectrum for MIN/80}

930

\caption{Signal spectrum for MIN/80}

\label{fig:min_80}

931

\label{fig:min_80}

\end{subfigure}

932

\end{subfigure}

933

\begin{subfigure}{\linewidth}

934

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/min_100}

935

\includegraphics[width=\linewidth]{images/min_100}

\caption{Signal spectrum for MIN/100}

936

\caption{Signal spectrum for MIN/100}

\label{fig:min_100}

937

\label{fig:min_100}

\end{subfigure}

938

\end{subfigure}

\caption{Signal spectrum of each experimental configurations MIN/40, MIN/60, MIN/80 and MIN/100}

939

\caption{Signal spectrum of each experimental configurations MIN/40, MIN/60, MIN/80 and MIN/100}

\end{figure}

940

\end{figure}

941

We observe that all rejections given by the quadratic solver are close to the experimentally

942

We observe that all rejections given by the quadratic solver are close to the experimentally

measured rejection. All curves prove that the constraint to reach the target rejection is

943

measured rejection. All curves prove that the constraint to reach the target rejection is

respected with both monolithic (except in MIN/100 which has no monolithic solution) or cascaded filters.

944

respected with both monolithic (except in MIN/100 which has no monolithic solution) or cascaded filters.

945

Table~\ref{tbl:resources_usage} shows the resource usage in the case of MIN/40, MIN/60;

946

Table~\ref{tbl:resources_usage} shows the resource usage in the case of MIN/40, MIN/60;

MIN/80 and MIN/100 \emph{i.e.} when the target rejection is fixed to 40, 60, 80 and 100~dB. We

947

MIN/80 and MIN/100 \emph{i.e.} when the target rejection is fixed to 40, 60, 80 and 100~dB. We

have taken care to extract solely the resources used by

948

have taken care to extract solely the resources used by

the FIR filters and remove additional processing blocks including FIFO and PL to

949

the FIR filters and remove additional processing blocks including FIFO and PL to

PS communication.

950

PS communication.

GITLAB

jfriedt / IFCS2018 article

Ajout du tableau comparatif entre Fir compiler et Oscimp