jfriedt / IFCS2018 article

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee

1

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee

% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de

2

% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de

% rejection par bit et perte si moins de bits que rejection/6

3

% rejection par bit et perte si moins de bits que rejection/6

% developper programme lineaire en incluant le decalage de bits

4

% developper programme lineaire en incluant le decalage de bits

% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on

5

% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on

% implemente et on demontre que ca tourne

6

% implemente et on demontre que ca tourne

% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?

7

% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?

% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer

8

% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer

% (zedboard ou redpit)

9

% (zedboard ou redpit)

10

% label schema : verifier que "argumenter de la cascade de FIR" est fait

11

% label schema : verifier que "argumenter de la cascade de FIR" est fait

12

\documentclass[a4paper,journal]{IEEEtran/IEEEtran}

13

\documentclass[a4paper,journal]{IEEEtran/IEEEtran}

\usepackage{graphicx,color,hyperref}

14

\usepackage{graphicx,color,hyperref}

\usepackage{amsfonts}

15

\usepackage{amsfonts}

\usepackage{amsthm}

16

\usepackage{amsthm}

\usepackage{amssymb}

17

\usepackage{amssymb}

\usepackage{amsmath}

18

\usepackage{amsmath}

\usepackage{algorithm2e}

19

\usepackage{algorithm2e}

\usepackage{url,balance}

20

\usepackage{url,balance}

\usepackage[normalem]{ulem}

21

\usepackage[normalem]{ulem}

\usepackage{tikz}

22

\usepackage{tikz}

\usetikzlibrary{positioning,fit}

23

\usetikzlibrary{positioning,fit}

\usepackage{multirow}

24

\usepackage{multirow}

\usepackage{scalefnt}

25

\usepackage{scalefnt}

\usepackage{caption}

26

\usepackage{caption}

\usepackage{subcaption}

27

\usepackage{subcaption}

28

% correct bad hyphenation here

29

% correct bad hyphenation here

\hyphenation{op-tical net-works semi-conduc-tor}

30

\hyphenation{op-tical net-works semi-conduc-tor}

\textheight=26cm

31

\textheight=26cm

\setlength{\footskip}{30pt}

32

\setlength{\footskip}{30pt}

\pagenumbering{gobble}

33

\pagenumbering{gobble}

\begin{document}

34

\begin{document}

\title{Filter optimization for real time digital processing of radiofrequency signals: application

35

\title{Filter optimization for real time digital processing of radiofrequency signals: application

to oscillator metrology}

36

to oscillator metrology}

37

\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},

38

\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},

G. Goavec-M\'erou\IEEEauthorrefmark{1},

39

G. Goavec-M\'erou\IEEEauthorrefmark{1},

P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}\\

40

P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}\\

\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }\\

41

\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }\\

\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\

42

\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\

Email: \{pyb2,jmfriedt\}@femto-st.fr}

43

Email: \{pyb2,jmfriedt\}@femto-st.fr}

}

44

}

\maketitle

45

\maketitle

\thispagestyle{plain}

46

\thispagestyle{plain}

\pagestyle{plain}

47

\pagestyle{plain}

\newtheorem{definition}{Definition}

48

\newtheorem{definition}{Definition}

49

\begin{abstract}

50

\begin{abstract}

Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to

51

Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to

radiofrequency signal processing. Applied to oscillator characterization in the context

52

radiofrequency signal processing. Applied to oscillator characterization in the context

of ultrastable clocks, stringent filtering requirements are defined by spurious signal or

53

of ultrastable clocks, stringent filtering requirements are defined by spurious signal or

noise rejection needs. Since real time radiofrequency processing must be performed in a

54

noise rejection needs. Since real time radiofrequency processing must be performed in a

Field Programmable Array to meet timing constraints, we investigate optimization strategies

55

Field Programmable Array to meet timing constraints, we investigate optimization strategies

to design filters meeting rejection characteristics while limiting the hardware resources

56

to design filters meeting rejection characteristics while limiting the hardware resources

required and keeping timing constraints within the targeted measurement bandwidths. The

57

required and keeping timing constraints within the targeted measurement bandwidths. The

presented technique is applicable to scheduling any sequence of processing blocks characterized

58

presented technique is applicable to scheduling any sequence of processing blocks characterized

by a throughput, resource occupation and performance tabulated as a function of configuration

59

by a throughput, resource occupation and performance tabulated as a function of configuration

characateristics, as is the case for filters with their coefficients and resolution yielding

60

characateristics, as is the case for filters with their coefficients and resolution yielding

rejection and number of multipliers.

61

rejection and number of multipliers.

\end{abstract}

62

\end{abstract}

63

\begin{IEEEkeywords}

64

\begin{IEEEkeywords}

Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter

65

Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter

\end{IEEEkeywords}

66

\end{IEEEkeywords}

67

\section{Digital signal processing of ultrastable clock signals}

68

\section{Digital signal processing of ultrastable clock signals}

69

Analog oscillator phase noise characteristics are classically performed by downconverting

70

Analog oscillator phase noise characteristics are classically performed by downconverting

the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,

71

the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,

followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In

72

followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In

a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by

73

a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by

multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.

74

multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.

75

\begin{figure}[h!tb]

76

\begin{figure}[h!tb]

\begin{center}

77

\begin{center}

\includegraphics[width=.8\linewidth]{images/schema}

78

\includegraphics[width=.8\linewidth]{images/schema}

\end{center}

79

\end{center}

\caption{Fully digital oscillator phase noise characterization: the Device Under Test

80

\caption{Fully digital oscillator phase noise characterization: the Device Under Test

(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and

81

(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and

downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals

82

downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals

and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite

83

and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite

Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays

84

Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays

the spectral characteristics of the phase fluctuations.}

85

the spectral characteristics of the phase fluctuations.}

\label{schema}

86

\label{schema}

\end{figure}

87

\end{figure}

88

As with the analog mixer,

89

As with the analog mixer,

the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as

90

the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as

well as the generation of the frequency sum signal in addition to the frequency difference.

91

well as the generation of the frequency sum signal in addition to the frequency difference.

These unwanted spectral characteristics must be rejected before decimating the data stream

92

These unwanted spectral characteristics must be rejected before decimating the data stream

for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the

93

for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the

downconverter

94

downconverter

and the decimation processing blocks are core characteristics of an oscillator characterization

95

and the decimation processing blocks are core characteristics of an oscillator characterization

system, and must reject out-of-band signals below the targeted phase noise -- typically in the

96

system, and must reject out-of-band signals below the targeted phase noise -- typically in the

sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will

97

sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will

use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency

98

use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency

datastream: optimizing the performance of the filter while reducing the needed resources is

99

datastream: optimizing the performance of the filter while reducing the needed resources is

hence tackled in a systematic approach using optimization techniques. Most significantly, we

100

hence tackled in a systematic approach using optimization techniques. Most significantly, we

tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with

101

tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with

tunable number of coefficients and tunable number of bits representing the coefficients and the

102

tunable number of coefficients and tunable number of bits representing the coefficients and the

data being processed.

103

data being processed.

104

\section{Finite impulse response filter}

105

\section{Finite impulse response filter}

106

We select FIR filters for their unconditional stability and ease of design. A FIR filter is defined

107

We select FIR filters for their unconditional stability and ease of design. A FIR filter is defined

by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the

108

by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the

outputs $y_k$

109

outputs $y_k$

\begin{align}

110

\begin{align}

y_n=\sum_{k=0}^N b_k x_{n-k}

111

y_n=\sum_{k=0}^N b_k x_{n-k}

\label{eq:fir_equation}

112

\label{eq:fir_equation}

\end{align}

113

\end{align}

114

As opposed to an implementation on a general purpose processor in which word size is defined by the

115

As opposed to an implementation on a general purpose processor in which word size is defined by the

processor architecture, implementing such a filter on an FPGA offers more degrees of freedom since

116

processor architecture, implementing such a filter on an FPGA offers more degrees of freedom since

not only the coefficient values and number of taps must be defined, but also the number of bits

117

not only the coefficient values and number of taps must be defined, but also the number of bits

defining the coefficients and the sample size. For this reason, and because we consider pipeline

118

defining the coefficients and the sample size. For this reason, and because we consider pipeline

processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency

119

processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency

signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but

120

signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but

the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language

121

the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language

(VHDL) level.

122

(VHDL) level.

{\color{red}Since latency is not an issue in a openloop phase noise characterization instrument,

123

Since latency is not an issue in a openloop phase noise characterization instrument,

the large

124

the large

numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,

125

numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,

is not considered as an issue as would be in a closed loop system.} % r2.4

126

is not considered as an issue as would be in a closed loop system.

127

The coefficients are classically expressed as floating point values. However, this binary

128

The coefficients are classically expressed as floating point values. However, this binary

number representation is not efficient for fast arithmetic computation by an FPGA. Instead,

129

number representation is not efficient for fast arithmetic computation by an FPGA. Instead,

we select to quantify these floating point values into integer values. This quantization

130

we select to quantify these floating point values into integer values. This quantization

will result in some precision loss.

131

will result in some precision loss.

132

\begin{figure}[h!tb]

133

\begin{figure}[h!tb]

\includegraphics[width=\linewidth]{images/zero_values}

134

\includegraphics[width=\linewidth]{images/zero_values}

\caption{Impact of the quantization resolution of the coefficients: the quantization is

135

\caption{Impact of the quantization resolution of the coefficients: the quantization is

set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting

136

set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting

the 30~first and 30~last coefficients out of the initial 128~band-pass

137

the 30~first and 30~last coefficients out of the initial 128~band-pass

filter coefficients to 0 (red dots).}

138

filter coefficients to 0 (red dots).}

\label{float_vs_int}

139

\label{float_vs_int}

\end{figure}

140

\end{figure}

141

The tradeoff between quantization resolution and number of coefficients when considering

142

The tradeoff between quantization resolution and number of coefficients when considering

integer operations is not trivial. As an illustration of the issue related to the

143

integer operations is not trivial. As an illustration of the issue related to the

relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits

144

relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits

a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon

145

a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon

quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the

146

quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the

taps become null, {\color{red}making the large number of coefficients irrelevant: processing

147

taps become null, making the large number of coefficients irrelevant: processing

resources % r1.1

148

resources

are hence saved by shrinking the filter length.} This tradeoff aimed at minimizing resources

149

are hence saved by shrinking the filter length. This tradeoff aimed at minimizing resources

to reach a given rejection level, or maximizing out of band rejection for a given computational

150

to reach a given rejection level, or maximizing out of band rejection for a given computational

resource, will drive the investigation on cascading filters designed with varying tap resolution

151

resource, will drive the investigation on cascading filters designed with varying tap resolution

and tap length, as will be shown in the next section. Indeed, our development strategy closely

152

and tap length, as will be shown in the next section. Indeed, our development strategy closely

follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}

153

follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}

in which basic blocks are defined and characterized before being assembled \cite{hide}

154

in which basic blocks are defined and characterized before being assembled \cite{hide}

in a complete processing chain. In our case, assembling the filter blocks is a simpler block

155

in a complete processing chain. In our case, assembling the filter blocks is a simpler block

combination process since we assume a single value to be processed and a single value to be

156

combination process since we assume a single value to be processed and a single value to be

generated at each clock cycle. The FIR filters will not be considered to decimate in the

157

generated at each clock cycle. The FIR filters will not be considered to decimate in the

current implementation: the decimation is assumed to be located after the FIR cascade at the

158

current implementation: the decimation is assumed to be located after the FIR cascade at the

moment.

159

moment.

160

\section{Methodology description}

161

\section{Methodology description}

162

Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP)

163

Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP)

chain obtained by assembling basic processing blocks, with hardware and manufacturer independence.

164

chain obtained by assembling basic processing blocks, with hardware and manufacturer independence.

Achieving such a target requires defining an abstract model to represent some basic properties

165

Achieving such a target requires defining an abstract model to represent some basic properties

of DSP blocks such as perfomance (i.e. rejection or ripples in the bandpass for filters) and

166

of DSP blocks such as performance (i.e. rejection or ripples in the bandpass for filters) and

resource occupation. These abstract properties, not necessarily related to the detailed hardware

167

resource occupation. These abstract properties, not necessarily related to the detailed hardware

implementation of a given platform, will feed a scheduler solver aimed at assembling the optimum

168

implementation of a given platform, will feed a scheduler solver aimed at assembling the optimum

target, whether in terms of maximizing performance for a given arbitrary resource occupation, or

169

target, whether in terms of maximizing performance for a given arbitrary resource occupation, or

minimizing resource occupation for a given perfomance. In our approach, the solution of the

170

minimizing resource occupation for a given performance. In our approach, the solution of the

solver is then synthesized using the dedicated tool provided by each platform manufacturer

171

solver is then synthesized using the dedicated tool provided by each platform manufacturer

to assess the validity of our abstract resource occupation indicator, and the result of running

172

to assess the validity of our abstract resource occupation indicator, and the result of running

the DSP chain on the FPGA allows for assessing the performance of the scheduler. We emphasize

173

the DSP chain on the FPGA allows for assessing the performance of the scheduler. We emphasize

that all solutions found by the solver are synthesized and executed on hardware at the end

174

that all solutions found by the solver are synthesized and executed on hardware at the end

of the analysis.

175

of the analysis.

176

In this demonstration, we focus on only two operations: filtering and shifting the number of

177

In this demonstration, we focus on only two operations: filtering and shifting the number of

bits needed to represent the data along the processing chain.

178

bits needed to represent the data along the processing chain.

We have chosen these basic operations because shifting and the filtering have already been studied

179

We have chosen these basic operations because shifting and the filtering have already been studied

in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for

180

in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for

assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend

181

assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend

requiring pipelined processing at full bandwidth for the earliest steps, including for

182

requiring pipelined processing at full bandwidth for the earliest steps, including for

time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}.

183

time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}.

184

Addressing only two operations allows for demonstrating the methodology but should not be

185

Addressing only two operations allows for demonstrating the methodology but should not be

considered as a limitation of the framework which can be extended to assembling any number

186

considered as a limitation of the framework which can be extended to assembling any number

of skeleton blocks as long as perfomance and resource occupation can be determined. {\color{red}

187

of skeleton blocks as long as performance and resource occupation can be determined.

Hence,

188

Hence,

in this paper we will apply our methodology on simple DSP chains: a white noise input signal % r1.2

189

in this paper we will apply our methodology on simple DSP chains: a white noise input signal

is generated using a Pseudo-Random Number (PRN) generator or by sampling a wideband (125~MS/s)

190

is generated using a Pseudo-Random Number (PRN) generator or by sampling a wideband (125~MS/s)

14-bit Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.} Once samples have been

191

14-bit Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor. Once samples have been

digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance --

192

digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance --

practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction

193

practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction

by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing,

194

by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing,

allowing to assess either filter rejection for a given resource usage, or validating the rejection

195

allowing to assess either filter rejection for a given resource usage, or validating the rejection

when implementing a solution minimizing resource occupation.

196

when implementing a solution minimizing resource occupation.

197

{\color{red}

198

The first step of our approach is to model the DSP chain. Since we aim at only optimizing

The first step of our approach is to model the DSP chain. Since we aim at only optimizing % r1.3

199

the filtering part of the signal processing chain, we have not included the PRN generator or the

200

199

the filtering part of the signal processing chain, we have not included the PRN generator or the

ADC in the model: the input data size and rate are considered fixed and defined by the hardware.

201

200

ADC in the model: the input data size and rate are considered fixed and defined by the hardware.

The filtering can be done in two ways, either by considering a single monolithic FIR filter

202

201

The filtering can be done in two ways, either by considering a single monolithic FIR filter

requiring many coefficients to reach the targeted noise rejection ratio, or by

203

202

requiring many coefficients to reach the targeted noise rejection ratio, or by

cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.}

204

203

cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.

205

204

After each filter we leave the possibility of shifting the filtered data to consume

206

205

After each filter we leave the possibility of shifting the filtered data to consume

less resources. Hence in the case of cascaded filter, we define a stage as a filter

207

206

less resources. Hence in the case of cascaded filter, we define a stage as a filter

and a shifter (the shift could be omitted if we do not need to divide the filtered data).

208

207

and a shifter (the shift could be omitted if we do not need to divide the filtered data).

209

208

\subsection{Model of a FIR filter}

210

209

\subsection{Model of a FIR filter}

211

210

A cascade of filters is composed of $n$ FIR stages. In stage $i$ ($1 \leq i \leq n$)

212

211

A cascade of filters is composed of $n$ FIR stages. In stage $i$ ($1 \leq i \leq n$)

the FIR has $C_i$ coefficients and each coefficient is an integer value with $\pi^C_i$

213

212

the FIR has $C_i$ coefficients and each coefficient is an integer value with $\pi^C_i$

bits while the filtered data are shifted by $\pi^S_i$ bits. We define also $\pi^-_i$ as

214

213

bits while the filtered data are shifted by $\pi^S_i$ bits. We define also $\pi^-_i$ as

the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}

215

214

the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}

shows a filtering stage.

216

215

shows a filtering stage.

217

216

\begin{figure}

218

217

\begin{figure}

\centering

219

218

\centering

\begin{tikzpicture}[node distance=2cm]

220

219

\begin{tikzpicture}[node distance=2cm]

\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;

221

220

\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;

\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;

222

221

\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;

\node (Start) [left of=FIR] { } ;

223

222

\node (Start) [left of=FIR] { } ;

\node (End) [right of=Shift] { } ;

224

223

\node (End) [right of=Shift] { } ;

225

224

\node[draw,fit=(FIR) (Shift)] (Filter) { } ;

226

225

\node[draw,fit=(FIR) (Shift)] (Filter) { } ;

227

226

\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;

228

227

\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;

\draw[->] (FIR) -- (Shift) ;

229

228

\draw[->] (FIR) -- (Shift) ;

\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;

230

229

\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;

\end{tikzpicture}

231

230

\end{tikzpicture}

\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}

232

231

\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}

\label{fig:fir_stage}

233

232

\label{fig:fir_stage}

\end{figure}

234

233

\end{figure}

235

234

FIR $i$ has been characterized through numerical simulation as able to reject $F(C_i, \pi_i^C)$ dB.

236

235

FIR $i$ has been characterized through numerical simulation as able to reject $F(C_i, \pi_i^C)$ dB.

This rejection has been computed using GNU Octave software FIR coefficient design functions

237

236

This rejection has been computed using GNU Octave software FIR coefficient design functions

(\texttt{firls} and \texttt{fir1}).

238

237

(\texttt{firls} and \texttt{fir1}).

For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.

239

238

For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.

Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,

240

239

Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,

the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.

241

240

the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.

At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the others are coded on much fewer bits.

242

241

At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the others are coded on much fewer bits.

243

242

With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter

244

243

With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter

transfer function.

245

244

transfer function.

Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag},

246

245

Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag},

the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the

247

246

the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the

bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. {\color{red}Throughout this demonstration,

248

247

bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. Throughout this demonstration,

we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%

249

248

we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%

of the Nyquist frequency to the end of the band, as would be typically selected to prevent

250

249

of the Nyquist frequency to the end of the band, as would be typically selected to prevent

aliasing before decimating the dataflow by 2. The method is however generalized to any filter

251

250

aliasing before decimating the dataflow by 2. The method is however generalized to any filter

shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}

252

251

shape as long as it is defined from the initial modeling steps: Fig. \ref{fig:rejection_pyramid}

as described below is indeed unique for each filter shape.}

253

252

as described below is indeed unique for each filter shape.

254

253

\begin{figure}

255

254

\begin{figure}

\begin{center}

256

255

\begin{center}

\scalebox{0.8}{

257

256

\scalebox{0.8}{

\centering

258

257

\centering

\begin{tikzpicture}[scale=0.3]

259

258

\begin{tikzpicture}[scale=0.3]

\draw[<->] (0,15) -- (0,0) -- (21,0) ;

260

259

\draw[<->] (0,15) -- (0,0) -- (21,0) ;

\draw[thick] (0,12) -- (8,12) -- (20,0) ;

261

260

\draw[thick] (0,12) -- (8,12) -- (20,0) ;

262

261

\draw (0,14) node [left] { $P$ } ;

263

262

\draw (0,14) node [left] { $P$ } ;

\draw (20,0) node [below] { $f$ } ;

264

263

\draw (20,0) node [below] { $f$ } ;

265

264

\draw[>=latex,<->] (0,14) -- (8,14) ;

266

265

\draw[>=latex,<->] (0,14) -- (8,14) ;

\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;

267

266

\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;

268

267

\draw[>=latex,<->] (8,14) -- (12,14) ;

269

268

\draw[>=latex,<->] (8,14) -- (12,14) ;

\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;

270

269

\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;

271

270

\draw[>=latex,<->] (12,14) -- (20,14) ;

272

271

\draw[>=latex,<->] (12,14) -- (20,14) ;

\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;

273

272

\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;

274

273

\draw[>=latex,<->] (16,12) -- (16,8) ;

275

274

\draw[>=latex,<->] (16,12) -- (16,8) ;

\draw (16,10) node [right] { rejection } ;

276

275

\draw (16,10) node [right] { rejection } ;

277

276

\draw[dashed] (8,-1) -- (8,14) ;

278

277

\draw[dashed] (8,-1) -- (8,14) ;

\draw[dashed] (12,-1) -- (12,14) ;

279

278

\draw[dashed] (12,-1) -- (12,14) ;

280

279

\draw[dashed] (8,12) -- (16,12) ;

281

280

\draw[dashed] (8,12) -- (16,12) ;

\draw[dashed] (12,8) -- (16,8) ;

282

281

\draw[dashed] (12,8) -- (16,8) ;

283

282

\end{tikzpicture}

284

283

\end{tikzpicture}

}

285

284

}

\end{center}

286

285

\end{center}

\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:

287

286

\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:

the passband is considered to occupy the initial 40\% of the Nyquist frequency range,

288

287

the passband is considered to occupy the initial 40\% of the Nyquist frequency range,

the stopband the last 40\%, allowing 20\% transition width.}

289

288

the stopband the last 40\%, allowing 20\% transition width.}

\label{fig:fir_mag}

290

289

\label{fig:fir_mag}

\end{figure}

291

290

\end{figure}

292

291

In the transition band, the behavior of the filter is left free, we only {\color{red}define} the passband and the stopband characteristics.

293

292

In the transition band, the behavior of the filter is left free, we only define the passband and the stopband characteristics.

% r2.7

294

293

% r2.7

{\color{red}Initial considered criteria include the mean value of the stopband rejection which yields unacceptable results since notches

295

294

Initial considered criteria include the mean value of the stopband rejection which yields unacceptable results since notches

overestimate the rejection capability of the filter.}

296

295

overestimate the rejection capability of the filter.

% Furthermore, the losses within

297

296

% Furthermore, the losses within

% the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients.

298

297

% the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients.

Our final criterion to compute the filter rejection considers

299

298

Our final criterion to compute the filter rejection considers

% r2.8 et r2.2 r2.3

300

299

% r2.8 et r2.2 r2.3

the {\color{red}minimal} rejection within the stopband, to which the {\color{red}sum of the absolute values

301

300

the minimal rejection within the stopband, to which the sum of the absolute values

within the passband is subtracted to avoid filters with excessive ripples, normalized to the

302

301

within the passband is subtracted to avoid filters with excessive ripples, normalized to the

bin width to remain consistent with the passband criterion (dBc/Hz units in all cases)}. With this

303

302

bin width to remain consistent with the passband criterion (dBc/Hz units in all cases). With this

criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.

304

303

criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.

305

304

% \begin{figure}

306

305

% \begin{figure}

% \centering

307

306

% \centering

% \includegraphics[width=\linewidth]{images/colored_mean_criterion}

308

307

% \includegraphics[width=\linewidth]{images/colored_mean_criterion}

% \caption{Mean stopband rejection criterion comparison between monolithic filter and cascaded filters}

309

308

% \caption{Mean stopband rejection criterion comparison between monolithic filter and cascaded filters}

% \label{fig:mean_criterion}

310

309

% \label{fig:mean_criterion}

% \end{figure}

311

310

% \end{figure}

312

311

\begin{figure}

313

312

\begin{figure}

\centering

314

313

\centering

\includegraphics[width=\linewidth]{images/colored_custom_criterion}

315

314

\includegraphics[width=\linewidth]{images/colored_custom_criterion}

\caption{Custom criterion (maximum rejection in the stopband minus the {\color{red} sum of the

316

315

\caption{Custom criterion (maximum rejection in the stopband minus the sum of the

absolute values of the passband rejection normalized to the bandwidth})

317

316

absolute values of the passband rejection normalized to the bandwidth)

comparison between monolithic filter and cascaded filters}

318

317

comparison between monolithic filter and cascaded filters}

\label{fig:custom_criterion}

319

318

\label{fig:custom_criterion}

\end{figure}

320

319

\end{figure}

321

320

Thanks to the latter criterion which will be used in the remainder of this paper, we are able to automatically generate multiple FIR taps

322

321

Thanks to the latter criterion which will be used in the remainder of this paper, we are able to automatically generate multiple FIR taps

and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the

323

322

and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the

rejection as a function of the number of coefficients and the number of bits representing these coefficients.

324

323

rejection as a function of the number of coefficients and the number of bits representing these coefficients.

The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet.

325

324

The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet.

Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection.

326

325

Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection.

Conversely when setting the a given number of bits, increasing the number of coefficients will not improve

327

326

Conversely when setting the a given number of bits, increasing the number of coefficients will not improve

the rejection. Hence the best coefficient set are on the vertex of the pyramid.

328

327

the rejection. Hence the best coefficient set are on the vertex of the pyramid.

329

328

\begin{figure}

330

329

\begin{figure}

\centering

331

330

\centering

\includegraphics[width=\linewidth]{images/rejection_pyramid}

332

331

\includegraphics[width=\linewidth]{images/rejection_pyramid}

\caption{{\color{red}{Filter}} rejection as a function of number of coefficients and number of bits

333

332

\caption{Filter rejection as a function of number of coefficients and number of bits

{\color{red}: this lookup table will be used to identify which filter parameters -- number of bits

334

333

: this lookup table will be used to identify which filter parameters -- number of bits

representing coefficients and number of coefficients -- best match the targeted transfer function.}}

335

334

representing coefficients and number of coefficients -- best match the targeted transfer function.}

\label{fig:rejection_pyramid}

336

335

\label{fig:rejection_pyramid}

\end{figure}

337

336

\end{figure}

338

337

Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps),

339

338

Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps),

we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria.

340

339

we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria.

If the FIR filter coefficients are the same between the stages, we have:

341

340

If the FIR filter coefficients are the same between the stages, we have:

$$F_{total} = F_1 + F_2$$

342

341

$$F_{total} = F_1 + F_2$$

But selecting two different sets of coefficient will yield a more complex situation in which

343

342

But selecting two different sets of coefficient will yield a more complex situation in which

the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves

344

343

the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves

are two different filters with maximums and notches not located at the same frequency offsets.

345

344

are two different filters with maximums and notches not located at the same frequency offsets.

Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved

346

345

Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved

with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.

347

346

with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.

% r2.9

348

347

% r2.9

Thus, estimating the rejection of filter cascades is more complex than {\color{red}taking} the sum of all the rejection

349

348

Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection

criteria of each filter. However since the {\color{red}individual filter rejection} sum underestimates the rejection capability of the cascade,

350

349

criteria of each filter. However since the individual filter rejection sum underestimates the rejection capability of the cascade,

% r2.10

351

350

% r2.10

this upper bound is considered as a {\color{red}conservative} and acceptable criterion for deciding on the suitability

352

351

this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability

of the filter cascade to meet design criteria.

353

352

of the filter cascade to meet design criteria.

354

353

\begin{figure}

355

354

\begin{figure}

\centering

356

355

\centering

\includegraphics[width=\linewidth]{images/cascaded_criterion}

357

356

\includegraphics[width=\linewidth]{images/cascaded_criterion}

\caption{{\color{red}Transfer function of individual filters and after cascading} the two filters,

358

357

\caption{Transfer function of individual filters and after cascading the two filters,

{\color{red}demonstrating that the selected criterion of maximum rejection in the bandstop (horizontal

359

358

demonstrating that the selected criterion of maximum rejection in the bandstop (horizontal

lines) is met. Notice that the cascaded filter has better rejection than summing the bandstop

360

359

lines) is met. Notice that the cascaded filter has better rejection than summing the bandstop

maximum of each individual filter.}

361

360

maximum of each individual filter.

}

362

361

}

\label{fig:sum_rejection}

363

362

\label{fig:sum_rejection}

\end{figure}

364

363

\end{figure}

365

364

% r2.6

366

{\color{red}

367

Finally in our case, we consider that the input signal are fully known. The

368

365

Finally in our case, we consider that the input signal are fully known. The

resolution of the input data stream are fixed and still the same for all experiments

369

366

resolution of the input data stream are fixed and still the same for all experiments

in this paper.}

370

367

in this paper.

371

368

Based on this analysis, we address the estimate of resource consumption (called

372

369

Based on this analysis, we address the estimate of resource consumption (called

% r2.11

373

370

% r2.11

silicon area -- in the case of FPGAs {\color{red}this means} processing cells) as a function of

374

371

silicon area -- in the case of FPGAs this means processing cells) as a function of

filter characteristics. As a reminder, we do not aim at matching actual hardware

375

372

filter characteristics. As a reminder, we do not aim at matching actual hardware

configuration but consider an arbitrary silicon area occupied by each processing function,

376

373

configuration but consider an arbitrary silicon area occupied by each processing function,

and will assess after synthesis the adequation of this arbitrary unit with actual

377

374

and will assess after synthesis the adequation of this arbitrary unit with actual

hardware resources provided by FPGA manufacturers. The sum of individual processing

378

375

hardware resources provided by FPGA manufacturers. The sum of individual processing

unit areas is constrained by a total silicon area representative of FPGA global resources.

379

376

unit areas is constrained by a total silicon area representative of FPGA global resources.

Formally, variable $a_i$ is the area taken by filter~$i$

380

377

Formally, variable $a_i$ is the area taken by filter~$i$

(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).

381

378

(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).

Constant $\mathcal{A}$ is the total available area. We model our problem as follows:

382

379

Constant $\mathcal{A}$ is the total available area. We model our problem as follows:

383

380

\begin{align}

384

381

\begin{align}

\text{Maximize } & \sum_{i=1}^n r_i \notag \\

385

382

\text{Maximize } & \sum_{i=1}^n r_i \notag \\

\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\

386

383

\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\

a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\

387

384

a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\

r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\

388

385

r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\

\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\

389

386

\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\

\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\

390

387

\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\

\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\

391

388

\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\

\pi_1^- &= \Pi^I \label{eq:init}

392

389

\pi_1^- &= \Pi^I \label{eq:init}

\end{align}

393

390

\end{align}

394

391

Equation~\ref{eq:area} states that the total area taken by the filters must be

395

392

Equation~\ref{eq:area} states that the total area taken by the filters must be

less than the available area. Equation~\ref{eq:areadef} gives the definition of

396

393

less than the available area. Equation~\ref{eq:areadef} gives the definition of

the area used by a filter, considered as the area of the FIR since the Shifter is

397

394

the area used by a filter, considered as the area of the FIR since the Shifter is

assumed not to require significant resources. We consider that the FIR needs $C_i$ registers of size

398

395

assumed not to require significant resources. We consider that the FIR needs $C_i$ registers of size

$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the

399

396

$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the

input data with the coefficients. Equation~\ref{eq:rejectiondef} gives the

400

397

input data with the coefficients. Equation~\ref{eq:rejectiondef} gives the

definition of the rejection of the filter thanks to the tabulated function~$F$ that we defined

401

398

definition of the rejection of the filter thanks to the tabulated function~$F$ that we defined

previously. The Shifter does not introduce negative rejection as we will explain later,

402

399

previously. The Shifter does not introduce negative rejection as we will explain later,

so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the

403

400

so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the

relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add

404

401

relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add

$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes

405

402

$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes

$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of

406

403

$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of

a filter is the same as the input number of bits of the next filter.

407

404

a filter is the same as the input number of bits of the next filter.

Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative

408

405

Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative

rejection. Indeed, the results of the FIR can be right shifted without compromising

409

406

rejection. Indeed, the results of the FIR can be right shifted without compromising

the quality of the rejection until a threshold. Each bit of the output data

410

407

the quality of the rejection until a threshold. Each bit of the output data

increases the maximum rejection level by 6~dB. We add one to take the sign bit

411

408

increases the maximum rejection level by 6~dB. We add one to take the sign bit

into account. If equation~\ref{eq:maxshift} was not present, the Shifter could

412

409

into account. If equation~\ref{eq:maxshift} was not present, the Shifter could

shift too much and introduce some noise in the output data. Each supplementary

413

410

shift too much and introduce some noise in the output data. Each supplementary

shift bit would cause an additional 6~dB rejection rise. A totally equivalent equation is:

414

411

shift bit would cause an additional 6~dB rejection rise. A totally equivalent equation is:

$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right)$.

415

412

$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right)$.

Finally, equation~\ref{eq:init} gives the number of bits of the global input.

416

413

Finally, equation~\ref{eq:init} gives the number of bits of the global input.

417

414

{\color{red}

418

This model is non-linear since we multiply some variable with another variable

419

415

This model is non-linear since we multiply some variable with another variable

and it is even non-quadratic, as the cost function $F$ does not have a known

420

416

and it is even non-quadratic, as the cost function $F$ does not have a known

linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations.

421

417

linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations.

% AH: conflit merge

422

418

% AH: conflit merge

% This variable must be defined by the user, it represent the number of different

423

419

% This variable must be defined by the user, it represent the number of different

% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

424

420

% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

% functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid}

425

421

% functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid}

% to restrict the number of configurations. Indeed, it is useless to have too many coefficients or

426

422

% to restrict the number of configurations. Indeed, it is useless to have too many coefficients or

% too many bits, hence we take the configurations close to edge of pyramid. Thank to theses

427

423

% too many bits, hence we take the configurations close to edge of pyramid. Thank to theses

% configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant

428

424

% configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant

% and the function $F$ can be estimate for each configurations

429

425

% and the function $F$ can be estimate for each configurations

% thanks our rejection criterion. We also defined binary

430

426

% thanks our rejection criterion. We also defined binary

This variable $p$ is defined by the user, and represents the number of different

431

427

This variable $p$ is defined by the user, and represents the number of different

set of coefficients generated (remember, we use \texttt{firls} and \texttt{fir1}

432

428

set of coefficients generated (remember, we use \texttt{firls} and \texttt{fir1}

functions from GNU Octave) based on the targeted filter characteristics and implementation

433

429

functions from GNU Octave) based on the targeted filter characteristics and implementation

assumptions (estimated number of bits defining the coefficients). Hence, $C_{ij}$ and

434

430

assumptions (estimated number of bits defining the coefficients). Hence, $C_{ij}$ and

$\pi_{ij}^C$ become constants and

435

431

$\pi_{ij}^C$ become constants and

we define $1 \leq j \leq p$ so that the function $F$ can be estimated (Look Up Table)

436

432

we define $1 \leq j \leq p$ so that the function $F$ can be estimated (Look Up Table)

for each configurations thanks to the rejection criterion. We also define the binary

437

433

for each configurations thanks to the rejection criterion. We also define the binary

variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

438

434

variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

and 0 otherwise. The new equations are as follows:

439

435

and 0 otherwise. The new equations are as follows:

}

440

441

436

\begin{align}

442

437

\begin{align}

a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

443

438

a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

444

439

r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

445

440

\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

446

441

\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

\end{align}

447

442

\end{align}

448

443

Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

449

444

Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

450

445

respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

451

446

Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

452

447

{\color{red}

453

% JM: conflict merge

454

448

% JM: conflict merge

% However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}

455

449

% However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}

% we multiply

456

450

% we multiply

% $\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can

457

451

% $\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can

% linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size,

458

452

% linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size,

% we define $0 < \pi_i^- \leq 128$ which is the maximum data size whose estimation is

459

453

% we define $0 < \pi_i^- \leq 128$ which is the maximum data size whose estimation is

% assumed on hardware characteristics.

460

454

% assumed on hardware characteristics.

% The Gurobi (\url{www.gurobi.com}) optimization software used to solve this quadratic

461

455

% The Gurobi (\url{www.gurobi.com}) optimization software used to solve this quadratic

% model is able to linearize the model provided as is. This model

462

456

% model is able to linearize the model provided as is. This model

% has $O(np)$ variables and $O(n)$ constraints.}

463

457

% has $O(np)$ variables and $O(n)$ constraints.}

The problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}

464

458

The problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}

we multiply

465

459

we multiply

$\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can

466

460

$\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can

linearise linearize this multiplication. The following formula shows how to linearize

467

461

linearize this multiplication. The following formula shows how to linearize

this situation in general case with $y$ a binary variable and $x$ a real variable ($0 \leq x \leq X^{max}$):

468

462

this situation in general case with $y$ a binary variable and $x$ a real variable ($0 \leq x \leq X^{max}$):

\begin{equation*}

469

463

\begin{equation*}

m = x \times y \implies

470

464

m = x \times y \implies

\left \{

471

465

\left \{

\begin{split}

472

466

\begin{split}

m & \geq 0 \\

473

467

m & \geq 0 \\

m & \leq y \times X^{max} \\

474

468

m & \leq y \times X^{max} \\

m & \leq x \\

475

469

m & \leq x \\

m & \geq x - (1 - y) \times X^{max} \\

476

470

m & \geq x - (1 - y) \times X^{max} \\

\end{split}

477

471

\end{split}

\right .

478

472

\right .

\end{equation*}

479

473

\end{equation*}

So if we bound up $\pi_i^-$ by 128~bits which is the maximum data size whose estimation is

480

474

So if we bound up $\pi_i^-$ by 128~bits which is the maximum data size whose estimation is

assumed on hardware characteristics,

481

475

assumed on hardware characteristics,

the Gurobi (\url{www.gurobi.com}) optimization software will be able to linearize

482

476

the Gurobi (\url{www.gurobi.com}) optimization software will be able to linearize

for us the quadratic problem so the model is left as is. This model

483

477

for us the quadratic problem so the model is left as is. This model

has $O(np)$ variables and $O(n)$ constraints.}

484

478

has $O(np)$ variables and $O(n)$ constraints.

485

479

% This model is non-linear and even non-quadratic, as $F$ does not have a known

486

480

% This model is non-linear and even non-quadratic, as $F$ does not have a known

% linear or quadratic expression. We introduce $p$ FIR configurations

487

481

% linear or quadratic expression. We introduce $p$ FIR configurations

% $(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants.

488

482

% $(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants.

% % r2.12

489

483

% % r2.12

% This variable must be defined by the user, it represent the number of different

490

484

% This variable must be defined by the user, it represent the number of different

% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

491

485

% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

% functions from GNU Octave).

492

486

% functions from GNU Octave).

% We define binary

493

487

% We define binary

% variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

494

488

% variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

% and 0 otherwise. The new equations are as follows:

495

489

% and 0 otherwise. The new equations are as follows:

%

496

490

%

% \begin{align}

497

491

% \begin{align}

% a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

498

492

% a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

% r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

499

493

% r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

% \pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

500

494

% \pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

% \sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

501

495

% \sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

% \end{align}

502

496

% \end{align}

%

503

497

%

% Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

504

498

% Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

% respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

505

499

% respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

% Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

506

500

% Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

%

507

501

%

% % r2.13

508

502

% % r2.13

% This modified model is quadratic since we multiply two variables in the

509

503

% This modified model is quadratic since we multiply two variables in the

% equation~\ref{eq:areadef2} ($\delta_{ij}$ by $\pi_{ij}^-$) but it can be linearised if necessary.

510

504

% equation~\ref{eq:areadef2} ($\delta_{ij}$ by $\pi_{ij}^-$) but it can be linearised if necessary.

% The Gurobi

511

505

% The Gurobi

% (\url{www.gurobi.com}) optimization software is used to solve this quadratic

512

506

% (\url{www.gurobi.com}) optimization software is used to solve this quadratic

% model, and since Gurobi is able to linearize, the model is left as is. This model

513

507

% model, and since Gurobi is able to linearize, the model is left as is. This model

% has $O(np)$ variables and $O(n)$ constraints.

514

508

% has $O(np)$ variables and $O(n)$ constraints.

515

509

Two problems will be addressed using the workflow described in the next section: on the one

516

510

Two problems will be addressed using the workflow described in the next section: on the one

hand maximizing the rejection capability of a set of cascaded filters occupying a fixed arbitrary

517

511

hand maximizing the rejection capability of a set of cascaded filters occupying a fixed arbitrary

silcon area (section~\ref{sec:fixed_area}) and on the second hand the dual problem of minimizing the silicon area

518

512

silicon area (section~\ref{sec:fixed_area}) and on the second hand the dual problem of minimizing the silicon area

for a fixed rejection criterion (section~\ref{sec:fixed_rej}). In the latter case, the

519

513

for a fixed rejection criterion (section~\ref{sec:fixed_rej}). In the latter case, the

objective function is replaced with:

520

514

objective function is replaced with:

\begin{align}

521

515

\begin{align}

\text{Minimize } & \sum_{i=1}^n a_i \notag

522

516

\text{Minimize } & \sum_{i=1}^n a_i \notag

\end{align}

523

517

\end{align}

We adapt our constraints of quadratic program to replace equation \ref{eq:area}

524

518

We adapt our constraints of quadratic program to replace equation \ref{eq:area}

with equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal

525

519

with equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal

rejection required.

526

520

rejection required.

527

521

\begin{align}

528

522

\begin{align}

\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}

529

523

\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}

\end{align}

530

524

\end{align}

531

525

\section{Design workflow}

532

526

\section{Design workflow}

\label{sec:workflow}

533

527

\label{sec:workflow}

534

528

In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area}

535

529

In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area}

and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved

536

530

and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved

in the computation of the results.

537

531

in the computation of the results.

538

532

\begin{figure}

539

533

\begin{figure}

\centering

540

534

\centering

\begin{tikzpicture}[node distance=0.75cm and 2cm]

541

535

\begin{tikzpicture}[node distance=0.75cm and 2cm]

\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;

542

536

\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;

\node (Start) [left= 3cm of Solver] { } ;

543

537

\node (Start) [left= 3cm of Solver] { } ;

\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;

544

538

\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;

\node (Input) [above= of TCL] { } ;

545

539

\node (Input) [above= of TCL] { } ;

\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;

546

540

\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;

\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;

547

541

\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;

\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;

548

542

\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;

\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;

549

543

\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;

\node (Results) [left= of Postproc] { } ;

550

544

\node (Results) [left= of Postproc] { } ;

551

545

\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;

552

546

\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;

\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;

553

547

\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;

\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;

554

548

\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;

\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;

555

549

\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;

\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;

556

550

\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;

\draw[->,dashed] (Bitstream) -- (Deploy) ;

557

551

\draw[->,dashed] (Bitstream) -- (Deploy) ;

\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;

558

552

\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;

\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;

559

553

\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;

\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;

560

554

\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;

\draw[->] (Postproc) -- (Results) ;

561

555

\draw[->] (Postproc) -- (Results) ;

\end{tikzpicture}

562

556

\end{tikzpicture}

\caption{Design workflow from the input parameters to the results {\color{red} allowing for

563

557

\caption{Design workflow from the input parameters to the results allowing for

a fully automated optimal solution search.}}

564

558

a fully automated optimal solution search.}

\label{fig:workflow}

565

559

\label{fig:workflow}

\end{figure}

566

560

\end{figure}

567

561

The filter solver is a C++ program that takes as input the maximum area

568

562

The filter solver is a C++ program that takes as input the maximum area

$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,

569

563

$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,

the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates

570

564

the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates

the quadratic programs and uses the Gurobi solver to estimate the optimal results.

571

565

the quadratic programs and uses the Gurobi solver to estimate the optimal results.

Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})

572

566

Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})

and a deploy script ((1b) on figure~\ref{fig:workflow}).

573

567

and a deploy script ((1b) on figure~\ref{fig:workflow}).

574

568

The TCL script describes the whole digital processing chain from the beginning

575

569

The TCL script describes the whole digital processing chain from the beginning

(the raw signal data) to the end (the filtered data) in a language compatible

576

570

(the raw signal data) to the end (the filtered data) in a language compatible

with proprietary synthesis software, namely Vivado for Xilinx and Quartus for

577

571

with proprietary synthesis software, namely Vivado for Xilinx and Quartus for

Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN)

578

572

Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN)

generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.

579

573

generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.

Then the script builds each stage of the chain with a generic FIR task that

580

574

Then the script builds each stage of the chain with a generic FIR task that

comes from a skeleton library. The generic FIR is highly configurable

581

575

comes from a skeleton library. The generic FIR is highly configurable

with the number of coefficients and the size of the coefficients. The coefficients

582

576

with the number of coefficients and the size of the coefficients. The coefficients

themselves are not stored in the script.

583

577

themselves are not stored in the script.

As the signal is processed in real-time, the output signal is stored as

584

578

As the signal is processed in real-time, the output signal is stored as

consecutive bursts of data for post-processing, mainly assessing the consistency of the

585

579

consecutive bursts of data for post-processing, mainly assessing the consistency of the

implemented FIR cascade transfer function with the design criteria and the expected

586

580

implemented FIR cascade transfer function with the design criteria and the expected

transfer function.

587

581

transfer function.

588

582

The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).

589

583

The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).

We use the 2018.2 version of Xilinx Vivado and we execute the synthesized

590

584

We use the 2018.2 version of Xilinx Vivado and we execute the synthesized

bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series

591

585

bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series

FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Omega$ resistors to

592

586

FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Omega$ resistors to

provide a broadband noise source.

593

587

provide a broadband noise source.

The board runs the Linux kernel and surrounding environment produced from the

594

588

The board runs the Linux kernel and surrounding environment produced from the

Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring

595

589

Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring

the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and

596

590

the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and

fetching the results is automated.

597

591

fetching the results is automated.

598

592

The deploy script uploads the bitstream to the board ((3) on

599

593

The deploy script uploads the bitstream to the board ((3) on

figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,

600

594

figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,

configures the coefficients of the FIR filters. It then waits for the results

601

595

configures the coefficients of the FIR filters. It then waits for the results

and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).

602

596

and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).

603

597

Finally, an Octave post-processing script computes the final results thanks to

604

598

Finally, an Octave post-processing script computes the final results thanks to

the output data ((5) on figure~\ref{fig:workflow}).

605

599

the output data ((5) on figure~\ref{fig:workflow}).

The results are normalized so that the Power Spectrum Density (PSD) starts at zero

606

600

The results are normalized so that the Power Spectrum Density (PSD) starts at zero

and the different configurations can be compared.

607

601

and the different configurations can be compared.

608

602

\section{Maximizing the rejection at fixed silicon area}

609

603

\section{Maximizing the rejection at fixed silicon area}

\label{sec:fixed_area}

610

604

\label{sec:fixed_area}

This section presents the output of the filter solver {\em i.e.} the computed

611

605

This section presents the output of the filter solver {\em i.e.} the computed

configurations for each stage, the computed rejection and the computed silicon area.

612

606

configurations for each stage, the computed rejection and the computed silicon area.

Such results allow for understanding the choices made by the solver to compute its solutions.

613

607

Such results allow for understanding the choices made by the solver to compute its solutions.

614

608

The experimental setup is composed of three cases. The raw input is generated

615

609

The experimental setup is composed of three cases. The raw input is generated

by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.

616

610

by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.

Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500

617

611

Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500

arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.

618

612

arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.

The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$

619

613

The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$

ranging from 2 to 22. In each case, the quadratic program has been able to give a

620

614

ranging from 2 to 22. In each case, the quadratic program has been able to give a

result up to five stages ($n = 5$) in the cascaded filter.

621

615

result up to five stages ($n = 5$) in the cascaded filter.

622

616

Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.

623

617

Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.

Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.

624

618

Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.

Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.

625

619

Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.

626

620

\renewcommand{\arraystretch}{1.4}

627

621

\renewcommand{\arraystretch}{1.4}

628

622

\begin{table}

629

623

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}

630

624

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}

\label{tbl:gurobi_max_500}

631

625

\label{tbl:gurobi_max_500}

\centering

632

626

\centering

{\scalefont{0.77}

633

627

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

634

628

\begin{tabular}{|c|ccccc|c|c|}

\hline

635

629

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

636

630

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

637

631

\hline

1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\

638

632

1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\

2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\

639

633

2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\

3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\

640

634

3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\

4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\

641

635

4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\

5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\

642

636

5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\

\hline

643

637

\hline

\end{tabular}

644

638

\end{tabular}

}

645

639

}

\end{table}

646

640

\end{table}

647

641

\begin{table}

648

642

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}

649

643

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}

\label{tbl:gurobi_max_1000}

650

644

\label{tbl:gurobi_max_1000}

\centering

651

645

\centering

{\scalefont{0.77}

652

646

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

653

647

\begin{tabular}{|c|ccccc|c|c|}

\hline

654

648

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

655

649

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

656

650

\hline

1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\

657

651

1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\

2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\

658

652

2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\

3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\

659

653

3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\

4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\

660

654

4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\

5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\

661

655

5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\

\hline

662

656

\hline

\end{tabular}

663

657

\end{tabular}

}

664

658

}

\end{table}

665

659

\end{table}

666

660

\begin{table}

667

661

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}

668

662

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}

\label{tbl:gurobi_max_1500}

669

663

\label{tbl:gurobi_max_1500}

\centering

670

664

\centering

{\scalefont{0.77}

671

665

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

672

666

\begin{tabular}{|c|ccccc|c|c|}

\hline

673

667

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

674

668

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

675

669

\hline

1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\

676

670

1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\

2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\

677

671

2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\

3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\

678

672

3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\

4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\

679

673

4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\

5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\

680

674

5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\

\hline

681

675

\hline

\end{tabular}

682

676

\end{tabular}

}

683

677

}

\end{table}

684

678

\end{table}

685

679

\renewcommand{\arraystretch}{1}

686

680

\renewcommand{\arraystretch}{1}

687

681

From these tables, we can first state that the more stages are used to define

688

682

From these tables, we can first state that the more stages are used to define

the cascaded FIR filters, the better the rejection. It was an expected result as it has

689

683

the cascaded FIR filters, the better the rejection. It was an expected result as it has

been previously observed that many small filters are better than

690

684

been previously observed that many small filters are better than

a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusions

691

685

a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusions

being hardly used in practice due to the lack of tools for identifying individual filter

692

686

being hardly used in practice due to the lack of tools for identifying individual filter

coefficients in the cascaded approach.

693

687

coefficients in the cascaded approach.

694

688

Second, the larger the silicon area, the better the rejection. This was also an

695

689

Second, the larger the silicon area, the better the rejection. This was also an

expected result as more area means a filter of better quality with more coefficients

696

690

expected result as more area means a filter of better quality with more coefficients

or more bits per coefficient.

697

691

or more bits per coefficient.

698

692

Then, we also observe that the first stage can have a larger shift than the other

699

693

Then, we also observe that the first stage can have a larger shift than the other

stages. This is explained by the fact that the solver tries to use just enough

700

694

stages. This is explained by the fact that the solver tries to use just enough

bits for the computed rejection after each stage. In the first stage, a

701

695

bits for the computed rejection after each stage. In the first stage, a

balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}

702

696

balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}

gives the relation between both values.

703

697

gives the relation between both values.

704

698

Finally, we note that the solver consumes all the given silicon area.

705

699

Finally, we note that the solver consumes all the given silicon area.

706

700

The following graphs present the rejection for real data on the FPGA. In all the following

707

701

The following graphs present the rejection for real data on the FPGA. In all the following

figures, the solid line represents the actual rejection of the filtered

708

702

figures, the solid line represents the actual rejection of the filtered

data on the FPGA as measured experimentally and the dashed line are the noise levels

709

703

data on the FPGA as measured experimentally and the dashed line are the noise levels

given by the quadratic solver. The configurations are those computed in the previous section.

710

704

given by the quadratic solver. The configurations are those computed in the previous section.

711

705

Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.

712

706

Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.

Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.

713

707

Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.

Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.

714

708

Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.

715

709

% \begin{figure}

716

710

% \begin{figure}

% \centering

717

711

% \centering

% \includegraphics[width=\linewidth]{images/max_500}

718

712

% \includegraphics[width=\linewidth]{images/max_500}

% \caption{Signal spectrum for MAX/500}

719

713

% \caption{Signal spectrum for MAX/500}

% \label{fig:max_500_result}

720

714

% \label{fig:max_500_result}

% \end{figure}

721

715

% \end{figure}

%

722

716

%

% \begin{figure}

723

717

% \begin{figure}

% \centering

724

718

% \centering

% \includegraphics[width=\linewidth]{images/max_1000}

725

719

% \includegraphics[width=\linewidth]{images/max_1000}

% \caption{Signal spectrum for MAX/1000}

726

720

% \caption{Signal spectrum for MAX/1000}

% \label{fig:max_1000_result}

727

721

% \label{fig:max_1000_result}

% \end{figure}

728

722

% \end{figure}

%

729

723

%

% \begin{figure}

730

724

% \begin{figure}

% \centering

731

725

% \centering

% \includegraphics[width=\linewidth]{images/max_1500}

732

726

% \includegraphics[width=\linewidth]{images/max_1500}

% \caption{Signal spectrum for MAX/1500}

733

727

% \caption{Signal spectrum for MAX/1500}

% \label{fig:max_1500_result}

734

728

% \label{fig:max_1500_result}

% \end{figure}

735

729

% \end{figure}

736

730

% r2.14 et r2.15 et r2.16

737

731

% r2.14 et r2.15 et r2.16

\begin{figure}

738

732

\begin{figure}

\centering

739

733

\centering

\begin{subfigure}{\linewidth}

740

734

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_500}

741

735

\includegraphics[width=\linewidth]{images/max_500}

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

742

736

\caption{Filter transfer functions for varying number of cascaded filters solving

the MAX/500 problem of maximizing rejection for a given resource allocation (500~arbitrary units).}

743

737

the MAX/500 problem of maximizing rejection for a given resource allocation (500~arbitrary units).}

\label{fig:max_500_result}

744

738

\label{fig:max_500_result}

\end{subfigure}

745

739

\end{subfigure}

746

740

\begin{subfigure}{\linewidth}

747

741

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_1000}

748

742

\includegraphics[width=\linewidth]{images/max_1000}

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

749

743

\caption{Filter transfer functions for varying number of cascaded filters solving

the MAX/1000 problem of maximizing rejection for a given resource allocation (1000~arbitrary units).}

750

744

the MAX/1000 problem of maximizing rejection for a given resource allocation (1000~arbitrary units).}

\label{fig:max_1000_result}

751

745

\label{fig:max_1000_result}

\end{subfigure}

752

746

\end{subfigure}

753

747

\begin{subfigure}{\linewidth}

754

748

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_1500}

755

749

\includegraphics[width=\linewidth]{images/max_1500}

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

756

750

\caption{Filter transfer functions for varying number of cascaded filters solving

the MAX/1500 problem of maximizing rejection for a given resource allocation (1500~arbitrary units).}

757

751

the MAX/1500 problem of maximizing rejection for a given resource allocation (1500~arbitrary units).}

\label{fig:max_1500_result}

758

752

\label{fig:max_1500_result}

\end{subfigure}

759

753

\end{subfigure}

\caption{\color{red}Solutions for the MAX/500, MAX/1000 and MAX/1500 problems of maximizing

760

754

\caption{Solutions for the MAX/500, MAX/1000 and MAX/1500 problems of maximizing

rejection for a given resource allocation.

761

755

rejection for a given resource allocation.

The filter shape constraint (bandpass and bandstop) is shown as thick

762

756

The filter shape constraint (bandpass and bandstop) is shown as thick

horizontal lines on each chart.}

763

757

horizontal lines on each chart.}

\end{figure}

764

758

\end{figure}

765

759

In all cases, we observe that the actual rejection is close to the rejection computed by the solver.

766

760

In all cases, we observe that the actual rejection is close to the rejection computed by the solver.

767

761

We compare the actual silicon resources given by Vivado to the

768

762

We compare the actual silicon resources given by Vivado to the

resources in arbitrary units.

769

763

resources in arbitrary units.

The goal is to check that our arbitrary units of silicon area models well enough

770

764

The goal is to check that our arbitrary units of silicon area models well enough

the real resources on the FPGA. Especially we want to verify that, for a given

771

765

the real resources on the FPGA. Especially we want to verify that, for a given

number of arbitrary units, the actual silicon resources do not depend on the

772

766

number of arbitrary units, the actual silicon resources do not depend on the

number of stages $n$. Most significantly, our approach aims

773

767

number of stages $n$. Most significantly, our approach aims

at remaining far enough from the practical logic gate implementation used by

774

768

at remaining far enough from the practical logic gate implementation used by

various vendors to remain platform independent and be portable from one

775

769

various vendors to remain platform independent and be portable from one

architecture to another.

776

770

architecture to another.

777

771

Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and

778

772

Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and

MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000

779

773

MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000

and 1500 arbitrary units. We have taken care to extract solely the resources used by

780

774

and 1500 arbitrary units. We have taken care to extract solely the resources used by

the FIR filters and remove additional processing blocks including FIFO and Programmable

781

775

the FIR filters and remove additional processing blocks including FIFO and Programmable

Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.

782

776

Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.

783

777

\begin{table}[h!tb]

784

778

\begin{table}[h!tb]

\caption{Resource occupation {\color{red}following synthesis of the solutions found for

785

779

\caption{Resource occupation following synthesis of the solutions found for

the problem of maximizing rejection for a given resource allocation}. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

786

780

the problem of maximizing rejection for a given resource allocation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

\label{tbl:resources_usage}

787

781

\label{tbl:resources_usage}

\centering

788

782

\centering

\begin{tabular}{|c|c|ccc|c|}

789

783

\begin{tabular}{|c|c|ccc|c|}

\hline

790

784

\hline

$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline

791

785

$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline

& LUT & 249 & 453 & 627 & \emph{17600} \\

792

786

& LUT & 249 & 453 & 627 & \emph{17600} \\

1 & BRAM & 1 & 1 & 1 & \emph{120} \\

793

787

1 & BRAM & 1 & 1 & 1 & \emph{120} \\

& DSP & 21 & 37 & 47 & \emph{80} \\ \hline

794

788

& DSP & 21 & 37 & 47 & \emph{80} \\ \hline

& LUT & 2374 & 5494 & 691 & \emph{17600} \\

795

789

& LUT & 2374 & 5494 & 691 & \emph{17600} \\

2 & BRAM & 2 & 2 & 2 & \emph{120} \\

796

790

2 & BRAM & 2 & 2 & 2 & \emph{120} \\

& DSP & 0 & 0 & 70 & \emph{80} \\ \hline

797

791

& DSP & 0 & 0 & 70 & \emph{80} \\ \hline

& LUT & 2443 & 3304 & 3521 & \emph{17600} \\

798

792

& LUT & 2443 & 3304 & 3521 & \emph{17600} \\

3 & BRAM & 3 & 3 & 3 & \emph{120} \\

799

793

3 & BRAM & 3 & 3 & 3 & \emph{120} \\

& DSP & 0 & 19 & 35 & \emph{80} \\ \hline

800

794

& DSP & 0 & 19 & 35 & \emph{80} \\ \hline

& LUT & 2634 & 3753 & 2557 & \emph{17600} \\

801

795

& LUT & 2634 & 3753 & 2557 & \emph{17600} \\

4 & BRAM & 4 & 4 & 4 & \emph{120} \\

802

796

4 & BRAM & 4 & 4 & 4 & \emph{120} \\

& DPS & 0 & 19 & 46 & \emph{80} \\ \hline

803

797

& DPS & 0 & 19 & 46 & \emph{80} \\ \hline

& LUT & 2423 & 3047 & 2847 & \emph{17600} \\

804

798

& LUT & 2423 & 3047 & 2847 & \emph{17600} \\

5 & BRAM & 5 & 5 & 5 & \emph{120} \\

805

799

5 & BRAM & 5 & 5 & 5 & \emph{120} \\

& DPS & 0 & 22 & 46 & \emph{80} \\ \hline

806

800

& DPS & 0 & 22 & 46 & \emph{80} \\ \hline

\end{tabular}

807

801

\end{tabular}

\end{table}

808

802

\end{table}

809

803

In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,

810

804

In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,

when the filter coefficients are small enough, or when the input size is small

811

805

when the filter coefficients are small enough, or when the input size is small

enough, Vivado optimizes resource consumption by selecting multiplexers to

812

806

enough, Vivado optimizes resource consumption by selecting multiplexers to

implement the multiplications instead of a DSP. In this case, it is quite difficult

813

807

implement the multiplications instead of a DSP. In this case, it is quite difficult

to compare the whole silicon budget.

814

808

to compare the whole silicon budget.

815

809

However, a rough estimation can be made with a simple equivalence: looking at

816

810

However, a rough estimation can be made with a simple equivalence: looking at

the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,

817

811

the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,

we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon

818

812

we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon

area use. With this equivalence, our 500 arbitraty units correspond to 2500 LUTs,

819

813

area use. With this equivalence, our 500 arbitrary units correspond to 2500 LUTs,

1000 arbitrary units correspond to 5000 LUTs and 1500 arbitrary units correspond

820

814

1000 arbitrary units correspond to 5000 LUTs and 1500 arbitrary units correspond

to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary

821

815

to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary

unit map well to actual hardware resources. The relatively small differences can probably be explained

822

816

unit map well to actual hardware resources. The relatively small differences can probably be explained

by the optimizations done by Vivado based on the detailed map of available processing resources.

823

817

by the optimizations done by Vivado based on the detailed map of available processing resources.

824

818

We now present the computation time needed to solve the quadratic problem.

825

819

We now present the computation time needed to solve the quadratic problem.

For each case, the filter solver software is executed on a Intel(R) Xeon(R) CPU E5606

826

820

For each case, the filter solver software is executed on a Intel(R) Xeon(R) CPU E5606

clocked at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve

827

821

clocked at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve

the quadratic problem. Table~\ref{tbl:area_time} shows the time needed to solve the quadratic

828

822

the quadratic problem. Table~\ref{tbl:area_time} shows the time needed to solve the quadratic

problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.

829

823

problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.

830

824

\begin{table}[h!tb]

831

825

\begin{table}[h!tb]

\caption{Time needed to solve the quadratic program with Gurobi}

832

826

\caption{Time needed to solve the quadratic program with Gurobi}

\label{tbl:area_time}

833

827

\label{tbl:area_time}

\centering

834

828

\centering

\begin{tabular}{|c|c|c|c|}\hline

835

829

\begin{tabular}{|c|c|c|c|}\hline

$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline

836

830

$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline

1 & 0.1~s & 0.1~s & 0.3~s \\

837

831

1 & 0.1~s & 0.1~s & 0.3~s \\

2 & 1.1~s & 2.2~s & 12~s \\

838

832

2 & 1.1~s & 2.2~s & 12~s \\

3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\

839

833

3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\

4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\

840

834

4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\

5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline

841

835

5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline

\end{tabular}

842

836

\end{tabular}

\end{table}

843

837

\end{table}

844

838

As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?

845

839

As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?

When the area is limited, the design exploration space is more limited and the solver is able to

846

840

When the area is limited, the design exploration space is more limited and the solver is able to

find an optimal solution faster.

847

841

find an optimal solution faster.

848

842

\subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}

849

843

\subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}

850

844

This section presents the results of the complementary quadratic program aimed at

851

845

This section presents the results of the complementary quadratic program aimed at

minimizing the area occupation for a targeted rejection level.

852

846

minimizing the area occupation for a targeted rejection level.

853

847

The experimental setup is composed of four cases. The raw input is the same

854

848

The experimental setup is composed of four cases. The raw input is the same

as in the previous section, from a PRN generator, which fixes the input data size $\Pi^I$.

855

849

as in the previous section, from a PRN generator, which fixes the input data size $\Pi^I$.

Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60, 80 or 100~dB.

856

850

Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60, 80 or 100~dB.

Hence, the three cases have been named: MIN/40, MIN/60, MIN/80 and MIN/100.

857

851

Hence, the three cases have been named: MIN/40, MIN/60, MIN/80 and MIN/100.

The number of configurations $p$ is the same as previous section.

858

852

The number of configurations $p$ is the same as previous section.

859

853

Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.

860

854

Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.

Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.

861

855

Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.

Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.

862

856

Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.

Table~\ref{tbl:gurobi_min_100} shows the results obtained by the filter solver for MIN/100.

863

857

Table~\ref{tbl:gurobi_min_100} shows the results obtained by the filter solver for MIN/100.

864

858

\renewcommand{\arraystretch}{1.4}

865

859

\renewcommand{\arraystretch}{1.4}

866

860

\begin{table}[h!tb]

867

861

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}

868

862

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}

\label{tbl:gurobi_min_40}

869

863

\label{tbl:gurobi_min_40}

\centering

870

864

\centering

{\scalefont{0.77}

871

865

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

872

866

\begin{tabular}{|c|ccccc|c|c|}

\hline

873

867

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

874

868

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

875

869

\hline

1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\

876

870

1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\

2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\

877

871

2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\

3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\

878

872

3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\

4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\

879

873

4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\

\hline

880

874

\hline

\end{tabular}

881

875

\end{tabular}

}

882

876

}

\end{table}

883

877

\end{table}

884

878

\begin{table}[h!tb]

885

879

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}

886

880

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}

\label{tbl:gurobi_min_60}

887

881

\label{tbl:gurobi_min_60}

\centering

888

882

\centering

{\scalefont{0.77}

889

883

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

890

884

\begin{tabular}{|c|ccccc|c|c|}

\hline

891

885

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

892

886

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

893

887

\hline

1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\

894

888

1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\

2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\

895

889

2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\

3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\

896

890

3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\

4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\

897

891

4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\

5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\

898

892

5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\

\hline

899

893

\hline

\end{tabular}

900

894

\end{tabular}

}

901

895

}

\end{table}

902

896

\end{table}

903

897

\begin{table}[h!tb]

904

898

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}

905

899

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}

\label{tbl:gurobi_min_80}

906

900

\label{tbl:gurobi_min_80}

\centering

907

901

\centering

{\scalefont{0.77}

908

902

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

909

903

\begin{tabular}{|c|ccccc|c|c|}

\hline

910

904

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

911

905

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

912

906

\hline

1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\

913

907

1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\

2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\

914

908

2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\

3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\

915

909

3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\

4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\

916

910

4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\

5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\

917

911

5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\

\hline

918

912

\hline

\end{tabular}

919

913

\end{tabular}

}

920

914

}

\end{table}

921

915

\end{table}

922

916

\begin{table}[h!tb]

923

917

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/100}

924

918

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/100}

\label{tbl:gurobi_min_100}

925

919

\label{tbl:gurobi_min_100}

\centering

926

920

\centering

{\scalefont{0.77}

927

921

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

928

922

\begin{tabular}{|c|ccccc|c|c|}

\hline

929

923

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

930

924

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

931

925

\hline

1 & - & - & - & - & - & - & - \\

932

926

1 & - & - & - & - & - & - & - \\

2 & (15, 7, 17) & (51, 14, 0) & - & - & - & 100~dB & 1365 \\

933

927

2 & (15, 7, 17) & (51, 14, 0) & - & - & - & 100~dB & 1365 \\

3 & (3, 3, 15) & (27, 9, 0) & (27, 9, 0) & - & - & 100~dB & 1002 \\

934

928

3 & (3, 3, 15) & (27, 9, 0) & (27, 9, 0) & - & - & 100~dB & 1002 \\

4 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 0) & - & 101~dB & 909 \\

935

929

4 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 0) & - & 101~dB & 909 \\

5 & (3, 3, 15) & (23, 8, 1) & (19, 7, 0) & (3, 3, 0) & (3, 3, 0) & 101~dB & 810 \\

936

930

5 & (3, 3, 15) & (23, 8, 1) & (19, 7, 0) & (3, 3, 0) & (3, 3, 0) & 101~dB & 810 \\

\hline

937

931

\hline

\end{tabular}

938

932

\end{tabular}

}

939

933

}

\end{table}

940

934

\end{table}

\renewcommand{\arraystretch}{1}

941

935

\renewcommand{\arraystretch}{1}

942

936

From these tables, we can first state that almost all configurations reach the targeted rejection

943

937

From these tables, we can first state that almost all configurations reach the targeted rejection

level or even better thanks to our underestimate of the cascade rejection as the sum of the

944

938

level or even better thanks to our underestimate of the cascade rejection as the sum of the

individual filter rejection. The only exception is for the monolithic case ($n = 1$) in

945

939

individual filter rejection. The only exception is for the monolithic case ($n = 1$) in

MIN/100: no solution is found for a single monolithic filter reach a 100~dB rejection.

946

940

MIN/100: no solution is found for a single monolithic filter reach a 100~dB rejection.

Futhermore, the area of the monolithic filter is twice as big as the two cascaded filters

947

941

Furthermore, the area of the monolithic filter is twice as big as the two cascaded filters

(1131 and 1760 arbitrary units v.s 547 and 903 arbitrary units for 60 and 80~dB rejection

948

942

(1131 and 1760 arbitrary units v.s 547 and 903 arbitrary units for 60 and 80~dB rejection

respectively). More generally, the more filters are cascaded, the lower the occupied area.

949

943

respectively). More generally, the more filters are cascaded, the lower the occupied area.

950

944

Like in previous section, the solver chooses always a little filter as first

951

945

Like in previous section, the solver chooses always a little filter as first

filter stage and the second one is often the biggest filter. This choice can be explained

952

946

filter stage and the second one is often the biggest filter. This choice can be explained

as in the previous section, with the solver using just enough bits not to degrade the input

953

947

as in the previous section, with the solver using just enough bits not to degrade the input

signal and in the second filter selecting a better filter to improve rejection without

954

948

signal and in the second filter selecting a better filter to improve rejection without

having too many bits in the output data.

955

949

having too many bits in the output data.

956

950

For the specific case of MIN/40 for $n = 5$ the solver has determined that the optimal

957

951

For the specific case of MIN/40 for $n = 5$ the solver has determined that the optimal

number of filters is 4 so it did not chose any configuration for the last filter. Hence this

958

952

number of filters is 4 so it did not chose any configuration for the last filter. Hence this

solution is equivalent to the result for $n = 4$.

959

953

solution is equivalent to the result for $n = 4$.

960

954

The following graphs present the rejection for real data on the FPGA. In all the following

961

955

The following graphs present the rejection for real data on the FPGA. In all the following

figures, the solid line represents the actual rejection of the filtered

962

956

figures, the solid line represents the actual rejection of the filtered

data on the FPGA as measured experimentally and the dashed line is the noise level

963

957

data on the FPGA as measured experimentally and the dashed line is the noise level

given by the quadratic solver.

964

958

given by the quadratic solver.

965

959

Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.

966

960

Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.

Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.

967

961

Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.

Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.

968

962

Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.

Figure~\ref{fig:min_100} shows the rejection of the different configurations in the case of MIN/100.

969

963

Figure~\ref{fig:min_100} shows the rejection of the different configurations in the case of MIN/100.

970

964

% \begin{figure}

971

965

% \begin{figure}

% \centering

972

966

% \centering

% \includegraphics[width=\linewidth]{images/min_40}

973

967

% \includegraphics[width=\linewidth]{images/min_40}

% \caption{Signal spectrum for MIN/40}

974

968

% \caption{Signal spectrum for MIN/40}

% \label{fig:min_40}

975

969

% \label{fig:min_40}

% \end{figure}

976

970

% \end{figure}

%

977

971

%

% \begin{figure}

978

972

% \begin{figure}

% \centering

979

973

% \centering

% \includegraphics[width=\linewidth]{images/min_60}

980

974

% \includegraphics[width=\linewidth]{images/min_60}

% \caption{Signal spectrum for MIN/60}

981

975

% \caption{Signal spectrum for MIN/60}

% \label{fig:min_60}

982

976

% \label{fig:min_60}

% \end{figure}

983

977

% \end{figure}

%

984

978

%

% \begin{figure}

985

979

% \begin{figure}

% \centering

986

980

% \centering

% \includegraphics[width=\linewidth]{images/min_80}

987

981

% \includegraphics[width=\linewidth]{images/min_80}

% \caption{Signal spectrum for MIN/80}

988

982

% \caption{Signal spectrum for MIN/80}

% \label{fig:min_80}

989

983

% \label{fig:min_80}

% \end{figure}

990

984

% \end{figure}

%

991

985

%

% \begin{figure}

992

986

% \begin{figure}

% \centering

993

987

% \centering

% \includegraphics[width=\linewidth]{images/min_100}

994

988

% \includegraphics[width=\linewidth]{images/min_100}

% \caption{Signal spectrum for MIN/100}

995

989

% \caption{Signal spectrum for MIN/100}

% \label{fig:min_100}

996

990

% \label{fig:min_100}

% \end{figure}

997

991

% \end{figure}

998

992

% r2.14 et r2.15 et r2.16

999

993

% r2.14 et r2.15 et r2.16

\begin{figure}

1000

994

\begin{figure}

\centering

1001

995

\centering

\begin{subfigure}{\linewidth}

1002

996

\begin{subfigure}{\linewidth}

\includegraphics[width=.91\linewidth]{images/min_40}

1003

997

\includegraphics[width=.91\linewidth]{images/min_40}

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

1004

998

\caption{Filter transfer functions for varying number of cascaded filters solving

the MIN/40 problem of minimizing resource allocation for reaching a 40~dB rejection.}

1005

999

the MIN/40 problem of minimizing resource allocation for reaching a 40~dB rejection.}

\label{fig:min_40}

1006

1000

\label{fig:min_40}

\end{subfigure}

1007

1001

\end{subfigure}

1008

1002

\begin{subfigure}{\linewidth}

1009

1003

\begin{subfigure}{\linewidth}

\includegraphics[width=.91\linewidth]{images/min_60}

1010

1004

\includegraphics[width=.91\linewidth]{images/min_60}

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

1011

1005

\caption{Filter transfer functions for varying number of cascaded filters solving

the MIN/60 problem of minimizing resource allocation for reaching a 60~dB rejection.}

1012

1006

the MIN/60 problem of minimizing resource allocation for reaching a 60~dB rejection.}

\label{fig:min_60}

1013

1007

\label{fig:min_60}

\end{subfigure}

1014

1008

\end{subfigure}

1015

1009

\begin{subfigure}{\linewidth}

1016

1010

\begin{subfigure}{\linewidth}

\includegraphics[width=.91\linewidth]{images/min_80}

1017

1011

\includegraphics[width=.91\linewidth]{images/min_80}

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

1018

1012

\caption{Filter transfer functions for varying number of cascaded filters solving

the MIN/80 problem of minimizing resource allocation for reaching a 80~dB rejection.}

1019

1013

the MIN/80 problem of minimizing resource allocation for reaching a 80~dB rejection.}

\label{fig:min_80}

1020

1014

\label{fig:min_80}

\end{subfigure}

1021

1015

\end{subfigure}

1022

1016

\begin{subfigure}{\linewidth}

1023

1017

\begin{subfigure}{\linewidth}

\includegraphics[width=.91\linewidth]{images/min_100}

1024

1018

\includegraphics[width=.91\linewidth]{images/min_100}

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

1025

1019

\caption{Filter transfer functions for varying number of cascaded filters solving

the MIN/100 problem of minimizing resource allocation for reaching a 100~dB rejection.}

1026

1020

the MIN/100 problem of minimizing resource allocation for reaching a 100~dB rejection.}

\label{fig:min_100}

1027

1021

\label{fig:min_100}

\end{subfigure}

1028

1022

\end{subfigure}

\caption{\color{red}Solutions for the MIN/40, MIN/60, MIN/80 and MIN/100 problems of reaching a

1029

1023

\caption{Solutions for the MIN/40, MIN/60, MIN/80 and MIN/100 problems of reaching a

given rejection while minimizing resource allocation. The filter shape constraint (bandpass and

1030

1024

given rejection while minimizing resource allocation. The filter shape constraint (bandpass and

bandstop) is shown as thick

1031

1025

bandstop) is shown as thick

horizontal lines on each chart.}

1032

1026

horizontal lines on each chart.}

\end{figure}

1033

1027

\end{figure}

1034

1028

We observe that all rejections given by the quadratic solver are close to the experimentally

1035

1029

We observe that all rejections given by the quadratic solver are close to the experimentally

measured rejection. All curves prove that the constraint to reach the target rejection is

1036

1030

measured rejection. All curves prove that the constraint to reach the target rejection is

respected with both monolithic (except in MIN/100 which has no monolithic solution) or cascaded filters.

1037

1031

respected with both monolithic (except in MIN/100 which has no monolithic solution) or cascaded filters.

1038

1032

Table~\ref{tbl:resources_usage} shows the resource usage in the case of MIN/40, MIN/60;

1039

1033

Table~\ref{tbl:resources_usage} shows the resource usage in the case of MIN/40, MIN/60;

MIN/80 and MIN/100 \emph{i.e.} when the target rejection is fixed to 40, 60, 80 and 100~dB. We

1040

1034

MIN/80 and MIN/100 \emph{i.e.} when the target rejection is fixed to 40, 60, 80 and 100~dB. We

have taken care to extract solely the resources used by

1041

1035

have taken care to extract solely the resources used by

the FIR filters and remove additional processing blocks including FIFO and PL to

1042

1036

the FIR filters and remove additional processing blocks including FIFO and PL to

PS communication.

1043

1037

PS communication.

1044

1038

\renewcommand{\arraystretch}{1.2}

1045

1039

\renewcommand{\arraystretch}{1.2}

\begin{table}

1046

1040

\begin{table}

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

1047

1041

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

\label{tbl:resources_usage_comp}

1048

1042

\label{tbl:resources_usage_comp}

\centering

1049

1043

\centering

{\scalefont{0.90}

1050

1044

{\scalefont{0.90}

\begin{tabular}{|c|c|cccc|c|}

1051

1045

\begin{tabular}{|c|c|cccc|c|}

\hline

1052

1046

\hline

$n$ & & MIN/40 & MIN/60 & MIN/80 & MIN/100 & \emph{Zynq 7010} \\ \hline\hline

1053

1047

$n$ & & MIN/40 & MIN/60 & MIN/80 & MIN/100 & \emph{Zynq 7010} \\ \hline\hline

& LUT & 343 & 334 & 772 & - & \emph{17600} \\

1054

1048

& LUT & 343 & 334 & 772 & - & \emph{17600} \\

1 & BRAM & 1 & 1 & 1 & - & \emph{120} \\

1055

1049

1 & BRAM & 1 & 1 & 1 & - & \emph{120} \\

& DSP & 27 & 39 & 55 & - & \emph{80} \\ \hline

1056

1050

& DSP & 27 & 39 & 55 & - & \emph{80} \\ \hline

& LUT & 1252 & 2862 & 5099 & 640 & \emph{17600} \\

1057

1051

& LUT & 1252 & 2862 & 5099 & 640 & \emph{17600} \\

2 & BRAM & 2 & 2 & 2 & 2 & \emph{120} \\

1058

1052

2 & BRAM & 2 & 2 & 2 & 2 & \emph{120} \\

& DSP & 0 & 0 & 0 & 66 & \emph{80} \\ \hline

1059

1053

& DSP & 0 & 0 & 0 & 66 & \emph{80} \\ \hline

& LUT & 891 & 2148 & 2023 & 2448 & \emph{17600} \\

1060

1054

& LUT & 891 & 2148 & 2023 & 2448 & \emph{17600} \\

3 & BRAM & 3 & 3 & 3 & 3 & \emph{120} \\

1061

1055

3 & BRAM & 3 & 3 & 3 & 3 & \emph{120} \\

& DSP & 0 & 0 & 19 & 27 & \emph{80} \\ \hline

1062

1056

& DSP & 0 & 0 & 19 & 27 & \emph{80} \\ \hline

& LUT & 662 & 1729 & 2451 & 2893 & \emph{17600} \\

1063

1057

& LUT & 662 & 1729 & 2451 & 2893 & \emph{17600} \\

4 & BRAM & 4 & 4 & 4 & 4 & \emph{120} \\

1064

1058

4 & BRAM & 4 & 4 & 4 & 4 & \emph{120} \\

& DPS & 0 & 0 & 7 & 19 & \emph{80} \\ \hline

1065

1059

& DPS & 0 & 0 & 7 & 19 & \emph{80} \\ \hline

& LUT & - & 1259 & 2602 & 2505 & \emph{17600} \\

1066

1060

& LUT & - & 1259 & 2602 & 2505 & \emph{17600} \\

5 & BRAM & - & 5 & 5 & 5 & \emph{120} \\

1067

1061

5 & BRAM & - & 5 & 5 & 5 & \emph{120} \\

& DPS & - & 0 & 0 & 19 & \emph{80} \\ \hline

1068

1062

& DPS & - & 0 & 0 & 19 & \emph{80} \\ \hline

\end{tabular}

1069

1063

\end{tabular}

}

1070

1064

}

\end{table}

1071

1065

\end{table}

\renewcommand{\arraystretch}{1}

1072

1066

\renewcommand{\arraystretch}{1}

1073

1067

If we keep the previous estimation of cost of one DSP in terms of LUT (1 DSP $\approx$ 100 LUT)

1074

1068

If we keep the previous estimation of cost of one DSP in terms of LUT (1 DSP $\approx$ 100 LUT)

the real resource consumption decreases as a function of the number of stages in the cascaded

1075

1069

the real resource consumption decreases as a function of the number of stages in the cascaded

filter according

1076

1070

filter according

to the solution given by the quadratic solver. Indeed, we have always a decreasing

1077

1071

to the solution given by the quadratic solver. Indeed, we have always a decreasing

consumption even if the difference between the monolithic and the two cascaded

1078

1072

consumption even if the difference between the monolithic and the two cascaded

filters is less than expected.

1079

1073

filters is less than expected.

1080

1074

Finally, table~\ref{tbl:area_time_comp} shows the computation time to solve

1081

1075

Finally, table~\ref{tbl:area_time_comp} shows the computation time to solve

the quadratic program.

1082

1076

the quadratic program.

1083

1077

\renewcommand{\arraystretch}{1.2}

1084

1078

\renewcommand{\arraystretch}{1.2}

\begin{table}[h!tb]

1085

1079

\begin{table}[h!tb]

\caption{Time to solve the quadratic program with Gurobi}

1086

1080

\caption{Time to solve the quadratic program with Gurobi}

\label{tbl:area_time_comp}

1087

1081

\label{tbl:area_time_comp}

\centering

1088

1082

\centering

{\scalefont{0.90}

1089

1083

{\scalefont{0.90}

\begin{tabular}{|c|c|c|c|c|}\hline

1090

1084

\begin{tabular}{|c|c|c|c|c|}\hline

$n$ & Time (MIN/40) & Time (MIN/60) & Time (MIN/80) & Time (MIN/100) \\\hline\hline

1091

1085

$n$ & Time (MIN/40) & Time (MIN/60) & Time (MIN/80) & Time (MIN/100) \\\hline\hline

1 & 0.07~s & 0.02~s & 0.01~s & - \\

1092

1086

1 & 0.07~s & 0.02~s & 0.01~s & - \\

2 & 7.8~s & 16~s & 14~s & 1.8~s \\

1093

1087

2 & 7.8~s & 16~s & 14~s & 1.8~s \\

3 & 4.7~s & 14~s & 28~s & 39~s \\

1094

1088

3 & 4.7~s & 14~s & 28~s & 39~s \\

4 & 39~s & 20~s & 193~s & 522~s ($\approx$ 9~min) \\

1095

1089

4 & 39~s & 20~s & 193~s & 522~s ($\approx$ 9~min) \\

5 & - & 12~s & 170~s & 1048~s ($\approx$ 17~min) \\\hline

1096

1090

5 & - & 12~s & 170~s & 1048~s ($\approx$ 17~min) \\\hline

\end{tabular}

1097

1091

\end{tabular}

}

1098

1092

}

\end{table}

1099

1093

\end{table}

\renewcommand{\arraystretch}{1}

1100

1094

\renewcommand{\arraystretch}{1}

1101

1095

GITLAB

jfriedt / IFCS2018 article

Typo + texte en noir.