jfriedt / IFCS2018 article

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee

1

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee

% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de

2

% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de

% rejection par bit et perte si moins de bits que rejection/6

3

% rejection par bit et perte si moins de bits que rejection/6

% developper programme lineaire en incluant le decalage de bits

4

% developper programme lineaire en incluant le decalage de bits

% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on

5

% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on

% implemente et on demontre que ca tourne

6

% implemente et on demontre que ca tourne

% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?

7

% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?

% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer

8

% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer

% (zedboard ou redpit)

9

% (zedboard ou redpit)

10

% label schema : verifier que "argumenter de la cascade de FIR" est fait

11

% label schema : verifier que "argumenter de la cascade de FIR" est fait

12

\documentclass[a4paper,journal]{IEEEtran/IEEEtran}

13

\documentclass[a4paper,journal]{IEEEtran/IEEEtran}

\usepackage{graphicx,color,hyperref}

14

\usepackage{graphicx,color,hyperref}

\usepackage{amsfonts}

15

\usepackage{amsfonts}

\usepackage{amsthm}

16

\usepackage{amsthm}

\usepackage{amssymb}

17

\usepackage{amssymb}

\usepackage{amsmath}

18

\usepackage{amsmath}

\usepackage{algorithm2e}

19

\usepackage{algorithm2e}

\usepackage{url,balance}

20

\usepackage{url,balance}

\usepackage[normalem]{ulem}

21

\usepackage[normalem]{ulem}

\usepackage{tikz}

22

\usepackage{tikz}

\usetikzlibrary{positioning,fit}

23

\usetikzlibrary{positioning,fit}

\usepackage{multirow}

24

\usepackage{multirow}

\usepackage{scalefnt}

25

\usepackage{scalefnt}

\usepackage{caption}

26

\usepackage{caption}

\usepackage{subcaption}

27

\usepackage{subcaption}

28

% correct bad hyphenation here

29

% correct bad hyphenation here

\hyphenation{op-tical net-works semi-conduc-tor}

30

\hyphenation{op-tical net-works semi-conduc-tor}

\textheight=26cm

31

\textheight=26cm

\setlength{\footskip}{30pt}

32

\setlength{\footskip}{30pt}

\pagenumbering{gobble}

33

\pagenumbering{gobble}

\begin{document}

34

\begin{document}

\title{Filter optimization for real time digital processing of radiofrequency signals: application

35

\title{Filter optimization for real time digital processing of radiofrequency signals: application

to oscillator metrology}

36

to oscillator metrology}

37

\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},

38

\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},

G. Goavec-M\'erou\IEEEauthorrefmark{1},

39

G. Goavec-M\'erou\IEEEauthorrefmark{1},

P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}\\

40

P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}\\

\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }\\

41

\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }\\

\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\

42

\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\

Email: \{pyb2,jmfriedt\}@femto-st.fr}

43

Email: \{pyb2,jmfriedt\}@femto-st.fr}

}

44

}

\maketitle

45

\maketitle

\thispagestyle{plain}

46

\thispagestyle{plain}

\pagestyle{plain}

47

\pagestyle{plain}

\newtheorem{definition}{Definition}

48

\newtheorem{definition}{Definition}

49

\begin{abstract}

50

\begin{abstract}

Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to

51

Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to

radiofrequency signal processing. Applied to oscillator characterization in the context

52

radiofrequency signal processing. Applied to oscillator characterization in the context

of ultrastable clocks, stringent filtering requirements are defined by spurious signal or

53

of ultrastable clocks, stringent filtering requirements are defined by spurious signal or

noise rejection needs. Since real time radiofrequency processing must be performed in a

54

noise rejection needs. Since real time radiofrequency processing must be performed in a

Field Programmable Array to meet timing constraints, we investigate optimization strategies

55

Field Programmable Array to meet timing constraints, we investigate optimization strategies

to design filters meeting rejection characteristics while limiting the hardware resources

56

to design filters meeting rejection characteristics while limiting the hardware resources

required and keeping timing constraints within the targeted measurement bandwidths. The

57

required and keeping timing constraints within the targeted measurement bandwidths. The

presented technique is applicable to scheduling any sequence of processing blocks characterized

58

presented technique is applicable to scheduling any sequence of processing blocks characterized

by a throughput, resource occupation and performance tabulated as a function of configuration

59

by a throughput, resource occupation and performance tabulated as a function of configuration

characateristics, as is the case for filters with their coefficients and resolution yielding

60

characateristics, as is the case for filters with their coefficients and resolution yielding

rejection and number of multipliers.

61

rejection and number of multipliers.

\end{abstract}

62

\end{abstract}

63

\begin{IEEEkeywords}

64

\begin{IEEEkeywords}

Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter

65

Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter

\end{IEEEkeywords}

66

\end{IEEEkeywords}

67

\section{Digital signal processing of ultrastable clock signals}

68

\section{Digital signal processing of ultrastable clock signals}

69

Analog oscillator phase noise characteristics are classically performed by downconverting

70

Analog oscillator phase noise characteristics are classically performed by downconverting

the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,

71

the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,

followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In

72

followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In

a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by

73

a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by

multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.

74

multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.

75

\begin{figure}[h!tb]

76

\begin{figure}[h!tb]

\begin{center}

77

\begin{center}

\includegraphics[width=.8\linewidth]{images/schema}

78

\includegraphics[width=.8\linewidth]{images/schema}

\end{center}

79

\end{center}

\caption{Fully digital oscillator phase noise characterization: the Device Under Test

80

\caption{Fully digital oscillator phase noise characterization: the Device Under Test

(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and

81

(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and

downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals

82

downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals

and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite

83

and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite

Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays

84

Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays

the spectral characteristics of the phase fluctuations.}

85

the spectral characteristics of the phase fluctuations.}

\label{schema}

86

\label{schema}

\end{figure}

87

\end{figure}

88

As with the analog mixer,

89

As with the analog mixer,

the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as

90

the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as

well as the generation of the frequency sum signal in addition to the frequency difference.

91

well as the generation of the frequency sum signal in addition to the frequency difference.

These unwanted spectral characteristics must be rejected before decimating the data stream

92

These unwanted spectral characteristics must be rejected before decimating the data stream

for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the

93

for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the

downconverter

94

downconverter

and the decimation processing blocks are core characteristics of an oscillator characterization

95

and the decimation processing blocks are core characteristics of an oscillator characterization

system, and must reject out-of-band signals below the targeted phase noise -- typically in the

96

system, and must reject out-of-band signals below the targeted phase noise -- typically in the

sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will

97

sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will

use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency

98

use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency

datastream: optimizing the performance of the filter while reducing the needed resources is

99

datastream: optimizing the performance of the filter while reducing the needed resources is

hence tackled in a systematic approach using optimization techniques. Most significantly, we

100

hence tackled in a systematic approach using optimization techniques. Most significantly, we

tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with

101

tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with

tunable number of coefficients and tunable number of bits representing the coefficients and the

102

tunable number of coefficients and tunable number of bits representing the coefficients and the

data being processed.

103

data being processed.

104

\section{Finite impulse response filter}

105

\section{Finite impulse response filter}

106

We select FIR filters for their unconditional stability and ease of design. A FIR filter is defined

107

We select FIR filters for their unconditional stability and ease of design. A FIR filter is defined

by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the

108

by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the

outputs $y_k$

109

outputs $y_k$

\begin{align}

110

\begin{align}

y_n=\sum_{k=0}^N b_k x_{n-k}

111

y_n=\sum_{k=0}^N b_k x_{n-k}

\label{eq:fir_equation}

112

\label{eq:fir_equation}

\end{align}

113

\end{align}

114

As opposed to an implementation on a general purpose processor in which word size is defined by the

115

As opposed to an implementation on a general purpose processor in which word size is defined by the

processor architecture, implementing such a filter on an FPGA offers more degrees of freedom since

116

processor architecture, implementing such a filter on an FPGA offers more degrees of freedom since

not only the coefficient values and number of taps must be defined, but also the number of bits

117

not only the coefficient values and number of taps must be defined, but also the number of bits

defining the coefficients and the sample size. For this reason, and because we consider pipeline

118

defining the coefficients and the sample size. For this reason, and because we consider pipeline

processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency

119

processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency

signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but

120

signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but

the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language

121

the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language

(VHDL) level.

122

(VHDL) level.

{\color{red}Since latency is not an issue in a openloop phase noise characterization instrument,

123

{\color{red}Since latency is not an issue in a openloop phase noise characterization instrument,

the large

124

the large

numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,

125

numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,

is not considered as an issue as would be in a closed loop system.} % r2.4

126

is not considered as an issue as would be in a closed loop system.} % r2.4

127

The coefficients are classically expressed as floating point values. However, this binary

128

The coefficients are classically expressed as floating point values. However, this binary

number representation is not efficient for fast arithmetic computation by an FPGA. Instead,

129

number representation is not efficient for fast arithmetic computation by an FPGA. Instead,

we select to quantify these floating point values into integer values. This quantization

130

we select to quantify these floating point values into integer values. This quantization

will result in some precision loss.

131

will result in some precision loss.

132

\begin{figure}[h!tb]

133

\begin{figure}[h!tb]

\includegraphics[width=\linewidth]{images/zero_values}

134

\includegraphics[width=\linewidth]{images/zero_values}

\caption{Impact of the quantization resolution of the coefficients: the quantization is

135

\caption{Impact of the quantization resolution of the coefficients: the quantization is

set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting

136

set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting

the 30~first and 30~last coefficients out of the initial 128~band-pass

137

the 30~first and 30~last coefficients out of the initial 128~band-pass

filter coefficients to 0 (red dots).}

138

filter coefficients to 0 (red dots).}

\label{float_vs_int}

139

\label{float_vs_int}

\end{figure}

140

\end{figure}

141

The tradeoff between quantization resolution and number of coefficients when considering

142

The tradeoff between quantization resolution and number of coefficients when considering

integer operations is not trivial. As an illustration of the issue related to the

143

integer operations is not trivial. As an illustration of the issue related to the

relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits

144

relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits

a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon

145

a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon

quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the

146

quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the

taps become null, {\color{red}making the large number of coefficients irrelevant: processing

147

taps become null, {\color{red}making the large number of coefficients irrelevant: processing

resources % r1.1

148

resources % r1.1

are hence saved by shrinking the filter length.} This tradeoff aimed at minimizing resources

149

are hence saved by shrinking the filter length.} This tradeoff aimed at minimizing resources

to reach a given rejection level, or maximizing out of band rejection for a given computational

150

to reach a given rejection level, or maximizing out of band rejection for a given computational

resource, will drive the investigation on cascading filters designed with varying tap resolution

151

resource, will drive the investigation on cascading filters designed with varying tap resolution

and tap length, as will be shown in the next section. Indeed, our development strategy closely

152

and tap length, as will be shown in the next section. Indeed, our development strategy closely

follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}

153

follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}

in which basic blocks are defined and characterized before being assembled \cite{hide}

154

in which basic blocks are defined and characterized before being assembled \cite{hide}

in a complete processing chain. In our case, assembling the filter blocks is a simpler block

155

in a complete processing chain. In our case, assembling the filter blocks is a simpler block

combination process since we assume a single value to be processed and a single value to be

156

combination process since we assume a single value to be processed and a single value to be

generated at each clock cycle. The FIR filters will not be considered to decimate in the

157

generated at each clock cycle. The FIR filters will not be considered to decimate in the

current implementation: the decimation is assumed to be located after the FIR cascade at the

158

current implementation: the decimation is assumed to be located after the FIR cascade at the

moment.

159

moment.

160

\section{Methodology description}

161

\section{Methodology description}

162

Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP)

163

Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP)

chain obtained by assembling basic processing blocks, with hardware and manufacturer independence.

164

chain obtained by assembling basic processing blocks, with hardware and manufacturer independence.

Achieving such a target requires defining an abstract model to represent some basic properties

165

Achieving such a target requires defining an abstract model to represent some basic properties

of DSP blocks such as perfomance (i.e. rejection or ripples in the bandpass for filters) and

166

of DSP blocks such as perfomance (i.e. rejection or ripples in the bandpass for filters) and

resource occupation. These abstract properties, not necessarily related to the detailed hardware

167

resource occupation. These abstract properties, not necessarily related to the detailed hardware

implementation of a given platform, will feed a scheduler solver aimed at assembling the optimum

168

implementation of a given platform, will feed a scheduler solver aimed at assembling the optimum

target, whether in terms of maximizing performance for a given arbitrary resource occupation, or

169

target, whether in terms of maximizing performance for a given arbitrary resource occupation, or

minimizing resource occupation for a given perfomance. In our approach, the solution of the

170

minimizing resource occupation for a given perfomance. In our approach, the solution of the

solver is then synthesized using the dedicated tool provided by each platform manufacturer

171

solver is then synthesized using the dedicated tool provided by each platform manufacturer

to assess the validity of our abstract resource occupation indicator, and the result of running

172

to assess the validity of our abstract resource occupation indicator, and the result of running

the DSP chain on the FPGA allows for assessing the performance of the scheduler. We emphasize

173

the DSP chain on the FPGA allows for assessing the performance of the scheduler. We emphasize

that all solutions found by the solver are synthesized and executed on hardware at the end

174

that all solutions found by the solver are synthesized and executed on hardware at the end

of the analysis.

175

of the analysis.

176

In this demonstration , we focus on only two operations: filtering and shifting the number of

177

In this demonstration, we focus on only two operations: filtering and shifting the number of

bits needed to represent the data along the processing chain.

178

bits needed to represent the data along the processing chain.

We have chosen these basic operations because shifting and the filtering have already been studied

179

We have chosen these basic operations because shifting and the filtering have already been studied

in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for

180

in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for

assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend

181

assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend

requiring pipelined processing at full bandwidth for the earliest steps, including for

182

requiring pipelined processing at full bandwidth for the earliest steps, including for

time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}.

183

time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}.

184

Addressing only two operations allows for demonstrating the methodology but should not be

185

Addressing only two operations allows for demonstrating the methodology but should not be

considered as a limitation of the framework which can be extended to assembling any number

186

considered as a limitation of the framework which can be extended to assembling any number

of skeleton blocks as long as perfomance and resource occupation can be determined. {\color{red}

187

of skeleton blocks as long as perfomance and resource occupation can be determined. {\color{red}

Hence,

188

Hence,

in this paper we will apply our methodology on simple DSP chains: a white noise input signal % r1.2

189

in this paper we will apply our methodology on simple DSP chains: a white noise input signal % r1.2

is generated using a Pseudo-Random Number (PRN) generator or by sampling a wideband (125~MS/s)

190

is generated using a Pseudo-Random Number (PRN) generator or by sampling a wideband (125~MS/s)

14-bit Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.} Once samples have been

191

14-bit Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor.} Once samples have been

digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance --

192

digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance --

practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction

193

practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction

by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing,

194

by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing,

allowing to assess either filter rejection for a given resource usage, or validating the rejection

195

allowing to assess either filter rejection for a given resource usage, or validating the rejection

when implementing a solution minimizing resource occupation.

196

when implementing a solution minimizing resource occupation.

197

{\color{red}

198

{\color{red}

The first step of our approach is to model the DSP chain. Since we aim at only optimizing % r1.3

199

The first step of our approach is to model the DSP chain. Since we aim at only optimizing % r1.3

the filtering part of the signal processing chain, we have not included the PRN generator or the

200

the filtering part of the signal processing chain, we have not included the PRN generator or the

ADC in the model: the input data size and rate are considered fixed and defined by the hardware.

201

ADC in the model: the input data size and rate are considered fixed and defined by the hardware.

The filtering can be done in two ways, either by considering a single monolithic FIR filter

202

The filtering can be done in two ways, either by considering a single monolithic FIR filter

requiring many coefficients to reach the targeted noise rejection ratio, or by

203

requiring many coefficients to reach the targeted noise rejection ratio, or by

cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.}

204

cascading multiple FIR filters, each with fewer coefficients than found in the monolithic filter.}

205

After each filter we leave the possibility of shifting the filtered data to consume

206

After each filter we leave the possibility of shifting the filtered data to consume

less resources. Hence in the case of cascaded filter, we define a stage as a filter

207

less resources. Hence in the case of cascaded filter, we define a stage as a filter

and a shifter (the shift could be omitted if we do not need to divide the filtered data).

208

and a shifter (the shift could be omitted if we do not need to divide the filtered data).

209

\subsection{Model of a FIR filter}

210

\subsection{Model of a FIR filter}

211

A cascade of filters is composed of $n$ FIR stages. In stage $i$ ($1 \leq i \leq n$)

212

A cascade of filters is composed of $n$ FIR stages. In stage $i$ ($1 \leq i \leq n$)

the FIR has $C_i$ coefficients and each coefficient is an integer value with $\pi^C_i$

213

the FIR has $C_i$ coefficients and each coefficient is an integer value with $\pi^C_i$

bits while the filtered data are shifted by $\pi^S_i$ bits. We define also $\pi^-_i$ as

214

bits while the filtered data are shifted by $\pi^S_i$ bits. We define also $\pi^-_i$ as

the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}

215

the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}

shows a filtering stage.

216

shows a filtering stage.

217

\begin{figure}

218

\begin{figure}

\centering

219

\centering

\begin{tikzpicture}[node distance=2cm]

220

\begin{tikzpicture}[node distance=2cm]

\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;

221

\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;

\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;

222

\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;

\node (Start) [left of=FIR] { } ;

223

\node (Start) [left of=FIR] { } ;

\node (End) [right of=Shift] { } ;

224

\node (End) [right of=Shift] { } ;

225

\node[draw,fit=(FIR) (Shift)] (Filter) { } ;

226

\node[draw,fit=(FIR) (Shift)] (Filter) { } ;

227

\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;

228

\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;

\draw[->] (FIR) -- (Shift) ;

229

\draw[->] (FIR) -- (Shift) ;

\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;

230

\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;

\end{tikzpicture}

231

\end{tikzpicture}

\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}

232

\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}

\label{fig:fir_stage}

233

\label{fig:fir_stage}

\end{figure}

234

\end{figure}

235

FIR $i$ has been characterized through numerical simulation as able to reject $F(C_i, \pi_i^C)$ dB.

236

FIR $i$ has been characterized through numerical simulation as able to reject $F(C_i, \pi_i^C)$ dB.

This rejection has been computed using GNU Octave software FIR coefficient design functions

237

This rejection has been computed using GNU Octave software FIR coefficient design functions

(\texttt{firls} and \texttt{fir1}).

238

(\texttt{firls} and \texttt{fir1}).

For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.

239

For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.

Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,

240

Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,

the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.

241

the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.

At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the others are coded on much fewer bits.

242

At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the others are coded on much fewer bits.

243

With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter

244

With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter

transfer function.

245

transfer function.

Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag},

246

Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag},

the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the

247

the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the

bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. {\color{red}Throughout this demonstration,

248

bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. {\color{red}Throughout this demonstration,

we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%

249

we arbitrarily set a bandpass of 40\% of the Nyquist frequency and a bandstop from 60\%

of the Nyquist frequency to the end of the band, as would be typically selected to prevent

250

of the Nyquist frequency to the end of the band, as would be typically selected to prevent

aliasing before decimating the dataflow by 2. The method is however generalized to any filter

251

aliasing before decimating the dataflow by 2. The method is however generalized to any filter

shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}

252

shape as long as it is defined from the initial modelling steps: Fig. \ref{fig:rejection_pyramid}

as described below is indeed unique for each filter shape.}

253

as described below is indeed unique for each filter shape.}

254

\begin{figure}

255

\begin{figure}

\begin{center}

256

\begin{center}

\scalebox{0.8}{

257

\scalebox{0.8}{

\centering

258

\centering

\begin{tikzpicture}[scale=0.3]

259

\begin{tikzpicture}[scale=0.3]

\draw[<->] (0,15) -- (0,0) -- (21,0) ;

260

\draw[<->] (0,15) -- (0,0) -- (21,0) ;

\draw[thick] (0,12) -- (8,12) -- (20,0) ;

261

\draw[thick] (0,12) -- (8,12) -- (20,0) ;

262

\draw (0,14) node [left] { $P$ } ;

263

\draw (0,14) node [left] { $P$ } ;

\draw (20,0) node [below] { $f$ } ;

264

\draw (20,0) node [below] { $f$ } ;

265

\draw[>=latex,<->] (0,14) -- (8,14) ;

266

\draw[>=latex,<->] (0,14) -- (8,14) ;

\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;

267

\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;

268

\draw[>=latex,<->] (8,14) -- (12,14) ;

269

\draw[>=latex,<->] (8,14) -- (12,14) ;

\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;

270

\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;

271

\draw[>=latex,<->] (12,14) -- (20,14) ;

272

\draw[>=latex,<->] (12,14) -- (20,14) ;

\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;

273

\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;

274

\draw[>=latex,<->] (16,12) -- (16,8) ;

275

\draw[>=latex,<->] (16,12) -- (16,8) ;

\draw (16,10) node [right] { rejection } ;

276

\draw (16,10) node [right] { rejection } ;

277

\draw[dashed] (8,-1) -- (8,14) ;

278

\draw[dashed] (8,-1) -- (8,14) ;

\draw[dashed] (12,-1) -- (12,14) ;

279

\draw[dashed] (12,-1) -- (12,14) ;

280

\draw[dashed] (8,12) -- (16,12) ;

281

\draw[dashed] (8,12) -- (16,12) ;

\draw[dashed] (12,8) -- (16,8) ;

282

\draw[dashed] (12,8) -- (16,8) ;

283

\end{tikzpicture}

284

\end{tikzpicture}

}

285

}

\end{center}

286

\end{center}

\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:

287

\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:

the passband is considered to occupy the initial 40\% of the Nyquist frequency range,

288

the passband is considered to occupy the initial 40\% of the Nyquist frequency range,

the stopband the last 40\%, allowing 20\% transition width.}

289

the stopband the last 40\%, allowing 20\% transition width.}

\label{fig:fir_mag}

290

\label{fig:fir_mag}

\end{figure}

291

\end{figure}

292

In the transition band, the behavior of the filter is left free, we only {\color{red}define} the passband and the stopband characteristics.

293

In the transition band, the behavior of the filter is left free, we only {\color{red}define} the passband and the stopband characteristics.

% r2.7

294

% r2.7

% Our initial criterion considered the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion

295

% Our initial criterion considered the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion

% yields unacceptable results since notches overestimate the rejection capability of the filter. Furthermore, the losses within

296

% yields unacceptable results since notches overestimate the rejection capability of the filter. Furthermore, the losses within

% the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients.

297

% the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients.

Our criterion to compute the filter rejection considers

298

Our criterion to compute the filter rejection considers

% r2.8 et r2.2 r2.3

299

% r2.8 et r2.2 r2.3

the {\color{red}minimal} rejection within the stopband, to which the {\color{red}sum of the absolute values

300

the {\color{red}minimal} rejection within the stopband, to which the {\color{red}sum of the absolute values

within the passband is subtracted to avoid filters with excessive ripples}. With this

301

within the passband is subtracted to avoid filters with excessive ripples, normalized to the

302

bin width to remain consistent with the passband criterion (dBc/Hz units in all cases)}. With this

criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.

302

303

criterion, we meet the expected rejection capability of low pass filters as shown in figure~\ref{fig:custom_criterion}.

303

304

% \begin{figure}

304

305

% \begin{figure}

% \centering

305

306

% \centering

% \includegraphics[width=\linewidth]{images/colored_mean_criterion}

306

307

% \includegraphics[width=\linewidth]{images/colored_mean_criterion}

% \caption{Mean stopband rejection criterion comparison between monolithic filter and cascaded filters}

307

308

% \caption{Mean stopband rejection criterion comparison between monolithic filter and cascaded filters}

% \label{fig:mean_criterion}

308

309

% \label{fig:mean_criterion}

% \end{figure}

309

310

% \end{figure}

310

311

\begin{figure}

311

312

\begin{figure}

\centering

312

313

\centering

\includegraphics[width=\linewidth]{images/colored_custom_criterion}

313

314

\includegraphics[width=\linewidth]{images/colored_custom_criterion}

\caption{Custom criterion (maximum rejection in the stopband minus the mean of the absolute value of the passband rejection)

314

315

\caption{Custom criterion (maximum rejection in the stopband minus the {\color{red} sum of the

316

absolute values of the passband rejection normalized to the bandwidth})

comparison between monolithic filter and cascaded filters}

315

317

comparison between monolithic filter and cascaded filters}

\label{fig:custom_criterion}

316

318

\label{fig:custom_criterion}

\end{figure}

317

319

\end{figure}

318

320

Thanks to the latter criterion which will be used in the remainder of this paper, we are able to automatically generate multiple FIR taps

319

321

Thanks to the latter criterion which will be used in the remainder of this paper, we are able to automatically generate multiple FIR taps

and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the

320

322

and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the

rejection as a function of the number of coefficients and the number of bits representing these coefficients.

321

323

rejection as a function of the number of coefficients and the number of bits representing these coefficients.

The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet.

322

324

The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet.

Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection.

323

325

Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection.

Conversely when setting the a given number of bits, increasing the number of coefficients will not improve

324

326

Conversely when setting the a given number of bits, increasing the number of coefficients will not improve

the rejection. Hence the best coefficient set are on the vertex of the pyramid.

325

327

the rejection. Hence the best coefficient set are on the vertex of the pyramid.

326

328

\begin{figure}

327

329

\begin{figure}

\centering

328

330

\centering

\includegraphics[width=\linewidth]{images/rejection_pyramid}

329

331

\includegraphics[width=\linewidth]{images/rejection_pyramid}

\caption{Rejection as a function of number of coefficients and number of bits}

330

332

\caption{{\color{red}{Filter}} rejection as a function of number of coefficients and number of bits

333

{\color{red}: this lookup table will be used to identify which filter parameters -- number of bits

334

representing coefficients and number of coefficients -- best match the targeted transfer function.}}

\label{fig:rejection_pyramid}

331

335

\label{fig:rejection_pyramid}

\end{figure}

332

336

\end{figure}

333

337

Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps),

334

338

Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps),

we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria.

335

339

we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria.

If the FIR filter coefficients are the same between the stages, we have:

336

340

If the FIR filter coefficients are the same between the stages, we have:

$$F_{total} = F_1 + F_2$$

337

341

$$F_{total} = F_1 + F_2$$

But selecting two different sets of coefficient will yield a more complex situation in which

338

342

But selecting two different sets of coefficient will yield a more complex situation in which

the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves

339

343

the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves

are two different filters with maximums and notches not located at the same frequency offsets.

340

344

are two different filters with maximums and notches not located at the same frequency offsets.

Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved

341

345

Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved

with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.

342

346

with respect to a basic sum of the rejection criteria shown as a the dotted yellow line.

% r2.9

343

347

% r2.9

Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection

344

348

Thus, estimating the rejection of filter cascades is more complex than taking the sum of all the rejection

criteria of each filter. However since the this sum underestimates the rejection capability of the cascade,

345

349

criteria of each filter. However since the {\color{red}individual filter rejection} sum underestimates the rejection capability of the cascade,

% r2.10

346

350

% r2.10

this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability

347

351

this upper bound is considered as a conservative and acceptable criterion for deciding on the suitability

of the filter cascade to meet design criteria.

348

352

of the filter cascade to meet design criteria.

349

353

\begin{figure}

350

354

\begin{figure}

\centering

351

355

\centering

\includegraphics[width=\linewidth]{images/cascaded_criterion}

352

356

\includegraphics[width=\linewidth]{images/cascaded_criterion}

\caption{Rejection of two cascaded filters}

353

357

\caption{{\color{red}Transfer function of individual filters and after cascading} the two filters,

358

{\color{red}demonstrating that the selected criterion of maximum rejection in the bandstop (horizontal

359

lines) is met. Notice that the cascaded filter has better rejection than summing the bandstop

360

maximum of each individual filter.}

361

}

\label{fig:sum_rejection}

354

362

\label{fig:sum_rejection}

\end{figure}

355

363

\end{figure}

356

364

% r2.6

357

365

% r2.6

Finally in our case, we consider that the input signal are fully known. So the

358

366

{\color{red}

resolution of the data stream are fixed and still the same for all experiments

359

367

Finally in our case, we consider that the input signal are fully known. The

in this paper.

360

368

resolution of the input data stream are fixed and still the same for all experiments

369

in this paper.}

361

370

Based on this analysis, we address the estimate of resource consumption (called

362

371

Based on this analysis, we address the estimate of resource consumption (called

% r2.11

363

372

% r2.11

silicon area -- in the case of FPGAs this means processing cells) as a function of

364

373

silicon area -- in the case of FPGAs this means processing cells) as a function of

filter characteristics. As a reminder, we do not aim at matching actual hardware

365

374

filter characteristics. As a reminder, we do not aim at matching actual hardware

configuration but consider an arbitrary silicon area occupied by each processing function,

366

375

configuration but consider an arbitrary silicon area occupied by each processing function,

and will assess after synthesis the adequation of this arbitrary unit with actual

367

376

and will assess after synthesis the adequation of this arbitrary unit with actual

hardware resources provided by FPGA manufacturers. The sum of individual processing

368

377

hardware resources provided by FPGA manufacturers. The sum of individual processing

unit areas is constrained by a total silicon area representative of FPGA global resources.

369

378

unit areas is constrained by a total silicon area representative of FPGA global resources.

Formally, variable $a_i$ is the area taken by filter~$i$

370

379

Formally, variable $a_i$ is the area taken by filter~$i$

(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).

371

380

(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).

Constant $\mathcal{A}$ is the total available area. We model our problem as follows:

372

381

Constant $\mathcal{A}$ is the total available area. We model our problem as follows:

373

382

\begin{align}

374

383

\begin{align}

\text{Maximize } & \sum_{i=1}^n r_i \notag \\

375

384

\text{Maximize } & \sum_{i=1}^n r_i \notag \\

\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\

376

385

\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\

a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\

377

386

a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\

r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\

378

387

r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\

\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\

379

388

\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\

\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\

380

389

\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\

\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\

381

390

\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\

\pi_1^- &= \Pi^I \label{eq:init}

382

391

\pi_1^- &= \Pi^I \label{eq:init}

\end{align}

383

392

\end{align}

384

393

Equation~\ref{eq:area} states that the total area taken by the filters must be

385

394

Equation~\ref{eq:area} states that the total area taken by the filters must be

less than the available area. Equation~\ref{eq:areadef} gives the definition of

386

395

less than the available area. Equation~\ref{eq:areadef} gives the definition of

the area used by a filter, considered as the area of the FIR since the Shifter is

387

396

the area used by a filter, considered as the area of the FIR since the Shifter is

assumed not to require significant resources. We consider that the FIR needs $C_i$ registers of size

388

397

assumed not to require significant resources. We consider that the FIR needs $C_i$ registers of size

$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the

389

398

$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the

input data with the coefficients. Equation~\ref{eq:rejectiondef} gives the

390

399

input data with the coefficients. Equation~\ref{eq:rejectiondef} gives the

definition of the rejection of the filter thanks to the tabulated function~$F$ that we defined

391

400

definition of the rejection of the filter thanks to the tabulated function~$F$ that we defined

previously. The Shifter does not introduce negative rejection as we will explain later,

392

401

previously. The Shifter does not introduce negative rejection as we will explain later,

so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the

393

402

so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the

relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add

394

403

relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add

$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes

395

404

$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes

$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of

396

405

$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of

a filter is the same as the input number of bits of the next filter.

397

406

a filter is the same as the input number of bits of the next filter.

Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative

398

407

Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative

rejection. Indeed, the results of the FIR can be right shifted without compromising

399

408

rejection. Indeed, the results of the FIR can be right shifted without compromising

the quality of the rejection until a threshold. Each bit of the output data

400

409

the quality of the rejection until a threshold. Each bit of the output data

increases the maximum rejection level by 6~dB. We add one to take the sign bit

401

410

increases the maximum rejection level by 6~dB. We add one to take the sign bit

into account. If equation~\ref{eq:maxshift} was not present, the Shifter could

402

411

into account. If equation~\ref{eq:maxshift} was not present, the Shifter could

shift too much and introduce some noise in the output data. Each supplementary

403

412

shift too much and introduce some noise in the output data. Each supplementary

shift bit would cause an additional 6~dB rejection rise. A totally equivalent equation is:

404

413

shift bit would cause an additional 6~dB rejection rise. A totally equivalent equation is:

$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right)$.

405

414

$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right)$.

Finally, equation~\ref{eq:init} gives the number of bits of the global input.

406

415

Finally, equation~\ref{eq:init} gives the number of bits of the global input.

407

416

{\color{red}

408

417

{\color{red}

This model is non-linear since we multiply some variable with another variable

409

418

This model is non-linear since we multiply some variable with another variable

and it is even non-quadratic, as $F$ does not have a known

410

419

and it is even non-quadratic, as the cost function $F$ does not have a known

linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations.

411

420

linear or quadratic expression. To linearize this problem, we introduce $p$ FIR configurations.

This variable must be defined by the user, it represent the number of different

412

421

% AH: conflit merge

set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

413

422

% This variable must be defined by the user, it represent the number of different

functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid}

414

423

% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

to restrict the number of configurations. Indeed, it is useless to have too many coefficients or

415

424

% functions from GNU Octave). To choose this value, we consider a subset of the figure~\ref{fig:rejection_pyramid}

too many bits, hence we take the configurations close to edge of pyramid. Thank to theses

416

425

% to restrict the number of configurations. Indeed, it is useless to have too many coefficients or

configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant

417

426

% too many bits, hence we take the configurations close to edge of pyramid. Thank to theses

and the function $F$ can be estimate for each configurations

418

427

% configurations $C_{ij}$ and $\pi_{ij}^C$ ($1 \leq j \leq p$) become constant

thanks our rejection criterion. We also defined binary

419

428

% and the function $F$ can be estimate for each configurations

429

% thanks our rejection criterion. We also defined binary

430

This variable $p$ is defined by the user, and represents the number of different

431

set of coefficients generated (remember, we use \texttt{firls} and \texttt{fir1}

432

functions from GNU Octave) based on the targeted filter characteristics and implementation

433

assumptions (estimated number of bits defining the coefficients). Hence, $C_{ij}$ and

434

$\pi_{ij}^C$ become constants and

435

we define $1 \leq j \leq p$ so that the function $F$ can be estimated (Look Up Table)

436

for each configurations thanks to the rejection criterion. We also define the binary

variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

420

437

variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

and 0 otherwise. The new equations are as follows:

421

438

and 0 otherwise. The new equations are as follows:

}

422

439

}

423

440

\begin{align}

424

441

\begin{align}

a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

425

442

a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

426

443

r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

427

444

\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

428

445

\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

\end{align}

429

446

\end{align}

430

447

Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

431

448

Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

432

449

respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

433

450

Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

434

451

{\color{red}

435

452

{\color{red}

However the problem still quadratic since in the constraint~\ref{eq:areadef2} we multiply

436

453

% JM: conflict merge

$\delta_{ij}$ and $\pi_i^-$. But like $\delta_{ij}$ is a binary variable we can

437

454

% However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}

linearize this multiplication. The following formula shows how to linearize

438

455

% we multiply

456

% $\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can

457

% linearise this multiplication if we can bound $\pi_i^-$. As $\pi_i^-$ is the data size,

458

% we define $0 < \pi_i^- \leq 128$ which is the maximum data size whose estimation is

459

% assumed on hardware characteristics.

460

% The Gurobi (\url{www.gurobi.com}) optimization software used to solve this quadratic

461

% model is able to linearize the model provided as is. This model

462

% has $O(np)$ variables and $O(n)$ constraints.}

463

However the problem remains quadratic at this stage since in the constraint~\ref{eq:areadef2}

464

we multiply

465

$\delta_{ij}$ and $\pi_i^-$. However, since $\delta_{ij}$ is a binary variable we can

466

linearise linearize this multiplication. The following formula shows how to linearize

this situation in general case with $y$ a binary variable and $x$ a real variable ($0 \leq x \leq X^{max}$):

439

467

this situation in general case with $y$ a binary variable and $x$ a real variable ($0 \leq x \leq X^{max}$):

\begin{equation*}

440

468

\begin{equation*}

m = x \times y \implies

441

469

m = x \times y \implies

\left \{

442

470

\left \{

\begin{split}

443

471

\begin{split}

m & \geq 0 \\

444

472

m & \geq 0 \\

m & \leq y \times X^{max} \\

445

473

m & \leq y \times X^{max} \\

m & \leq x \\

446

474

m & \leq x \\

m & \geq x - (1 - y) \times X^{max} \\

447

475

m & \geq x - (1 - y) \times X^{max} \\

\end{split}

448

476

\end{split}

\right .

449

477

\right .

\end{equation*}

450

478

\end{equation*}

451

479

So if we bound up $\pi_i^-$ by 128~bits which is the maximum data size whose estimation is

So if we bound up $\pi_i^-$ by 128~bits to represent the maximum data size tolerated,

452

480

assumed on hardware characteristics,

the Gurobi (\url{www.gurobi.com}) optimization software will be able to linearize

453

481

the Gurobi (\url{www.gurobi.com}) optimization software will be able to linearize

for us the quadratic problem so the model is left as is.

454

482

for us the quadratic problem so the model is left as is. This model

}

455

483

has $O(np)$ variables and $O(n)$ constraints.}

This model has $O(np)$ variables and $O(n)$ constraints.

456

457

484

% This model is non-linear and even non-quadratic, as $F$ does not have a known

458

485

% This model is non-linear and even non-quadratic, as $F$ does not have a known

% linear or quadratic expression. We introduce $p$ FIR configurations

459

486

% linear or quadratic expression. We introduce $p$ FIR configurations

% $(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants.

460

487

% $(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants.

% % r2.12

461

488

% % r2.12

% This variable must be defined by the user, it represent the number of different

462

489

% This variable must be defined by the user, it represent the number of different

% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

463

490

% set of coefficients generated (for memory, we use \texttt{firls} and \texttt{fir1}

% functions from GNU Octave).

464

491

% functions from GNU Octave).

% We define binary

465

492

% We define binary

% variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

466

493

% variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

% and 0 otherwise. The new equations are as follows:

467

494

% and 0 otherwise. The new equations are as follows:

%

468

495

%

% \begin{align}

469

496

% \begin{align}

% a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

470

497

% a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

% r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

471

498

% r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

% \pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

472

499

% \pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

% \sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

473

500

% \sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

% \end{align}

474

501

% \end{align}

%

475

502

%

% Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

476

503

% Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

% respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

477

504

% respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

% Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

478

505

% Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

%

479

506

%

% % r2.13

480

507

% % r2.13

% This modified model is quadratic since we multiply two variables in the

481

508

% This modified model is quadratic since we multiply two variables in the

% equation~\ref{eq:areadef2} ($\delta_{ij}$ by $\pi_{ij}^-$) but it can be linearised if necessary.

482

509

% equation~\ref{eq:areadef2} ($\delta_{ij}$ by $\pi_{ij}^-$) but it can be linearised if necessary.

% The Gurobi

483

510

% The Gurobi

% (\url{www.gurobi.com}) optimization software is used to solve this quadratic

484

511

% (\url{www.gurobi.com}) optimization software is used to solve this quadratic

% model, and since Gurobi is able to linearize, the model is left as is. This model

485

512

% model, and since Gurobi is able to linearize, the model is left as is. This model

% has $O(np)$ variables and $O(n)$ constraints.

486

513

% has $O(np)$ variables and $O(n)$ constraints.

487

514

Two problems will be addressed using the workflow described in the next section: on the one

488

515

Two problems will be addressed using the workflow described in the next section: on the one

hand maximizing the rejection capability of a set of cascaded filters occupying a fixed arbitrary

489

516

hand maximizing the rejection capability of a set of cascaded filters occupying a fixed arbitrary

silcon area (section~\ref{sec:fixed_area}) and on the second hand the dual problem of minimizing the silicon area

490

517

silcon area (section~\ref{sec:fixed_area}) and on the second hand the dual problem of minimizing the silicon area

for a fixed rejection criterion (section~\ref{sec:fixed_rej}). In the latter case, the

491

518

for a fixed rejection criterion (section~\ref{sec:fixed_rej}). In the latter case, the

objective function is replaced with:

492

519

objective function is replaced with:

\begin{align}

493

520

\begin{align}

\text{Minimize } & \sum_{i=1}^n a_i \notag

494

521

\text{Minimize } & \sum_{i=1}^n a_i \notag

\end{align}

495

522

\end{align}

We adapt our constraints of quadratic program to replace equation \ref{eq:area}

496

523

We adapt our constraints of quadratic program to replace equation \ref{eq:area}

with equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal

497

524

with equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal

rejection required.

498

525

rejection required.

499

526

\begin{align}

500

527

\begin{align}

\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}

501

528

\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}

\end{align}

502

529

\end{align}

503

530

\section{Design workflow}

504

531

\section{Design workflow}

\label{sec:workflow}

505

532

\label{sec:workflow}

506

533

In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area}

507

534

In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area}

and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved

508

535

and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved

in the computation of the results.

509

536

in the computation of the results.

510

537

\begin{figure}

511

538

\begin{figure}

\centering

512

539

\centering

\begin{tikzpicture}[node distance=0.75cm and 2cm]

513

540

\begin{tikzpicture}[node distance=0.75cm and 2cm]

\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;

514

541

\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;

\node (Start) [left= 3cm of Solver] { } ;

515

542

\node (Start) [left= 3cm of Solver] { } ;

\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;

516

543

\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;

\node (Input) [above= of TCL] { } ;

517

544

\node (Input) [above= of TCL] { } ;

\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;

518

545

\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;

\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;

519

546

\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;

\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;

520

547

\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;

\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;

521

548

\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;

\node (Results) [left= of Postproc] { } ;

522

549

\node (Results) [left= of Postproc] { } ;

523

550

\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;

524

551

\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;

\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;

525

552

\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;

\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;

526

553

\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;

\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;

527

554

\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;

\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;

528

555

\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;

\draw[->,dashed] (Bitstream) -- (Deploy) ;

529

556

\draw[->,dashed] (Bitstream) -- (Deploy) ;

\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;

530

557

\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;

\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;

531

558

\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;

\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;

532

559

\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;

\draw[->] (Postproc) -- (Results) ;

533

560

\draw[->] (Postproc) -- (Results) ;

\end{tikzpicture}

534

561

\end{tikzpicture}

\caption{Design workflow from the input parameters to the results}

535

562

\caption{Design workflow from the input parameters to the results {\color{red} allowing for

563

a fully automated optimal solution search.}}

\label{fig:workflow}

536

564

\label{fig:workflow}

\end{figure}

537

565

\end{figure}

538

566

The filter solver is a C++ program that takes as input the maximum area

539

567

The filter solver is a C++ program that takes as input the maximum area

$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,

540

568

$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,

the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates

541

569

the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates

the quadratic programs and uses the Gurobi solver to estimate the optimal results.

542

570

the quadratic programs and uses the Gurobi solver to estimate the optimal results.

Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})

543

571

Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})

and a deploy script ((1b) on figure~\ref{fig:workflow}).

544

572

and a deploy script ((1b) on figure~\ref{fig:workflow}).

545

573

The TCL script describes the whole digital processing chain from the beginning

546

574

The TCL script describes the whole digital processing chain from the beginning

(the raw signal data) to the end (the filtered data) in a language compatible

547

575

(the raw signal data) to the end (the filtered data) in a language compatible

with proprietary synthesis software, namely Vivado for Xilinx and Quartus for

548

576

with proprietary synthesis software, namely Vivado for Xilinx and Quartus for

Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN)

549

577

Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN)

generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.

550

578

generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.

Then the script builds each stage of the chain with a generic FIR task that

551

579

Then the script builds each stage of the chain with a generic FIR task that

comes from a skeleton library. The generic FIR is highly configurable

552

580

comes from a skeleton library. The generic FIR is highly configurable

with the number of coefficients and the size of the coefficients. The coefficients

553

581

with the number of coefficients and the size of the coefficients. The coefficients

themselves are not stored in the script.

554

582

themselves are not stored in the script.

As the signal is processed in real-time, the output signal is stored as

555

583

As the signal is processed in real-time, the output signal is stored as

consecutive bursts of data for post-processing, mainly assessing the consistency of the

556

584

consecutive bursts of data for post-processing, mainly assessing the consistency of the

implemented FIR cascade transfer function with the design criteria and the expected

557

585

implemented FIR cascade transfer function with the design criteria and the expected

transfer function.

558

586

transfer function.

559

587

The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).

560

588

The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).

We use the 2018.2 version of Xilinx Vivado and we execute the synthesized

561

589

We use the 2018.2 version of Xilinx Vivado and we execute the synthesized

bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series

562

590

bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series

FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Omega$ resistors to

563

591

FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Omega$ resistors to

provide a broadband noise source.

564

592

provide a broadband noise source.

The board runs the Linux kernel and surrounding environment produced from the

565

593

The board runs the Linux kernel and surrounding environment produced from the

Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring

566

594

Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring

the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and

567

595

the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and

fetching the results is automated.

568

596

fetching the results is automated.

569

597

The deploy script uploads the bitstream to the board ((3) on

570

598

The deploy script uploads the bitstream to the board ((3) on

figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,

571

599

figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,

configures the coefficients of the FIR filters. It then waits for the results

572

600

configures the coefficients of the FIR filters. It then waits for the results

and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).

573

601

and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).

574

602

Finally, an Octave post-processing script computes the final results thanks to

575

603

Finally, an Octave post-processing script computes the final results thanks to

the output data ((5) on figure~\ref{fig:workflow}).

576

604

the output data ((5) on figure~\ref{fig:workflow}).

The results are normalized so that the Power Spectrum Density (PSD) starts at zero

577

605

The results are normalized so that the Power Spectrum Density (PSD) starts at zero

and the different configurations can be compared.

578

606

and the different configurations can be compared.

579

607

\section{Maximizing the rejection at fixed silicon area}

580

608

\section{Maximizing the rejection at fixed silicon area}

\label{sec:fixed_area}

581

609

\label{sec:fixed_area}

This section presents the output of the filter solver {\em i.e.} the computed

582

610

This section presents the output of the filter solver {\em i.e.} the computed

configurations for each stage, the computed rejection and the computed silicon area.

583

611

configurations for each stage, the computed rejection and the computed silicon area.

Such results allow for understanding the choices made by the solver to compute its solutions.

584

612

Such results allow for understanding the choices made by the solver to compute its solutions.

585

613

The experimental setup is composed of three cases. The raw input is generated

586

614

The experimental setup is composed of three cases. The raw input is generated

by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.

587

615

by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.

Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500

588

616

Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500

arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.

589

617

arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.

The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$

590

618

The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$

ranging from 2 to 22. In each case, the quadratic program has been able to give a

591

619

ranging from 2 to 22. In each case, the quadratic program has been able to give a

result up to five stages ($n = 5$) in the cascaded filter.

592

620

result up to five stages ($n = 5$) in the cascaded filter.

593

621

Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.

594

622

Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.

Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.

595

623

Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.

Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.

596

624

Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.

597

625

\renewcommand{\arraystretch}{1.4}

598

626

\renewcommand{\arraystretch}{1.4}

599

627

\begin{table}

600

628

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}

601

629

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}

\label{tbl:gurobi_max_500}

602

630

\label{tbl:gurobi_max_500}

\centering

603

631

\centering

{\scalefont{0.77}

604

632

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

605

633

\begin{tabular}{|c|ccccc|c|c|}

\hline

606

634

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

607

635

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

608

636

\hline

1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\

609

637

1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\

2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\

610

638

2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\

3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\

611

639

3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\

4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\

612

640

4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\

5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\

613

641

5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\

\hline

614

642

\hline

\end{tabular}

615

643

\end{tabular}

}

616

644

}

\end{table}

617

645

\end{table}

618

646

\begin{table}

619

647

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}

620

648

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}

\label{tbl:gurobi_max_1000}

621

649

\label{tbl:gurobi_max_1000}

\centering

622

650

\centering

{\scalefont{0.77}

623

651

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

624

652

\begin{tabular}{|c|ccccc|c|c|}

\hline

625

653

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

626

654

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

627

655

\hline

1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\

628

656

1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\

2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\

629

657

2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\

3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\

630

658

3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\

4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\

631

659

4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\

5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\

632

660

5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\

\hline

633

661

\hline

\end{tabular}

634

662

\end{tabular}

}

635

663

}

\end{table}

636

664

\end{table}

637

665

\begin{table}

638

666

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}

639

667

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}

\label{tbl:gurobi_max_1500}

640

668

\label{tbl:gurobi_max_1500}

\centering

641

669

\centering

{\scalefont{0.77}

642

670

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

643

671

\begin{tabular}{|c|ccccc|c|c|}

\hline

644

672

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

645

673

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

646

674

\hline

1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\

647

675

1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\

2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\

648

676

2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\

3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\

649

677

3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\

4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\

650

678

4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\

5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\

651

679

5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\

\hline

652

680

\hline

\end{tabular}

653

681

\end{tabular}

}

654

682

}

\end{table}

655

683

\end{table}

656

684

\renewcommand{\arraystretch}{1}

657

685

\renewcommand{\arraystretch}{1}

658

686

From these tables, we can first state that the more stages are used to define

659

687

From these tables, we can first state that the more stages are used to define

the cascaded FIR filters, the better the rejection. It was an expected result as it has

660

688

the cascaded FIR filters, the better the rejection. It was an expected result as it has

been previously observed that many small filters are better than

661

689

been previously observed that many small filters are better than

a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusions

662

690

a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusions

being hardly used in practice due to the lack of tools for identifying individual filter

663

691

being hardly used in practice due to the lack of tools for identifying individual filter

coefficients in the cascaded approach.

664

692

coefficients in the cascaded approach.

665

693

Second, the larger the silicon area, the better the rejection. This was also an

666

694

Second, the larger the silicon area, the better the rejection. This was also an

expected result as more area means a filter of better quality with more coefficients

667

695

expected result as more area means a filter of better quality with more coefficients

or more bits per coefficient.

668

696

or more bits per coefficient.

669

697

Then, we also observe that the first stage can have a larger shift than the other

670

698

Then, we also observe that the first stage can have a larger shift than the other

stages. This is explained by the fact that the solver tries to use just enough

671

699

stages. This is explained by the fact that the solver tries to use just enough

bits for the computed rejection after each stage. In the first stage, a

672

700

bits for the computed rejection after each stage. In the first stage, a

balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}

673

701

balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}

gives the relation between both values.

674

702

gives the relation between both values.

675

703

Finally, we note that the solver consumes all the given silicon area.

676

704

Finally, we note that the solver consumes all the given silicon area.

677

705

The following graphs present the rejection for real data on the FPGA. In all the following

678

706

The following graphs present the rejection for real data on the FPGA. In all the following

figures, the solid line represents the actual rejection of the filtered

679

707

figures, the solid line represents the actual rejection of the filtered

data on the FPGA as measured experimentally and the dashed line are the noise levels

680

708

data on the FPGA as measured experimentally and the dashed line are the noise levels

given by the quadratic solver. The configurations are those computed in the previous section.

681

709

given by the quadratic solver. The configurations are those computed in the previous section.

682

710

Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.

683

711

Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.

Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.

684

712

Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.

Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.

685

713

Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.

686

714

% \begin{figure}

687

715

% \begin{figure}

% \centering

688

716

% \centering

% \includegraphics[width=\linewidth]{images/max_500}

689

717

% \includegraphics[width=\linewidth]{images/max_500}

% \caption{Signal spectrum for MAX/500}

690

718

% \caption{Signal spectrum for MAX/500}

% \label{fig:max_500_result}

691

719

% \label{fig:max_500_result}

% \end{figure}

692

720

% \end{figure}

%

693

721

%

% \begin{figure}

694

722

% \begin{figure}

% \centering

695

723

% \centering

% \includegraphics[width=\linewidth]{images/max_1000}

696

724

% \includegraphics[width=\linewidth]{images/max_1000}

% \caption{Signal spectrum for MAX/1000}

697

725

% \caption{Signal spectrum for MAX/1000}

% \label{fig:max_1000_result}

698

726

% \label{fig:max_1000_result}

% \end{figure}

699

727

% \end{figure}

%

700

728

%

% \begin{figure}

701

729

% \begin{figure}

% \centering

702

730

% \centering

% \includegraphics[width=\linewidth]{images/max_1500}

703

731

% \includegraphics[width=\linewidth]{images/max_1500}

% \caption{Signal spectrum for MAX/1500}

704

732

% \caption{Signal spectrum for MAX/1500}

% \label{fig:max_1500_result}

705

733

% \label{fig:max_1500_result}

% \end{figure}

706

734

% \end{figure}

707

735

% r2.14 et r2.15 et r2.16

708

736

% r2.14 et r2.15 et r2.16

\begin{figure}

709

737

\begin{figure}

\centering

710

738

\centering

\begin{subfigure}{\linewidth}

711

739

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_500}

712

740

\includegraphics[width=\linewidth]{images/max_500}

\caption{Signal spectrum for MAX/500}

713

741

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

742

the MAX/500 problem of maximizing rejection for a given resource allocation (500~arbitrary units).}

\label{fig:max_500_result}

714

743

\label{fig:max_500_result}

\end{subfigure}

715

744

\end{subfigure}

716

745

\begin{subfigure}{\linewidth}

717

746

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_1000}

718

747

\includegraphics[width=\linewidth]{images/max_1000}

\caption{Signal spectrum for MAX/1000}

719

748

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

749

the MAX/1000 problem of maximizing rejection for a given resource allocation (1000~arbitrary units).}

\label{fig:max_1000_result}

720

750

\label{fig:max_1000_result}

\end{subfigure}

721

751

\end{subfigure}

722

752

\begin{subfigure}{\linewidth}

723

753

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/max_1500}

724

754

\includegraphics[width=\linewidth]{images/max_1500}

\caption{Signal spectrum for MAX/1500}

725

755

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

756

the MAX/1500 problem of maximizing rejection for a given resource allocation (1500~arbitrary units).}

\label{fig:max_1500_result}

726

757

\label{fig:max_1500_result}

\end{subfigure}

727

758

\end{subfigure}

\caption{Signal spectrum of each experimental configurations MAX/500, MAX/1000 and MAX/1500}

728

759

\caption{\color{red}Solutions for the MAX/500, MAX/1000 and MAX/1500 problems of maximizing

760

rejection for a given resource allocation.

761

The filter shape constraint (bandpass and bandstop) is shown as thick

762

horizontal lines on each chart.}

\end{figure}

729

763

\end{figure}

730

764

In all cases, we observe that the actual rejection is close to the rejection computed by the solver.

731

765

In all cases, we observe that the actual rejection is close to the rejection computed by the solver.

732

766

We compare the actual silicon resources given by Vivado to the

733

767

We compare the actual silicon resources given by Vivado to the

resources in arbitrary units.

734

768

resources in arbitrary units.

The goal is to check that our arbitrary units of silicon area models well enough

735

769

The goal is to check that our arbitrary units of silicon area models well enough

the real resources on the FPGA. Especially we want to verify that, for a given

736

770

the real resources on the FPGA. Especially we want to verify that, for a given

number of arbitrary units, the actual silicon resources do not depend on the

737

771

number of arbitrary units, the actual silicon resources do not depend on the

number of stages $n$. Most significantly, our approach aims

738

772

number of stages $n$. Most significantly, our approach aims

at remaining far enough from the practical logic gate implementation used by

739

773

at remaining far enough from the practical logic gate implementation used by

various vendors to remain platform independent and be portable from one

740

774

various vendors to remain platform independent and be portable from one

architecture to another.

741

775

architecture to another.

742

776

Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and

743

777

Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and

MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000

744

778

MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000

and 1500 arbitrary units. We have taken care to extract solely the resources used by

745

779

and 1500 arbitrary units. We have taken care to extract solely the resources used by

the FIR filters and remove additional processing blocks including FIFO and Programmable

746

780

the FIR filters and remove additional processing blocks including FIFO and Programmable

Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.

747

781

Logic (PL -- FPGA) to Processing System (PS -- general purpose processor) communication.

748

782

\begin{table}[h!tb]

749

783

\begin{table}[h!tb]

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

750

784

\caption{Resource occupation {\color{red}following synthesis of the solutions found for

785

the problem of maximizing rejection for a given resource allocation}. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

\label{tbl:resources_usage}

751

786

\label{tbl:resources_usage}

\centering

752

787

\centering

\begin{tabular}{|c|c|ccc|c|}

753

788

\begin{tabular}{|c|c|ccc|c|}

\hline

754

789

\hline

$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline

755

790

$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline

& LUT & 249 & 453 & 627 & \emph{17600} \\

756

791

& LUT & 249 & 453 & 627 & \emph{17600} \\

1 & BRAM & 1 & 1 & 1 & \emph{120} \\

757

792

1 & BRAM & 1 & 1 & 1 & \emph{120} \\

& DSP & 21 & 37 & 47 & \emph{80} \\ \hline

758

793

& DSP & 21 & 37 & 47 & \emph{80} \\ \hline

& LUT & 2374 & 5494 & 691 & \emph{17600} \\

759

794

& LUT & 2374 & 5494 & 691 & \emph{17600} \\

2 & BRAM & 2 & 2 & 2 & \emph{120} \\

760

795

2 & BRAM & 2 & 2 & 2 & \emph{120} \\

& DSP & 0 & 0 & 70 & \emph{80} \\ \hline

761

796

& DSP & 0 & 0 & 70 & \emph{80} \\ \hline

& LUT & 2443 & 3304 & 3521 & \emph{17600} \\

762

797

& LUT & 2443 & 3304 & 3521 & \emph{17600} \\

3 & BRAM & 3 & 3 & 3 & \emph{120} \\

763

798

3 & BRAM & 3 & 3 & 3 & \emph{120} \\

& DSP & 0 & 19 & 35 & \emph{80} \\ \hline

764

799

& DSP & 0 & 19 & 35 & \emph{80} \\ \hline

& LUT & 2634 & 3753 & 2557 & \emph{17600} \\

765

800

& LUT & 2634 & 3753 & 2557 & \emph{17600} \\

4 & BRAM & 4 & 4 & 4 & \emph{120} \\

766

801

4 & BRAM & 4 & 4 & 4 & \emph{120} \\

& DPS & 0 & 19 & 46 & \emph{80} \\ \hline

767

802

& DPS & 0 & 19 & 46 & \emph{80} \\ \hline

& LUT & 2423 & 3047 & 2847 & \emph{17600} \\

768

803

& LUT & 2423 & 3047 & 2847 & \emph{17600} \\

5 & BRAM & 5 & 5 & 5 & \emph{120} \\

769

804

5 & BRAM & 5 & 5 & 5 & \emph{120} \\

& DPS & 0 & 22 & 46 & \emph{80} \\ \hline

770

805

& DPS & 0 & 22 & 46 & \emph{80} \\ \hline

\end{tabular}

771

806

\end{tabular}

\end{table}

772

807

\end{table}

773

808

In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,

774

809

In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,

when the filter coefficients are small enough, or when the input size is small

775

810

when the filter coefficients are small enough, or when the input size is small

enough, Vivado optimizes resource consumption by selecting multiplexers to

776

811

enough, Vivado optimizes resource consumption by selecting multiplexers to

implement the multiplications instead of a DSP. In this case, it is quite difficult

777

812

implement the multiplications instead of a DSP. In this case, it is quite difficult

to compare the whole silicon budget.

778

813

to compare the whole silicon budget.

779

814

However, a rough estimation can be made with a simple equivalence: looking at

780

815

However, a rough estimation can be made with a simple equivalence: looking at

the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,

781

816

the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,

we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon

782

817

we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon

area use. With this equivalence, our 500 arbitraty units correspond to 2500 LUTs,

783

818

area use. With this equivalence, our 500 arbitraty units correspond to 2500 LUTs,

1000 arbitrary units correspond to 5000 LUTs and 1500 arbitrary units correspond

784

819

1000 arbitrary units correspond to 5000 LUTs and 1500 arbitrary units correspond

to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary

785

820

to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary

unit map well to actual hardware resources. The relatively small differences can probably be explained

786

821

unit map well to actual hardware resources. The relatively small differences can probably be explained

by the optimizations done by Vivado based on the detailed map of available processing resources.

787

822

by the optimizations done by Vivado based on the detailed map of available processing resources.

788

823

We now present the computation time needed to solve the quadratic problem.

789

824

We now present the computation time needed to solve the quadratic problem.

For each case, the filter solver software is executed on a Intel(R) Xeon(R) CPU E5606

790

825

For each case, the filter solver software is executed on a Intel(R) Xeon(R) CPU E5606

clocked at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve

791

826

clocked at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve

the quadratic problem. Table~\ref{tbl:area_time} shows the time needed to solve the quadratic

792

827

the quadratic problem. Table~\ref{tbl:area_time} shows the time needed to solve the quadratic

problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.

793

828

problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.

794

829

\begin{table}[h!tb]

795

830

\begin{table}[h!tb]

\caption{Time needed to solve the quadratic program with Gurobi}

796

831

\caption{Time needed to solve the quadratic program with Gurobi}

\label{tbl:area_time}

797

832

\label{tbl:area_time}

\centering

798

833

\centering

\begin{tabular}{|c|c|c|c|}\hline

799

834

\begin{tabular}{|c|c|c|c|}\hline

$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline

800

835

$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline

1 & 0.1~s & 0.1~s & 0.3~s \\

801

836

1 & 0.1~s & 0.1~s & 0.3~s \\

2 & 1.1~s & 2.2~s & 12~s \\

802

837

2 & 1.1~s & 2.2~s & 12~s \\

3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\

803

838

3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\

4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\

804

839

4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\

5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline

805

840

5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline

\end{tabular}

806

841

\end{tabular}

\end{table}

807

842

\end{table}

808

843

As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?

809

844

As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?

When the area is limited, the design exploration space is more limited and the solver is able to

810

845

When the area is limited, the design exploration space is more limited and the solver is able to

find an optimal solution faster.

811

846

find an optimal solution faster.

812

847

\subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}

813

848

\subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}

814

849

This section presents the results of the complementary quadratic program aimed at

815

850

This section presents the results of the complementary quadratic program aimed at

minimizing the area occupation for a targeted rejection level.

816

851

minimizing the area occupation for a targeted rejection level.

817

852

The experimental setup is composed of four cases. The raw input is the same

818

853

The experimental setup is composed of four cases. The raw input is the same

as in the previous section, from a PRN generator, which fixes the input data size $\Pi^I$.

819

854

as in the previous section, from a PRN generator, which fixes the input data size $\Pi^I$.

Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60, 80 or 100~dB.

820

855

Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60, 80 or 100~dB.

Hence, the three cases have been named: MIN/40, MIN/60, MIN/80 and MIN/100.

821

856

Hence, the three cases have been named: MIN/40, MIN/60, MIN/80 and MIN/100.

The number of configurations $p$ is the same as previous section.

822

857

The number of configurations $p$ is the same as previous section.

823

858

Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.

824

859

Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.

Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.

825

860

Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.

Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.

826

861

Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.

Table~\ref{tbl:gurobi_min_100} shows the results obtained by the filter solver for MIN/100.

827

862

Table~\ref{tbl:gurobi_min_100} shows the results obtained by the filter solver for MIN/100.

828

863

\renewcommand{\arraystretch}{1.4}

829

864

\renewcommand{\arraystretch}{1.4}

830

865

\begin{table}[h!tb]

831

866

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}

832

867

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}

\label{tbl:gurobi_min_40}

833

868

\label{tbl:gurobi_min_40}

\centering

834

869

\centering

{\scalefont{0.77}

835

870

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

836

871

\begin{tabular}{|c|ccccc|c|c|}

\hline

837

872

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

838

873

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

839

874

\hline

1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\

840

875

1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\

2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\

841

876

2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\

3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\

842

877

3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\

4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\

843

878

4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\

\hline

844

879

\hline

\end{tabular}

845

880

\end{tabular}

}

846

881

}

\end{table}

847

882

\end{table}

848

883

\begin{table}[h!tb]

849

884

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}

850

885

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}

\label{tbl:gurobi_min_60}

851

886

\label{tbl:gurobi_min_60}

\centering

852

887

\centering

{\scalefont{0.77}

853

888

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

854

889

\begin{tabular}{|c|ccccc|c|c|}

\hline

855

890

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

856

891

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

857

892

\hline

1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\

858

893

1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\

2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\

859

894

2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\

3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\

860

895

3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\

4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\

861

896

4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\

5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\

862

897

5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\

\hline

863

898

\hline

\end{tabular}

864

899

\end{tabular}

}

865

900

}

\end{table}

866

901

\end{table}

867

902

\begin{table}[h!tb]

868

903

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}

869

904

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}

\label{tbl:gurobi_min_80}

870

905

\label{tbl:gurobi_min_80}

\centering

871

906

\centering

{\scalefont{0.77}

872

907

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

873

908

\begin{tabular}{|c|ccccc|c|c|}

\hline

874

909

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

875

910

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

876

911

\hline

1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\

877

912

1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\

2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\

878

913

2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\

3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\

879

914

3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\

4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\

880

915

4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\

5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\

881

916

5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\

\hline

882

917

\hline

\end{tabular}

883

918

\end{tabular}

}

884

919

}

\end{table}

885

920

\end{table}

886

921

\begin{table}[h!tb]

887

922

\begin{table}[h!tb]

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/100}

888

923

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/100}

\label{tbl:gurobi_min_100}

889

924

\label{tbl:gurobi_min_100}

\centering

890

925

\centering

{\scalefont{0.77}

891

926

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

892

927

\begin{tabular}{|c|ccccc|c|c|}

\hline

893

928

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

894

929

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

895

930

\hline

1 & - & - & - & - & - & - & - \\

896

931

1 & - & - & - & - & - & - & - \\

2 & (15, 7, 17) & (51, 14, 0) & - & - & - & 100~dB & 1365 \\

897

932

2 & (15, 7, 17) & (51, 14, 0) & - & - & - & 100~dB & 1365 \\

3 & (3, 3, 15) & (27, 9, 0) & (27, 9, 0) & - & - & 100~dB & 1002 \\

898

933

3 & (3, 3, 15) & (27, 9, 0) & (27, 9, 0) & - & - & 100~dB & 1002 \\

4 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 0) & - & 101~dB & 909 \\

899

934

4 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 0) & - & 101~dB & 909 \\

5 & (3, 3, 15) & (23, 8, 1) & (19, 7, 0) & (3, 3, 0) & (3, 3, 0) & 101~dB & 810 \\

900

935

5 & (3, 3, 15) & (23, 8, 1) & (19, 7, 0) & (3, 3, 0) & (3, 3, 0) & 101~dB & 810 \\

\hline

901

936

\hline

\end{tabular}

902

937

\end{tabular}

}

903

938

}

\end{table}

904

939

\end{table}

\renewcommand{\arraystretch}{1}

905

940

\renewcommand{\arraystretch}{1}

906

941

From these tables, we can first state that almost all configurations reach the targeted rejection

907

942

From these tables, we can first state that almost all configurations reach the targeted rejection

level or even better thanks to our underestimate of the cascade rejection as the sum of the

908

943

level or even better thanks to our underestimate of the cascade rejection as the sum of the

individual filter rejection. The only exception is for the monolithic case ($n = 1$) in

909

944

individual filter rejection. The only exception is for the monolithic case ($n = 1$) in

MIN/100: no solution is found for a single monolithic filter reach a 100~dB rejection.

910

945

MIN/100: no solution is found for a single monolithic filter reach a 100~dB rejection.

Futhermore, the area of the monolithic filter is twice as big as the two cascaded filters

911

946

Futhermore, the area of the monolithic filter is twice as big as the two cascaded filters

(1131 and 1760 arbitrary units v.s 547 and 903 arbitrary units for 60 and 80~dB rejection

912

947

(1131 and 1760 arbitrary units v.s 547 and 903 arbitrary units for 60 and 80~dB rejection

respectively). More generally, the more filters are cascaded, the lower the occupied area.

913

948

respectively). More generally, the more filters are cascaded, the lower the occupied area.

914

949

Like in previous section, the solver chooses always a little filter as first

915

950

Like in previous section, the solver chooses always a little filter as first

filter stage and the second one is often the biggest filter. This choice can be explained

916

951

filter stage and the second one is often the biggest filter. This choice can be explained

as in the previous section, with the solver using just enough bits not to degrade the input

917

952

as in the previous section, with the solver using just enough bits not to degrade the input

signal and in the second filter selecting a better filter to improve rejection without

918

953

signal and in the second filter selecting a better filter to improve rejection without

having too many bits in the output data.

919

954

having too many bits in the output data.

920

955

For the specific case of MIN/40 for $n = 5$ the solver has determined that the optimal

921

956

For the specific case of MIN/40 for $n = 5$ the solver has determined that the optimal

number of filters is 4 so it did not chose any configuration for the last filter. Hence this

922

957

number of filters is 4 so it did not chose any configuration for the last filter. Hence this

solution is equivalent to the result for $n = 4$.

923

958

solution is equivalent to the result for $n = 4$.

924

959

The following graphs present the rejection for real data on the FPGA. In all the following

925

960

The following graphs present the rejection for real data on the FPGA. In all the following

figures, the solid line represents the actual rejection of the filtered

926

961

figures, the solid line represents the actual rejection of the filtered

data on the FPGA as measured experimentally and the dashed line is the noise level

927

962

data on the FPGA as measured experimentally and the dashed line is the noise level

given by the quadratic solver.

928

963

given by the quadratic solver.

929

964

Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.

930

965

Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.

Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.

931

966

Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.

Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.

932

967

Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.

Figure~\ref{fig:min_100} shows the rejection of the different configurations in the case of MIN/100.

933

968

Figure~\ref{fig:min_100} shows the rejection of the different configurations in the case of MIN/100.

934

969

% \begin{figure}

935

970

% \begin{figure}

% \centering

936

971

% \centering

% \includegraphics[width=\linewidth]{images/min_40}

937

972

% \includegraphics[width=\linewidth]{images/min_40}

% \caption{Signal spectrum for MIN/40}

938

973

% \caption{Signal spectrum for MIN/40}

% \label{fig:min_40}

939

974

% \label{fig:min_40}

% \end{figure}

940

975

% \end{figure}

%

941

976

%

% \begin{figure}

942

977

% \begin{figure}

% \centering

943

978

% \centering

% \includegraphics[width=\linewidth]{images/min_60}

944

979

% \includegraphics[width=\linewidth]{images/min_60}

% \caption{Signal spectrum for MIN/60}

945

980

% \caption{Signal spectrum for MIN/60}

% \label{fig:min_60}

946

981

% \label{fig:min_60}

% \end{figure}

947

982

% \end{figure}

%

948

983

%

% \begin{figure}

949

984

% \begin{figure}

% \centering

950

985

% \centering

% \includegraphics[width=\linewidth]{images/min_80}

951

986

% \includegraphics[width=\linewidth]{images/min_80}

% \caption{Signal spectrum for MIN/80}

952

987

% \caption{Signal spectrum for MIN/80}

% \label{fig:min_80}

953

988

% \label{fig:min_80}

% \end{figure}

954

989

% \end{figure}

%

955

990

%

% \begin{figure}

956

991

% \begin{figure}

% \centering

957

992

% \centering

% \includegraphics[width=\linewidth]{images/min_100}

958

993

% \includegraphics[width=\linewidth]{images/min_100}

% \caption{Signal spectrum for MIN/100}

959

994

% \caption{Signal spectrum for MIN/100}

% \label{fig:min_100}

960

995

% \label{fig:min_100}

% \end{figure}

961

996

% \end{figure}

962

997

% r2.14 et r2.15 et r2.16

963

998

% r2.14 et r2.15 et r2.16

\begin{figure}

964

999

\begin{figure}

\centering

965

1000

\centering

\begin{subfigure}{\linewidth}

966

1001

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/min_40}

967

1002

\includegraphics[width=.91\linewidth]{images/min_40}

\caption{Signal spectrum for MIN/40}

968

1003

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

1004

the MIN/40 problem of minimizing resource allocation for reaching a 40~dB rejection.}

\label{fig:min_40}

969

1005

\label{fig:min_40}

\end{subfigure}

970

1006

\end{subfigure}

971

1007

\begin{subfigure}{\linewidth}

972

1008

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/min_60}

973

1009

\includegraphics[width=.91\linewidth]{images/min_60}

\caption{Signal spectrum for MIN/60}

974

1010

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

1011

the MIN/60 problem of minimizing resource allocation for reaching a 60~dB rejection.}

\label{fig:min_60}

975

1012

\label{fig:min_60}

\end{subfigure}

976

1013

\end{subfigure}

977

1014

\begin{subfigure}{\linewidth}

978

1015

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/min_80}

979

1016

\includegraphics[width=.91\linewidth]{images/min_80}

\caption{Signal spectrum for MIN/80}

980

1017

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

1018

the MIN/80 problem of minimizing resource allocation for reaching a 80~dB rejection.}

\label{fig:min_80}

981

1019

\label{fig:min_80}

\end{subfigure}

982

1020

\end{subfigure}

983

1021

\begin{subfigure}{\linewidth}

984

1022

\begin{subfigure}{\linewidth}

\includegraphics[width=\linewidth]{images/min_100}

985

1023

\includegraphics[width=.91\linewidth]{images/min_100}

\caption{Signal spectrum for MIN/100}

986

1024

\caption{\color{red}Filter transfer functions for varying number of cascaded filters solving

1025

the MIN/100 problem of minimizing resource allocation for reaching a 100~dB rejection.}

\label{fig:min_100}

987

1026

\label{fig:min_100}

\end{subfigure}

988

1027

\end{subfigure}

\caption{Signal spectrum of each experimental configurations MIN/40, MIN/60, MIN/80 and MIN/100}

989

1028

\caption{\color{red}Solutions for the MIN/40, MIN/60, MIN/80 and MIN/100 problems of reaching a

1029

given rejection while minimizing resource allocation. The filter shape constraint (bandpass and

1030

bandstop) is shown as thick

1031

horizontal lines on each chart.}

\end{figure}

990

1032

\end{figure}

991

1033

We observe that all rejections given by the quadratic solver are close to the experimentally

992

1034

We observe that all rejections given by the quadratic solver are close to the experimentally

measured rejection. All curves prove that the constraint to reach the target rejection is

993

1035

measured rejection. All curves prove that the constraint to reach the target rejection is

respected with both monolithic (except in MIN/100 which has no monolithic solution) or cascaded filters.

994

1036

respected with both monolithic (except in MIN/100 which has no monolithic solution) or cascaded filters.

995

1037

Table~\ref{tbl:resources_usage} shows the resource usage in the case of MIN/40, MIN/60;

996

1038

Table~\ref{tbl:resources_usage} shows the resource usage in the case of MIN/40, MIN/60;

MIN/80 and MIN/100 \emph{i.e.} when the target rejection is fixed to 40, 60, 80 and 100~dB. We

997

1039

MIN/80 and MIN/100 \emph{i.e.} when the target rejection is fixed to 40, 60, 80 and 100~dB. We

have taken care to extract solely the resources used by

998

1040

have taken care to extract solely the resources used by

the FIR filters and remove additional processing blocks including FIFO and PL to

999

1041

the FIR filters and remove additional processing blocks including FIFO and PL to

PS communication.

1000

1042

PS communication.

1001

1043

\renewcommand{\arraystretch}{1.2}

1002

1044

\renewcommand{\arraystretch}{1.2}

\begin{table}

1003

1045

\begin{table}

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

1004

1046

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

\label{tbl:resources_usage_comp}

1005

1047

\label{tbl:resources_usage_comp}

\centering

1006

1048

\centering

{\scalefont{0.90}

1007

1049

{\scalefont{0.90}

\begin{tabular}{|c|c|cccc|c|}

1008

1050

\begin{tabular}{|c|c|cccc|c|}

\hline

1009

1051

\hline

$n$ & & MIN/40 & MIN/60 & MIN/80 & MIN/100 & \emph{Zynq 7010} \\ \hline\hline

1010

1052

$n$ & & MIN/40 & MIN/60 & MIN/80 & MIN/100 & \emph{Zynq 7010} \\ \hline\hline

& LUT & 343 & 334 & 772 & - & \emph{17600} \\

1011

1053

& LUT & 343 & 334 & 772 & - & \emph{17600} \\

1 & BRAM & 1 & 1 & 1 & - & \emph{120} \\

1012

1054

1 & BRAM & 1 & 1 & 1 & - & \emph{120} \\

& DSP & 27 & 39 & 55 & - & \emph{80} \\ \hline

1013

1055

& DSP & 27 & 39 & 55 & - & \emph{80} \\ \hline

& LUT & 1252 & 2862 & 5099 & 640 & \emph{17600} \\

1014

1056

& LUT & 1252 & 2862 & 5099 & 640 & \emph{17600} \\

2 & BRAM & 2 & 2 & 2 & 2 & \emph{120} \\

1015

1057

2 & BRAM & 2 & 2 & 2 & 2 & \emph{120} \\

& DSP & 0 & 0 & 0 & 66 & \emph{80} \\ \hline

1016

1058

& DSP & 0 & 0 & 0 & 66 & \emph{80} \\ \hline

& LUT & 891 & 2148 & 2023 & 2448 & \emph{17600} \\

1017

1059

& LUT & 891 & 2148 & 2023 & 2448 & \emph{17600} \\

3 & BRAM & 3 & 3 & 3 & 3 & \emph{120} \\

1018

1060

3 & BRAM & 3 & 3 & 3 & 3 & \emph{120} \\

& DSP & 0 & 0 & 19 & 27 & \emph{80} \\ \hline

1019

1061

& DSP & 0 & 0 & 19 & 27 & \emph{80} \\ \hline

& LUT & 662 & 1729 & 2451 & 2893 & \emph{17600} \\

1020

1062

& LUT & 662 & 1729 & 2451 & 2893 & \emph{17600} \\

4 & BRAM & 4 & 4 & 4 & 4 & \emph{120} \\

1021

1063

4 & BRAM & 4 & 4 & 4 & 4 & \emph{120} \\

& DPS & 0 & 0 & 7 & 19 & \emph{80} \\ \hline

1022

1064

& DPS & 0 & 0 & 7 & 19 & \emph{80} \\ \hline

& LUT & - & 1259 & 2602 & 2505 & \emph{17600} \\

1023

1065

& LUT & - & 1259 & 2602 & 2505 & \emph{17600} \\

5 & BRAM & - & 5 & 5 & 5 & \emph{120} \\

1024

1066

5 & BRAM & - & 5 & 5 & 5 & \emph{120} \\

& DPS & - & 0 & 0 & 19 & \emph{80} \\ \hline

1025

1067

& DPS & - & 0 & 0 & 19 & \emph{80} \\ \hline

\end{tabular}

1026

1068

\end{tabular}

}

1027

1069

}

\end{table}

1028

1070

\end{table}

\renewcommand{\arraystretch}{1}

1029

1071

\renewcommand{\arraystretch}{1}

1030

1072

If we keep the previous estimation of cost of one DSP in terms of LUT (1 DSP $\approx$ 100 LUT)

1031

1073

If we keep the previous estimation of cost of one DSP in terms of LUT (1 DSP $\approx$ 100 LUT)

the real resource consumption decreases as a function of the number of stages in the cascaded

1032

1074

the real resource consumption decreases as a function of the number of stages in the cascaded

filter according

1033

1075

filter according

to the solution given by the quadratic solver. Indeed, we have always a decreasing

1034

1076

to the solution given by the quadratic solver. Indeed, we have always a decreasing

consumption even if the difference between the monolithic and the two cascaded

1035

1077

consumption even if the difference between the monolithic and the two cascaded

filters is less than expected.

1036

1078

filters is less than expected.

1037

1079

Finally, table~\ref{tbl:area_time_comp} shows the computation time to solve

1038

1080

Finally, table~\ref{tbl:area_time_comp} shows the computation time to solve

the quadratic program.

1039

1081

the quadratic program.

1040

1082

\renewcommand{\arraystretch}{1.2}

1041

1083

\renewcommand{\arraystretch}{1.2}

\begin{table}[h!tb]

1042

1084

\begin{table}[h!tb]

\caption{Time to solve the quadratic program with Gurobi}

1043

1085

\caption{Time to solve the quadratic program with Gurobi}

\label{tbl:area_time_comp}

1044

1086

\label{tbl:area_time_comp}

\centering

1045

1087

\centering

{\scalefont{0.90}

1046

1088

{\scalefont{0.90}

\begin{tabular}{|c|c|c|c|c|}\hline

1047

1089

\begin{tabular}{|c|c|c|c|c|}\hline

$n$ & Time (MIN/40) & Time (MIN/60) & Time (MIN/80) & Time (MIN/100) \\\hline\hline

1048

1090

$n$ & Time (MIN/40) & Time (MIN/60) & Time (MIN/80) & Time (MIN/100) \\\hline\hline

1 & 0.07~s & 0.02~s & 0.01~s & - \\

1049

1091

1 & 0.07~s & 0.02~s & 0.01~s & - \\

2 & 7.8~s & 16~s & 14~s & 1.8~s \\

1050

1092

2 & 7.8~s & 16~s & 14~s & 1.8~s \\

3 & 4.7~s & 14~s & 28~s & 39~s \\

1051

1093

3 & 4.7~s & 14~s & 28~s & 39~s \\

4 & 39~s & 20~s & 193~s & 522~s ($\approx$ 9~min) \\

1052

1094

4 & 39~s & 20~s & 193~s & 522~s ($\approx$ 9~min) \\

5 & - & 12~s & 170~s & 1048~s ($\approx$ 17~min) \\\hline

1053

1095

5 & - & 12~s & 170~s & 1048~s ($\approx$ 17~min) \\\hline

\end{tabular}

1054

1096

\end{tabular}

}

1055

1097

}

\end{table}

1056

1098

\end{table}

\renewcommand{\arraystretch}{1}

1057

1099

\renewcommand{\arraystretch}{1}

GITLAB

jfriedt / IFCS2018 article

Merge branch 'master' of https://lxsd.femto-st.fr/gitlab/jfriedt/ifcs2018-article